結果在台灣的 `gemini-1.5-pro-002` 比在美國的 `gemini-1.5-pro-002` 快上不少,花費時間幾乎跟美國的 `gemini-2.0-flash-exp` 同等級,效果也不差。
我發了個 PR 優先使用台灣跟日本的 `gemini-1.5-pro-002` :
https://github.com/cofacts/rumors-api/pull/361
至於那個最新的 `gemini-2.0-flash-001` 正式版,等它變快一點再說吧。
#361 Adjust LLM parameters to speed things up
The most recent model, `gemini-2.0-flash-001`, suffers from slow response issue, compared to its predecessor, the experimental `gemini-2.0-flash-exp`.
<https://private-user-images.githubusercontent.com/108608/411172883-a9a9a886-8a73-4d4d-a197-9f6bec78b60e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg5OTg5NzgsIm5iZiI6MTczODk5ODY3OCwicGF0aCI6Ii8xMDg2MDgvNDExMTcyODgzLWE5YTlhODg2LThhNzMtNGQ0ZC1hMTk3LTlmNmJlYzc4YjYwZS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjA4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIwOFQwNzExMThaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT04NmMzY2Q2MjE5MjQ3YjllMzYwNmNmZDdiMmFjYTljMzgxMGJlZTRjY2ZmZDBjMTIyMjE3ODA0YzRkOTY0NTUzJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.Oahs8mpqda3yqIIcDGTYDkzXEdy1YybqNZ7Nnpo8uNM|圖片>
On the other hand, I found that `gemini-1.5-pro-002` in `asia-east1` (Taiwan) and `asia-northeast1` (Tokyo) is actually as fast as `gemini-2.0-flash-exp` on `us-cental1`, and its quality is also pretty good.
<https://private-user-images.githubusercontent.com/108608/411172752-b7652d65-f903-41bb-8daa-d5b2822d6125.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg5OTg5NzgsIm5iZiI6MTczODk5ODY3OCwicGF0aCI6Ii8xMDg2MDgvNDExMTcyNzUyLWI3NjUyZDY1LWY5MDMtNDFiYi04ZGFhLWQ1YjI4MjJkNjEyNS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjA4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIwOFQwNzExMThaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0xYWZjYTUyOWZjMzEyY2RiYjQ3OGNhODk0OTVjMDJmMzMxYmY2N2JhOGE0YjM4MGU0M2QyMmExYTYxNGE0ZTU5JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.D_4Hxy0SZpYuQnI53RbbO-ubgOMfTzvr3-DgfCmOs_s|圖片>
This PR sets up the LLM model settings for LLM transcript to prioritize `gemini-1.5-pro-002` in Asia, then `gemini-2.0-flash-exp`. For now, we ignore the latest `gemini-2.0-flash-001` until it gets a significant speed boost.
😮 1