| Summary: | open vsx is slow again | ||
|---|---|---|---|
| Product: | Community | Reporter: | Anton Kosyakov <anton> |
| Component: | Infrastructure | Assignee: | Eclipse Webmaster <webmaster> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | P3 | CC: | denis.roy, mikael.barbero |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | PC | ||
| OS: | Mac OS X | ||
| Whiteboard: | |||
|
Description
Anton Kosyakov
Could you please elaborate the "cannot connect to it"? Thanks. Around one hour ago users were getting timeout errors while trying to search the Open VSX from VS Code integration in Gitpod. Ending up with `We cannot connect to the Extensions Marketplace at this time, please try again later.` user facing error. Do you an idea of how long and how many requests were "slow"? I see half a dozen of timeout from the logs, but nothing more. Also, this was during searches, correct? Not direct queries for extensions? > Do you an idea of how long and how many requests were "slow"? I see half a dozen of timeout from the logs, but nothing more. Unfortunately, we don't collect such information yet. It only logged in the client browser for now. As it is directly talks to Open VSX. > Also, this was during searches, correct? Not direct queries for extensions? Yes, it is about using Open VSX api. If a user manages to a link to Azure Storage then it works. I noticed that the ES nodes were spending a lot of time in GC. I've increased the memory allocated to the nodes (see https://github.com/EclipseFdn/open-vsx.org/commit/31fbf071dd4073bbb89b5d83e866c98ad9c4b3b6). Let's keep this one open for a week. Feel free to add any other issue you may have during this time. If we don't see the issue during this period, I'll close then the ticket. We are still seeing errors like Failed to find extension 'tyriar.sort-lines@1.9.0:39TPtnzbLWepVuVCVtlO3g==' in 'https://open-vsx.org' registry:","error":"Error: ESOCKETTIMEDOUT\n at ClientRequest.<anonymous> (/app/node_modules/request/request.js:816:19) Configured timeout is 5 mins. I've opened https://github.com/eclipse/openvsx/issues/256 as this is now the only log remaining. @webmaster, meanwhile, could you please have a look at the nginx side of open-vsx.org? sorry, actually I was wrong timeout is 5s not 5mins on our side: https://github.com/gitpod-io/gitpod/blob/48dfd9faae8ed6b21684365eb26837065e1adc79/components/server/src/theia-plugin/theia-plugin-service.ts#L244 Could the two issues not be related? If the server is exhausting a 30s timeout to the database (posibly because of connection pool exhaustion) wouldn't that lead a client to a timeout? > sorry, actually I was wrong timeout is 5s not 5mins on our side:
Thanks; while we strive to get sub-second response times in all cases, 5 seconds isn't very forgiving in the odd case of congestion or other contention.
> Thanks; while we strive to get sub-second response times in all cases, 5 seconds isn't very forgiving in the odd case of congestion or other contention.
What timeout would you recommend? 30s is enough?
30s should be plenty. If >1s becomes normal, we'll investigate the cause. From deeper analysis, issue is on the app side (too many heavy SQL requests). There is an effort to denormalize the DB schema to lighten the load. See https://github.com/eclipse/openvsx/issues/261 We're still seeing very low cache hit rate (as reported in https://github.com/eclipse/openvsx/issues/214) but we still have no docker image to deploy with the latest changes. (In reply to Denis Roy from comment #13) > 30s should be plenty. If >1s becomes normal, we'll investigate the cause. Indeed, we are seeing 9s ~ 14s response times from anything that cannot be cached/accelerated. A cacheable call responds very quickly: $ time wget -S https://open-vsx.org/api/rust-lang/rust HTTP request sent, awaiting response... HTTP/1.1 200 Server: nginx [snip] X-Frame-Options: DENY X-Proxy-Cache: HIT real 0m0.281s A non-cacheable call, with an Origin header, cannot be cached: $ time wget -S --header="Origin: x" https://open-vsx.org/api/rust-lang/rust HTTP request sent, awaiting response... HTTP/1.1 200 Server: nginx [snip] X-Frame-Options: DENY X-Proxy-Cache: BYPASS real 0m7.538s Thx for the summary, Mikaël. We have resolved this. |