| Summary: | Provide a way to refresh that finds new children without doing attribute checks on other resources | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Platform | Reporter: | Chris Recoskie <recoskie> | ||||
| Component: | Resources | Assignee: | Platform-Resources-Inbox <platform-resources-inbox> | ||||
| Status: | CLOSED WONTFIX | QA Contact: | |||||
| Severity: | enhancement | ||||||
| Priority: | P3 | CC: | angvoz.dev, jamesblackburn+eclipse, malaperle, overholt, pwebster, Szymon.Brandys, yevshif | ||||
| Version: | 3.7 | ||||||
| Target Milestone: | --- | ||||||
| Hardware: | All | ||||||
| OS: | All | ||||||
| Whiteboard: | stalebug | ||||||
| Bug Depends on: | |||||||
| Bug Blocks: | 278257 | ||||||
| Attachments: |
|
||||||
|
Description
Chris Recoskie
I guess the main assumption here is that there are many fewer listFiles than stats. Existing directories would need to be listed to discover new resources, and newly discovered resources would need to be queries to discover whether or not they're files. If files greatly outnumber directories (which I guess is very likely) this would be an order of magnitude improvement. Taking one of my slow refresh projects (I'm on NFS): bash:jamesb:xl-cbga-20:32892> find . -type f |wc 50412 51240 4979196 bash:jamesb:xl-cbga-20:32893> find . -type d |wc 2863 2887 177557 So I we might get a 20x time saving. What do your projects look like Chris? The other point to note is that we really don't want the Workspace locked while the refresh is happening - or at least it shouldn't be locked for long periods. > DEPTH_ZERO – refresh only the resource itself; do not refresh any children.
Shouldn't DEPTH_ZERO applied to a folder discover (but not refresh) any children? FS commonly implements a folder as a special kind of file which content is the list of files in the folder. If you refresh that you should possess the list of files already.
Created attachment 189862 [details]
work in progress
Attaching work in progress for discussion purposes.
The new type of refresh is now ever so slightly faster than a regular refresh, but not really fast enough. On a remote project hosted with RSE that has 250 folders each with 100 C++ files (for a total of 25,000 source files) that gets compiled into another 25,000 additional object files, doing a DEPTH_INFINITE refresh on the project is about one second slower than updating the folder contents with the new DEPTH_FOLDER_CONTENTS flag.
The problem is that doing individual calls to IFileStore.fetchInfo(...) is really slow; if you refresh that way, just doing fetching info for newly discovered children, you actually end up on the whole refreshing slower than if you just did a full refresh. The existing refresh functionality avoids this by fetching all child infos in one call via IFileStore.childInfos(...). Doing it that way in the new method got things down to one second faster, but essentially defeats the purpose of what we're trying to do, because for one second's difference you might as well just do a full refresh at DEPTH_ONE since you are fetching info for all the children anyway.
In order to make this worthwhile, we'd need a way to ask an IFileStore to get the child infos for a specific list of children in one operation. That way we could retrieve the list of filesystem children, reconcile them against the workspace members, and then only fetchInfo() for any newly discovered resources.
(In reply to comment #3) I talked to James B. today about the patch. It seems that we all are busy at the end of M7. Since James has been working in the refresh area recently, I asked him to look at the patch to speed the work up. His comment will be valuable. I will find time next week to look and comment. (In reply to comment #4) > (In reply to comment #3) > I talked to James B. today about the patch. It seems that we all are busy at > the end of M7. Since James has been working in the refresh area recently, I > asked him to look at the patch to speed the work up. His comment will be > valuable. I will find time next week to look and comment. Well basically the feature doesn't seem to be worth doing without adding some API to EFS to go along with it, and in order to validate the performance, I'd have to implement the API for some EFS provider (RSE would make most sense). I have had other things to work on so I was not planning to look at this again until Juno timeframe. This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. If the bug is still relevant, please remove the "stalebug" whiteboard tag. |