Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 105554 - cyclic symbolic link causes java.lang.OutOfMemoryError: Java heap space
Summary: cyclic symbolic link causes java.lang.OutOfMemoryError: Java heap space
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.3   Edit
Hardware: All Unix All
: P3 normal with 9 votes (vote)
Target Milestone: 3.3 M7   Edit
Assignee: John Arthorne CLA
QA Contact:
URL:
Whiteboard:
Keywords: greatbug
: 131461 145200 161014 161015 172160 215047 (view as bug list)
Depends on: 184433
Blocks: 232426
  Show dependency tree
 
Reported: 2005-07-29 03:07 EDT by Hironori Komaba CLA
Modified: 2018-07-30 09:03 EDT (History)
15 users (show)

See Also:


Attachments
One way to solve the endless loop during refresh for recursive symlinks (10.19 KB, patch)
2006-05-10 08:27 EDT, Max Weninger CLA
no flags Details | Diff
Updated patch (19.61 KB, patch)
2006-08-23 16:04 EDT, John Arthorne CLA
no flags Details | Diff
Updated patch (11.56 KB, patch)
2006-08-24 09:38 EDT, John Arthorne CLA
no flags Details | Diff
recursive symlink example (16.51 KB, application/octet-stream)
2006-09-19 13:54 EDT, Max Weninger CLA
no flags Details
Updated my way to handle recursive symlinks (10.57 KB, patch)
2006-10-31 07:32 EST, Max Weninger CLA
no flags Details | Diff
Performance test (4.79 KB, patch)
2007-02-06 18:13 EST, John Arthorne CLA
no flags Details | Diff
Test projects for symbolic links. These break Eclipse 3.3M6!! (19.83 KB, application/x-compressed)
2007-04-27 08:58 EDT, Martin Oberhuber CLA
no flags Details
Patch fixing all recursive symbolic link issues in UnifiedTree (21.00 KB, patch)
2007-04-27 09:20 EDT, Martin Oberhuber CLA
john.arthorne: iplog+
Details | Diff
Minor tweaks, code formatting, etc (21.53 KB, patch)
2007-04-30 10:31 EDT, John Arthorne CLA
no flags Details | Diff
Patch adding performance counters (development-time only) (3.30 KB, patch)
2007-05-03 10:48 EDT, Martin Oberhuber CLA
no flags Details | Diff
Patch adding a UI Preference for suppressLinkedDuplicates (8.88 KB, patch)
2007-05-03 10:52 EDT, Martin Oberhuber CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Hironori Komaba CLA 2005-07-29 03:07:42 EDT
An Eclipse project contains symbolic links as its child resources.
If such symbolic link points to its ancestor directory, when the workspace
refreshing is invoked (by pressing F5 key in the resource navigator, for 
example), Eclipse is killed by 'java.lang.OutOfMemoryError: Java heap space'.

This is caused by the infinite loop execution of
  FileSystemResourceManager.refreshResource() --> UnifiedTree.accept()
Comment 1 Rafael Chaves CLA 2005-08-01 15:38:12 EDT
See also bug 42438.
Comment 2 Udo Rader CLA 2005-08-24 03:54:01 EDT
This is even worse when having "automatic refresh" in action. A real
showstopper, IMHO.
Comment 3 John Arthorne CLA 2005-08-24 10:34:12 EDT
Having cycles in your file system is a show-stopper - if it hurts, don't do it.
Comment 4 Udo Rader CLA 2005-08-25 15:15:27 EDT
Hmm, various other applications deal quite happily with it. A symlink that
points to the containing directory is fine with all *nix filesystems and that's
why one can do it.

If an application does not deal with all normal situations a file system can be
in, then the application is buggy.

From my endusers point of view eclipse lacks appropriate deadlock prevention
when dealing with the filesystem.
Comment 5 James Sleeman CLA 2005-08-31 23:55:01 EDT
(In reply to comment #3)
> Having cycles in your file system is a show-stopper - if it hurts, don't do it.

having cycles in your file system is a perfectly normal and acceptable practice.
 Applications should use realpath to canonicalize directories and ensure there
is no recursion before enumerating thier contents.

I found this report because I tried adding a project containing a cyclic link,
it was unsuccessful while other IDEs and editors have no issue with cyclicity.

Comment 6 Max Weninger CLA 2005-10-05 07:36:38 EDT
(In reply to comment #5)
> (In reply to comment #3)
> > Having cycles in your file system is a show-stopper - if it hurts, don't do it.
> 
> having cycles in your file system is a perfectly normal and acceptable practice.
>  Applications should use realpath to canonicalize directories and ensure there
> is no recursion before enumerating thier contents.

I can only agree! You will find such symlinks on nearly any linux system e.g. in
/usr/include. Without starting a discussion if this practice is good or not the
fact is that eclipse will have to handle such situations.

BTW: it is not an "infinite loop" it will stop after 40 iterations but this
also too much :-)

We will use a fix by changing the native stat impl for linux systems in 
libcore_3_1_0.so. There we check if the path is a link to "." and if yes
this path will be skipped.

If you are interessted we can provide the patch.

But of course the real fix would be to check the path for such situations 
in the java part with calls like getCannonicalPath.
Comment 7 John Arthorne CLA 2005-10-05 09:40:35 EDT
Sorry, my comment #3 was a bit harsh - I think I was just reacting to the
comment that the bug was a show-stopper (comment #2), which seemed odd because
after four years of Eclipse being around this was the first bug on the subject.

I agree it is a bug, and it should be fixed.  The suggested fix in comment #6
seems insufficient, since a sym-link to the parent would presumably also create
such a loop.
Comment 8 Max Weninger CLA 2005-10-05 14:26:14 EDT
(In reply to comment #7)
> I agree it is a bug, and it should be fixed.  The suggested fix in comment #6
> seems insufficient, since a sym-link to the parent would presumably also create
> such a loop.

Sure it does only fix a special problem that eclipse currently has.
But "real" symlink loops are not the normal case
and are normally the result of an error of the user or the system setup.

As where links to "." are pretty "normal" and are IMHO not real "loops".

But the best fix would be of course in the resource layer :-)
Comment 9 Max Weninger CLA 2005-10-06 07:59:47 EDT
This does not really belong to this bug but the reason is the same
The "import from file system" wizard has also some problems
when importing sources which contain symlinks to "."

It will create 40 iterations of the folder structure 
as real folders before stopping.

Should I create a separate bugzille entry for this?
Comment 10 Max Weninger CLA 2005-10-06 09:48:30 EDT
(In reply to comment #9)
> This does not really belong to this bug but the reason is the same
> The "import from file system" wizard has also some problems
> when importing sources which contain symlinks to "."
> 
> It will create 40 iterations of the folder structure 
> as real folders before stopping.

A fix would be possible in FileSystemStructureProvider
getChildren() with getCannonicalFile a link to the current folder
could be detected and ignored.

But this is also just symptom fixing.

The best solution for any recursive file system structure traversal 
would be to collect the possible folders in front, check them for
loops etc. and iterate just over the "cleanuped" list.

Comment 11 Mark Belanger CLA 2005-11-09 17:46:05 EST
I'd like to reiterate that this is a real showstopper if
you are trying to import filesystems that contain such
links
Comment 12 Oli Bye CLA 2006-02-01 07:00:43 EST
This problem isn't only restricted to UNIX.

For example I'm using Eclipse CDT on Windows against a project in a ClearCase dynamic view hosted on Solaris.

It's a serious obstacle to the take up of Eclipse on our team. Mainly because it occurs when people first try Eclipse, giving them a very bad first impression.

I'm aware that this is a more complex problem. I don't think getCanonicalFile will help us here as the information about link being a link is only available at the ClearCase level. I don't think it's even exposed at the SMB level.

A great compromise would be if I could get the resource manager to just exclude certain names when it does the refresh. It isn't enough to just exclude at the C/C++ Project Paths->Source Tab->Exclusion Filter level.
Comment 13 John Arthorne CLA 2006-03-13 10:13:57 EST
*** Bug 131461 has been marked as a duplicate of this bug. ***
Comment 14 Matt McCutchen CLA 2006-03-15 18:15:19 EST
I would very much like it if Eclipse's resources layer would handle all symlinks in an intelligent fashion.  I'm writing a plugin, and I wanted to make it remember a path in the workspace in the form of a symlink, but I realized this would confuse the resource system if the path went up.

When Eclipse's filesystem scanner finds a symlink, it could report the symlink as a linked resource; that would fix the bug if Eclipse already has logic that avoids cycles of linked resources.
Comment 15 Max Weninger CLA 2006-05-10 08:27:48 EDT
Created attachment 40924 [details]
One way to solve the endless loop during refresh for recursive symlinks

This is a patch against 3.2RC3
It is a change in RefreshLocalVisitor which remembers folders
that have already been visited in the refresh operation.
File.getCanonicalPath is used to get the unique path.

The rest of the change is adding a checkbox in the
workspace preferences to enable this check.
The default value is false.

This does NOT fix the same problem in the file system
import wizard but a similair approach can be used there too
Comment 16 John Arthorne CLA 2006-06-05 11:41:42 EDT
*** Bug 145200 has been marked as a duplicate of this bug. ***
Comment 17 David Biesack CLA 2006-07-24 09:32:59 EDT
This bug also manifests itself in apparent infinite loops running batch audits of codebases with symbolic links in them. Instantiation's CodePro AnalytiX uses the Eclipse compiler to parse the code. Symbolic links get expanded and reexpanded in the bin directory of the temporary audit workspace. For example, if your source folder has org/eclipse and that folder has a symlink resources -> ../.. then you will see org/eclipse/orgeclipse/org/eclipse/org/eclipse ... created in the bin directory. I did not verify if this stops after N levels, however, but the operation did appear to be looping.
Comment 18 Max Weninger CLA 2006-08-02 04:27:26 EDT
Will this patch be considered for 3.2.1?
Comment 19 John Arthorne CLA 2006-08-23 16:04:56 EDT
Created attachment 48520 [details]
Updated patch

This is an update on the patch in comment #15.  I haven't tested it yet - just posting here as a work in progress.  This patch moves the recursive check out of RefreshLocalVisitor and into UnifiedTree.  If you fix only the refresh local case, you'll still get the infinite loop for other clients of the UnifiedTree (this is a tree that simutaneously does a breadth-first traversal of the local file system and the resource tree). I also believe the previous patch would have scaled poorly to workspaces with thousands of folders (not uncommon).
Comment 20 Max Weninger CLA 2006-08-24 07:56:29 EDT
Great that there is progress on this problem.
I am really not an expert with this aerea of the code 
so I am glad to learn from you :-)

If you like I can test it as soon you say it makes sense
Comment 21 John Arthorne CLA 2006-08-24 09:38:47 EDT
Created attachment 48580 [details]
Updated patch

The previous patch didn't catch all cases.  I have been testing this latest patch and it seems to work well.  If anyone else wants to give it a try it would be good to hear your feedback.
Comment 22 Max Weninger CLA 2006-08-28 08:03:33 EDT
Tested it today:

Applied patch against 3.2

-Use "Import archive wizard" to import a tgz which contains recursive symlinks in an existing project 
 
-> works (recursive symlinks are ignored)

-Extract same tgz on cmdline in an existing simple project and use refresh 

-> hangs until out of memory as before

Comment 23 Max Weninger CLA 2006-09-19 10:33:41 EDT
I just looked again at your patch.

It is not enough just to check the parent<->child relation
because recursive symlinks can spawn more then one 
directory level.

And that will not be discovered from this patch
Comment 24 John Arthorne CLA 2006-09-19 12:23:21 EDT
Can you give a concrete example where it doesn't work?  If you have a link at /a/b/c that links to /a, then the prefix check should still work. A simple failing example to test against would be helpful.
Comment 25 Max Weninger CLA 2006-09-19 13:54:23 EDT
Created attachment 50494 [details]
recursive symlink example
Comment 26 Max Weninger CLA 2006-09-19 13:55:44 EDT
Of course you must extract the archive on a unix/linux host :-)
Comment 27 Anton Leherbauer CLA 2006-10-16 11:05:15 EDT
*** Bug 161015 has been marked as a duplicate of this bug. ***
Comment 28 John Arthorne CLA 2006-10-16 11:59:11 EDT
*** Bug 161014 has been marked as a duplicate of this bug. ***
Comment 29 Max Weninger CLA 2006-10-31 07:32:52 EST
Created attachment 52979 [details]
Updated my way to handle recursive symlinks

Bugfix to included already existing folders in the visited list
Comment 30 Martin Oberhuber CLA 2007-01-30 10:00:52 EST
This is certainly a hard problem, especially when the recursive link is in a ClearCase view accessed on Windows (Comment #12). File.getCanonicalPath() fails in this case. The only possible solution I can see is when the getChildren() call computes a hash code of the child list (including child basenames, length, access time, attributes from EFS/IFileInfo), and in case exactly the same children are found again break the recursion (with a visited list as in Comment #29). Such an approach would be a bit similar to the way how cddb identifies audio CD's by computing a hash code from the TOC, length of the songs etc and it seems to work reasonably well.

For the more common case where symbolic links can be really identified as symbolic links by the underlying file system, EFS now has API to identify them when the native file library is loaded (bug 170317: EFS.ATTRIBUTE_SYMLINK, currently implemented on Linux only).

I think that if a solution is desired that avoids cyclic links in all possible cases, there is no way around a visited list since the recursion may involve any number of symbolic links. A->. is just the simplest case, A.b->B with B.a->A is the mutually recursive case, A.b->B, B.c->C, C.a->A involves three folders and so on -- one can dream up recursion involving any number of links / folders involved.

I think the solution from comment #29 is close to perfect, just given that it fails for the ClearCase case (and java.io.File.getCanonicalPath() may be expensive), a solution including a hashCode of child infos might be worth a thought. A simple solution could involve an ArrayList as follows:

static class UniqueResourceId {
  private String names;
  private long modTime;
  private int type;
  public UniqueResourceId (String name, long modTime, int type) {
     this.name=name; this.modTime=modTime; this.type=type;
  }
  public boolean equals(Object other) {
    if (other instanceof UniqueResourceId) {
      UniqueResourceId o = (UniqueResourceId)other;
      return modTime==o.modTime && type == o.type && name.equals(o.name);
    }
    return false;
  }
  public int hashCode() {
    return name.hashCode(); //just hash the name
  }
}
static class FreezableArrayList extends ArrayList {
    private boolean frozen=false;
    private int hashCode;
    public void freeze() {
        this.hashCode=super.hashCode();
        frozen=true;
    }
    public int hashCode() {
        if(frozen) return this.hashCode;
        return super.hashCode();
    }
}
Object computeUniqueFolderId(IContainer folder) {
    IResource[] children = folder.members();
    FreezableArrayList uuid = new FreezableArrayList(children.length);
    for(int i=0; i<children.length; i++) {
        IResource c=children[i];
        uuid.add(new UniqueResourceId(c.getName(), 
             c.getLocalTimeStamp(), c.getType());
    }
    uuid.freeze();
    return uuid;
}
Comment 31 MayXu CLA 2007-01-30 22:13:16 EST
I update a little base the attachement of Comment #29 in 
order to resolve the relative link
RefreshLocalVisitor.java checkFolder method

String s = p.toFile().getCanonicalPath();
String s1 = p.toFile().getPath();
if (s1.length() > s.length() && s1.startsWith(s))
{
	return false;
}
else if(!fVisitedList.contains(s)){
	fVisitedList.add(s);
} else {
	return false;
}
Comment 32 Martin Oberhuber CLA 2007-01-31 06:39:00 EST
(In reply to comment #31)
Ah, you are taking a shortcut in case the symbolic link just points up in the folder hierarchy e.g.  myself->.  or  parent->.. 

Was your intention just performance optimization or something else? I think your code doesn't correctly cover the prefix case where you sit in a folder
  /path/to/folder
and it contains a symbolic link
  parallel -> ../fol
which points to the folder
  /path/to/fol

Your code wouldn't refresh /path/to/fol. I believe this could be fixed by appending File.separatorChar to the canonicalPath before doing your prefix check.

With respect to my previous proposal in comment #30, an even better return value for the method computing the unique identifier would be (since direct access to elements is no more needed for the UID):

static class UIDFromList {
    private int hashCode;
    private List list;
    public UIDFromList(List l) { 
        this.list=l; this.hashCode=l.hashCode();
    }
    public int hashCode() {
        return this.hashCode;
    }
    public boolean equals(Object other) {
        if (other instanceof UIDFromList) {
            return this.list.equals(((UIDFromList)other).list);
        }
        return false;
    }
}
Comment 33 John Arthorne CLA 2007-02-05 17:15:42 EST
I have two criteria for a solution to this bug:

1) Must have trivial or no performance impact when no symbolic links are present.
2) Must be solved in UnifiedTree, rather than in RefreshLocalVisitor. As mentioned earlier, all the current patches put the fix in RefreshLocalVisitor. However, if some other visitor walks over the UnifiedTree in this case, they will hit the same infinite recursion (DeleteVisitor and CopyVisitor being prominent examples).

I think now that we have native symlink support in EFS, there is an opportunity to satisfy 1). I.e., we can detect symlinks at trivial cost since the attribute will be fetched anyway as part of the IFileStore#fetchInfo call.  Off the top of my head here's a possible solution:

- UnifiedTreeNode has a "visited" field, which is a Set of IFileStores visited already.  By default this field is "null", so it adds trivial extra cost for non-symlink case.
- When we go to fetch children of a node in the tree (UnifiedTree.addChildren), check if it is a sym link. If it is, we initialize the visited field with a new Set.  If the visited field is non-null, add the current node's file store to the visited set before proceeding to add children.
- As we create child nodes, check if it is already in the visited set. If so, don't create a node for that child.
- For each child node that is created, pass along the visited set (or null) from the parent.

With such a solution I believe we could throw away the UI portion of this patch - no need to add a preference if the performance is good in the normal case.  

The only problem this doesn't solve is the case where the symbolic link is not detectable via the IFileStore native (ClearCase on Windows mounting a Linux drive, for example).  I don't think we should be hung up on this solving problem right now.. I would much rather solve the majority case of plain symlinks on *nix/Vista machines. It would then be possible to solve the CC problem via a specialized IFileStore implementation, for example, that knows how to detect symlinks for that case.

If someone wants to take a crack this, be my guest.  If not, I'll try to find time to code this up in the 3.3 M6 period.
Comment 34 Martin Oberhuber CLA 2007-02-05 18:10:25 EST
Do you know of an existing unit test to check the performance impact (and probably also memory consumption) of this? 
Unchanged M5, and the existing patch could be the cornerstones of a performance check.
Comment 35 John Arthorne CLA 2007-02-06 18:13:03 EST
Created attachment 58388 [details]
Performance test

This patch adds a performance test for refreshing a large project.  Tweak the suite() method to enable only the single refresh test.  I'll release this after M5 so we have a baseline.
Comment 36 Philipe Mulet CLA 2007-04-24 10:56:01 EDT
*** Bug 172160 has been marked as a duplicate of this bug. ***
Comment 37 Martin Oberhuber CLA 2007-04-24 12:13:08 EDT
(In reply to comment #33)
> If someone wants to take a crack this, be my guest.  If not, I'll try to find
> time to code this up in the 3.3 M6 period.

With M7 close, is it realistic that you find time for this? Or until what time would you accept a good patch? - Max Weninger has submitted two patches already which are actually part of our product based on Eclipse, but have not been accepted into the Platform...

> I think now that we have native symlink support in EFS, there is an 
> opportunity to satisfy 1). I.e., we can detect symlinks at trivial cost 

yes.

> If the visited field is non-null, add the current node's file store to
> the visited set before proceeding to add children.
> - As we create child nodes, check if it is already in the visited set.

I think we need to put java.io.File#getCanonicalPath() into the "visited" Set, and also check for the canonical path of each folder when drilling further down, since the path we get from the IFileStore will be deeper and deeper even if the symbolic link points to itself.
Since getting the canonical path is typically expensive (and needs to be done for every folder drilling down, as soon as we stepped over a symbolic link), I still think it's a better idea to compute a hash code from each folder's children and attributes - as proposed in comment #32. Then, we need to get the canonical path only if the hashcodes are the same.

> The only problem this doesn't solve is the case where the symbolic link is not
> detectable via the IFileStore native

Yes - but in such a situation also getCanonicalPath() does not help. I agree that it is OK to live with this limitation for now.
Comment 38 John Arthorne CLA 2007-04-24 13:59:51 EDT
Yes, it's unlikely I'll get to this in 3.3.
Comment 39 Martin Oberhuber CLA 2007-04-26 05:25:43 EDT
I'm working on this now and will try to come up with a good patch.
Comment 40 Martin Oberhuber CLA 2007-04-27 08:58:58 EDT
Created attachment 65204 [details]
Test projects for symbolic links. These break Eclipse 3.3M6!!

Attached archive contains test projects for all possible cases of symbolic links I could think of. It also includes the "udev" real-world example from attachment 50494 [details] (and obsoletes it). These are the projects I've tested my implementation against.

There are 6 test projects, which can be imported into Eclipse (extract the archive on a UNIX platform, then choose File > Import > Existing Projects into Workspace). Two other folder hierarchies are also needed to test recursive symbolic links pointing outside the projects known to the workspace.

Attention: DO NOT IMPORT test1..test5 INTO ECLIPSE 3.3M6 or earlier. It will fail with an OutOfMemoryError during refresh, since these test cases all contain recursive symbolic links. Also, DO NOT import these projects on a platform that supports symbolic links but where Eclipse does not know about them (currently MacOS X, HP-UX, AIX, Linux x86-64, or an SMB mounted share on Windows).

test6 is a special project: it does not contain recursive symbolic links and should thus work on any Eclipse installation. This test is there to check for a specific small shortcoming of the implementation (to be attached next).

Legal Message: I, Martin Oberhuber, declare that I developed attached code from scratch, without referencing any 3rd party materials except material licensed under the EPL. I am authorized by my employer, Wind River Systems Inc., to make this contribution under the EPL.
Comment 41 Martin Oberhuber CLA 2007-04-27 09:20:52 EDT
Created attachment 65205 [details]
Patch fixing all recursive symbolic link issues in UnifiedTree

Attached patch fixes all problems shown by all known symbolic link test cases as of the attachment to comment #40.

It follows the rules established by John in comment #33:
* Entire fix is in UnifiedTree
* Special handling ONLY for symlinks pointing to folders (as detected by EFS)
  Thus there is guaranteed trivial extra cost, and no change of behavior in
  the non symlink case.

Because EFS symlink detection is used, this fix only works when a liblocalfile native library fragment is installed that supports detecting symbolic links. Currently, this is the case for Linux-x86, Linux-PPC, and Solaris-SPARC. Other platforms are not affected by this fix and will still break. Re-compiling the liblocalfile for other UNIX platforms is trivial, though; MaxOS X may be harder since it has a different code base.

The fix detects also mutually recursive symbolic links (a->b, b->c, c->a) by maintaining a history of symbolic link parents during traversal of the tree. Because this history is not stored after traversal is done, and because traversal is done in a breadth-first-search manner, the fix has two "strange" effects:

1.) In case of mutually recursive a symbolic links, refreshing on level 0 has
    a different result than refreshing on lower levels - lower levels reveal
    one level more of the recursive link loop, while refreshing on level 0
    reveals only the first one. This can be shown by refreshing on various 
    levels in project test1.

2.) There is one specific case where even non-recursive symbolic links show
    fewer duplicate resources now, than it was before the fix. Look at test6
    from the attachment to comment #40 for details.

Both these "strange" results appear in very rare scenarios, so I think these are acceptable taking the benefit of the fix into account. Fixing them would require a lot more effort, most probably including things like keeping track of previous locations while following links and keeping this history in sync with workspace changes; resource tree lookups during tree traversal; deferring symbolic link evaluation in the BFS queue until all non-symbolic-links are evaluated.

Summing up, the patch has the following properties:
  * it works with a symlink-enabled EFS provider only.
  * with EFS enabled on localstore, it fixes all known recursion issues.
  * recursive symbolic links are completely hidden from the resource tree.
  * it NEVER leads to missing unique resources that would otherwise be shown.
  * it MAY suppress duplicate resources under very unlikely conditions only.
  * in case of recursive links only, it MAY lead to different results
    during refresh, depending on where a query was started, with the
    difference being that dupliates are either shown or hidden.

When a recursive symbolic link is detected, it is completely hidden (as if it would not exist). I think that this is a potential enhancement for the future -- displaying symbolic links like linked resources in the resource tree, and handling them accordingly.

The patch includes a testcase for the new utility class (PrefixPool) but no automated test for the various symbolic link scenarios. I think that converting the tgz archive of the test projects into automated tests should not be too hard, the test cases could simply count the number of resources after a full refresh; I was just not sure how to best add this. Of course, the test could only run on EFS-symlink-enabled-platforms.

Because the code apparently does no work in the non-symlink-case, I also did not check the performance test.
Comment 42 Martin Oberhuber CLA 2007-04-27 09:22:47 EDT
(In reply to comment #41)
Legal Message: I, Martin Oberhuber, declare that I developed attached code from
scratch, without referencing any 3rd party materials except material licensed
under the EPL. I am authorized by my employer, Wind River Systems Inc., to make
this contribution under the EPL.

Comment 43 John Arthorne CLA 2007-04-27 09:48:06 EDT
Thanks Martin. Targetting to review and release for M7 unless I find problems. Also marking this as a "greatbug" because of comment #40 and comment #41, which is an exemplary example of a contribution, including detailed documentation and test cases.
Comment 44 John Arthorne CLA 2007-04-27 17:48:12 EDT
Some questions:

1) Why are the parentPath and childPath both inserted into pathPrefixHistory on each iteration? Won't we eventually find all loops if we only insert the childPath? I must be missing something, but this seems like we have double the necessary number of calls to File.getCanonicalPath.

2) suppressLinkedDuplicates field is always false. Can we just remove this field and leave in the TODO comment about potential future changes?

3) My regex understanding is probably rusty, but the pattern "\\.[./]*" seems incorrect.  To me this reads "dot, followed by any sequence of either characters or front slashes".   Won't this match a folder name that starts with a dot, such as ".metadata"? Is this intended to be parent references (../), or current references (./) ?

4) In initLinkHistoriesIfNeeded, it adds both the root resource's location, and the location of the root resource's project. I think this doesn't account for linked resources in the workspace. Example:

 - Project /P1 has location /home/johna/p1
 - Linked folder /P1/Folder has location /home/johna/folder
 - Child folder /P1/Folder/SymLink is a symbolic link /home/johna/folder/SymLink to /home/johna/p1.

-> This isn't a cyclic symbolic link chain, since the symlink does not point to a parent of the symlink.

I think rather than adding the project location, this code should add the location of the nearest linked resource, or the project location if not nested within a linked resource.  If you agree, I can fix this up because I have code elsewhere the computes the "file system root" for a given resource that can be used here. Or even simpler, it might just work with adding the root resource's location, and eventually via traversal we will catch any loops that lead back to the root resource.
Comment 45 John Arthorne CLA 2007-04-27 17:50:32 EDT
By the way, despite my comments I am optimistic we can solve this in M7. I agree this has no impact on non-symlink cases, and even if we don't catch all possibilities right away, we'll be better off than the existing behaviour.
Comment 46 Martin Oberhuber CLA 2007-04-27 19:09:40 EDT
(In reply to comment #44)
Let me answer the simple ones first:

> 2) suppressLinkedDuplicates field is always false. Can we just remove this
> field and leave in the TODO comment about potential future changes?

Sure, you are right. Initially, I thought the intention of this were clearer
with the extra field, and would allow for programmtic modification of the behavior. But now I think that suppressing duplicate resources
is typically problematic since not backward compatible so the field should
be removed keeping only a short comment.

> 3) the pattern "\\.[./]*" seems incorrect.
No, this is correct. Inside the character class [] every character stands for
itself and you don't need to quote. [\\./] would also match a String like
"..\../.." which may be interesting when considering symbolic links on Vista,
though so if you feel better I'm absolutely fine with such a change (anticipating that some 3rd party would write an EFS provider that supports symlinks on vista).

Now for the harder ones:

> 1) Why are the parentPath and childPath both inserted into pathPrefixHistory?
> Won't we eventually find all loops if we only insert the childPath?
This is a tricky question, and I'd like to sleep over it :-) My initial concern with this is, that the childPath is currently NOT inserted into the pathPrefixHistory since a link is only recursive if it walks back to where it came from at some point. I think you'd 

 * get more false positives by adding the child path, since when two folders
   a->/link  and b->/link  point to the same resource (which is not recursive
   per se) the second occurrence would necessarily be treated as "recursive"
   and thus suppressed. Right now, this happens much more rarely (only if a
   b1/b2->/link points to a link that is a prefix of a path that has a link
   parent at a higher level).

I guess your thinking is that if it's really a loop, it doesn't matter whether
the loop is detected sooner or later and the loop must come across the same 
place at some time. That's certainly true, but the problem is that we are doing a breadth-first search and not a depth-first search... and I cannot see how we could remember the history of previous links on a resource "tree" (though this would be a good thing).

If we could switch from BFS to DFS when a symbolic link is encountered, and clear the link history when this happens, this would be the best solution (and also fix all the other "strange" issues!!). Can we do that? I think I'm not familiar enough with the queue and how inserting works, but this might be an option...

At any rate, remember that the getCanonicalPath() only happens for symbolic links so it does not impact the non-symlink-case, AND the solution is also better than all previously proposed patches which did getCanonicalPath() for every folder (and not just symlinks). Your suggestion would only be a linear (*2) improvement for the symlink case and I think that's not worth the risk mentioned above.

If we cannot switch from BFS to DFS, I suggest keeping it as-is for now and probably shooting for other improvements in the future. I see room for improvements by trying to do more interpretation on the link target (trying to resolve it internally and do without any getCanonicalPath() at all) in the future, with getCanonicalPath() as a very last resort only later. This would also work better on remote file systems. But it seems harder and less robust so I'd prefer thinking about this later.

> 4) In initLinkHistoriesIfNeeded, it adds both the root resource's location,
> and the location of the root resource's project. I think this doesn't account
> for linked resources in the workspace. Example:
> -> This isn't a cyclic symbolic link chain, since the symlink does not point
>    to a parent of the symlink.

right, and I think this is essentially the BFS versus DFS issue again.
Adding better or more "file system roots" as you suggest is certainly a good think and cannot hurt, I think. I'm afraid though, that because of the BFS
the positive effect is probably smaller than you expect.

Currently, the "spanning filesystem locations" are remembered in separate PrefixPool just in order to ensure that a link pointing to a prefix of a previously found link root is NOT suppressed if it is outside a file system "root". So, adding more file system "roots" would lead to suppressing MORE
symbolic links and thus producing more false positives, I think.

The whole "spanning filesystem location" chack would be unnecessary if we were doing DFS, so let me reiterate again that if we could switch this would be the best solution.

> Or even simpler, it might just work with adding the root resource's
> location, and eventually via traversal we will catch any loops that lead back
> to the root resource.

I think you are right, following my arguments above it is better to not add the project root to the spanning filesystem check. It means probably creating more duplicate resources, but also certainly suppressing less false positives.

Thanks for your thorough review. It's good knowing that 4 eyes are watching this! I left out linked resources since I don't know enough how to deal with them. Your example above (and probably more) should definitely be added to the test cases. Also, I did not test what happens with the DeleteVisitor and CopyVisitor you mentioned in comment #33 since I did not know how to invoke them.
Comment 47 Matt McCutchen CLA 2007-04-27 20:08:09 EDT
This bug doesn't apply to DeleteVisitor because DeleteVisitor deletes symlinks rather than following them; see bug 44106.  However, it might still be good to check that any fix for this bug doesn't break DeleteVisitor.
Comment 48 Martin Oberhuber CLA 2007-04-29 08:49:13 EDT
(In reply to comment #47)
> This bug doesn't apply to DeleteVisitor because DeleteVisitor deletes symlinks
Interesting... how does it know it's a symlink and not just a folder?

Comment 49 Matt McCutchen CLA 2007-04-29 10:54:08 EDT
(In reply to comment #48)
> Interesting... how does it know it's a symlink and not just a folder?

DeleteVisitor calls on the filesystem implementation (in the case of the local filesystem, LocalFile) to do the delete.  LocalFile.internalDelete attempts a nonrecursive java.io.File.delete, which can delete anything except a real nonempty directory.  internalDelete only recurses if that first attempt fails, so it avoids deleting through symlinks.  There's a separate issue when local history is being kept; see bug 174492.

(In reply to comment #46)
> Also, I did not test what happens with the DeleteVisitor and
> CopyVisitor you mentioned in comment #33 since I did not know how to invoke
> them.

You can invoke them simply by deleting and copying folders using the Navigator.
Comment 50 Martin Oberhuber CLA 2007-04-30 01:28:31 EDT
Ah, that's smart... thanks for the info! Does the DeleteVisitor then need to use the UnifiedTree after all? Or does it delete using the FileSystem implementation and then use the UnifiedTree to see whether any resources remain that could not be deleted?
Comment 51 Martin Oberhuber CLA 2007-04-30 04:06:37 EDT
(In reply to comment #46)
> (In reply to comment #44)
> > 1) Why are the parentPath and childPath both inserted into 
> >    pathPrefixHistory? Won't we eventually find all loops if we only
> >    insert the childPath?

Based on your suggestion of reducing the number of canonical path queries, I made some experiments with a modified algorithm that was more optimistic: 
  - follow any symbolic links until getting to a place seen before
  - only check the "seen before" at junctions where links fork off
It turned out that this algorithm worked well for the synthetic cases, but failed for the real world test4_udev test case. 

In the udev case, the link cycle is very long (up to 10 nodes), and while following the cycle the tree grows very fast in breadth because many children fork off. Because of that, detecting the link cycle just one level later leads to exponential growth in the number of nodes -- leading to 30 levels, 84000 nodes and 14000 canonical path queries in 25 seconds execution time on my PC. The current patch refreshes the same tree in < 2 seconds because it detects cycles more eagerly.

What I conclude from this experiment, is that it is essential to detect the link cycle as early as possible, i.e. at the time the symbolic link forks off. This is exactly what the current patch does. It seems, that "eventually get to... again later" is just not good enough.

I think a really good solution to the symbolic link refresh problem requires correctly modeling symbolic links as links in the resource tree. This does not only give better visual feedback to users, but also always suppress duplicate resources during traversal (since clients will be able to see those resources by following the link in the resource model). I'm not sure if explicit new API for symbolic links is required (as requested by bug 44107) or the existing "linked resource" notion could be used.

Anyways, it looks like adding symbolic links to the resource model is out of scope for Eclipse 3.3, so I think the algorithm of the current patch is the best we can do for now. There is room for minor improvements:
 - remember the previous parent path so the canonical path for the parent
   is not computed again unnecessarily;
 - compute the parent canonical path ourselves by looking at the non-symlinked
   path segment between the current node and a previously computed canonical
   path for a parent higher up in the tree
 - defer computation of symlink children until all non-symlinks are computed
But none of these would change the structure of the algorithm, or change its efficiency by orders of magnitude.

> > 4) In initLinkHistoriesIfNeeded, it adds both the root resource's location,
> > and the location of the root resource's project. I think this doesn't 
> > account for linked resources in the workspace.

Please disregard my previous comment about more roots adding more false positives. I think this is not true. The more "spanning file system tree roots" are known upfront, the better - so adding tree roots brought in through linked
resources would be a good thing provided that they are in the same EFS file
system.

But I still think that the positive effect of this would be very small, since the current patch computes the spanning tree root lazily when the first symbolic link points to it. So I'd recommend not adding linked resource roots for now, but keeping the project root (since this is the most common known spanning tree root).

ONE LAST IMPORTANT NOTE: On second thought, I'm afraid that we cannot contribute the test4_udev files under the EPL, since this file system comes from an embedded linux product (most probably GPL but not sure). Please keep this test on bugzilla only for now and do not commit it as automated test case. All other tests are written by myself and contributed under the EPL.
Comment 52 John Arthorne CLA 2007-04-30 10:31:44 EDT
Created attachment 65390 [details]
Minor tweaks, code formatting, etc

This is an entirely cosmetic change from Martin's patch 65205. Just attaching it here so I can test it out on other machines.
Comment 53 John Arthorne CLA 2007-04-30 11:37:17 EDT
In my testing on a slow Linux box, I found that test4 doesn't complete. It just seems to refresh forever.  This is because the job that performs background refresh (RefreshJob), will time-out after two seconds, and switch to only refreshing two levels of the tree at a time. See bug 74392 comment 22 for summary, and UnifiedTree.accept method with int argument.  Since it only refreshes two levels at once, it doesn't catch recursive link structures that span more than two levels.  It looks like solving this will require persisting the prefix history in RefreshJob until the refresh request queue is empty. I'm going to release the current patch anyway because this is no worse than the previous behaviour.
Comment 54 John Arthorne CLA 2007-04-30 11:58:17 EDT
Patch released for 1pm EDT integration build. I'll leave this open for now to sort out remaining details.
Comment 55 Martin Oberhuber CLA 2007-04-30 14:34:16 EDT
(In reply to comment #53)
> In my testing on a slow Linux box, I found that test4 doesn't complete.

I guess the simplest fix for this would be to set suppressLinkedDuplicates = true. Perhaps this should be a UI preference after all. Code for a UI Preference is still attached to comment #21, I think this could be re-used.

The other option would be if there were no Preference, but isRecursiveLink would always act as if suppressLinkedDuplicates were true; but the suppressed links would be put into a queue for later checking after all other resources from the queue are handled (at that time we can see more easily if they are recursive).

Or, always act as if suppressLinkedDuplicates were true, but inside isRecursiveLink register the link pointing to a duplicate resource as a linked resouces. This would be the best solution from a user's perspective - but I don't understand the nature and lifecycle of linked resources enough to implement it...
Comment 56 John Arthorne CLA 2007-04-30 15:38:11 EDT
> I guess the simplest fix for this would be to set suppressLinkedDuplicates = 
> true.

Unless I'm missing something, I don't think suppressLinkedDuplicates is relevant here.  The problem is that certain calls to UnifiedTree.accept only perform refresh of a fixed number of levels in the tree.  If the cycle is not detected within that span, then it will go undetected because we throw away the path/prefix history across invocations of the accept method.
Comment 57 Martin Oberhuber CLA 2007-05-01 00:34:39 EDT
(In reply to comment #56)
> I don't think suppressLinkedDuplicates is relevant here.

It is relevant. Becuase if it is set, any symbolic link that points to a resource inside the same project is suppressed. So you need much less contextual information - knowing the "spanning file system roots" is sufficient, and these are set before the rest of the algorithm runs. Even when looking at a SINGLE level, all symbolic links that remin in the same project (which is all of them in the test4_udev case) will be suppressed.

Only in the test cases that produce symbolic link loops by pointing to resource trees outside the project, it will take more than 1 level to detect the loop.
Comment 58 Martin Oberhuber CLA 2007-05-01 03:57:07 EDT
Excuse me if this is ignorant, but what if symbolic links were treated as just another form of persistence for linked Resources?

Linked resources have the important property that they mask the underlying file system resource, so this would shield us from potentially recursive symbolic links. Much code deals with linked resources already in exactly the manner needed for symbolic links.

Because they look the same for the user, it should be possible to keep the changes needed all inside the resources plugin:
  - a boolean bit with each linked resource indicating whether it has
    persistence in .project or is backed by the file system (as symlink)
  - Read&write of .project file to ignore the links backed by the file system
  - Different handling of delete for links backed by the file system
  - UnifiedTree to create/update links backed by the file system

This would mean no API change, correct visual feedback for symbolic links, simple handling, no more problems with any recursive links ever, correct
representation of the file system even if links introduce duplicates.

Or am I missing any important obstacles?
Comment 59 John Arthorne CLA 2007-05-02 15:37:08 EDT
Re comment #58: It's an interesting thought exercise, but linked resources are fairly different from sym-links. Linked resources are a pointer from the resource tree to the file system tree - it spans across two different "layers". Thus, there is no such thing as a cycle introduced by a linked resource - there is no link from the file system back up to the resource tree in order to incur a cycle. A given linked resource is only associated with one file system location. A resource corresponding to a symlink has two applicable resources - the location of the link, and the location of the target of the link. It's not clear how this would be treated in the API, for example, what the return value of IResource.getLocation() would be.

Also,  I don't see how the property of linked resources masking filesystem resources would help here.  A linked resource represented as a symbolic link could not mask the symbolic link. If the symbolic link was modified to point to a different file, we would have to update the linked resource as well.
Comment 60 John Arthorne CLA 2007-05-02 18:15:12 EDT
Marking fixed for M7.  I have opened bug 185247 for adding JUnit tests to cover the scenarios described in comment #40.
Comment 61 Martin Oberhuber CLA 2007-05-03 10:48:19 EDT
Created attachment 65775 [details]
Patch adding performance counters (development-time only)

I'm still slightly concerned about the problems on a slow Linux box that you described in comment #53. Could these have been due to the unexpectedly missing symlink-enabled library for x86_64 (bug 184433 comment 19)?

I tested with some performance counters (attached) and got the following results for the test4_complex_udev testcase:

* #nodes in file system according to "find": 440 (13 levels)
* #nodes found by RefreshLocalVisitor: 664 (14 levels) in 391 msec

it's hard to imagine that your machine is 5 times slower than mine, going from 391 msec to 2 seconds thus hitting the timeout. FYI, regarding the previous discussion about suppressLinkedDuplicates - when suppressLinkedDuplicates is enabled, the numbers are

* #nodes found with suppressLinkedDuplicates: 440 (13 levels) in 132 msec

so in this case, not a single symbolic link is followed, giving exactly the number of nodes found by "find" and being 3 times faster; because getCanonicalPath() is called only 74 times instead of 184 times.
Comment 62 Martin Oberhuber CLA 2007-05-03 10:52:21 EDT
Created attachment 65778 [details]
Patch adding a UI Preference for suppressLinkedDuplicates

Attached patch adds a UI Preference for suppressing duplicate resources introduced by symbolic links in all cases, if anybody is interested. This is derived from John's patch in attachment 48580 [details], so I have not updated the Copyright headers.
Comment 63 John Arthorne CLA 2007-05-03 10:57:13 EDT
I'm quite sure I had the correct library, because all of the other test cases worked fine. I'll try with this instrumented version and see what the times look like.
Comment 64 Martin Oberhuber CLA 2007-05-04 05:51:16 EDT
Verified the fix with I20070503-1400 on Linux x86_64 and Solaris 10 gtk.

Created new bug 185509 for following up with additional improvements, for addressing the issue found in comment #53 on a slow Linux box.

I did not carry over the (long) CC list to the new bug, so anybody interested in the follow-up please put yourself on CC of the new bug yourself.
Comment 65 Martin Oberhuber CLA 2007-05-16 13:19:16 EDT
The folk watching this bug may be interested to hear that another issue with cyclic symbolic links has been found in the "Import Existing Project" wizard. See bug #187318, put yourself on CC and vote for it if you are interested.

A patch has been provided for that issue, let's see if it gets accepted.
Comment 66 Markus Schorn CLA 2008-05-28 09:49:27 EDT
*** Bug 215047 has been marked as a duplicate of this bug. ***
Comment 67 Andrey Loskutov CLA 2018-07-30 09:03:30 EDT
(In reply to Martin Oberhuber from comment #41)

> Because this history is not stored after traversal is done, and because
> traversal is done in a breadth-first-search manner, the fix has two
> "strange" effects:
> 
> 1.) In case of mutually recursive a symbolic links, refreshing on level 0 has
>     a different result than refreshing on lower levels - lower levels reveal
>     one level more of the recursive link loop, while refreshing on level 0
>     reveals only the first one. This can be shown by refreshing on various 
>     levels in project test1.
> 
> 2.) There is one specific case where even non-recursive symbolic links show
>     fewer duplicate resources now, than it was before the fix. Look at test6
>     from the attachment to comment #40 for details.
> 
> Both these "strange" results appear in very rare scenarios, so I think these
> are acceptable taking the benefit of the fix into account.

Unfortunately we run into this rare scenario in bug 537449 :-(
It is really hard to explain customer *without* recursive links in file system that Eclipse does not show random directories just because customer makes use of links.

> Fixing them would
> require a lot more effort, most probably including things like keeping track
> of previous locations while following links and keeping this history in sync
> with workspace changes; resource tree lookups during tree traversal;
> deferring symbolic link evaluation in the BFS queue until all
> non-symbolic-links are evaluated.

We plan to do *something* in context of bug 537449, and especially we plan to finally add known test cases via bug 185247.