Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 271508 - [repository] CompositeRepo failures can cause bad artifact results
Summary: [repository] CompositeRepo failures can cause bad artifact results
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows XP
: P3 critical (vote)
Target Milestone: 3.5 M7   Edit
Assignee: Matthew Piggott CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-07 14:40 EDT by Andrew Niefer CLA
Modified: 2009-04-13 17:15 EDT (History)
4 users (show)

See Also:


Attachments
CompositeArtifact.patch (9.85 KB, patch)
2009-04-07 17:19 EDT, Matthew Piggott CLA
no flags Details | Diff
CompositeArtifact.patch (21.20 KB, patch)
2009-04-08 11:29 EDT, Matthew Piggott CLA
no flags Details | Diff
CompositeArtifact.patch (21.42 KB, patch)
2009-04-08 11:37 EDT, Matthew Piggott CLA
no flags Details | Diff
updated for cancel (21.87 KB, patch)
2009-04-09 13:44 EDT, Andrew Niefer CLA
no flags Details | Diff
Refactoring of the retry logic in one method (11.24 KB, patch)
2009-04-10 13:55 EDT, Pascal Rapicault CLA
no flags Details | Diff
CompositeArtifactRepositoryPatch (24.37 KB, patch)
2009-04-13 16:52 EDT, Matthew Piggott CLA
no flags Details | Diff
CompositeArtifactReopsitory patch (24.39 KB, patch)
2009-04-13 17:02 EDT, Matthew Piggott CLA
pascal: iplog+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Niefer CLA 2009-04-07 14:40:06 EDT
The CompositeArtifactRepository will iterate over its children to get artifacts.
If there is a problem getting from one child, we try again with the next.  There are problems recovering from errors and we end up writing to the same stream over and over.  

(In the case of many children, if the first repo doesn't contain the artifact but the rest do, we end up with N-1 copies of the artifact being written to the stream).

CompositeArtifactRepository.getArtifact & getRawArtifact

1) We should check that the child repo actually contains the requested descriptor before attempting to download from that child

2) If the child contains the artifact, but download fails part way through, we should mark the child bad and return a retry code. (The same as we do for mirrors).  We can't just try the next child because we've already written stuff to the stream.

3) Each child status is added to the multistatus and the multistatus isOK is checked.  If the first child failed, then the multistatus always returns false.  (This can cause the artifact to be downloaded and written to the stream multiple times).  This probably goes away once (1) and (2) are fixed.
Comment 1 Matthew Piggott CLA 2009-04-07 17:19:34 EDT
Created attachment 131184 [details]
CompositeArtifact.patch

I'm fairly sure this will fix the problem, I'm working on some test cases to verify.
Comment 2 Andrew Niefer CLA 2009-04-07 17:44:57 EDT
Minor comment about the patch, I don't believe the local variable "i" is necessary, you could just check childIterator.hasNext().

I'm not sure if the retry message "Retry another mirror" is quite correct since we aren't talking about mirrors but just children.

Henrik, do you have any comments?
Comment 3 Henrik Lindberg CLA 2009-04-08 10:26:05 EDT
(In reply to comment #2) 
> Henrik, do you have any comments?
> 
Have not looked at the patch yet (only have
Limited Internet access right now. Looked at current code and it
needs to be reviewed for exceptions (op-cancel) etc. CompositeRepo should not use
the output stream it gets directly, it needs to give the transport a temp stream, since it may write something and then fail - don't know if the patch does this. 

I can work on this on Monday (I am travelling until then).
Comment 4 Matthew Piggott CLA 2009-04-08 11:29:55 EDT
Created attachment 131303 [details]
CompositeArtifact.patch

I've updated the patch with Andrew's suggestions. I've also included two tests for proper behaviour.  (The first to ensure a retry request returned, the second that the stream is closed before retrying)

The CompositeArtifactRepository doesn't touch the OutputStream, merely passes it to a child repository. 

Did this problem occur using an older revision of the mirroring application?  If not, I don't know why the OutputStream wouldn't be closed before a retry, the MirrorRequest should always close the stream and request a new one first.
Comment 5 Kim Moir CLA 2009-04-08 11:35:53 EDT
I was using the mirroring application in the bundles that were included in last week's i-build. (I20090331-0901)
Comment 6 Matthew Piggott CLA 2009-04-08 11:37:24 EDT
Created attachment 131304 [details]
CompositeArtifact.patch

Added to one of the tests, of course the idea occurred moments after hitting the last submit!
Comment 7 Andrew Niefer CLA 2009-04-09 13:44:17 EDT
Created attachment 131422 [details]
updated for cancel

Attached patch adds some handling for cancel. (Return the cancel, don't keep trying and marking repos as bad).

RE comment #3, as Matthew said. In the case of failure we return retry. We expect the caller to close the stream and try again with a new one.  We were using this strategy to avoid using a temporary stream.

 For exceptions, we are just delegating to the children repositories, they are expected to handle exceptions so we just deal with status objects.
Comment 8 Pascal Rapicault CLA 2009-04-10 13:55:09 EDT
Created attachment 131519 [details]
Refactoring of the retry logic in one method

I have merged the getRawArtifact and getArtifact methods to share the retry logic.
Now the problem I have with the code is that none of the other method of the class honor the "good" flag. I think this should be done. Ideally creating an iterator that is aware would be cleaner.
Also I would make good be a getter.
Reassigning back to Matt
Comment 9 Matthew Piggott CLA 2009-04-13 16:52:48 EDT
Created attachment 131711 [details]
CompositeArtifactRepositoryPatch

I believe I've added checks for isGood() to all locations where it would be useful.

For the moment I haven't added a check to the getArtifacts() method, we might mark a child as bad if its missing a file, but that isn't to say all other files might be valid.  In any case, if an ArtifactRequest returns a failure it will retry other children (if another has the artifact).
Comment 10 Matthew Piggott CLA 2009-04-13 17:02:09 EDT
Created attachment 131712 [details]
CompositeArtifactReopsitory patch

Minor fix
Comment 11 Pascal Rapicault CLA 2009-04-13 17:14:36 EDT
Fix released in HEAD.