Community
Participate
Working Groups
Created attachment 133849 [details] patch that remembers status code + test Currently, the unavailableRepos in AbstractRepositoryManager does not remember the reason for repo being unavailable. This means that remembering AUTH failure will be reported as a NOT FOUND while the repo is in the unavailable state. This is confusing to a user. The current solution is to not make repos with AUTH failure unavailable. This has the drawback that a user that enters the wrong password, will be prompted ad naseum for the password. I have made an implementation where the code for managing the list is broken out to a separate class, and this class supports associating an integer code with the object (i.e. the location). It is then possible to remember either NOT FOUND, or AUTH failed (or any such code in the future). I also made the changes to the abstract repository manager to make use of the new class as well as handling the AUTH fail correctly. A small change was also needed to ensure that remove repository always clears the status (the new Auth test will fail on second run otherwise). I can release this, but wants to check that it looks ok first. (esp the change of doing the removal of remembered unavailability in the removeRepo method earlier).
The patch looks ok to me. One small thing I don't like is this coding pattern: try { int code = unavailableRepositories.getStatus(location); fail(location, code); } catch (NoSuchElementException e) { // ignore } I.e., the exception is thrown on the "success" case, and failure occurs only if no exception is thrown. I think it would be cleaner for getStatus() to return IStatus.OK when no status is known, and then have: if (code != IStatus.OK) fail(location, code); This just seems easier to read to me. My preference would be to wait until early RC1 to release this, so we have more time for testing it. The current state is annoying but not the end of world (considering next milestone is only two weeks away anyway).
(In reply to comment #1) > The patch looks ok to me. One small thing I don't like is this coding pattern: > > try { > int code = unavailableRepositories.getStatus(location); > fail(location, code); > } catch (NoSuchElementException e) { > // ignore > } > > I.e., the exception is thrown on the "success" case, and failure occurs only if > no exception is thrown. I think it would be cleaner for getStatus() to return > IStatus.OK when no status is known, and then have: > > if (code != IStatus.OK) > fail(location, code); > > This just seems easier to read to me. > > My preference would be to wait until early RC1 to release this, so we have more > time for testing it. The current state is annoying but not the end of world > (considering next milestone is only two weeks away anyway). > ok, on when to release. The coding style is because the getStatus throws the exception if there is no code - it is not aware of the meaning of any codes. It is just an int. Could change the API to "can remember any code but -1" or something. I.e. the lack of a code is the success in this case... Or, indeed store an IStatus instead of an int - that is perhaps better, and then IStatus.OK can be returned on missing elements. I can change it to that, and then release it early RC1.
Yes, I was just suggesting that a return value of 0 or -1 be used indicate no status is recorded. However, holding onto a status object works too, and is perhaps a bit clearer than just recording an int.
(In reply to comment #3) > Yes, I was just suggesting that a return value of 0 or -1 be used indicate no > status is recorded. However, holding onto a status object works too, and is > perhaps a bit clearer than just recording an int. > And it makes it possible to keep the full details of the original problem.
Created attachment 134051 [details] Alternate impl that uses IStatus instead of an int This is an alternative implementation that instead of an int, remembers an IStatus. It s slightly more complicated, but perhaps more useful in the future. The alternative (as suggested earlier) is to use one value (like -1) to mean "not remembered".
If this patch is ready, please ask for a review
The alternate impl is ready for review. If that impl is deemed too complex, there is a small change required to the int based version as it needs to reserve one int (-1, or 0) to mean "no status code remembered".
Created attachment 134668 [details] Simpler status remembering implementation Now that you're storing IStatus rather than int, you don't need all that complexity of entry objects, etc. This new patch just uses a map of (Object->IStatus) to remember the status objects. I also switched to a thread-safe collection to handle concurrent access.
After more testing, I'm not entirely happy with the new behaviour. In particular this case: 1) Open Install new software dialog 2) Add a new site that requires authentication 3) Enter the wrong password three times until the failure dialog 4) Close Install new software dialog 5) Reopen dialog -> An error dialog pops up immediately with the authentication failure. This error dialog now appears every time I open the install software wizard. Perhaps the background loading job should ignore authentication failure and not report it as an error (since password prompting is behind suppressed).
Seems like we want "open install new software dialog" to clear the remembered status when it closes. If you waited a while, it would have gone away anyway... Isn't the purpose to not plague the user with repeated errors, and trying to connect multiple times during "a run". If authentication is simply ignored - the user may be confused what is really going on. It would mean that things just don't show up...
I'm no longer sure why the old behaviour is wrong. If I enter my password incorrectly, and then later come back and remember my password, I want that password prompt to appear. If we remember auth failure, we will never prompt for the password again. Henrik, can you give steps for the case where you think the current behaviour is bad?
John asked me to comment on this one. I haven't looked at the patch. All I did was try the behavior in the current build (without the patch) to see what the problem is and whether we should fix. My take is that the current behavior is ok. here's what I tried. 1 - added http://localhost:8080/never to my list of sites in the preferences 2 - Help->Install New Software 3 - no prompt (good) 4 - if I select All Available Sites or the specific site, I get a prompt. I can cancel (good) or I can fail after three times (as in 3.4). 5 - I can go to other sites, etc. If I pick the site again, prompted again (seems expected to me, otherwise we will need to make sure the status in the list explains why the list is empty.) 6 - If I have [x] Contact all update sites during install to find required software, then I get the login prompt again while resolving. Surprising. To me, it's an "in your face" version of the question, "why are you talking to every site when I chose software from only one?" 7 - If I cancel, I can't move to the next page. (That is a bit odd, I would have expected cancel to cancel the authentication but not necessarily propagate up to the parent task.) 8 - If I don't cancel, and just fail authentication, it's possible to see the prompts again during the sizing phase. I had a lot of sites selected, so it took quite some time for this to happen. Bottom line: - The prompting during sizing is annoying but I don't know if it's worth fixing with special handling at the manager level. The user can avoid it by disabling the repo or unchecking the [] Contact all update sites checkbox. And actually, we can suppress it using the same technique used in bug 274097. I opened bug 275229 to discuss that. - In step 7 I would hope cancelling authentication would simply cancel that repo and keep resolving. I opened bug 275227 to look into this. - When we actually go to download artifacts, the wizard is closed, so the prompts would still occur in that case if the user has checked [x] contact all sites.
(In reply to comment #11) > I'm no longer sure why the old behaviour is wrong. If I enter my password > incorrectly, and then later come back and remember my password, I want that > password prompt to appear. If we remember auth failure, we will never prompt > for the password again. Henrik, can you give steps for the case where you think > the current behaviour is bad? > The primary need to remember auth error was when 401, 403, and 407 where lumped together as auth error. Now that we treat 403 and 407 as failure to read, there is no need to remember auth error for the sake or reporting it to the user - an auth error is always about logging in (and not about errors with proxy or forbidden access). The only remaining case is when performing a run and the same repository needs to be consulted multiple times during a run - i.e, if I canceled out of the password dialog once in a run, I don't want to be prompted again. So - I think our options are: a) do not remember auth error at all - it can effect some users in some cases, but may not be worth fixing. b) if auth errors are remembered, they need to be cleaned when the current "run" is finished. Whatever a "run" is. In any case, I still think the change to remember the proper IStatus should be released.
After reading susan's comment. I think we should still remember the IStatus, but not remember failed authentication. That would give us the current behavior wrt. login prompts, but would fix the issue of loosing error information.
I think we should defer this one to 3.6. I'm not seeing a big problem in the current behaviour that would warrant fixing at this late stage in 3.5. Although remembering the cause of the failure makes sense, and I think the current patch achieves this, there may be other side-effects of this change that we would need to work through. We've got enough other bugs to focus on right now. Note also Susan entered some bugs to investigate specific problems with excessive authentication prompting that should improve the 3.5 behaviour.
(In reply to comment #15) > I think we should defer this one to 3.6. I'm not seeing a big problem in the > current behaviour that would warrant fixing at this late stage in 3.5. Although > remembering the cause of the failure makes sense, and I think the current patch > achieves this, there may be other side-effects of this change that we would > need to work through. We've got enough other bugs to focus on right now. Note > also Susan entered some bugs to investigate specific problems with excessive > authentication prompting that should improve the 3.5 behaviour. > I don't agree, it means that operations give unpredictable result - the details are lost the second time the same thing is loaded. i.e. calling "loadRepository" for the same location twice does not return a useful status the second time. (Easily demonstrated). The only thing that should be done is to remove remembering the auth errors as this preserves the current behavior.
I'm not saying this isn't the right answer, I'm just saying it isn't the right time for a change like this in my opinion. Our focus right now is on serious defects and highly visible polish bugs, and having a more accurate reason for connection failure being surfaced on subsequent access to a location doesn't seem worth the risking of changing it right now. Pascal or Susan, if you feel otherwise please chime in.
Created attachment 134834 [details] Simplified version that does not remember auth error - includes test The new patch is what I think we should include in RC1.
(In reply to comment #17) > Pascal or Susan, if you feel > otherwise please chime in. I don't feel otherwise but felt guilty ignoring the comments. So I am chiming in anyway. This is a tough one. I didn't carefully review the latest patch, just looked at the amount of code and decided the reward has to be high for the amount of code changing. This feels like one of those implementation-driven bugs (we know we are losing detail!) where we need to step back and assess the actual cost of not fixing. Correct me if I'm wrong, but if we do a "Test Connection" (which is a refresh underneath), won't we always get the detailed error? If so, then I think we can live with it as is, because we will always have a way in bug reports and in the field to get the detailed error when we are ready to diagnose it. We also should have the original error in the log.
The subtle change here is that currently we always return a REPOSITORY_NOT_FOUND status on subsequent attempts. We have code in the UI and various other places that treat REPOSITORY_NOT_FOUND differently from other errors. For example we often silently skip NOT_FOUND repositories to behave nicely in offline scenarios, but may abort or surface the error in other cases. Thus our current code has the effect of reporting the true error the first time, but muffling/silencing the error on subsequent access. Maybe our current behaviour is wrong in some of these cases, but it's possible the fix will introduce places where we are suddenly doing excessive error prompting or logging (such as the new authentication prompting in comment #9). For this reason I don't think it's a low risk change, and the benefit doesn't seem great (the user is told the repository is not available, when in fact it is available but malformed, for example).
(In reply to comment #20) > For this > reason I don't think it's a low risk change, and the benefit doesn't seem great > (the user is told the repository is not available, when in fact it is available > but malformed, for example). > Then, if the user waits a while, it is malformed again, then not found, etc. Those that use the provisional API will also need to know that they must call refresh before doing a load if they want to make certain they get the real status back, I can agree on the risk that things may happen elsewhere. But surely, isn't there time to address such problems if they were to occur? I guess I place a higher cost on not fixing - I am concerned about the user's perception of overall p2 quality when getting "an error has occurred" type of messages. I find them embarrassing. I would be more comfortable if the returned/remembered status said something else than just "not found" - like "Repository not contacted because of earlier errors. See log, or use Test Connections to find out more". Then it is just an API issue to defer to 3.6.
I don't see this happening in 3.6 - the issues at this level in the transport stack are really more about optimizing throughput - not getting stuck on a slow mirror / bad server because of short outages on a much closer/better server. Spending time on this very small improvement does not seem worth while. (And I have not seen complaints about the bad behaviour that the issue being discussed here can cause). Taking it off 3.6, but keeping it as a reminder to when someone tries to improve downloading throughput.
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie.