| Summary: | [transport][ui] Details error message not shown in the UI | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Equinox | Reporter: | Pascal Rapicault <pascal> | ||||||||
| Component: | p2 | Assignee: | Henrik Lindberg <henrik.lindberg> | ||||||||
| Status: | RESOLVED FIXED | QA Contact: | |||||||||
| Severity: | normal | ||||||||||
| Priority: | P3 | CC: | john.arthorne, matthew, susan | ||||||||
| Version: | 3.5 | ||||||||||
| Target Milestone: | 3.5 M7 | ||||||||||
| Hardware: | PC | ||||||||||
| OS: | Mac OS X - Carbon (unsup.) | ||||||||||
| Whiteboard: | |||||||||||
| Bug Depends on: | 272604 | ||||||||||
| Bug Blocks: | 267887 | ||||||||||
| Attachments: |
|
||||||||||
|
Description
Pascal Rapicault
Created attachment 131341 [details] Surface Error Messages This should surface more specific error messages for transport errors from SimpleArtifactRepositories and CompositeArtifactRepositories. I've also made a slight change to the RepositoryStatus class to change text associated with UnknownHost errors, previously the error was: Unknown Host: http://somehost.com/artifacts.xml This is changed to simply: Unknown Host: somehost.com The display of the message is still a bit wonky, the Status output is duplicated (bug 267887) and the text of the Exception appears. So for an Unknown Host the dialog is: Unknown Host: somehost.com Expanded Details: Unknown Host: somehost.com Unknown Host: somehost.com somehost.com fyi, I'm using bug 267887 to track any UI changes needed. For the most part I am just using the status manager to report any statuses or exceptions returned from underneath but it's possible I'm building statuses with redundant messages... Actually, I looked at the duplicated messages yesterday as well, and it appeared to be the way the StatusHandler displays errors. I believe the status message is used for the text, and the details section has an entry for the status message and the exception.getCause(). I noticed that if the Status were to have an INFO or WARNING severity then a MessageDialog would be used instead which lacks a Details button/section. Would it make more sense to display test connection errors in a MessageDialog.openError()? The status would presumably not be in the Error log, but is it useful to have test connection failures in the log? Matthew, could you please update the patch, there has been some changes that makes it difficult to apply. Henrik, please don't commit the UI-part of the patch, I'd like to take a look at that part separately once we've decided what error codes are surfaced by core. The patch as is will suppress any future reporting of problems with the problem sites and while we do want to do that for not-found, I'm not sure we want to do it for the other cases. There is another way to fix the problem on the UI side. Also, I'd like to understand the difference between REPOSITORY_NOT_FOUND and REPOSITORY_INVALID_LOCATION and decide how to treat those. Once you've validated that the core-side stuff is working as expected, can you attach the remaining UI part of the patch to bug 267887 and I will modify as needed...this also relates to UI bug 269507. For certain errors I'd like to offer the user the chance to fix a bogus URL or delete a bad site, etc...but I need to understand what the codes mean. (In reply to comment #5) > Henrik, please don't commit the UI-part of the patch, I'd like to take a look > at that part separately once we've decided what error codes are surfaced by > core. I think the UI should be prepared to show meaningful info for all of the specified status codes. Right now, not all of the codes are used, and in some cases the preferred code can't be used as the caller does not handle the status correctly. (I think REPOSITORY_NOT_FOUND is overused). > The patch as is will suppress any future reporting of problems with the problem > sites and while we do want to do that for not-found, I'm not sure we want to do > it for the other cases. There is another way to fix the problem on the UI > side. > I am not going to commit the UI code. I am also not certain about several other things suggested in the patch. > Also, I'd like to understand the difference between REPOSITORY_NOT_FOUND and > REPOSITORY_INVALID_LOCATION and decide how to treat those. > Me too :) We now have: REPOSITORY_INVALID_LOCATION REPOSITORY_NOT_FOUND REPOSITORY_FAILED_READ REPOSITORY_UNKNOWN_TYPE REPOSITORY_FAILED_AUTHENTICATION as well as these that deal with creation and writing REPOSITORY_EXISTS INTERNAL_ERROR REPOSITORY_FAILED_WRITE REPOSITORY_READ_ONLY and the general INTERNAL_ERROR There are two issues - what the codes actually mean, and layering/interpretation. As I see it, the cases are: a) The address itself is invalid (not a valid URI, or an un-supported URI form (like relative)). This is caused by a typo - user should edit. b) There is no handler for the scheme (user typed in a CVS "pserver" URI). User either made a typo, or there is something wrong with the ECF configuration - impossible to say which without checking scheme against a number of non supported schemes. c) Connection errors (host unknown, wrong port, not connected at all) - can not determine cause, either a typo, network error, or problems at the host end d) Authentication errors - either wrong address, or user entered wrong credentials (or has none) e) It is not possible to determine if the address is a reference to a repository because of errors when trying to get the files that tells what sort of repository it is (for example: request for content.jar times out so we don't know if it is a 404-not found, a host problem (say 500), or a network issue). f) The (marker) file is not found (we get a 404) g) The (marker) file is found, but can not be completely read (network, io, authentication issues) h) The repository (marker) file was found, but there is no factory for the type (or version of that type). The configuration is wrong - a factory for the type should either have been in the configuration from the start, or installed from a "built in" repository type using InstallHandlers. i) A marker file is found, and can be read, there is a factory, but there are syntax/semantic/versioning errors in the read meta data j) The repository was successfully loaded. Then consider layering: - The lowest level can only tell if the (marker) file could be read, and if not, why it could not be read. It should not set codes that have meaning higher up (like retry something else) - the lowest level simply can not know, - A repository factory can determine if i) is a loadable instance ii) not possible to determine if it is a loadable instance or iii) (almost) certain that it is not a loadable instance or iv) a loadable instance with errors - the level above factories needs to decide based on the result from each factory in turn if it should try another factory. The factory itself should not decide this. - A composite factory sort of acts like the level above factories, but should return as if it was a single factory (as there is a level above). I think it is not strange, that "REPOSITORY_NOT_FOUND" is overused in the code.... Anyway - matching the scenarios with error codes... (*= codes we do not have) a) INVALID_LOCATION b) * NO_HANDLER_FOR_SCHEME c) * CAN_NOT_CONNECT d) FAILED_AUTHENTICATION e) * COMMUNICATION_ERROR f) NOT_FOUND g) FAILED_READ h) UNKNOWN_TYPE i) * SYNTAX_ERROR j) OK (c and e, could be combined - from a users perspective the difference is probably not meaningful). If we don't add the "missing codes" the mapping is a bit difficult: a-b) INVALID_LOCATION d) FAILED_AUTHENTICATION c, e, g, i) FAILED_READ f) NOT_FOUND h) UNKNOWN_TYPE j) OK A repo factory could return OK FAIL - it is this type of repo, there was something wrong when loading it (can not be used) WARNING - it is this type of repo, there were warnings when loading it (can be used) NOT_LOADED - it was not this type of repo I think these codes should be combinable with the lower level codes Search for a factory is done when !NOT_LOADED is returned. When a NOT_LOADED is returned, a check of the detail code determines if it is meaningful to continue the search (i.e. not meaningful to continue after connection errors and authentication errors. The overall search returns: OK FAIL WARNING NOT_LOADED with suitable detail code, and possibly with detailed exception. > Once you've validated that the core-side stuff is working as expected, can you > attach the remaining UI part of the patch to bug 267887 and I will modify as > needed...this also relates to UI bug 269507. > > For certain errors I'd like to offer the user the chance to fix a bogus URL or > delete a bad site, etc...but I need to understand what the codes mean. > So do I :) It seems like what we are getting at in this bug is that there are all these specific things that can go wrong underneath (different exceptions, bad returns, timeouts, etc.), and then we are mapping these to some generic codes, and then the UI is deciding what to do with those codes and what the user might do about it. Three layers. So I completely agree with what you say about layering and intepretation. In order to shed some light on which codes might be meaningful, here are the kinds of errors that would cause us to do different things in the UI. I think I'm basically restating what you said above, but by putting it in terms of the user corrections, I'm hoping it helps. The UI is only going to check codes based on what it can suggest/do, whereas the underlying suggestion could provide much more detail to a support person. I'm not suggesting these are the actual codes, but here is the ideal way that the UI would classify errors: - NETWORK_IS_DEAD. If there is a REPOSITORY_NOT_FOUND because the network is dead, the possible user correction (use only local sites, disable networked sites for now, etc.) is different than if the URI was wrong. - BOGUS HOST. There is a network, but the host is not there. This is useful to know because it could just be a typo. This has nothing to do with whether a particular marker was there, the host itself is not there and the user should try something different. - REPOSITORY_NOT_FOUND - the host is there, but there was not a repository found. At this point, the user doesn't really care which marker was gone, which factory had a problem. If the host is there but the file names are not present, that is probably the true case of "repo not found." - BOGUS REPOSITORY - a repository was found but there was some problem with it. From the user's point of view, this is pretty close to REPOSITORY_NOT_FOUND, but it's helpful to know that there was *supposed* to be one there and that it's not a typo. The message could tell them to contact the provider and provide the detailed spew. (In reply to comment #7) > - REPOSITORY_NOT_FOUND > - the host is there, but there was not a repository found. At this point, the > user doesn't really care which marker was gone, which factory had a problem. > If the host is there but the file names are not present, that is probably the > true case of "repo not found." Yes, this is the "trivial case". (i.e positive proof that location is not referencing a repository), > - BOGUS REPOSITORY > - a repository was found but there was some problem with it. From the user's > point of view, this is pretty close to REPOSITORY_NOT_FOUND, but it's helpful > to know that there was *supposed* to be one there and that it's not a typo. > The message could tell them to contact the provider and provide the detailed > spew. > This is also trivial (i.e. positive proof - there is something there, but it is somehow "broken"). The rest are all some sort of "error" from invalid location syntax to not having an internet connection. It is useful for the user to know where they problem may be: - what user entered (the address, username, password, etc) - local environment (is the ethernet cable plugged in) - "the network" - can I talk to this host - "the remote host" - is it responding > - NETWORK_IS_DEAD. > If there is a REPOSITORY_NOT_FOUND because the network is dead, the possible > user correction (use only local sites, disable networked sites for now, etc.) > is different than if the URI was wrong. > - BOGUS HOST. > There is a network, but the host is not there. This is useful to know because > it could just be a typo. This has nothing to do with whether a particular > marker was there, the host itself is not there and the user should try > something different. Unfortunately, it is difficult to differentiate - I think that the codes I proposed would give the user a clue to what is going on. The typical message to the user would be "Could not load/use repository because...." (as opposed to "repository not found") and then one of "authentication issues", "connection issues", "errors communicating with the server"... I believe we are saying the same thing. Created attachment 132091 [details]
Surface Errors from Artifact Repositories
> I believe we are saying the same thing. > Yep...so are you going to take a stab at this? I could add the UI handling after the fact per bug 267887. One thing to consider is that the repo manager currently has special handling for REPOSITORY_NOT_FOUND (rememberNotFound...), so that it doesn't constantly try to communicate with a missing repo. For which error codes would this happen... I think this is why so many errors underneath currently get mapped to NOT_FOUND at the repo manager level... Created attachment 132094 [details]
Surface Errors from Metadata Repositories
The previous should surface problems using TestConnection (once the UI is updated) as it first checks Artifact repositories.
This patch should surface errors when adding repositories through the UI, it modifies the CacheManager to surface errors. Unfortunately there seems to be a timing problem in FileInfoReader which can result in most problems becoming simply Not Found.
In the FileInfoReader#sendBrowseRequest() method, if the transfer exception is returned via listener (as is the case for UnknownHost) it does not always occur prior to the check for null. This also leads to breaking out of the loop before the maximum retry attempts.
(In reply to comment #11) > In the FileInfoReader#sendBrowseRequest() method, if the transfer exception is > returned via listener (as is the case for UnknownHost) it does not always occur > prior to the check for null. This also leads to breaking out of the loop > before the maximum retry attempts. > Nice catch! Thanks Matthew- I created Bug 272604 to track fixing that part of the issue. Released patch to HEAD. The presentation is not perfect, but now erros do surface to the UI. Removing iplog+ from bug - this indicates an IP contribution in a comment rather than a patch. http://wiki.eclipse.org/Development_Resources/Automatic_IP_Log |