Community
Participate
Working Groups
We have a problem with redistributing DTDs and XSDs from third parties such as Sun and W3C. However these files are freely accessible on the Web. Instead of redistributing them, we should create entries in our XML catalog and use a URI resolver that downloads them and caches them. Therefore as long as you are connected to the network once, you can get these and use them offline. The caching function should be like that in a browser. The cache should periodically check for a newer version (user HTTP HEAD) but use the old one if disconnected. The user should be able to set preferences and control the cache, like for a Web browser. There could even be a Refresh XML Catalog command to load everything in the Catalog. Maybe run this as a background task when the user opens the XML editor.
We should also create a low priority job to pre-cache all the URIs listed in the XML catalog. This would ensure that the URIs were cached as long as you were connected once.
I've started investigating this request.
I've created a plugin, org.eclipse.wst.internet.cache, which extends the URI resolver and provides caching facilities to WTP. Some notes about the cache plugin. 1. The cache respects cache values set for remote resources. If a cache value is not set, the cache defaults to live for 1 day. I welcome comments/concerns about the 1 day default expiry. 2. The cache plugin provides a preference page which allows a user to view the entries in the cache, delete selected entries, clear the entire cache, and disable the caching facility. See the cache preference page under the Internet category. 3. The cache plugin provides a low priority (the priority is set to DECORATE, the lowest possible priority in Eclipse) job which will retrieve resources that a) were not able to be retrieved earlier (possibly because there was no network connnection) and b) that are prespecified via an Eclipse extension point. A resource to cache can be specified as follows. <extension point="org.eclipse.wst.internet.cache.cacheresource"> <cacheresource uri="RESOURCE_URI"/>* </extension> The cache plugin should be in WTP builds from 20050418 and later.
The extension point to precache resources MUST include a URL to any applicable licence associated with the resource. The Eclipse legal guidance is that we cannot precache anything that has licence terms without the user's explicit acceptance. I suggest the following change: <extension point="org.eclipse.wst.internet.cache.cacheresource"> <cacheresource uri="RESOURCE_URI" licenceuri="LICENCE_URI" />* </extension> where licenceuri is optional, but MUST be provided if a licence applies. The background task should check for any licences, and prompt the user to accept them. The UI should display a list of the URIs to be precached, and link them to the licences. Each URI should have a check mark. The UI should have some buttons: 1. Select All - checks all URIs 2. Unselect All - unchecks all URIs 3. Accept - signals user acceptance of the licences for checked URIs 4. Cancel - cancels the precaching job This dialog only be displayed automatically once. After the first time, the user can launch it via the Preferences page.
Can resources requested by the user via some operation, not the precache, still be cached silently or will we have to try and display some licence for these resources as well?
If the user requests a resource then we don't have to present a licence. It's just for the resources that we plan to cache in the background task without user initiation.
Should the dialog prompting the user to agree to the licences be launched on Eclipse startup? If so, I think we should try to reduce the number of dialogs we hit the users with where possible. How about including these licence requirements in the same dialog as the third party requirements?
Yes, we don't want lots of dialogs. Perhaps we could make this launchable only from the preferences page. Add a button like: Download Resources That way the user is in control. We'd need to make sure this function was adequately documented.
Looking more closely at the precache feature (as I started implementing it) this feature doesn't seem to fit naturally in the cache. I'd like to suggest that this feature belongs in the XML catalog. The nature of a cache is that it stores local copies of remote resources that are used. The XML catalog has the facility to allow the user to add entries and inlcludes entries for resources that are already available locally. I suggest that instead of adding a download resources button and dialog to the cache preference page that a new option be added to the new entry button on the XML catalog preference page. The entry will allow users to enter their own catalog entries or select from prelisted entries. I think this may be easier for users to understand. Comments?
Changed Version field given new release numbering.
Lawrence, I agree with your last comment, that the XML Catalog is the natural place to provide both the defintion and "fetch and cache" capability you mention. Since it is "user directed" there should not be any issues. Perhaps a checkbox that says "fetch and cache when http: protocol given" would suffice.
Save the user response for accepting a licence in the area common to the installation, i.e. not local to each workspace. This avoids asking the user the same question for each new workspace. It's annoying enough as it is, so let's make life a little easier for users by just asking them once (per install at least). David Williams knows where to store configuration preferences.
While testing the current I build I noticed long time delays for Web projects and Web services. The problem turned out to be the project builders trigger the validators which try to get the J2EE schemas from the Web. If network throughput is slow, this causes a very noticeable time delay (last night around 10 seconds). The problem went away when I enabled caching. The current default is to disable caching. Not many users will know about the caching preference page and they'll just think that WTP sucks. I strongly recommend that caching be on by default. The user will get the prompt and become aware of the situation. However, we need to understand the implication from unattended JUnit tests. They will trigger this dialog. I recommend that we create a JVM properpty, e.g. -Dwtp.quiet=true to let our code know that it is running unattended. The caching code can check this property and automatically accept the licence during testing. Lawrence, BTW, shouldn't you change the status on this bug to ASSIGNED?
Changing status to assigned. I agree that it makes sense to enable caching by default and will turn this back on. I will also implement the JVM property to allow JUnit tests to run unattended.
I disagree about turning it on by default, since having it on precipitated bug 96824. Unless we can throttle or somehow reduce the resources used by the cache, turning it on by default could be disastrous to performance depending on how many lookups are being performed at once. The cache should make only one request per URL no matter how many times the extension is asked to resolve that URL. It would also be a good idea to not block the caller of the lookup while we're downloading the resource to be cached. The download should be spun off to a queuing Job and only after it's completed should the cached location be returned as a result to the lookup query.
Nitin, the cache should behave itself as you describe. That's a bug IMHO and needs to be fixed asap. However, most users will have no clue that we even have a cache, so shipping WTP with it disabled will just generate a lot of bad performance comments.
I would like to tweak Nitin's requirement of "only one request per URL no matter how many times the extension is asked to resolve that URL" ... I think that should be something like "one one request per URL per second" (or similar) ... there's some occasions when networks come and go so what could not be resolved one time, might be able to be resolved a few seconds later, after a cable is plug'd in, or wireless comes back in range, etc. (CVS support does similar "retry but not too often", I believe). Plus, I've opened a "help wanted" enhancement to ensure WTP is well behaved in this regard. See bug 102350.
Lawrence, regarding your notification of enabling the cache by default, I strongly disagree that that's the right thing to do at this time. Unless a limiter is in place, we're knowingly going to cause bug 96824 to happen.
The cache should be on by default since users will experience performance problems if it's off.
Nitin, I disagree and think the cache should be on by default. If the cache is probing the same resource repeatedly this is a bug and needs to be addressed for 0.7. I don't think this bug is a reason to disable the cache by default.
Created attachment 24837 [details] Screen shot of web.xml validation error message detail. The validator was unable to read a schema while caching was off. It could read it when caching was enabled. Why?
I just attached a screen shot. I had a very negative RC1 experience. Why isn't caching enabled by default? Caching was disable when I started RC1. When I created a Web project I got the validation error in web.xml because http://www.ibm.com/webservices/xsd/j2ee_web_services_client_1_1.xsd couldn't be read. Then I enabled caching and it could be read. I could also read it via a Web browser. Is this caching interfering with resource resolution? Why are we using the IBM site? It is official (guaranteed to be there)?
Arthur, 1. Caching is not enabled by default in RC1 as this caused a problem in the build's automated tests. See bug 103614. (Specifically, the license dialog caused the build to hang.) I have a fix for this that is ready to go for RC2. Caching will be enabled in RC2. 2. Caching should not interfere with resource resolution when it is disabled. This function works fine for me on RC1. You may have had intermittent network connectivity which caused this failure. Please try this again and let report back whether you have the same problem. 3. The schema http://www.ibm.com/webservices/xsd/j2ee_web_services_client_1_1.xsd is included in http://java.sun.com/xml/ns/j2ee/j2ee_1_4.xsd. <xsd:include schemaLocation="http://www.ibm.com/webservices/xsd/j2ee_web_services_client_1_1.xsd"/> Seems this schema, although hosted by IBM, is part of the J2EE set of schemas.
Resolving to fixed. The cache was in 0.7. There are other open defects for requests related to the cache. Please open new defects for other cache related problems and requests.
Verfied.
Closing bug.