| Summary: | ant content type over defined for <project> tag | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Platform | Reporter: | David Williams <david_williams> | ||||
| Component: | Ant | Assignee: | Darin Wright <darin.eclipse> | ||||
| Status: | VERIFIED FIXED | QA Contact: | |||||
| Severity: | normal | ||||||
| Priority: | P3 | CC: | Darin_Swanson, eclipse, manderse, mark.melvin, Sam.Mesh, stephen.duncan, thatnitind | ||||
| Version: | 3.1 | ||||||
| Target Milestone: | 3.1.1 | ||||||
| Hardware: | PC | ||||||
| OS: | Windows XP | ||||||
| Whiteboard: | |||||||
| Bug Depends on: | |||||||
| Bug Blocks: | 104140 | ||||||
| Attachments: |
|
||||||
|
Description
David Williams
Created attachment 19218 [details]
brief exchange from newsgroup discussing the problem
I just wanted to attach this to better document this is an end-user reported
problem, not just a hypothetical project-to-project issues.
I am aware and have some understanding of the problem...but I really doubt that this problem can be addressed in the Ant component short of removing the project designation as you suggested. That would solve this problem...till someone else defines a similar content type. As with bug 64707...logging these in Ant "land" seems to be counter productive. The resource framework and extension providers need to feel the pain and come up with some real solutions. Especially since the 3.1 API window has pretty much already closed... Sorry...I know I am just sounding critical and providing no solutions. I agree some improvements can be made by infrastructure, but the reason I put this against ant component is that it seems, to me, to really be a case of overdefining, or, perhaps another way to say it, its trying to use 'contentType' in a way that's not appropriate. The use of 'project' tag is in no way is an indication that the contents are an ant script. If there is not a more more reliable way to tell something is an ant script, by examing the contents (or file name) then its not a good oandidate for "contentType". To give a hypothetical, if there as a processing instruction in the file thet said "ant-script", that would be a pretty unambiguous way to conclude the contents was indeed an ant script. Perhaps this is something your ant tools could automatically to at least help some users? In our WTP case, we're just (trying) to say the contents are XML, which is accurate (and, is not based on presence or abscense of any particular tag, except the XML Declaration). If this was a conflict of an xml content type conflicting with another xml content type, then I would agree, that requires a fix from infrastructure to allow user choices. If you rely on user choice to disambiguate 'project', then users will be basically forced to choose between disabling ant association, or xml editing ... not a good or required choice if there is a better way to infer true ant contents. And don't worry, doesn't sound critical to me. :) Sounds like important discussion. I know of no other way to infer that the contents of an XML file are an Ant buildfile. The only hard requirement is that there is a top level project element. Really looks like we are moving towards putting the work on the poor user via some content type UI assignment. XML has become much too prevelant of a definition language :-) This type of problem might be resolved if the content describer is more strict about the contents that are accepted by its content type. For instance, Ant build script's <project> root element has "default" as mandatory attribute. In other cases there may be a second level element that must be there too. Being more strict for accepting contents as "valid" should reduce the chances of conflicts. Actually the default attribute is no longer required in the newer Ants... Ant is now very very open...as long as you have a well formed project element you may have a valid Ant buildfile. You no longer even need target elements. Unless we actually get into parsing I do not believe we can tell. I will keep thinking/investigating but I do not think that a content type descriptor could actually distinguish an Ant buildfile. Bummer... the latest Ant manual says it is still required: http://ant.apache.org/manual/using.html#projects Well, even if the attribute not being mandatory, I believe there is hope. The API for content describers allows them to return three different values: - VALID ("this is mine") - INDETERMINATE ("it may be mine but there is no strong evidence") - INVALID ("somebody else's, not mine") I would suggest content describers for XML documents with a common root element should check for additional evidence (besides the project root element), like the default attribute or a target sub element. If the root element is missing, return INVALID. If additional evidence can be found, return VALID. Otherwise, return INDETERMINATE. > Unless we actually get into parsing I > do not believe we can tell. Not sure I understood this, but I wonder if you are aware that we already do XML parsing in XMLRootElementContentDescriber. I logged and they have fixed this manual bug for the Ant 1.6.3 release. I meant parsing beyond simple XML validation...parsing towards validating that the XML is an Ant buildfile. But I currently do not believe this is possible to determine except at Ant runtime due to the extensibility of Ant. I will investigate the content describer suggestion...but we may be returning a fair number of INDETERMINATE. And if 2 or more content describers return VALID what happens? Back to the same problem? > And if 2 or more content describers return VALID what happens? > Back to the same problem? Yes. But the likelihood is much lower. Also, nature/content type associations should help managing the number of conflicts (bug 69640). The Ant manual has been updated. http://ant.apache.org/manual/ So, if I understand the implications of the manual change, I think the implication is that there is no way to reliably tell that any particular set of xml tags is an ant file. Right? I'd suggest, then, that for eclipse tool use, you drop the dependancy on '<project>' tag (which does not imply 'ant'), and rely instead on the presence of a special Processing Instruction, such as <? eclipse-ant-script ?> so the eclipse tools (and end users) would have a reliable way of identifying something as ant content, without interfering with other content types. I realize, BTW, this would mean having your own content describer instead relying on that handy XMLRootContent describer. Perhaps someone should right a general perpose "XMLPIContentDescriber" to detect special PI tags. And, BTW, I also realize there would have to be changes to your new file templates, existing users might have to 'migrate', etc. ... but, I see no alternatives without interfering with upstream "projects" which also use '<project>'. I believe there is no way to destinguish an Ant buildfile from other XML. Ouch...I would see this as a pretty major breaking change for the Eclipse Ant integration. This leads to a negative first impression for users of Eclipse's Ant integration --> you have to modify your buildfile to work with Eclipse. I can already hear the bug reports pouring in. But I agree something needs to be done as your user experience is much less than positive as well. I think we really need to try to come up with a better solution that generalizes beyond this particular Ant problem. We need to somehow involve the user; generally the user knows what the content type of the file is but I don't think we should force them to annotate this in the content of the file. Yes, I'm sure other "xmlish" content will want to do more in this area, and ant is just leading the way. :) [and, it get's my interest since its ant that interfers with our upstream WTP normal xml] Seems we agree ant can't be distinguished by its (normal) content, then its not a great candidate for "contentType". Unless ... combining mine and Rafael's suggestions, the content type describer should just return "indeterminant" if its xml and contains a <project> element. But, could return "valid" if contained a processing instruction identifing it as <? eclipse-ant-script ?>. (that is, BTW, what processing instructions are for, to identify application or tool specific information, to be ignored by other apps and tools). Only other approach I know of is to have some file property that could be set to be "ant script", but I think that would require even more user intervention. Do you know of other cases of ambiguous content, that we could verify if the processing instruction approach would work? I do not understand why what was suggested in comment 7 would not work. The problem here is that multiple content types share the same root element, and so far the content describer used only takes the root element into account. The solution proposed would allow disambiguating the majority of cases. The cases where it would not work would be: - the empty file case (already a problem today) - the incomplete content case (<project/>#EOF) Of course, this requires all content types having "project" as the root element to provide a more precise content describer, not only Ant. I agree, Rafael, that comment #7 is the right appoach: indeterminant if only 'project' element, and, I'd suggest, valid if <?eclipse-ant-script?> PI found (from what I can see, there is nothing else that would "prove" something is ant script content, from the content alone). But, I want to emphasize the current cross-project issue isn't that there's mulitple <project> element competetion, its that the ant <project> definition always "wins" over "plain xml" definition. All I'm requesting is that the mere presence of '<project>' not be "stolen" as ant-script. There *might* be cases we want to also make use of '<project>' element, and its fair we would have to provide custom content describer that wouldn't steal just any 'ol <project> element, but the problem at hand is that we want the generic case to be handled as generic xml. There's no way we can predict how all our users will use the '<project>' tag (that is, can't provide content describer in advance for all of them) so need those generic cases to be treated first as XML, and only treated as ant script if unambiguously ant script content. Since you guys have teamed up on me :-) I will spend some time tomorrow investigating / implementing along the lines of comment #7. Most Ant buildfiles will include targets and will designate a default target. Thanks for all the feedback. From comment 15: "I'd suggest, valid if <?eclipse-ant-script?> PI found (from what I can see, there is nothing else that would "prove" something is ant script content, from the content alone)." I would take a commonly used attribute ("default") for the root element or a common sub-element (e.g. "taskdef"/"target") as enough evidence for accepting the contents as valid. The benefit here (at the expense of more complex detection code) is that it should work with any regular Ant build script. W.r.t. the requirements spec'd in the second paragraph in comment 15, bug 86915 has to be addressed as well for that to work. Related to the comment 16: (1) *default* is required for Ant builds by Ant Editor. Is it required by Ant? (2) Ant Editor contributes to the Problems View when *opening* "buggy" Ant file. Probably, there is already an RFE for (2)? 1) default is not required by the Ant editor for some time: see bug 81367. It is also not required by Ant. Try out M6. 2) I don't understand your comment [OFFTOPIC] Easy scenario related to the comment 18 and comment 19 (2): - Open *not-Ant-build-file* file with Ant Editor. - Problems View shows the problems created when file was opened with Ant Editor. - Close Ant Editor. - Problems remain. - Open the file in the Text Editor. - Problems remain. IMHO, problem generation when opening the Editor is wrong pattern. (In reply to comment #20) > [OFFTOPIC] Easy scenario related to the comment 18 and comment 19 (2): > - Open *not-Ant-build-file* file with Ant Editor. > - Problems View shows the problems created when file was opened with Ant Editor. > - Close Ant Editor. > - Problems remain. > - Open the file in the Text Editor. > - Problems remain. > IMHO, problem generation when opening the Editor is wrong pattern. bug 73629 Thanks for the reference to the bug 73629. OFFTOPIC closed. (In reply to the comment 21) I have coded up an AntBuildfileContentDescriber that looks for a top level project element. If found and if there is a default attribute associated or sub elements named target then returns VALID. If project found with no default or targets then INDETERMINATE Else INVALID. Problems/Comments I subclass XMLContentDescriber to check for valid XML syntax. This is an internal class. The AntBuildfileContentDescriber is "asked" 4 times to describe the changed contents of the buildfile. That is, for every save (resource change), the buildfile is parsed 4 times resulting from 2 calls to File.getContentDescription(). For each call to ContentTypeCatalog.getDescriptionFor() the file is parsed twice: once determining the content types for the buffer and then again getting the description from the retrieved content type. Am I missing caching a result somewhere...but this appears to be the same behavior with the XMLRootElementContentDescriber. Is this a known issue. Seems to be less than optimal. Released changes to: org.eclipse.ant.core plugin.xml to reference the new content describer, AntBuildfileContentDescriber AntCorePlugin: to pass on the bundle context to AntCoreUtil AntCoreUtil: provide the plugin bundle context (used to get the SAX parser factory service reference) Created: AntHandler: parses as required to determine if content is buildfile content AntBuildfileContentDescriber: uses the AntHandler to correctly describe if XML files are Ant buildfile content Re: comment 23 The two calls to the content type describer are expected - the first is called on every eligible content type, the second on the elected one only. Will consider the possibility of merging the two stages. The second call to IFile#getDescription during save seems to be caused by LastSaveReferenceProvider#readDocument, fired by a background job, for quick diff. Since the file has just been saved, its content type might have changed, so we have to do all the work once more. We have bug 57137 on performance tuning in the context of content type detection. Thanks Rafael. I will follow the discussion / changes in bug 57137. Kevin please verify. David it would be great if you could double check that this removes the bad interaction between Ant and WTP. This change is in the Eclipse integration build of this week. Thanks for your efforts to improve this. I will try and take a look, but the problem actually originates in our own clients making use of xml files with <project> in them, so not sure I myself can do it justice, but, will at least give some look at it. Thanks again. verified Guys, we should rather have this content type describer provided by runtime. The main reason is that it might be useful to other content types. Another reason is that as it stands, org.eclipse.ant.core is being activated when the Ant describer is loaded. Having it in runtime would avoid that. Otherwise, the class should be moved to its own package and that package should be marked as an exception in the Eclipse-AutoStart header in the bundle manifest, for example: Eclipse-AutoStart: true; exceptions="org.eclipse.ant.internal.donotactivate" Please don't move the "ant content type describer" to runtime, if that's what you meant ... sure seems it would set a bad example. I'm hoping Ant will continue to lead the way and show me the right way to do it in WTP :) Rafael...I am with David on this one. I doubt the describer would be useful for other content types (it is pretty specific to Ant now). I have logged bug 92546 to address this new issue from comment #29 *** Bug 93482 has been marked as a duplicate of this bug. *** Looks like "project" and "target" are not quite enough. We, in WTP, have received bug 104140 where a well used type of file (for "maven") has similar structure to ant files. Example below. Suggestions for workarounds we in WTP could do for our .7 release (we are in RC mode now) would be most welcome. = = = = = Example of "maven" file (which is not ant content, and desired to be treated as "xml" in WTP). <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <build> <plugins> <plugin> <configuration> <target>1.3</target> </configuration> </plugin> </plugins> </build> </project> I will try I don't think the Ant content type is over defined now. This is happening because there is no content type for Maven. However, note that Ant is a type of XML, so any actions available to XML files should be available to Ant files as well. I suggest closing this PR and reporting a new issue against the component not honoring that, or against Platform/Runtime if it turns out it is a more fundamental issue. On not over defined: It reports "valid" for something which is pretty clearly not an ant file. I do not think it "fair" to say that every xml file that uses 'project' and 'target' must also have to define a custom content describer that specifies exactly how it is different from an ant file ... in fact, if maven defined one which also (just) looked at 'project' and 'target' which would win? (It turns out Maven files do have a pretty small list of file names so this bug can be worked around that way, but I do think that is a work around and relying on only 'project' having a child 'target' is a pretty loose definition of ant). Unless the ant team is willing to take on support of Maven and properly edit the attached sample file :) -- conceptually that's what they are saying by saying its valid 'ant' script. On "XML should work for Ant" -- I agree, but this is editors, not actions, and there are many issue related to ambiguity with how editors and documents are contructed and associated, for which there's several bugs open already. I just noticed the "target" element in the sample Maven file occurs deep in the document element tree. wouldn't it be enough if the Ant content describer checked only "target" elements occurring immediately under the "project" element? Does it make sense for targets to occur at other levels? yes and no. We could consider for 3.1.1 tweaking the Ant content type describer so that only "top" targets are considered which is the correct form for Ant buildfiles. With the global use of XML I really see no general win with content type describers especially since the uses are very generalized themselves (the structure of an Ant build file is basically impossible to discern conclusively). I just realized this was still marked "major" from when it *was* a major conflict with just 'project'. I think the current state is less serious and is just a matter of tweaking the describer to rule out obvious cases as they are found. [Perhaps I should have opened a new bug!? But, will just change severity, if that's acceptable.] The "deep target" is one good one to rule out. Also, as in this one, if schema or non-ant DTD is referenced in file that could also "rule it out". I know ant can have a DTD (and will even generate it for you!) ... is there a form of ant content that can have a schema? Not a huge deal but in the future it would likely be better to log a new bug. "basic" Ant buildfiles can have a dtd...but since the definition of a build file is extendible by the user, a general dtd can not exist. Can you help me out...how would I determine if a dtd or schema is non-Ant? Well, I was hoping "non-ant" would be obvious, as in tha maven example in comment #33 .. its had a relatively fixed, well known and documented "namespace" that includes POM or maven ... and ant, if it had one, would have included "ant". But since ant does not have a standard, fixed dtd, much less a standard namespace, there's not much you can do ... if a dtd used. Maybe you could detect that a schema or namespace was being used (the "special" xmlns and xmlns:xsi attributes of the project element in comment #33, and rule out those cases). In the original report from the WTP client, the expectation was that the use of the schema would be enough to get it recognized as "non-ant" (or .. as xml, ... or as maven content type, eventually?). Released into the 3.2 stream for more testing. Changes to AntHandler. Darin, is this bug being considered for release into 3.1.1? If not, then the Target field should be changed. Thanks. The bug is tagged correctly...it will be delieved for 3.1.1 The AntHandler will now return indeterminate for a file which has a top level project element but only has a deeply nested target element (before it returned valid for this content). Released to both 3.2 and to 3.1.1. Please verify DarinW. The problem statement and "solution" are just an ugly hack to work around a fundamental problem in the text API. It should be possible for multiple editors to detect their "own" content type without conflicting, as discussed in Bug 107682. Please refer to bug 111313 (In reply to comment #46) > The AntHandler will now return indeterminate for a file which has a top level > project element but only has a deeply nested target element (before it > returned valid for this content). > > Released to both 3.2 and to 3.1.1. I have tested the following with 3.2 and 3.1.1 and there is still a problem wrt xml files with a <project> tag in them. If you create an xml document with <project></project> in it and save it as maven.xml, eclipse uses the default txt editor for it which is correct. If you edit it again and change the <project> to <project default="abc"> it then thinks its an ant build file which in this case it isn't as maven uses this structure too to specify the default goal to execute e.g. <?xml version="1.0" encoding="UTF-8"?> <project default="buildAll" xmlns:m="jelly:maven"> <goal name="buildAll"> <m:reactor basedir="${basedir}" includes="**/project.xml" goals="all" banner="Building applications" ignoreFailures="false"/> </goal> </project> Is it possible to stop using the ant editor for files matching maven.xml? The Ant content type definition does not say it wants maven.xml files, only *.xml files. The problem is that there is no content type for Maven. If there was one, it would work as you want. What you have to do in this case is to associate the maven.xml file name to the XML content type (see Window > Preferences..., General > Content Type). Then, Maven files will be treated as any other XML file. You can also use the Ant editor to edit maven.xml files but turn off the error reporting. Window>Preferences>Ant>Editor>Problems Set the names of files that will not be checked for problems in the Ant editor. *** Bug 113896 has been marked as a duplicate of this bug. *** Verified. |