Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 89195

Summary: ant content type over defined for <project> tag
Product: [Eclipse Project] Platform Reporter: David Williams <david_williams>
Component: AntAssignee: Darin Wright <darin.eclipse>
Status: VERIFIED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: Darin_Swanson, eclipse, manderse, mark.melvin, Sam.Mesh, stephen.duncan, thatnitind
Version: 3.1   
Target Milestone: 3.1.1   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Bug Depends on:    
Bug Blocks: 104140    
Attachments:
Description Flags
brief exchange from newsgroup discussing the problem none

Description David Williams CLA 2005-03-27 15:44:39 EST
in WTP we've had a dozen or so complaints that our XML editor doesn't work. 
Investigations reveal that its because the XML file contained the <project> tag, 
so an "ant document" was being created, instead of one of our strucutred
documents. If/when the editor-content-type support is added, this will still be
a problem since then the ant editor would open on files which are not ant files. 

1. <project> is an extrememly popular root tag, so I don't think it should be
associated with 'ant' at all. 

Short of removing it from ant component's definition, other solutions might include

2. give it a 'low' priority (with might allow our's to "win" if WTP installed (I
haven't actually tested this, its just an idea). 

3. Ultimately, there may need to be some UI that allows end-users to better
decide how to dis-ambiguate cases like this. (I don't have time to do to for
this release, not sure if anyone from ant team does). But ... <project> is used
in lots of files besides ant files, web projects, maven projects, etc, so, is
likely to never be "either/or" case ... in other words, <project> always a bad
choice to use to determine content type. 

I've marked this a 'major' because I don't think Exlipse 3.1 or WTP 1.0 can be
released with this conflict. (Is 'critical' more appropriate? .. help links are
broken so I can't check definitions right now). 

Other suggestions welcome.
Comment 1 David Williams CLA 2005-03-27 15:47:35 EST
Created attachment 19218 [details]
brief exchange from newsgroup discussing the problem

I just wanted to attach this to better document this is an end-user reported
problem, not just a hypothetical project-to-project issues.
Comment 2 Darin Swanson CLA 2005-03-27 18:48:16 EST
I am aware and have some understanding of the problem...but I really doubt that
this problem can be addressed in the Ant component short of removing the project
designation as you suggested. That would solve this problem...till someone else
defines a similar content type.

As with bug 64707...logging these in Ant "land" seems to be counter productive.
The resource framework and extension providers need to feel the pain and come up
with some real solutions. Especially since the 3.1 API window has pretty much
already closed...

Sorry...I know I am just sounding critical and providing no solutions.
Comment 3 David Williams CLA 2005-03-27 21:34:26 EST
I agree some improvements can be made by infrastructure, but the reason I put
this against ant component is that it seems, to me, to really be a case of
overdefining, or, perhaps another way to say it, its trying to use 'contentType'
in a way that's not appropriate. The use of 'project' tag is in no way is an
indication that the contents are an ant script. If there is not a more more
reliable way to tell something is an ant script, by examing the contents (or
file name) then its not a good oandidate for "contentType". 

To give a hypothetical, if there as a processing instruction in the file thet
said "ant-script", that would be a pretty unambiguous way to conclude the
contents was indeed an ant script. Perhaps this is something your ant tools
could automatically to at least help some users?

In our WTP case, we're just (trying) to say the contents are XML, which is
accurate (and, is not based on presence or abscense of any particular tag,
except the XML Declaration). 

If this was a conflict of an xml content type conflicting with another xml
content type, then I would agree, that requires a fix from infrastructure to
allow user choices. 

If you rely on user choice to disambiguate 'project', then users will be
basically forced to choose between disabling ant association, or xml editing ...
not a good or required choice if there is a better way to infer true ant contents. 

And don't worry, doesn't sound critical to me. :) Sounds like important discussion.
Comment 4 Darin Swanson CLA 2005-03-27 23:43:27 EST
I know of no other way to infer that the contents of an XML file are an Ant 
buildfile. The only hard requirement is that there is a top level project 
element.

Really looks like we are moving towards putting the work on the poor user via 
some content type UI assignment. XML has become much too prevelant of a 
definition language :-)
Comment 5 Rafael Chaves CLA 2005-03-28 13:18:53 EST
This type of problem might be resolved if the content describer is more strict
about the contents that are accepted by its content type. For instance, Ant
build script's <project> root element has "default" as mandatory attribute. In
other cases there may be a second level element that must be there too. Being
more strict for accepting contents as "valid" should reduce the chances of
conflicts.
Comment 6 Darin Swanson CLA 2005-03-28 18:43:21 EST
Actually the default attribute is no longer required in the newer Ants...

Ant is now very very open...as long as you have a well formed project element
you  may have a valid Ant buildfile. You no longer even need target elements.

Unless we actually get into parsing I do not believe we can tell.

I will keep thinking/investigating but I do not think that a content type
descriptor could actually distinguish an Ant buildfile.
Comment 7 Rafael Chaves CLA 2005-03-28 18:59:53 EST
Bummer... the latest Ant manual says it is still required:

http://ant.apache.org/manual/using.html#projects

Well, even if the attribute not being mandatory, I believe there is hope.

The API for content describers allows them to return three different values:
- VALID ("this is mine")
- INDETERMINATE ("it may be mine but there is no strong evidence")
- INVALID ("somebody else's, not mine")

I would suggest content describers for XML documents with a common root element
should check for additional evidence (besides the project root element), like
the default attribute or a target sub element. If the root element is missing,
return INVALID. If additional evidence can be found, return VALID. Otherwise,
return INDETERMINATE.

> Unless we actually get into parsing I 
> do not believe we can tell. 

Not sure I understood this, but I wonder if you are aware that we already do XML
parsing in XMLRootElementContentDescriber.
Comment 8 Darin Swanson CLA 2005-03-28 19:05:47 EST
I logged and they have fixed this manual bug for the Ant 1.6.3 release.

I meant parsing beyond simple XML validation...parsing towards validating that 
the XML is an Ant buildfile. But I currently do not believe this is possible to 
determine except at Ant runtime due to the extensibility of Ant.

I will investigate the content describer suggestion...but we may be returning a 
fair number of INDETERMINATE.
And if 2 or more content describers return VALID what happens? Back to the same 
problem?
Comment 9 Rafael Chaves CLA 2005-03-28 19:16:34 EST
> And if 2 or more content describers return VALID what happens? 
> Back to the same problem?

Yes. But the likelihood is much lower. Also, nature/content type associations
should help managing the number of conflicts (bug 69640).
Comment 10 Darin Swanson CLA 2005-04-01 12:37:19 EST
The Ant manual has been updated. http://ant.apache.org/manual/
Comment 11 David Williams CLA 2005-04-01 12:55:27 EST
So, if I understand the implications of the manual change, I think the
implication is that there is no way to reliably tell that any particular set of
xml tags is an ant file. Right? 

I'd suggest, then, that for eclipse tool use, you drop the dependancy on
'<project>' tag (which does not imply 'ant'), and rely instead on the presence
of a special Processing Instruction, such as <? eclipse-ant-script ?> so the
eclipse tools (and end users) would have a reliable way of identifying something
as ant content, without interfering with other content types. 

I realize, BTW, this would mean having your own content describer instead
relying on that handy XMLRootContent describer. Perhaps someone should right a
general perpose "XMLPIContentDescriber" to detect special PI tags. 

And, BTW, I also realize there would have to be changes to your new file
templates, existing users might have to 'migrate', etc. ... but, I see no
alternatives without interfering with upstream "projects" which also use
'<project>'. 
Comment 12 Darin Swanson CLA 2005-04-02 11:27:55 EST
I believe there is no way to destinguish an Ant buildfile from other XML.

Ouch...I would see this as a pretty major breaking change for the Eclipse Ant 
integration.
This leads to a negative first impression for users of Eclipse's Ant 
integration --> you have to modify your buildfile to work with Eclipse. I can 
already hear the bug reports pouring in.

But I agree something needs to be done as your user experience is much less 
than positive as well.

I think we really need to try to come up with a better solution that 
generalizes beyond this particular Ant problem.

We need to somehow involve the user; generally the user knows what the content 
type of the file is but I don't think we should force them to annotate this in 
the content of the file.
Comment 13 David Williams CLA 2005-04-02 12:08:27 EST
Yes, I'm sure other "xmlish" content will want to do more in this area, and ant
is just leading the way. :) 

[and, it get's my interest since its ant that interfers with our upstream WTP
normal xml]

Seems we agree ant can't be distinguished by its (normal) content, then its not
a great candidate for "contentType". 

Unless ... combining mine and Rafael's suggestions, the content type describer
should just return "indeterminant" if its xml and contains a <project> element.
But, could return "valid" if contained a processing instruction identifing it as
<? eclipse-ant-script ?>.  (that is, BTW, what processing instructions are for,
to identify application or tool specific information, to be ignored by other
apps and tools). 

Only other approach I know of is to have some file property that could be set to
be "ant script", but I think that would require even more user intervention. 

Do you know of other cases of ambiguous content, that we could verify if the
processing instruction approach would work?
Comment 14 Rafael Chaves CLA 2005-04-03 13:44:12 EDT
I do not understand why what was suggested in comment 7 would not work. The
problem here is that multiple content types share the same root element, and so
far the content describer used only takes the root element into account. The
solution proposed would allow disambiguating the majority of cases. The cases
where it would not work would be:
- the empty file case (already a problem today)
- the incomplete content case (<project/>#EOF)

Of course, this requires all content types having "project" as the root element
to provide a more precise content describer, not only Ant.
Comment 15 David Williams CLA 2005-04-03 14:44:44 EDT
I agree, Rafael, that comment #7 is the right appoach: indeterminant if only
'project' element, and, I'd suggest, valid if <?eclipse-ant-script?> PI found
(from what I can see, there is nothing else that would "prove" something is ant
script content, from the content alone). 

But, I want to emphasize the current cross-project issue isn't that there's
mulitple <project> element competetion, its that the ant <project> definition
always "wins" over "plain xml" definition. All I'm requesting is that the mere
presence of '<project>' not be "stolen" as ant-script. There *might* be cases we
want to also make use of '<project>' element, and its fair we would have to
provide custom content describer that wouldn't steal just any 'ol <project>
element, but the problem at hand is that we want the generic case to be handled
as generic xml. There's no way we can predict how all our users will use the
'<project>' tag (that is, can't provide content describer in advance for all of
them) so need those generic cases to be treated first as XML, and only treated
as ant script if unambiguously ant script content. 
Comment 16 Darin Swanson CLA 2005-04-03 23:09:58 EDT
Since you guys have teamed up on me :-) I will spend some time tomorrow 
investigating / implementing along the lines of comment #7.

Most Ant buildfiles will include targets and will designate a default target.

Thanks for all the feedback.
Comment 17 Rafael Chaves CLA 2005-04-04 11:13:10 EDT
From comment 15: "I'd suggest, valid if <?eclipse-ant-script?> PI found
(from what I can see, there is nothing else that would "prove" something is ant
script content, from the content alone)." 

I would take a commonly used attribute ("default") for the root element or a
common sub-element (e.g. "taskdef"/"target") as enough evidence for accepting
the contents as valid. The benefit here (at the expense of more complex
detection code) is that it should work with any regular Ant build script.

W.r.t. the requirements spec'd in the second paragraph in comment 15, bug 86915
has to be addressed as well for that to work.

Comment 18 Sam Mesh CLA 2005-04-04 17:58:52 EDT
Related to the comment 16:
(1) *default* is required for Ant builds by Ant Editor. Is it required by Ant?
(2) Ant Editor contributes to the Problems View when *opening* "buggy" Ant file.

Probably, there is already an RFE for (2)?
Comment 19 Darin Swanson CLA 2005-04-04 18:12:35 EDT
1) default is not required by the Ant editor for some time: see bug 81367.
It is also not required by Ant. Try out M6.
2) I don't understand your comment
Comment 20 Sam Mesh CLA 2005-04-04 18:28:54 EDT
[OFFTOPIC] Easy scenario related to the comment 18 and comment 19 (2):
- Open *not-Ant-build-file* file with Ant Editor.
- Problems View shows the problems created when file was opened with Ant Editor.
- Close Ant Editor.
- Problems remain.
- Open the file in the Text Editor.
- Problems remain.

IMHO, problem generation when opening the Editor is wrong pattern.
Comment 21 Darin Swanson CLA 2005-04-04 18:39:27 EDT
(In reply to comment #20)
> [OFFTOPIC] Easy scenario related to the comment 18 and comment 19 (2):
> - Open *not-Ant-build-file* file with Ant Editor.
> - Problems View shows the problems created when file was opened with Ant 
Editor.
> - Close Ant Editor.
> - Problems remain.
> - Open the file in the Text Editor.
> - Problems remain.
> IMHO, problem generation when opening the Editor is wrong pattern.

bug 73629
Comment 22 Sam Mesh CLA 2005-04-04 19:09:03 EDT
Thanks for the reference to the bug 73629.
OFFTOPIC closed.

(In reply to the comment 21)
Comment 23 Darin Swanson CLA 2005-04-04 19:13:43 EDT
I have coded up an AntBuildfileContentDescriber that looks for a top level 
project element.
If found and if there is a default attribute associated or sub elements named 
target then returns VALID.
If project found with no default or targets then INDETERMINATE
Else INVALID.

Problems/Comments
I subclass XMLContentDescriber to check for valid XML syntax. This is an 
internal class.

The AntBuildfileContentDescriber is "asked" 4 times to describe the changed 
contents of the buildfile. That is, for every save (resource change), the 
buildfile is parsed 4 times resulting from 2 calls to 
File.getContentDescription(). 

For each call to ContentTypeCatalog.getDescriptionFor() the file is parsed 
twice: once determining the content types for the buffer and then again 
getting the description from the retrieved content type. 
Am I missing caching a result somewhere...but this appears to be the same 
behavior with the XMLRootElementContentDescriber.

Is this a known issue. Seems to be less than optimal.
Comment 24 Darin Swanson CLA 2005-04-04 20:43:47 EDT
Released changes to:
org.eclipse.ant.core plugin.xml to reference the new content describer, 
AntBuildfileContentDescriber
AntCorePlugin: to pass on the bundle context to AntCoreUtil
AntCoreUtil: provide the plugin bundle context (used to get the SAX parser 
factory service reference)

Created:
AntHandler: parses as required to determine if content is buildfile content
AntBuildfileContentDescriber: uses the AntHandler to correctly describe if XML 
files are Ant buildfile content
Comment 25 Rafael Chaves CLA 2005-04-05 11:59:05 EDT
Re: comment 23 

The two calls to the content type describer are expected - the first is called
on every eligible content type, the second on the elected one only. Will
consider the possibility of merging the two stages.

The second call to IFile#getDescription during save seems to be caused by
LastSaveReferenceProvider#readDocument, fired by a background job, for quick
diff. Since the file has just been saved, its content type might have changed,
so we have to do all the work once more.

We have bug 57137 on performance tuning in the context of content type detection.
Comment 26 Darin Swanson CLA 2005-04-05 12:05:00 EDT
Thanks Rafael. I will follow the discussion / changes in bug 57137.

Kevin please verify.

David it would be great if you could double check that this removes the bad 
interaction between Ant and WTP. This change is in the Eclipse integration 
build of this week.
Comment 27 David Williams CLA 2005-04-05 12:21:00 EDT
Thanks for your efforts to improve this. I will try and take a look, but the
problem actually originates in our own clients making use of xml files with
<project> in them, so not sure I myself can do it justice, but, will at least give 
some look at it. 

Thanks again. 
Comment 28 Kevin Barnes CLA 2005-04-05 17:03:38 EDT
verified
Comment 29 Rafael Chaves CLA 2005-04-18 15:36:09 EDT
Guys, we should rather have this content type describer provided by runtime. The
main reason is that it might be useful to other content types. Another reason is
that as it stands, org.eclipse.ant.core is being activated when the Ant
describer is loaded. Having it in runtime would avoid that. Otherwise, the class
should be moved to its own package and that package should be marked as an
exception in the Eclipse-AutoStart header in the bundle manifest, for example:

Eclipse-AutoStart: true; exceptions="org.eclipse.ant.internal.donotactivate"
Comment 30 David Williams CLA 2005-04-18 15:54:00 EDT
Please don't move the "ant content type describer" to runtime, if that's what
you meant ... sure seems it would set a bad example. I'm hoping Ant will
continue to 
lead the way and show me the right way to do it in WTP :)
Comment 31 Darin Swanson CLA 2005-04-24 22:19:52 EDT
Rafael...I am with David on this one.
I doubt the describer would be useful for other content types (it is pretty 
specific to Ant now).

I have logged bug 92546 to address this new issue from comment #29
Comment 32 David Williams CLA 2005-05-09 12:17:42 EDT
*** Bug 93482 has been marked as a duplicate of this bug. ***
Comment 33 David Williams CLA 2005-07-16 14:26:37 EDT
Looks like "project" and "target" are not quite enough. We, in WTP, have
received bug 104140 where a well used type of file (for "maven") has similar
structure to ant files. Example below. Suggestions for workarounds we in WTP
could do for our .7 release (we are in RC mode now) would be most welcome. 
 = = = = =
Example of "maven" file (which is not ant content, and desired to be treated as
"xml" in WTP). 

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">
    <build>
        <plugins>
            <plugin>
                <configuration>
                    <target>1.3</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>
Comment 34 Darin Swanson CLA 2005-07-17 00:34:37 EDT
I will try 
Comment 35 Rafael Chaves CLA 2005-07-18 09:58:11 EDT
I don't think the Ant content type is over defined now. This is happening
because there is no content type for Maven. However, note that Ant is a type of
XML, so any actions available to XML files should be available to Ant files as
well. I suggest closing this PR and reporting a new issue against the component
not honoring that, or against Platform/Runtime if it turns out it is a more
fundamental issue.
Comment 36 David Williams CLA 2005-07-18 10:29:29 EDT
On not over defined: It reports "valid" for something which is pretty clearly
not an ant file. I do not think it "fair" to say that every xml file that uses
'project' and 'target' must also have to define a custom content describer that
specifies exactly how it is different from an ant file ... in fact, if maven
defined one which also (just) looked at 'project' and 'target' which would win?
(It turns out Maven files do have a pretty small list of file names so this bug
can be worked around that way, but I do think that is a work around and relying
on only 'project' having a child 'target' is a pretty loose definition of ant).
Unless the ant team is willing to take on support of Maven and properly edit the
attached sample file :) -- conceptually that's what they are saying by saying
its valid 'ant' script. 

On "XML should work for Ant" -- I agree, but this is editors, not actions, and
there are many issue related to ambiguity with how editors and documents are
contructed and associated, for which there's several bugs open already. 
Comment 37 Rafael Chaves CLA 2005-07-18 11:09:23 EDT
I just noticed the "target" element in the sample Maven file occurs deep in the
document element tree. wouldn't it be enough if the Ant content describer
checked only "target" elements occurring immediately under the "project"
element? Does it make sense for targets to occur at other levels?
Comment 38 Darin Swanson CLA 2005-07-18 12:43:20 EDT
yes and no.
Comment 39 Darin Swanson CLA 2005-07-18 12:50:44 EDT
We could consider for 3.1.1 tweaking the Ant content type describer so that 
only "top" targets are considered which is the correct form for Ant buildfiles.

With the global use of XML I really see no general win with content type 
describers especially since the uses are very generalized themselves (the 
structure of an Ant build file is basically impossible to discern 
conclusively).
Comment 40 David Williams CLA 2005-07-18 16:15:48 EDT
I just realized this was still marked "major" from when it *was* a major
conflict with just 'project'. I think the current state is less serious and is
just a matter of tweaking the describer to rule out obvious cases as they are
found. [Perhaps I should have opened a new bug!? But, will just change severity,
if that's acceptable.]

The "deep target" is one good one to rule out.

Also, as in this one, if schema or non-ant DTD is referenced in file that could
also "rule it out". I know ant can have a DTD (and will even generate it for
you!) ... is there a form of ant content that can have a schema?
Comment 41 Darin Swanson CLA 2005-07-18 16:23:40 EDT
Not a huge deal but in the future it would likely be better to log a new bug.

"basic" Ant buildfiles can have a dtd...but since the definition of a build 
file is extendible by the user, a general dtd can not exist.

Can you help me out...how would I determine if a dtd or schema is non-Ant?
Comment 42 David Williams CLA 2005-07-18 16:40:34 EDT
Well, I was hoping "non-ant" would be obvious, as in tha maven example in
comment #33 .. its had a relatively fixed, well known and documented "namespace"
that includes POM or maven ... and ant, if it had one, would have included "ant". 

But since ant does not have a standard, fixed dtd, much less a standard
namespace, there's not much you can do ... if a dtd used. Maybe you could detect
that a schema or namespace was being used (the "special" xmlns and xmlns:xsi
attributes of the project element in comment #33, and rule out those cases). In
the original report from the WTP client, the expectation was that the use of the
schema would be enough to get it recognized as "non-ant" (or .. as xml, ... or
as maven content type, eventually?). 
Comment 43 Darin Swanson CLA 2005-08-16 00:26:16 EDT
Released into the 3.2 stream for more testing.

Changes to AntHandler.
Comment 44 Karice McIntyre CLA 2005-08-29 16:31:08 EDT
Darin, is this bug being considered for release into 3.1.1?  If not, then the 
Target field should be changed.  Thanks.
Comment 45 Darin Swanson CLA 2005-08-29 16:35:45 EDT
The bug is tagged correctly...it will be delieved for 3.1.1
Comment 46 Darin Swanson CLA 2005-09-01 10:29:06 EDT
The AntHandler will now return indeterminate for a file which has a top level 
project element but only has a deeply nested target element (before it 
returned valid for this content).

Released to both 3.2 and to 3.1.1. 
Comment 47 Darin Swanson CLA 2005-09-01 10:29:44 EDT
Please verify DarinW.
Comment 48 Bob Foster CLA 2005-09-01 12:45:53 EDT
The problem statement and "solution" are just an ugly hack to work around a
fundamental problem in the text API. It should be possible for multiple editors
to detect their "own" content type without conflicting, as discussed in Bug 107682.
Comment 49 Missing name CLA 2005-10-03 18:16:01 EDT
Please refer to bug 111313
Comment 50 Stephen Lynch CLA 2005-10-20 06:54:45 EDT
(In reply to comment #46)
> The AntHandler will now return indeterminate for a file which has a top level 
> project element but only has a deeply nested target element (before it 
> returned valid for this content).
> 
> Released to both 3.2 and to 3.1.1. 

I have tested the following with 3.2 and 3.1.1 and there is still a problem wrt
xml files with a <project> tag in them.

If you create an xml document with <project></project> in it and save it as
maven.xml, eclipse uses the default txt editor for it which is correct.

If you edit it again and change the <project> to <project default="abc"> it then
thinks its an ant build file which in this case it isn't as maven uses this
structure too to specify the default goal to execute e.g.

<?xml version="1.0" encoding="UTF-8"?>
	<project default="buildAll" xmlns:m="jelly:maven">
    <goal name="buildAll">
    	<m:reactor basedir="${basedir}"
               includes="**/project.xml"
               goals="all"
               banner="Building applications"
               ignoreFailures="false"/>
               
    </goal>
</project>

Is it possible to stop using the ant editor for files matching maven.xml?
Comment 51 Rafael Chaves CLA 2005-10-20 08:27:07 EDT
The Ant content type definition does not say it wants maven.xml files, only
*.xml files. The problem is that there is no content type for Maven. If there
was one, it would work as you want. 

What you have to do in this case is to associate the maven.xml file name to the
XML content type (see Window > Preferences..., General > Content Type). Then,
Maven files will be treated as any other XML file.
Comment 52 Darin Swanson CLA 2005-10-20 11:12:04 EDT
You can also use the Ant editor to edit maven.xml files but turn off the error 
reporting.
Window>Preferences>Ant>Editor>Problems
Set the names of files that will not be checked for problems in the Ant editor.
Comment 53 Darin Swanson CLA 2005-10-26 17:19:15 EDT
*** Bug 113896 has been marked as a duplicate of this bug. ***
Comment 54 Darin Wright CLA 2006-01-18 15:11:50 EST
Verified.