Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 338494 - [xpath2] prohibiting xpath expressions starting with / or // to be parsed
Summary: [xpath2] prohibiting xpath expressions starting with / or // to be parsed
Status: RESOLVED FIXED
Alias: None
Product: WTP Source Editing
Classification: WebTools
Component: wst.xpath (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact: Jesper Moller CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-28 19:48 EST by Mukul Gandhi CLA
Modified: 2013-01-22 17:19 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mukul Gandhi CLA 2011-02-28 19:48:20 EST
Build Identifier: I20090313-0100

XML Schema 1.1 assertions requires that the assert XDM tree must be rooted at a parentless element (this was recently clarified on XML Schema WG comments list). This means that schema 1.1 assertions XPath expressions starting with / or // must not be allowed (for e.g /x, //x, x[/a] etc) while writing schema 1.1 assertions. I'm providing an implementation of this feature and would be committing it asap with a test case. I made sure that this addition (via a new constructor in JFlexCupParser and a new class) to code base doesn't fail any existing stuff.

Reproducible: Always
Comment 1 Mukul Gandhi CLA 2011-02-28 20:06:58 EST
i've committed the changes (along with few test cases) for this bug report both on HEAD location and the service branch. i'm marking this bug resolved.
Comment 2 Mukul Gandhi CLA 2011-02-28 22:30:30 EST
a slight correction.
(In reply to comment #0)

> via a new constructor in JFlexCupParser
the constructor is still same in JFlexCupParser, but i've added an overloaded "parse" method in JFlexCupParser.

Thanks.
Comment 3 David Carver CLA 2011-03-01 17:56:51 EST
(In reply to comment #2)
> a slight correction.
> (In reply to comment #0)
> 
> > via a new constructor in JFlexCupParser
> the constructor is still same in JFlexCupParser, but i've added an overloaded
> "parse" method in JFlexCupParser.
> 
> Thanks.

One thing to note here, there may not be another maintenance release for WTP.  3.2.3 was probably the last one, and that just happened recently.
Comment 4 Jesper Moller CLA 2011-03-01 19:12:39 EST
There are a number of other "things" that the host language may define, e.g. which axes are available.
We should add API for that as well.
Comment 5 Jesper Moller CLA 2011-04-16 19:39:10 EDT
Mukul, I'm looking at the implementation right now, and I've noticed two concerns:

1) Are you sure you are covering all the problematic issues this way - what about the different axis, the parent (..) step, fn:root, fn:doc function, and many other constructs?
The new API has the required ways to fix this, please see org.eclipse.wst.xml.xpath2.api.XPath2Expression. The idea is to allow parsing of the offending expression, but allow the caller to inspect certain aspects of the expression (without using the internal classes).

2) Hard-coding the grammar production numbers into Java is not a good idea, since they may change on grammar changes. In the interest of backwards compatibility, I'm changing the implementation to fix this.
Comment 6 Mukul Gandhi CLA 2011-04-17 10:25:10 EDT
(In reply to comment #5)
> 1) Are you sure you are covering all the problematic issues this way - what
> about the different axis, the parent (..) step, fn:root, fn:doc function, and
> many other constructs?

Can you please elaborate, how the above cases relate to the use case I was trying to solve with this fix?

> The new API has the required ways to fix this, please see
> org.eclipse.wst.xml.xpath2.api.XPath2Expression.

I'm not sure how this new interface and APIs exposed in it will help with this use case. Can you please elaborate?
It seems the following methods in interface XPath2Expression (and it's implementation with org.eclipse.wst.xml.xpath2.processor.ast.XPath)

getFreeVariables()
getResolvedFunctions()
getAxes()

are still unimplemented. Do you intend to provide some kind of implementation in these?

> The idea is to allow parsing
> of the offending expression, but allow the caller to inspect certain aspects > of the expression (without using the internal classes).

I think that's a good idea.

> 2) Hard-coding the grammar production numbers into Java is not a good idea,
> since they may change on grammar changes. In the interest of backwards
> compatibility, I'm changing the implementation to fix this.

I agree. If you're doing an improvement for this, kindly ensure that the existing functionality provided in this fix is retained with your modifications related to the changes, you would be doing (since this fix is currently used by Xerces).

Ideally I was looking for something like this in PsychoPath implementation,
i.e implementation of following dynamic context error (quoted from XPath 2.0 spec) with the supporting XDM processing model.
XPDY0050 : It is a dynamic error if the dynamic type of the operand of a treat expression does not match the sequence type specified by the treat expression. This error might also be raised by a path expression beginning with "/" or "//" if the context node is not in a tree that is rooted at a document node. This is because a leading "/" or "//" in a path expression is an abbreviation for an initial step that includes the clause treat as document-node().

This is currently unimplemented in PsychoPath, which led me to write this fix. I would be inclined to revert this current fix, provided we have an implementation of the error code XPDY0050 with corresponding processing model implementation.

Thanks.
Comment 7 Jesper Moller CLA 2011-04-17 13:49:12 EDT
(In reply to comment #6)
> (In reply to comment #5)
> > 1) Are you sure you are covering all the problematic issues this way - what
> > about the different axis, the parent (..) step, fn:root, fn:doc function, and
> > many other constructs?
> 
> Can you please elaborate, how the above cases relate to the use case I was
> trying to solve with this fix?

As I've read the XML Schema 1.1 spec, it is not allowed to look outside of the subtree which is handled in an assert. / and // can cause this, but so can .. and a number of axis traversals.
So I tried to fix this in the XPath2Expression interface.

> > The new API has the required ways to fix this, please see
> > org.eclipse.wst.xml.xpath2.api.XPath2Expression.
> 
> I'm not sure how this new interface and APIs exposed in it will help with this
> use case. Can you please elaborate?
> It seems the following methods in interface XPath2Expression (and it's
> implementation with org.eclipse.wst.xml.xpath2.processor.ast.XPath)
> 
> getFreeVariables()
> getResolvedFunctions()
> getAxes()
> 
> are still unimplemented. Do you intend to provide some kind of implementation
> in these?

Whoops. I left the implementation of these on a workspace and forgot the commit. I'll fix this ASAP!

> > The idea is to allow parsing
> > of the offending expression, but allow the caller to inspect certain aspects > of the expression (without using the internal classes).
> 
> I think that's a good idea.

Thought you would!

> > 2) Hard-coding the grammar production numbers into Java is not a good idea,
> > since they may change on grammar changes. In the interest of backwards
> > compatibility, I'm changing the implementation to fix this.
> 
> I agree. If you're doing an improvement for this, kindly ensure that the
> existing functionality provided in this fix is retained with your modifications
> related to the changes, you would be doing (since this fix is currently used by
> Xerces).
> 
> Ideally I was looking for something like this in PsychoPath implementation,
> i.e implementation of following dynamic context error (quoted from XPath 2.0
> spec) with the supporting XDM processing model.
> XPDY0050 : It is a dynamic error if the dynamic type of the operand of a treat
> expression does not match the sequence type specified by the treat expression.
> This error might also be raised by a path expression beginning with "/" or "//"
> if the context node is not in a tree that is rooted at a document node. This is
> because a leading "/" or "//" in a path expression is an abbreviation for an
> initial step that includes the clause treat as document-node().

Ok, I'll fix this.
 
> This is currently unimplemented in PsychoPath, which led me to write this fix.
> I would be inclined to revert this current fix, provided we have an
> implementation of the error code XPDY0050 with corresponding processing model
> implementation.

Cool. I'll leave this is a backwards compatible manner, but I'm hoping you'll move to the new API for Xerces as soon as possible, as per our previous discussions.
Comment 8 Jesper Moller CLA 2011-04-17 14:02:29 EDT
(In reply to comment #7)
> (In reply to comment #6)
> > Ideally I was looking for something like this in PsychoPath implementation,
> > i.e implementation of following dynamic context error (quoted from XPath 2.0
> > spec) with the supporting XDM processing model.
> > XPDY0050 : It is a dynamic error if the dynamic type of the operand of a treat
> > expression does not match the sequence type specified by the treat expression.
> > This error might also be raised by a path expression beginning with "/" or "//"
> > if the context node is not in a tree that is rooted at a document node. This is
> > because a leading "/" or "//" in a path expression is an abbreviation for an
> > initial step that includes the clause treat as document-node().
> 
> Ok, I'll fix this.

In fact, this is what DynamicContext.getLimitNode is also for. It signifies that the expression is limited to this node, rather than a document node. This functionality will be committed before this weeks coming I-build.

For instance, when validating an assertion on a document, you can set the LimitNode to the element being validated, and the XPath2 engine will not allow reaching nodes above this.
Comment 9 Jesper Moller CLA 2013-01-22 17:19:42 EST
I just re-fixed the original implementation (on master) since the original implementation relied on disallowing certain grammar reductions, mentioned by number (i.e. magic constants).
Will be released shortly.