Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 370771

Summary: XSD2Ecore transformation does not support complex type restrictions for xs:any
Product: [Modeling] EMF Reporter: Nicolas Rouquette <nicolas.f.rouquette>
Component: XSDAssignee: Ed Merks <Ed.Merks>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: nicolas.f.rouquette
Version: unspecified   
Target Milestone: ---   
Hardware: Macintosh   
OS: Mac OS X - Carbon (unsup.)   
Whiteboard:
Attachments:
Description Flags
QualifiedTest1.xsd
none
UnqualifiedTest1.xsd
none
Same as UnqualifiedTest1.xsd except with elementFormDefault="qualified" none

Description Nicolas Rouquette CLA 2012-02-06 16:15:00 EST
Build Identifier: M20110210-1200

I have reduced the problem to a pair of 2 small schemas with small differences between them.

The desired schema looks like this:

  
   <xs:complexType abstract="true" ecore:mixed="true" name="IAbstractComponent">
      <xs:sequence>
         <xs:any maxOccurs="unbounded" minOccurs="0"/>
      </xs:sequence>
      <xs:attribute name="name" type="xs:string" use="required"/>
   </xs:complexType>

   <xs:complexType abstract="true" ecore:mixed="true" name="IComponent">
      <xs:complexContent>
         <xs:restriction base="ex:IAbstractComponent">
            <xs:sequence>
               <xs:element form="qualified" maxOccurs="unbounded" minOccurs="0" name="VariableElement" type="ex:IVariableIdentifier"/>
            </xs:sequence>
         </xs:restriction>
      </xs:complexContent>
   </xs:complexType>

The problem is that the generated Ecore class does not have an attribute: IComponent.variableElement.

If we remove qualification, then the XSD2Ecore transformation produces the desired attribute:

   <xs:complexType abstract="true" ecore:mixed="true" name="IAbstractComponent">
      <xs:sequence>
         <xs:any maxOccurs="unbounded" minOccurs="0"/>
      </xs:sequence>
      <xs:attribute name="name" type="xs:string" use="required"/>
   </xs:complexType>

   <xs:complexType abstract="true" ecore:mixed="true" name="IComponent">
      <xs:complexContent>
         <xs:restriction base="ex:IAbstractComponent">
            <xs:sequence>
               <xs:element maxOccurs="unbounded" minOccurs="0" name="VariableElement" type="ex:IVariableIdentifier"/>
            </xs:sequence>
         </xs:restriction>
      </xs:complexContent>
   </xs:complexType>

The difference between them is that the former yields a schema that supports robust validation of XML documents.
In the latter case, the validation is weaker because the unqualified elements used in the restriction.




Reproducible: Always

Steps to Reproduce:
1. Load the qualified schema, QualifiedTest1.xsd
2. New > EMF Genmodel
3. The generated ecore file, qualified1.ecore, is missing an attribute: IComponent.variableElement

Compare this with:

1. Load the qualified schema, UnualifiedTest1.xsd
2. New > EMF Genmodel
3. The generated ecore file, unqualified1.ecore, does have an attribute: IComponent.variableElement
Comment 1 Nicolas Rouquette CLA 2012-02-06 16:16:27 EST
Created attachment 210619 [details]
QualifiedTest1.xsd

This is the intended schema that, unfortunately, does not work with the XSD2Ecore transformation
Comment 2 Nicolas Rouquette CLA 2012-02-06 16:20:22 EST
Created attachment 210621 [details]
UnqualifiedTest1.xsd

This is the schema that we currently have to write to get support from the XSD2Ecore transformation.

Unfortunately, this schema doesn't validate cleanly:

Engine name: Saxon-EE 9.3.0.5
Severity: warning
Description: There is no global element declaration named {VariableElement}, so the strict wildcard in the base type can never be satisfied
Start location: 30:0

Engine name: Saxon-EE 9.3.0.5
Severity: warning
Description: There is no global element declaration named {Source}, so the strict wildcard in the base type can never be satisfied
Start location: 46:0

More importantly, the unqualified schema poses compatibility with other tools that expect XML documents to be valid w.r.t. their XSD using qualified namespaces.
Comment 3 Ed Merks CLA 2012-02-07 06:35:06 EST
The logic for dealing with the features of a restriction intentionally ignores redundant features.

            boolean isRedundant = false;
            if (isRestriction)
            {
              isRedundant =
                extendedMetaData.getElement
                  (baseClass, xsdElementDeclaration.getTargetNamespace(), xsdElementDeclaration.getName()) != null;

              if (!isRedundant)
              {
                group =
                  extendedMetaData.getElementWildcardAffiliation
                    (baseClass, xsdElementDeclaration.getTargetNamespace(), xsdElementDeclaration.getName());
              }

But the element is only redundant because it's globally declared as well, rather than already being declared in the actual inheritance hierarchy.  So maybe we shouldn't consider such cases redundant.

The Saxon warning seems bogus though, because if the wildcard is allowed to resolve to local element declarations, then it should have noticed there was a derived type with elements that match.  Are your global element declarations there only to make SAXON happy?
Comment 4 Nicolas Rouquette CLA 2012-02-07 12:26:29 EST
(In reply to comment #3)
> The logic for dealing with the features of a restriction intentionally ignores
> redundant features.
> 
>             boolean isRedundant = false;
>             if (isRestriction)
>             {
>               isRedundant =
>                 extendedMetaData.getElement
>                   (baseClass, xsdElementDeclaration.getTargetNamespace(),
> xsdElementDeclaration.getName()) != null;
> 
>               if (!isRedundant)
>               {
>                 group =
>                   extendedMetaData.getElementWildcardAffiliation
>                     (baseClass, xsdElementDeclaration.getTargetNamespace(),
> xsdElementDeclaration.getName());
>               }
> 
> But the element is only redundant because it's globally declared as well,
> rather than already being declared in the actual inheritance hierarchy.  So
> maybe we shouldn't consider such cases redundant.

Yes!

This may be related to a bizarre behavior I've noticed when we turn on elementFormDefault="qualified" in the header.
With qualification by default, the XSD2Ecore transformation ignores all references to global element declarations.

Is there a way to prototype a fix for this?

I'd be happy to try it. 
 
> The Saxon warning seems bogus though, because if the wildcard is allowed to
> resolve to local element declarations, then it should have noticed there was a
> derived type with elements that match.  Are your global element declarations
> there only to make SAXON happy?

The global element declarations are there because they are referenced in several places in the schema.

- Nicolas.
Comment 5 Nicolas Rouquette CLA 2012-02-07 12:34:51 EST
Created attachment 210667 [details]
Same as UnqualifiedTest1.xsd except with elementFormDefault="qualified"

With global element qualification by default, XSD2Ecore fails to generate the following attributes:

IAbstractComponent.source
IAbstractComponent.variableElement

IComponent.variableElement

- Nicolas.
Comment 6 Ed Merks CLA 2012-02-07 12:47:17 EST
The guard is there only for restrictions and in general EMF doesn't do much with restrictions because they don't correspond to anything that maps in a meaningful way to something in Ecore on in Java. So I don't see this as something that causes a general problem.

Why do you define this locally when you have a global element you can reference?


   <xs:element form="qualified" maxOccurs="unbounded" minOccurs="0" name="VariableElement" type="ex:IVariableIdentifier"/>
 

Note that whether EMF generates the features (and accessors in Java) for the restricted class's elements or attributes really don't affect what content you'll be able to parse.  In any case, the "elements" will appear in the wildcard and will be accessible the same way as anything else allowed by the wildcard.  Generated accessors will be at best a convenience in place of getAny().list(XyzPackage.Literals.DOCUMENT_ROOT__VARIABLE_ELEMENT) and nothing will prevent clients from adding things to your wildcard that your restriction is trying to exclude.

Hopefully I can find time in the coming weeks to take a closer look.
Comment 7 Ed Merks CLA 2012-02-07 12:49:52 EST
IAbstractComponent doesn't have the feature you say are missing.


   <xs:complexType abstract="true" ecore:mixed="true" name="IAbstractComponent">
      <xs:sequence>
         <xs:any maxOccurs="unbounded" minOccurs="0"/>
      </xs:sequence>
      <xs:attribute name="name" type="xs:string" use="required"/>
   </xs:complexType>
Comment 8 Nicolas Rouquette CLA 2012-02-07 15:43:42 EST
(In reply to comment #7)
> IAbstractComponent doesn't have the feature you say are missing.
> 
> 
>    <xs:complexType abstract="true" ecore:mixed="true"
> name="IAbstractComponent">
>       <xs:sequence>
>          <xs:any maxOccurs="unbounded" minOccurs="0"/>
>       </xs:sequence>
>       <xs:attribute name="name" type="xs:string" use="required"/>
>    </xs:complexType>

Sorry, what's missing are:

Analysis.source
Analysis.variableElement

IComponent.variableElement

- Nicolas.
Comment 9 Nicolas Rouquette CLA 2012-02-07 15:59:39 EST
(In reply to comment #6)
> The guard is there only for restrictions and in general EMF doesn't do much
> with restrictions because they don't correspond to anything that maps in a
> meaningful way to something in Ecore on in Java. So I don't see this as
> something that causes a general problem.

There is a subtlety here.

In the qualified version, Analysis & IComponent will both accept nested ex:VariableElement elements of type ex:IVariableIdentifier.

However, Analysis allows a nested ex:Source element that must be before any ex:VariableElement. That is, the restriction specifies an important ordering constraint amongst the possible nested elements. 

> Why do you define this locally when you have a global element you can
> reference?
>
>    <xs:element form="qualified" maxOccurs="unbounded" minOccurs="0"
> name="VariableElement" type="ex:IVariableIdentifier"/>

This is a simplified version of a third-party schema where the authors
like this idiom because it enables them to change the definition of the element (ex:VariableElement) in one place instead of everywhere the type would be referenced (ex:IVariableIdentifier).

> Note that whether EMF generates the features (and accessors in Java) for the
> restricted class's elements or attributes really don't affect what content
> you'll be able to parse. 

The problem is in the other direction; i.e., we create an Ecore model whose metamodel is the XSD2Ecore-generated metamodel. Then we serialize this Ecore model in XML, not XMI.

Since the XSD2Ecore transformation ignores the fact that the elements are in a sequence in the complex type, it cannot guarantee valid-by-transformation for the XML documents.

With the unqualified schema, we can easily produce the following invalid XML document:

<ex:Analysis ...>
    <VariableElement .../>
    <Source .../>
    <VariableElement .../>
</ex:Analysis>

If we had support for qualification, then we would get:

<ex:Analysis ...>
    <ex:VariableElement .../>
    <ex:Source .../>
    <ex:VariableElement .../>
</ex:Analysis>

In either case, such XML documents aren't valid w.r.t. the XSD because the XSD2Ecore transformation didn't preserve the ordering constraints implied by the complex type extension/restriction with sequencing.

> In any case, the "elements" will appear in the
> wildcard and will be accessible the same way as anything else allowed by the
> wildcard.  Generated accessors will be at best a convenience in place of
> getAny().list(XyzPackage.Literals.DOCUMENT_ROOT__VARIABLE_ELEMENT) and nothing
> will prevent clients from adding things to your wildcard that your restriction
> is trying to exclude.

Yes except that in this case, ex:Analysis is really saying:

- there can be arbitrary nested elements, including ex:Source & ex:VariableElement
- there must be one nested ex:Source element
- any nested ex:VariableElement must be after the ex:Source element

> Hopefully I can find time in the coming weeks to take a closer look.

ok.

- Nicolas.
Comment 10 Ed Merks CLA 2012-02-07 16:07:07 EST
As I said though, little to nothing about the restriction is enforced.  Even with the features in the restricted class, they still delegate to the feature map and will not enforce a specific order as a result of that.
Comment 11 Nicolas Rouquette CLA 2012-02-07 16:44:29 EST
For the restriction, I can live without the ordering constraint.

The problem for us is the fact that we don't get the generated features when we use qualification (QualifiedTest1.xsd or QualifiedTest3.xsd).
This is particularly problematic for model transformation purposes because the ECore metamodel does not expose EFeatureMap.Entry as a metaclass;
it's only an API class.

This means that there is no model-level equivalent for what you described at the API level:

getAny().list(XyzPackage.Literals.DOCUMENT_ROOT__VARIABLE_ELEMENT) 

To see this, open the Metamodel Explorer view and look at what is available about the ecore metamodel.
You'll see that EFeatureMap, EFeatureMapEntry are opaque types

This means that whenever we get an EClass with a feature map-based implementation, it is very important to have the operations to access known features; e.g.:

Analysis.source
Analysis.variableElement

IComponent.variableElement


- Nicolas.
Comment 12 Ed Merks CLA 2012-04-05 00:17:37 EDT
I changed the logic to this:

EStructuralFeature element = extendedMetaData.getElement(baseClass, xsdElementDeclaration.getTargetNamespace(), xsdElementDeclaration.getName());
isRedundant = element != null && !extendedMetaData.isDocumentRoot(element.getEContainingClass());

As such, a feature is redundant only if it appears in the actual class hierarchy, not if it's present as a global element, i.e., not if it is in a document root.
Comment 13 Ed Merks CLA 2012-04-05 00:38:02 EDT
The changes are committed to git for 2.8.
Comment 14 Ed Merks CLA 2012-06-11 00:58:10 EDT
The changes are available in recent builds and will be part of the Juno release.