Community
Participate
Working Groups
http://weblogs.java.net/blog/kohsuke/archive/2006/03/canonicalizatio.html
http://www.w3.org/TR/xml-c14n * The document is encoded in UTF-8 * Line breaks normalized to #xA on input, before parsing * Attribute values are normalized, as if by a validating processor * Character and parsed entity references are replaced * CDATA sections are replaced with their character content * The XML declaration and document type declaration (DTD) are removed * Empty elements are converted to start-end tag pairs * Whitespace outside of the document element and within start and end tags is normalized * All whitespace in character content is retained (excluding characters removed during line feed normalization) * Attribute value delimiters are set to quotation marks (double quotes) * Special characters in attribute values and character content are replaced by character references * Superfluous namespace declarations are removed from each element * Default attributes are added to each element * Lexicographic order is imposed on the namespace declarations and attributes of each element
Using Apache security canonicalizer: DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder(); Document doc = db.newDocument(); marshaller.marshal(obj, doc); doc.normalize(); Init.init(); byte[] c14nOutputbytes = Canonicalizer.getInstance( Canonicalizer.ALGO_ID_C14N_WITH_COMMENTS) .canonicalizeSubtree(doc.getDocumentElement()); // Re-parse to get attributes in alpha order Document canonical = db.parse(new ByteArrayInputStream(c14nOutputbytes));
Created attachment 216127 [details] Test case
The Eclipselink project has moved to Github: https://github.com/eclipse-ee4j/eclipselink