Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 368068 - [epub] HTML parsers should use jsoup instead of SAX
Summary: [epub] HTML parsers should use jsoup instead of SAX
Status: CLOSED MOVED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Mylyn (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Torkild Resheim CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 398103
Blocks:
  Show dependency tree
 
Reported: 2012-01-06 18:00 EST by Torkild Resheim CLA
Modified: 2013-04-16 16:21 EDT (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Torkild Resheim CLA 2012-01-06 18:00:24 EST
While it is required that HTML for EPUB is well formed, this may not always be the case. The part of the EPUB tooling used for generating the table of contents and detecting referenced resources will fail if the HTML is not well formed. **jsoup** (http://jsoup.org/) could be used instead as it it's much better at handling bad HTML and has the additional benefit of being able to clean up the HTML. Options could be added to the EPUB generator for enabling these features in order to ensure that the final EPUB is correct.

See also bug 357294.
Comment 1 David Green CLA 2012-02-02 13:56:36 EST
Mylyn Docs is now free to use jsoup, based on the following CQ:

5978: jsoup Version: 1.6.1 (ATO CQ5559)
https://dev.eclipse.org/ipzilla/show_bug.cgi?id=5978

Also jsoup has just been added to Orbit (available in the latest Stable build http://download.eclipse.org/tools/orbit/downloads/drops/S20120123151124/)
Comment 2 Torkild Resheim CLA 2012-05-11 03:12:26 EDT
Moving to new EPUB component.
Comment 3 Eclipse Webmaster CLA 2022-11-15 11:45:08 EST
Mylyn has been restructured, and our issue tracking has moved to GitHub [1].

We are closing ~14K Bugzilla issues to give the new team a fresh start. If you feel that this issue is still relevant, please create a new one on GitHub.

[1] https://github.com/orgs/eclipse-mylyn