| Summary: | HtmlStreamTokenizer.unescape..() don't properly handle entities | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | z_Archived | Reporter: | Eugene Kuleshov <ekuleshov> | ||||||||
| Component: | Mylyn | Assignee: | Steffen Pingel <steffen.pingel> | ||||||||
| Status: | RESOLVED FIXED | QA Contact: | |||||||||
| Severity: | normal | ||||||||||
| Priority: | P3 | CC: | robert.elves, shawn.minto, steffen.pingel | ||||||||
| Version: | unspecified | ||||||||||
| Target Milestone: | 2.3 | ||||||||||
| Hardware: | PC | ||||||||||
| OS: | All | ||||||||||
| Whiteboard: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
Eugene Kuleshov
Created attachment 81991 [details]
test case showing the issue
here is simple test case showing the issue
Created attachment 81992 [details]
mylyn/context/zip
The easiest way to fix this is probably use org.apache.commons.lang.StringEscapeUtils.unescapeHtml() from the commons-lang Also, HtmlStreamTokenizer has not been updated in 5 years and recognizes ~114 entity names. StringEscapeUtils (Entities) was changed 3 months ago and currently recognizes ~250 entity names. Eugene, George: is the Commons Lang really the best library for unescaping HTML? That functionality seems a bit misplaced in Lang, so I wonder if we can get it from another library that we're already approved for. Steffen: let me know if you're familiar with anything. Shawn: it's interesting that this class of yours has not been changed for 5 years! Probably time for us to move on ;) Eugene: thanks for the test case, tthat's helpful. (In reply to comment #6) > Eugene: thanks for the test case, tthat's helpful. Thank George. That is his testcase from bug 208073 Steffen: if this isn't already supported by our addition of commons-lang consider for 2.3. Thanks Eugene. I have deprecated the escaping methods in HtmlStreamTokenizer. From now on StringEscapeUtils from the commons lang library should be used instead. Rob, I'll leave the cleanup of the Bugzilla deprecation warnings to you. Created attachment 86742 [details]
mylyn/context/zip
|