| Summary: | [parser] Incorrect parsing of XML/HTML escape symbols after editing | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [WebTools] WTP Source Editing | Reporter: | Yahor Radtsevich <yahorr> | ||||||||||||
| Component: | wst.xml | Assignee: | Salvador Zalapa <zalapa> | ||||||||||||
| Status: | RESOLVED FIXED | QA Contact: | Nitin Dahyabhai <thatnitind> | ||||||||||||
| Severity: | normal | ||||||||||||||
| Priority: | P3 | CC: | nsand.dev | ||||||||||||
| Version: | unspecified | Flags: | nsand.dev:
review+
|
||||||||||||
| Target Milestone: | 3.4.2 | ||||||||||||||
| Hardware: | PC | ||||||||||||||
| OS: | Windows 7 | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Attachments: |
|
||||||||||||||
|
Description
Yahor Radtsevich
Created attachment 203364 [details]
Several screenshots to reproduce
Screenshot demonstrating this bug are attached. I have used HTML Page Designer to illustrating purposes, but the result is the same for any XML/HTML editor.
Since after breaking down the EntityName region, it becomes in a XML_Content region. So, when the heuristics are triggered (in order do not re parse the region), the XMLContentRegion.UpdateRegion() method consider that if the change's lenght == 0 (withspace deleted) it can handle the change by itself, by just updating the region indexes (without any parse action). This initial patch adds a new heuristic to the XML_Content.UpdateRegion() method, if the content's length is between 4 and 10, it could be a potential EntityName region, so it should be parsed. The scenario reported is covered by this proposed patch however i am still facing an issue why i try to join back the following (due to i am getting 2 regions here): &Aac ute; I was wondering if this patch is sufficient and adequate to cover this bug? i am still trying to figure out a solution for the other scenario. Created attachment 217380 [details]
Intial Patch
Attaching a second patch version, this patch covers all the scenarios. As i said in comment#2, the proposed heuristic consists in detect if the length of the region is between 4 and 10, it is a potential EntityName region, so a reparse is triggered. In this new patch, this also hadles the scenario: &Aac ute; For this i did add an extra rule to the XMLTokenizer, to handle a entity region decomposed just as one XML_Content instead of two or more (in order to be cached just one region, so the reparse can be performed properly). Created attachment 218036 [details]
Second proposed patch
Created attachment 218050 [details]
Second proposed patch (fixed)
In the last patch i forgot to delete some system.out.println sentences.
Created attachment 218284 [details]
Patch (head version)
My mistake again, the last patch used an old WTP version, this one is on head version.
Hi Chava, is there any way to accomplish this without being dependent on the length of the region being between 4 and 10 characters? Maybe something based on the text contents or region type. https://github.com/zalapa/webtools.sourceediting/commit/425850e9d8f296d0c0ea156eba27e251f3c49386 Adding the new version, this is filtering the text started with "#" and ended with ";" Patch from the remote repository looks good. Thanks, Chava. |