| Summary: | DOT grammar does not accept unicode characters in string ID | ||
|---|---|---|---|
| Product: | [Tools] GEF | Reporter: | Alexander Nyßen <nyssen> |
| Component: | GEF DOT | Assignee: | Alexander Nyßen <nyssen> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | P3 | ||
| Version: | 0.2.0 | ||
| Target Milestone: | 4.0.0 (Neon) M7 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
I pushed the following changes to origin/master: - Changed grammer rule to accept characters from unicode range \u0080 to \u00FF within ids. This corresponds to octal 200-377, which is specified as range within the DOT grammar language definition. - Adjusted sample_input.dot to contain a special char in an unquoted id. Resolving as fixed in 4.0.0 M7. |
The dot language definition allows that string ids are "Any string of alphabetic ([a-zA-Z\200-\377]) characters, underscores ('_') or digits ([0-9]), not beginning with a digit;". Our DOT grammar up to now only accepts the following: terminal STRING: ('a'..'z' | 'A'..'Z'| '_') ('a'..'z' | 'A'..'Z' '_' | '0'..'9')*; We need to adjust the grammar to accept all that is allowed in DOT.