| Summary: | [generator] JavaIoFileSystemAccess always uses default char encoding | ||
|---|---|---|---|
| Product: | [Modeling] TMF | Reporter: | Jan Koehnlein <jan> |
| Component: | Xtext | Assignee: | Jan Koehnlein <jan> |
| Status: | CLOSED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | P3 | CC: | sebastian.zarnekow, stephan.herrmann, tmf.xtext-inbox |
| Version: | 2.2.1 | Flags: | jan:
juno+
|
| Target Milestone: | M5 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
|
Description
Jan Koehnlein
When writing files we should use the encoding provider of the written file type (if available) and not of the language that is contributing the generator. Otherwise the files could probably never be read correctly by the language implementation, could they? (In reply to comment #1) Yes, but usually there is no provider for the target language. Otherwise we'd need IResourceServiceProviders for all target languages. (In reply to comment #2) > (In reply to comment #1) > > Yes, but usually there is no provider for the target language. Otherwise we'd > need IResourceServiceProviders for all target languages. I think it's reasonable to assume that the target language will be read with the default encoding of the system / of the containing eclipse project / folder if no ResourceServiceProvider is available. Pushed to MASTER. (In reply to comment #3) > (In reply to comment #2) > > (In reply to comment #1) > > > > Yes, but usually there is no provider for the target language. Otherwise we'd > > need IResourceServiceProviders for all target languages. > > I think it's reasonable to assume that the target language will be read with > the default encoding of the system / of the containing eclipse project / folder > if no ResourceServiceProvider is available. Using the default encoding of the containing eclipse project would be fine, but looking at the corresponding commit that doesn't seem to be implemented? Any recommendations how to achieve that effect? The encoding is determined like this:
protected String getEncoding(IFile fileToBeWritten) throws CoreException {
return fileToBeWritten.getCharset(true);
}
What problems do you face with that approach?
(In reply to comment #6) > The encoding is determined like this: > > protected String getEncoding(IFile fileToBeWritten) throws CoreException { > return fileToBeWritten.getCharset(true); > } > > What problems do you face with that approach? Sorry, I missed the actual subject of the ticket. I'm afraid you'll have to provide an IEncodingProvider that tries to read the encoding from the .settings folder. I don't see another option. I think the following is pretty common: - workspace has platform dependent default encoding, nobody cares because ... - all projects properly define UTF-8 as their encoding Now, shouldn't all files be generated in UTF-8 by default? But they aren't, they still use the workspace default. I had hoped that this bug would fix the common problems once and for all :) Currently, I worked around this by sub-classing JavaIoFileSystemAccess and letting the generator call myFileSystemAccess.setEncoding(enc). I have no idea about the impact of providing my on IEncodingProvider, but I'm afraid doing relevant work there will further degrade performance when *reading* lots of resources, whereas I only want to influence the *writing* of resources created by the file system access. Maybe I'll try to make my file system access smarter to infer the encoding from any enclosing folder inside the workspace ... Reopening because I have a solution that seems to deliver what was already promised in comment 3 :) (In reply to comment #3) > I think it's reasonable to assume that the target language will be read with > the default encoding of the system / of the containing eclipse project / folder > if no ResourceServiceProvider is available. Here's what I came up with: In the JavaIoFileSystemAccess implement getEncoding like so: protected String getEncoding(URI fileURI) { IResourceServiceProvider resourceServiceProvider = registry.getResourceServiceProvider(fileURI); if(resourceServiceProvider != null) return resourceServiceProvider.getEncodingProvider().getEncoding(fileURI); String containerEncoding = ResourceUtil.findEncodingFromContainer(fileURI); if (containerEncoding != null) return containerEncoding; return encodingProvider.getEncoding(fileURI); } where ResourceUtil.findEncodingFromContainer is this: public static String findEncodingFromContainer(URI uri) { try { IWorkspaceRoot wsRoot = ResourcesPlugin.getWorkspace().getRoot(); IContainer container = null; if (uri.isPlatformResource()) { IPath path = new Path(uri.toPlatformString(true)); container = wsRoot.getProject(path.segment(0)); for (int s=1; s<path.segmentCount()-1; s++) container = container.getFolder(new Path(path.segment(s))); } else if (uri.isFile()) { IPath path = new Path(uri.toFileString()); container = wsRoot.getContainerForLocation(path); } if (container != null) return container.getDefaultCharset(); } catch (Throwable t) { // nop } return null; // no encoding found } Sorry about the layout ... I'm afaid your solution does not work in general since it assumes a workspace to be present. Btw: What your utility does is pretty much the same as what IFile.getCharset(true) does. Injecting an own IEncodingProvider that deals with the file extension of the generated artifacts is the way to go. There you could do he WS specific stuff but it won't work for Xtext in general. (In reply to comment #10) > I'm afaid your solution does not work in general since it assumes a workspace > to be present. Sorry, my code snippet isn't beautiful, but if no workspace is opened it just catches the IllegalStateException and continues with the original strategy. So all is clean in fact. > Btw: What your utility does is pretty much the same as what > IFile.getCharset(true) does. thanks for the hint. > Injecting an own IEncodingProvider that deals with the file extension of the > generated artifacts is the way to go. There you could do he WS specific stuff > but it won't work for Xtext in general. Can I register an IEncodingProvider just for the *writing* case?? I'll stick to my special file system access. I just wanted to help that future users of Xtext wouldn't even have to worry about this. I understood that your goal was exactly to respect the encoding of the enclosing folder / project. Why not implement this for the case where a workspace exists? I'm sure that's what people typically expect. (In reply to comment #11) > (In reply to comment #10) > > I'm afaid your solution does not work in general since it assumes a workspace > > to be present. > > Sorry, my code snippet isn't beautiful, but if no workspace is opened > it just catches the IllegalStateException and continues with the > original strategy. So all is clean in fact. > Well, if you have a workspace available, you could and should use the EclipseResourceFileSystemAccess2 instead of the JavaIoFileSystemAccess. It's not a good idea to introduce a dep to the resources abstraction from a component that's only dependent on Java io files. > Can I register an IEncodingProvider just for the *writing* case?? > Why would you want to do that? If I'm not mistaken, you have a project setting for the encoding which says 'UTF-8' or something. Reading and writing files to that project should all be performed with that encoding, shouldn't it? Requested via bug 522520. -M. |