Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 108668

Summary: Default text encoding should be set to UTF-8 for all text-based files
Product: [Eclipse Project] Platform Reporter: Alex Blewitt <alex.blewitt>
Component: ResourcesAssignee: Platform-Resources-Inbox <platform-resources-inbox>
Status: RESOLVED WONTFIX QA Contact:
Severity: normal    
Priority: P3 CC: cdtdoug, codex69, contact, cwiscount, daniel_megert, david_williams, denis.roy, devether, dimitar.giormov, d_a_carver, eclipse.org, eclipse, eclipseBugs, gautier.desaintmartinlacaze, jxue01, kapitel29, kitlo, kokojie, kon, lbarbareau, lrozenblyum, marcel, Martin.Spamer, matthieuparet69, mistria, mknauer, mober.at+eclipse, olivier.croisier, paduffy, pwebster, robin, rupert.thurner, sptaszkiewicz, stas, StijnDeWitt, sxenos, tieskey, tlroche, udo.hafermann, wayne.beaton
Version: 4.5.1   
Target Milestone: ---   
Hardware: All   
OS: All   
See Also: https://git.eclipse.org/r/57036
https://git.eclipse.org/r/57040
https://bugs.eclipse.org/bugs/show_bug.cgi?id=516583
https://bugs.eclipse.org/bugs/show_bug.cgi?id=479450
Whiteboard:
Bug Depends on:    
Bug Blocks: 421702    
Attachments:
Description Flags
Screen shot
none
Default Encoding in IntelliJ Community Edition 14
none
UTF-8 from a system in Eclipse none

Description Alex Blewitt CLA 2005-09-02 09:16:20 EDT
I was reading
http://www-128.ibm.com/developerworks/xml/library/x-utf8/?ca=dnt-635 today, and
it got me thinking; why does Eclipse default to a platform-specific encoding for
its file types when creating new files? In the majority of cases, source files
will already be in US-ASCII (which is upwards compatible with UTF-8) and these
days, operating systems are already capable of dealing with UTF-8 documents in
editors (Windows has notepad and wordpad, Mac OS X has TextEdit, and Linux has a
variety of tools (vim, emacs) that also support UTF-8.

This would also avoid potential problems with creating HTML pages, by
disallowing certain windows-codepage characters (those tending to be ` and '
quotes) that don't show up on other platforms.

Obviously Eclipse can be configured to define the default encoding type, but
Eclipse is such a leading player in the IDE market that it makes sense for
Eclipse to take the lead in making UTF-8 the default encoding for text files.
Comment 1 Dani Megert CLA 2005-09-02 09:26:32 EDT
-1 to do this for all kind of text files. Note we already define UTF-8 as
default for files with content-type XML.

Moving to JDT Core to decide whether they want to define a default encoding for
Java source files.

Comment 2 Alex Blewitt CLA 2005-09-02 09:28:48 EDT
Why -1 for all types of files? It will only affect newly created files within
Eclipse, and the default can be changed by users afterwards.

Given that Eclipse is well set up for distributed environments, and indeed,
works on many platforms, then UTF-8 is the only sensible default encoding type.

Granted, there may be good reasons; but can you explain them here for others
interested in the reasoning behind the decision?
Comment 3 Dani Megert CLA 2005-09-02 09:38:25 EDT
Because Eclipse is not just an IDE and more important we should not override the
platform (os) encoding that user has chosen.
Comment 4 Elliotte Rusty Harold CLA 2005-09-02 10:54:39 EDT
The user does not normally choose the platform OS encoding. They install an
operating system and say they're in the U.S./Japan/Quebec/Denmark/ wherever and
a default character set is chosen for them based on that information. On Windows
and the Mac, this encoding is likely to be a local, platform dependent,
non-standard character set. Linux is a little better. You're at least likely to
get a genuine standard character set. However, it still may not be Unicode.

The user has not made an explicit choice of the default encoding at the
operating system level, and generaly cannot make that choice. I think the
vendors should also change their defaults to UTF-8 and Unicode, but until they
do, there's no reaosn for Eclipse to repsect their defaults. 
Comment 5 Alex Blewitt CLA 2005-09-02 11:01:20 EDT
For that matter, Windows users don't get to choose what their encoding is
either. The regional options specify a 'Language for non-Unicode platforms' that
should be used as a fall-back when you have a program that doesn't know what
Unicode is. But Eclipse knows what Unicode is, and can deal with it nicely; even
Windows 2000 supported UTF-8.

Given that all OS vendors are moving towards supporting UTF-8 as a default
option, I think it's time to give the shackles of codepages a rest and move
forwards rather than looking backwards. It doesn't really matter whether you're
looking at Eclipse as an IDE or the Eclipse platform; I'm writing a Rich Client
Application and it's just as important for that that the default text format is
a cross-platform rather than platform-specific format. After all, I'm developing
it as a Rich Client app because of the cross-platform support.
Comment 6 David Williams CLA 2005-09-12 00:40:49 EDT
I'll voice my 2 cents that I do not possibly see how UTF-8 as the default for 
all files could possibly work. Seems this would mean files created with Eclipse 
could not be interoperable with other applications not making that assumption. 
Perhaps the originator is assuming that all UTF-8 is identified with a 3 byte 
BOM, which I do not think is true. Even if so, Java, by itself, does not even 
handle that 3 byte BOM well (does not handle well on 'read', does not produce 
during 'write'). Of course, it makes sense for XML, etc. HTML and JSP's all have 
their own spec'd encoding rules (well, HTML doesn't, that I know of). But as a 
general rule, if the encoding is not identified in the content (or spec'd rules 
for the content), you pretty much have to assume platform default. 
Comment 7 Bob Foster CLA 2005-09-12 02:03:20 EDT
Much as I would like to see some sanity in this area, I agree with David. The
description is correct - an increasing number of applications can deal with
UTF-8. The fact that Windows adds the UTF-8 BOM helps a lot. But other platforms
still don't write a UTF-8 BOM, and until there is a reliable,
platform-independent, content-independent way to detect UTF-8 encoding, it
doesn't make sense as the Eclipse default. Too bad, really, but easy workaround.
The user can set UTF-8 as the default encoding.
Comment 8 Elliotte Rusty Harold CLA 2005-09-12 04:08:24 EDT
Autodetection of encoding would be nice. However, without a lot of effort it
can't be done for all types of files. However this does not mean we shoudl
accept the platform default. The platform default is just one other encoding
that cannot be autodetected. There is no reason that encoding is more likely to
be correct than UTF-8. In 2005 files are routinely moved between platforms and
locales. I often start a project by checking existing code out of a source
repository. What encoding the files are in, depends only on what encoding they
were checked in as. It has nothing to do with the platform default.

One option that UTF-8 offers (and single byte platform defaults do not) is to
attempt to read a file as UTF-8 and, if it fails, to try again with the platform
default. A file that is not UTF-8 is unlikely to be be read as UTF-8 without
detectable error. The reverse is not true. If, for instance, you attempt to read
a file as Latin-1, then all files will seem to be legal Latin-1 without
exception, even if that's wrong. Non-UTF-8 can normally be detected through
invalid byte sequences. However all byte sequences are legal in Latin-1 and most
othe rsingle-byte character sets.  
Comment 9 Elliotte Rusty Harold CLA 2005-09-12 04:19:06 EDT
The fact is no single encoding will work as the default for all files. This
includes the platform default. The current system does not work. The question is
not whether UTF-8 will work for all files. It won't. The question is whether
assuming UTF-8 as the default will work better than the current, failing system.
It will. 

Java is a cross-platform language. Teams routinely use different platforms and
increasingly the same platform but set to different locales. Even if everyone on
a team is using Windows, the developers in Japan, Israel, the U.S. India, and
China are all likely to have different default character sets. Unicode is the
only character set that has any hope of working for them all, and UTF-8 is the
right encoding for Unicode. 
Comment 10 Alex Blewitt CLA 2005-09-12 04:36:58 EDT
Just to be clear, I didn't raise this with the expectation that all UTF-8 files are marked with the BOM, or 
assume that such encodings can be automatically detected.

However, just because one encoding cannot automatically be detected does not mean that another 
choice is therefore the correct answer. Consider the possibilities that Eclipse (including RCP) are 
possibly going to be used for:

1) Editing files that other Eclipse installs will read (e.g. private data to an RCP application, or others 
specific to a feature e.g. Java source files)

2) Editing files that will be stored in some kind of shared repository, potentially globally

3) Editing files as a souped-up editor for the filesystem

Of these three possibilities, it's way more likely that Eclipse will be used as one of the first two options. 
Even Eclipse's assumptions about all files being stored under some particular workspace/project 
combination (for the IDE, at least) is likely to rule out Eclipse as a general purpose editor, unlike Emacs 
which happily can edit files in any location. For example, I wouldn't use Eclipse to edit /etc/hosts 
because (a) I don't want to have to set up a .project in /etc just to look at the config files, and (b) I don't 
want to create linked resources for every file I want to edit in Eclipse -- I'll just use Emacs or Vi (both of 
which support UTF-8, by the way).

The point is that with any choice, there are pros and cons. In this case, if files are created/assumed to 
be UTF-8, then you'll end up with a file that is editable on any Unicode-savvy operating system. This 
includes Windows, where UTF-8 files are supported by the OS (and the encoding reported by Java is the 
'fallback encoding' for non-Unicode aware systems). On the other hand, if you use RandomOS' choice of 
character encoding for files, then it's only RandomOS that will be able to read that file correctly. All 
other non-RandomOS systems will load the file transparently with errors, possibly mangling the data in 
the process.

Eclipse is supposed to be about platform-neutral development, so that development is independent of 
the OS that is being used to create the content. This simply isn't true when using RandomOS' character 
set encoding. In fact, by using RandomOS' encoding, you are explicitly limiting those file(s) to only be 
usable on RandomOS.

Yes, it may break obscure cases where Eclipse is being used as an editor for platform-specific files, like 
/etc/hosts. But it will fix a lot more cases where files are developed by distributed team members 
around the globe on a variety of different operating systems.

So there is no one this-absolutely-works-for-all-cases. But UTF-8 is a much, much better choice as a 
default than RandomOS' encoding, especially when compared with the target uses of Eclipse outlined 
above.

This bug as also raised against the Core Text component, rather than JDT itself. To re-iterate, this is a 
bug on the text handling of *all* text files, not just .java files. This may be currently assigned to jdt-
inbox for their comments, but the bug should still remain a Core Text bug.
Comment 11 Dani Megert CLA 2005-09-12 04:48:26 EDT
>This may be currently assigned to jdt-
>inbox for their comments, but the bug should still remain a Core Text bug.
It's not Platform Text: the default encoding is provided by Platform Resources.
Comment 12 Jerome Lanneluc CLA 2005-09-15 10:17:45 EDT
The encoding for .java files is not spec'ed by the JLS.
Moving to Platform Resources for comment on the general resolution of this request.
Comment 13 Alex Blewitt CLA 2005-09-15 10:43:49 EDT
I beg to differ re: Java files:

http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#95413

"Programs are written in Unicode (section 3.1), but lexical translations are
provided (section 3.2) so that Unicode escapes (section 3.3) can be used to
include any Unicode character using only ASCII characters."

"3.1 Unicode
Programs are written using the Unicode character set. ... "

It doesn't explicitly say which encoding of Unicode should be used (UTF-8,
UTF-16 etc.) but it *does* say that it is Unicode. Furthermore, it says that
programs may also be written in ASCII with Unicode escape sequences, and UTF-8
is the only encoding that also has the property that the first 128 characters
are ASCII, so the implicit conclusion is that the only UTF encoding that can be
used is UTF-8.

Note that the statement further on:

"Except for comments (section 3.7), identifiers, and the contents of character
and string literals (section 3.10.4, section 3.10.5), all input elements
(section 3.5) in a program are formed only from ASCII characters (or Unicode
escapes (section 3.3) which result in ASCII characters). ASCII (ANSI X3.4) is
the American Standard Code for Information Interchange. The first 128 characters
of the Unicode character encoding are the ASCII characters."

may be misleading due to the English, but it is saying that all of the
punctuation, white-space and other characters in a file are ASCII (also the same
character in UTF-8) -- but (importantly) the comments, identifiers, and string
literals (i.e. everything except keywords and punctuation) *is* Unicode. It's
just that as well as Unicode, it can also be represented using \u notation, but
does not have to be.
Comment 14 John Arthorne CLA 2005-09-15 11:21:08 EDT
The only reasonable default encoding is the one supplied by the operating
system.  If the user is running in a different locale and has an encoding to
match that locale, it needs to be honoured by Eclipse.  Interoperability with
the local operating system and other local programs is more important than
cross-platform interoperability.  If you want to set the encoding used by
Eclipse to UTF-8, you can do so.
Comment 15 Alex Blewitt CLA 2005-09-15 11:29:19 EDT
This isn't just a cross-platform issue. It's a cross-locale issue. Developers
writing code/documentation/files in a Locale on one side of the globe should be
able to have files shared with those on the other side of the globe, even on the
same platform.

Further, there's no way of setting the default locale as picked up by Java on
Windows systems. The Cp1252 reported on windows (when running in England) is the
fallback encoding for when UTF-8 isn't supported.

I also feel this bug needs a wider audience (and reasoned discussion) than an
assertion that 'the only sensible default is the OS locale'. As is noted in
comment #4, the user often doesn't have this choice of encoding; they just
select from a generic regional location and a locale-specific non-global one is
picked randomly without any user intervention.

I also strongly disagree with the statment that 'Interoperability with
the local operating system and other local programs is more important than
cross-[locale] interoperability.'. I invite you to submit an example of any
Eclipse application -- JDT or otherwise -- that edits operating system files
instead of ones that are destined for UTF-8 capable systems (web browsers,
version control systems etc.) And please note, this is about cross-locale
interoperability, not just cross-platform interoperability.
Comment 16 Rafael Chaves CLA 2005-09-15 12:13:08 EDT
Note that for cross-locale interoperability users are expected to set the
default encoding (whatever it is) at the project level (instead of at the
workspace level). This setting is stored in the project content area, thus being
shared through the team repository (all users will end up with the same setting).
Comment 17 Alex Blewitt CLA 2005-09-15 12:28:18 EDT
I'd also like to point out that whilst it's possible to override the default
Java setting (using -Dfile.encoding=UTF-8), this hides what any original
platform setting may be at any level. Having Eclipse default to UTF-8 by
default, whilst still allowing it to be changed back to any locale-specific
encoding, is a way of having a locale- and platform- portable default that is
overridable by the user to be locale-specific.

I don't necessarily believe that a per-project setting is the best workaround,
as there are RCP apps that don't necessarily use .projects for data interchange
(they may choose to work with WebDAV or similar). Having a default accessible
may make sense for these kinds of applications as well.
Comment 18 Bob Foster CLA 2005-09-15 22:26:23 EDT
I completely agree with John. UTF-8 is still a minority encoding; most files are
in national character sets. The setting most likely to correspond to the user's
national character set is the operating system default. Given that the user can
change the default encoding with one preference setting, I'm surprised this
discussion (reasoned or not) has dragged on this long.
Comment 19 Alex Blewitt CLA 2005-09-16 03:54:58 EDT
Because it's about changing the *default*. You know, what Eclipse comes with. Yes, it's trivial for me to 
change my preference setting, but I'm building RCP applications and I don't want users across Europe 
(who use a variety of slightly different locales) wondering why they can't exchange RCP documents.

o Eclipse uses GIFs instead of BMPs, because they're more portable
o Eclipse uses HTML instead of Word or TROFF, because they're more portable

And yet you're arguing that using a less-portable character set encoding is the right thing to do?

Eclipse isn't used as a general-purpose text editor to edit operating system files. Even if it was, current 
operating systems can deal with UTF-8 character set encodings natively and this 'codepage' thing is a 
fallback for applications that can't, or in this case, won't deal with UTF-8 encodings.

As has already been pointed out, Java files are already UTF-8, and it's also currently the default for XML 
documents. It should also be the default for any HTML or JSP document to avoid non-printing 
characters showing up when the page is viewed on a platform where the encoding is different.

Eclipse is a very good cross-platform product. It's already used in global development (the Eclipse 
committers do a great job of making that happen). However, you have situations where developers in 
one locale will be creating files with one encoding, and developers the other side of the world using 
another encoding. Tell me why it's not sensible that we should all be using one encoding?
Comment 20 John Arthorne CLA 2005-09-16 09:59:27 EDT
This discussion is closed as far as I'm concerned.  I think the discussion has
had adequate exposure in the various newsgroup postings Alex has made, and there
clearly isn't community consensus. Changing the default encoding is a drastic
enough change that we would need broad support from both the community and the
commiters on affected projects, and the -1's above from the platform and WTP
text leads alone are enough for me to consider this closed.
Comment 21 P Duffy CLA 2005-11-07 16:47:03 EST
My dev team ran into this unexpected issue.  A cut/paste from a Windows doc into
an Eclipse Java file editor was then checked into the source control system.  A
linux user then was unable to compile or open the file because it contained
cp1252 characters, which are illegal under the linux default of utf8. 

Developers expectations are that the tools are going to protect them from such
situations.  We have people developing product on three platforms: Windows,
Linux, and Solaris. Cross platform is a big issue for us.  I assume the best
suggestion is that we manually configure character encoding to be UTF-8 across
all platforms?

Cheers,
Comment 22 John Arthorne CLA 2005-11-09 09:27:11 EST
Yes, you need to use an encoding that is shared across all your development
platforms, or restrict yourself to the range that those encodings have in
common.  cp1252 and UTF-8 share a signficant subset (128 bits of ASCII, and
more). The development of Eclipse itself is done across many platforms within
that shared subset of encodings.
Comment 23 Elliotte Rusty Harold CLA 2005-11-09 10:22:11 EST
That comment is incorrect. The only characters Cp1252 and UTF-8 share in the
context of Eclipse are the 128 ASCII characters. While all 256 Cp1252 characters
are available in UTF-8, 128 of them do not share the same code points. Since,
unlike XML, Java files do not carry any information about their own encoding,
this needs to be externally speciifed by the IDE. Thus a Cp1252 file loaded into
a UTF-8 environment will be reported a smalformed. The Java editor does not
autodetect and account for the different mappings from code points to
characters, as an XML editor might be able to do. 

This is a flaw in the design of Java. We can't fix that. Currently the best
solution is indeed to manually configure for UTF-8 across all platforms.
However, since that is the best solution it should be the default as well. 
Comment 24 John Arthorne CLA 2005-11-09 10:53:50 EST
cp1252 and UTF-8 only differ in the range 0x80-0x9F, the remaining 224
characters are the same. Here is a mapping table from unicode.org:

http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
Comment 25 P Duffy CLA 2005-11-09 11:07:50 EST
Please confirm or correct the following statement.  

If a dev team is developing on Linux, Solaris, and Windows, then it is
recommended that Eclipse file encoding be set to UTF-8 for all platforms.
Comment 26 Elliotte Rusty Harold CLA 2005-11-09 11:10:37 EST
The Cp1252 character set is a proper subset of the Unicode character set. Every
character in Cp1252 has a corresponding Unicode code point.

The problem is that Eclipse doesn't work at the level of characters. It does not
know that Cp1252 é (the byte 0xE9) is equivalent to the two UTF-8 bytes 0xC3 0x
A9. Thus at the level Eclipse works, the different character sets are not
compatible.

There is a missing layer of indirection in Java. XML has this additional layer
of indirection between bytes and characters. Java doesn't. If Java had it, we
wouldn't be having this discussion.

Given that Java does not include in file metadata about the character encoding,
the question becomes what Eclipse should do to handle character set
identification. No solution will be perfect. However in the long term I think
the current platform specific approach is clearly inferior to a
platform-independent UTF-8 default. 
Comment 27 Elliotte Rusty Harold CLA 2005-11-09 11:14:58 EST
My correction: "It is recommended that the Eclipse file encoding be set to UTF-8
for all platforms." No "if" is necessary. :-)

Even a mono-platform environment will not be harmed by using UTF-8, and may well
be improved by it if characters from outside the current locale are needed. In
today's international world, we cannot assume that just because I am typing this
message in the U.S. that I only require characters from the Roman alphabet. I
may well need Cyrillic or Japanese or other character sets. 

At worst UTF-8 does no harm. At best it avoids numerous problems of characters
set interoperability between programmers on a team.
Comment 28 P Duffy CLA 2005-11-09 11:19:12 EST
It would have saved my team some grief had all the the platform defaults been
set to UTF8.  This character encoding issues is not something most developers
want or need to be bothered with.

P.S.  Why did not the Windows cp1252 editor complain when illegal characters
were cut and pasted from a Word doc?  Is this a bug?  Had the editor
detected/prevented the illegal chars, we would not have had a problem.
Comment 29 John Arthorne CLA 2005-11-09 11:26:17 EST
Good point about catching this on paste - I suggest entering a separate bug
report against the Platform Text component.
Comment 30 P Duffy CLA 2005-11-09 11:45:11 EST
Where exactly does the file encoding setting get persisted and to which
property?  I'm playing with the file encoding setting now and exporting a new
preference file, but can't locate the encoding change in the preferences file. 
Further, when I import our old preferences file, the encoding remains set to my
change to UTF8, not back to the original default of cp1252.

?
Comment 31 P Duffy CLA 2005-11-09 12:19:36 EST
And here is the bug report response...


daniel.megert@eclipse.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID




------- Additional Comments From daniel.megert@eclipse.org  2005-11-09 12:12 -------
We have no idea under which encoding someone will checkout that file in the
future. The one who checks out the file could have his workspace encoding set to
Chinese or whatever.

If you share files across platforms you have to choices:
1. have a policy that users watch for such problems and don't release such files
2. have a policy that users must set their workspace encoding to UTF-8
3. set the project encoding to UTF-8. This might be the best solution because any
   one who checks out the project will get the correct encoding.


Comment 32 John Arthorne CLA 2005-11-09 15:11:25 EST
re comment 30: The best place to set the encoding is by right clicking on any
resource (project, folder, file), and selecting Properties > Info. When the
encoding is set on a project or folder, it sets the *default* encoding for all
files in that container.  I.e., if the file does not have an explicit encoding
stored, Eclipse looks for the encoding on the containing folder recursively
until an encoding setting is found.  When set this way, the encoding information
will be persisted in the project content area in the .settings directory along
with the project contents.
Comment 33 P Duffy CLA 2005-11-09 15:21:50 EST
I can not locate a .settings folder.  Where exactly is it located?
Comment 34 John Arthorne CLA 2005-11-09 15:26:28 EST
The .settings folder isonly created as needed... it is a sibling of the .project
file in your project's top level directory. If you set the encoding of a
resource in that project (or for the entire project), it will be stored in that
directory.
Comment 35 P Duffy CLA 2005-11-09 15:34:02 EST
I just changed the encoding on the project to UTF-8, then closed the project. 
No sign of a .settings folder.

?
Comment 36 John Arthorne CLA 2005-11-09 15:56:21 EST
Created attachment 29639 [details]
Screen shot

For illustration, here is a screen shot of a simple project that has its
encoding set to UTF8.  You can see the .settings folder in the Navigator, and
it contains a file called "org.eclipse.core.resources.prefs" that stores the
encoding details for the project.  I assume you are using Eclipse 3.0 or
greater, and that you don't have filter on your view that hides .* files?
Comment 37 P Duffy CLA 2005-11-09 17:42:03 EST
If I do this exactly as you describe on a new sample test project, I get exactly
the result you describe.  If I try this on my existing project, no .settings
folder is created.

?
Comment 38 P Duffy CLA 2005-11-09 17:48:03 EST
Actually, I get "internal error setting encoding" dialogue.  Don't know if this
is relevant, but we are using CCRC pluggin for Eclipse.  The project is under
ClearCase source control.

Comment 39 John Arthorne CLA 2005-11-09 17:56:49 EST
Are there more error details in the log file? (workspace/.metadata/.log)?
Comment 40 P Duffy CLA 2005-11-10 10:28:41 EST
!ENTRY org.eclipse.core.runtime 4 2 2005-11-09 17:47:00.367
!MESSAGE An internal error occurred during: "Setting encoding".
!STACK 0
java.lang.IllegalArgumentException: Attempted to beginRule: R/, does not match
outer scope rule: P/bac-nova
	at org.eclipse.core.internal.runtime.Assert.isLegal(Assert.java:58)
	at org.eclipse.core.internal.jobs.ThreadJob.illegalPush(ThreadJob.java:117)
	at org.eclipse.core.internal.jobs.ThreadJob.push(ThreadJob.java:211)
	at org.eclipse.core.internal.jobs.ImplicitJobs.begin(ImplicitJobs.java:59)
	at org.eclipse.core.internal.jobs.JobManager.beginRule(JobManager.java:190)
	at org.eclipse.core.internal.resources.WorkManager.checkIn(WorkManager.java:96)
	at
org.eclipse.core.internal.resources.Workspace.prepareOperation(Workspace.java:1674)
	at org.eclipse.core.internal.resources.Folder.create(Folder.java:88)
	at
org.eclipse.core.internal.resources.ProjectPreferences$2.run(ProjectPreferences.java:304)
	at
org.eclipse.core.internal.resources.ProjectPreferences.save(ProjectPreferences.java:315)
	at
org.eclipse.core.internal.preferences.EclipsePreferences.flush(EclipsePreferences.java:351)
	at
org.eclipse.core.internal.resources.ProjectPreferences.flush(ProjectPreferences.java:585)
	at
org.eclipse.core.internal.preferences.EclipsePreferences.flush(EclipsePreferences.java:339)
	at
org.eclipse.core.internal.resources.ProjectPreferences.flush(ProjectPreferences.java:585)
	at
org.eclipse.core.internal.resources.CharsetManager.setCharsetFor(CharsetManager.java:280)
	at
org.eclipse.core.internal.resources.Container.setDefaultCharset(Container.java:255)
	at
org.eclipse.ui.ide.dialogs.ResourceEncodingFieldEditor$1.run(ResourceEncodingFieldEditor.java:134)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:76)
Comment 41 John Arthorne CLA 2005-11-11 17:02:26 EST
Re comment #40 - can you enter a new bug report for that error?  In the report,
include what version/build of Eclipse you are using (build id is in Help >
About...).
Comment 42 P Duffy CLA 2005-11-14 10:57:53 EST
Re: 41, under which category should this bug be filled?
Comment 43 John Arthorne CLA 2005-11-14 11:03:56 EST
You can log it under Platform Resources.
Comment 44 John Arthorne CLA 2007-01-22 10:42:20 EST
*** Bug 171087 has been marked as a duplicate of this bug. ***
Comment 45 Dani Megert CLA 2007-12-18 02:46:51 EST
*** Bug 213251 has been marked as a duplicate of this bug. ***
Comment 46 koko CLA 2008-12-16 11:45:21 EST
+1 UTF-8 should be the obvious default character set for all text files, I had a problem with eclipse when I was using an English Windows XP system and trying to open a file in eclipse with Chinese characters, as you can imagine the display is completely messed up and eclipse doesn't tell me what I need to do. I had to spend time google for answers. I had to put -Dfile.encoding=UTF-8 in eclipse.ini so that it behaves correctly. 

If eclipse had this on as default or at least detect file correctly, a user like me wouldn't have to go through all this trouble to get a file to display as it should have been displayed. Every other text editor I use such as PsPad, ultraedit, notepad++, display the file properly.
Comment 47 David Williams CLA 2008-12-16 13:50:38 EST
(In reply to comment #46)

> 
> If eclipse had this on as default or at least detect file correctly, a user
> like me wouldn't have to go through all this trouble to get a file to display
> as it should have been displayed. Every other text editor I use such as PsPad,
> ultraedit, notepad++, display the file properly.
> 

Making UTF-8 the default is not the right solution for the problem you were having. I'd suggest you open a separate bug describing the details and how Eclipse didn't meet your expectations compared to the other editors. You'd have to attach a sample file (I recommend zipping it up, so it doesn't get "changed" by attachments and browsers). You should also attach your "configuration" (obtained from about box) so someone could maybe see what the problem is. Also, the .settings folder from the project might also be important. 

There may be "nothing we can do" for an automatic fix, but ... that'd be the right approach, not making UTF-8 the default. 

Thanks, 
Comment 48 Alex Blewitt CLA 2009-05-28 17:43:57 EDT
I'm going to bump this up to 4.0 and re-open. Considerations about distributed resources (where the client OS and server OS may be in different locales/character sets) re-emphasise the need for a universal text encoding.
Comment 49 David Williams CLA 2009-05-28 17:54:32 EDT
(In reply to comment #48)
> I'm going to bump this up to 4.0 and re-open. Considerations about distributed
> resources (where the client OS and server OS may be in different
> locales/character sets) re-emphasise the need for a universal text encoding.
> 

I'd suggest waiting to re-open until all the client OS's and server OS's agree to a universal encoding. :) Until then, anything else is going to break someone. 

Comment 50 Alex Blewitt CLA 2009-05-28 18:49:47 EDT
If the distributed EFS representation can't come up with some kind of encoding (or at least, demarkating what encoding the resources should be in) then E4 isn't really going to stand much of a chance. Even HTTP resources announce a Content-Type with a charset encoding; this could easily be used to work with a UTF-8 representation.

It may not be universal, but it's a darn sight more universal than any other encoding you can name (unless you go with other UTF-* encodings, or one of its subsets like ASCII) 
Comment 51 David Williams CLA 2009-05-29 01:32:07 EDT
(In reply to comment #50)
> If the distributed EFS representation can't come up with some kind of encoding
> (or at least, demarkating what encoding the resources should be in) then E4
> isn't really going to stand much of a chance. Even HTTP resources announce a
> Content-Type with a charset encoding; this could easily be used to work with a
> UTF-8 representation.
> 
> It may not be universal, but it's a darn sight more universal than any other
> encoding you can name (unless you go with other UTF-* encodings, or one of its
> subsets like ASCII) 
> 

I don't know much (nothing really) about "distributed EFS" but if you are saying that e4 should use a protocol that contains the encoding in the data stream itself (like HTTP) then I wholeheartedly agree with that. 

See also bug 210704. 

Comment 52 Martin Oberhuber CLA 2009-05-29 03:48:22 EDT
+1 for embedding encoding in the character stream wherever we can (like XML,
   HTTP, some kinds of file systems). Encoding is meta-info for the data and
   belongs to the data, not to a separate user-changeable setup.

For the actual data in the workspace, the main problem is that this often needs to be interoperable with legacy tools. People want to use their Eclipse editor and legacy editor interchangeably. Encoding is really owned by the data (which may be legacy) and not by Eclipse.

There may be some projects (like Java) where UTF-8 is the obvious default choice. In other cases (old C, Makefiles, some Webpages) a very conservative ISO-8859-1 may be the best choice which inhibits accidentally entering "odd" characters from the start. To add to complexity, default encoding may also be specified by the underlying OS/Platform or Country -- although that's often not really desired, especially when data is meant to be shared across geo boundaries like we see more and more.

I'm in favor of having Eclipse auto-detect the proper encoding in more places than it does to day, but too much magic in some toolset is always a slippery road and the problem is a tough one to solve.

Perhaps the simplest thing that could possibly work is this:
(1) Always accept encoding as specified inside data stream.
(2) Use UTF-8 default encoding unless otherwise specified.
(3) Have project creation wizards / natures override that default as appropriate.
Comment 53 Szymon Brandys CLA 2009-10-19 10:59:45 EDT
*** Bug 284637 has been marked as a duplicate of this bug. ***
Comment 54 Matthieu Paret CLA 2011-05-20 04:00:36 EDT
It is really sucking bug. Try to generate javadoc with macRoman ?
And for french people ...
Please remove MacRoman. It will be good for newbies ...
Comment 55 Tilman Potthof CLA 2011-12-18 07:12:47 EST
This bug is so annoying and kills productivity. It happened a hundred times to me, the working with other people is distracted by that silly behavior.

Please just do UTF-8 as default.
Comment 56 David Williams CLA 2011-12-22 20:09:35 EST
(In reply to comment #55)
> This bug is so annoying and kills productivity. It happened a hundred times to
> me, the working with other people is distracted by that silly behavior.
> 
> Please just do UTF-8 as default.

I agree with comment 18, made in 2005, "I'm surprised this discussion (reasoned or not) has dragged on this long." 

A "fix" will have to be something other than changing the default, for the many reasons mentioned in the many comments over the past 6 years, so its not constructive just to keep suggesting that. Perhaps someone would want to work on a new feature, say, to "provide a better warning UTF-8 is not being used" or something ... but seriously ... 2005 ... changing the default discussed for 6 years?! 

Perhaps leaving this bug as opened and "new" gives the wrong impression ... perhaps it should be closed as "won't fix"? And interested parties could open more specific feature requests for new behavior or features that wouldn't break existing users and data?
Comment 57 Dani Megert CLA 2012-01-03 11:39:38 EST
> Perhaps leaving this bug as opened and "new" gives the wrong impression ...
> perhaps it should be closed as "won't fix"?
+1.
Comment 58 Stijn de Witt CLA 2012-02-07 18:29:48 EST
Or maybe now that it's 2012 (!!) we can suppose that Unicode support has now progressed far enough for Eclipse to make UTF-8 the default for new projects?

Funny thing is that because of all these tools using these very conservative defaults, every day new files get created in legacy encodings, reenforcing the need for being conservative... in a vicious circle, perpetuating itself. If just these tools would embrace Unicode within a year or two we could forget about those legacy encodings.

The default is important because the majority of programmers doesn't care/know about encodings and will use the default no matter how bad it may be.
Comment 59 Martin Spamer CLA 2012-05-28 10:07:56 EDT
(In reply to comment #58)
> Or maybe now that it's 2012 (!!) we can suppose that Unicode support has now
> progressed far enough for Eclipse to make UTF-8 the default for new projects?
> 
> reenforcing the
> need for being conservative... in a vicious circle, perpetuating itself. 
+1
Comment 60 Laurent Barbareau CLA 2012-09-02 13:01:49 EDT
Probably, Alex Blewitt, you were too avant-gardist for that time...

Other ones were certainly afraid of the potential bugs and complaints that would have caused.

Ok guys ! Go on, here we go now ! Ready for next release ? Just a property to switch I beg, that's all isn't it ?

And what about a display of the current encoding used in the active Editor, into the status bar for instance ?

Thank you.
Comment 61 Stijn de Witt CLA 2013-06-20 16:14:27 EDT
Maybe I'm getting this all wrong, but this discussion is about the creation of *new* files right?

The original problem statement was this:

"Why does Eclipse default to a platform-specific encoding for
its file types when creating new files?"

So maybe I'm dumb, but can someone explain why creating a new file as UTF-8, from within Eclipse, would be a problem for anyone?

Now I can give many examples of what I consider to be a very plausible use case for the current setting of platform default to give problems, but I'd be repeating was has been said before.

Somehow it looks to me like the majority of the people voting against this are mostly thinking of scenarios involving the opening of *existing* files.
Comment 62 Robin Stocker CLA 2013-10-09 10:49:12 EDT
Another user having a problem with the current situation has led me here:

http://stackoverflow.com/questions/19251180/encoding-issues-in-eclipse-for-mac-and-for-windows

As this is still open, please note that NetBeans uses UTF-8 as default:

http://wiki.netbeans.org/FaqI18nProjectEncoding

UTF-8 being a minority encoding: This is no longer true, at least for the web:

http://w3techs.com/technologies/overview/character_encoding/all

(In reply to Dani Megert from comment #3)
> Because Eclipse is not just an IDE

Maybe this should be another one of the defaults which should be overriden for the IDE packages then.
Comment 63 David Williams CLA 2013-11-23 13:59:32 EST
(In reply to Stijn de Witt from comment #61)
> Maybe I'm getting this all wrong, but this discussion is about the creation
> of *new* files right?
> 
> The original problem statement was this:
> 
> "Why does Eclipse default to a platform-specific encoding for
> its file types when creating new files?"
> 
> So maybe I'm dumb, but can someone explain why creating a new file as UTF-8,
> from within Eclipse, would be a problem for anyone?
> 
> Now I can give many examples of what I consider to be a very plausible use
> case for the current setting of platform default to give problems, but I'd
> be repeating was has been said before.
> 
> Somehow it looks to me like the majority of the people voting against this
> are mostly thinking of scenarios involving the opening of *existing* files.

Yes, but ... imagine a user has a thousands of files created under old default assumption ... and then they work for a while and create a couple of hundred new files under new assumed default .... and then someone else on the team checks out that project (let's say, for the first time) ... how is known at that point which were the "old, existing" files ... and which were the newly created files? (I actually think I know of one answer to this, but just wondering if you do ... or, what you had in mind). 

Similarly, what if a user had a few thousand files that already existed, lets say create with plain 'ol text editors or some old tools that used the platform encoding ... and a user wants to "import" those into Eclipse. I think from Eclipse's point of view, those are (sort of) "new files" ... but still, either way, data could be lost if Eclipse made some other assumption or tried to do some "automatic conversion". 

Keep in mind, there are some file encodings, such as for various Japanese, Chinese, or Arabic languages that can not be properly encoded using UTF-8, so a simple "automatic conversion" would not (always) work. Well, I'm 99% sure of that :) ... native language developers are free to correct me -- I'm certainly not knowledgeable to know of the exact list. 

I'm sure all these problems are "solvable" (to some extent) ... but, it would take more work than simply "changing the default workspace preference".
Comment 64 Stanislav Spiridonov CLA 2013-11-25 09:23:29 EST
May by for the same reasons we need to back to some 7-bit encoding? Imagine I have thousand files in that encoding. And of course I use pretty old school 7-bit editor (why not?). And now I try to open my files in Eclipse. Wow! It doesn't work! Why?!
Comment 65 Dani Megert CLA 2013-11-25 09:35:55 EST
(In reply to Stijn de Witt from comment #61)
> Maybe I'm getting this all wrong, but this discussion is about the creation
> of *new* files right?

It said so in the initial comment, but that's not the whole story about how it currently works in Eclipse. Most files don't tell you what their encoding is and hence that preference is also important/used when reading files. The encoding is detected
1. from the file contents, if possible
2. from the file's encoding setting, if available
3. from the file's (parent) folder's encoding setting, if available
4  from the file's project encoding setting, if available
5. from the workspace encoding preference

To be independent of workspaces, it is therefore recommended to set the project specific encoding, so that it can be shared via repository.
Comment 66 Tiger Shark CLA 2014-01-26 20:35:27 EST
Why the only "reason" to keep it as is was "the only sensible choice" or something like that?


Just another issue. I have a java app on witch I have a regex hardcoded with an accentuated character (á). Eclipse warned me the file should be saved as utf8 and so I did.

After some months, I had to re import the project on another eclipse instance on the same computer. The file was imported as windows locale encoding or something and thus my "á" got corrupted. Since it was a regex, I could not detect the issue until it was too late...
Even changing the file to utf8 again wouldn't fix the corrupted character. I had to edit the source. So... a project breaks just by reimporting it if you save a file as utf8 without changing the property for the entire project/environment???????


Please enum real reasons for not changing the default to utf8 and please, stop assuming that because there is an option somewhere to change it, everyone will find it right away without losing hours of valuable time.
Comment 67 Stanislav Spiridonov CLA 2014-01-27 01:54:35 EST
While the Eclipse team is trying to please everyone (impossible IMHO), you can set the UTF8 as default for whole JDK by setting system (or user) environment variable JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8. After that the Java editor will use the UTF8 by default. 

But even if you will set that property it you will need to update default encoding for jsp and .property files (#@!$#@!$) in Eclipse...
Comment 68 David Williams CLA 2014-01-27 02:45:39 EST
(In reply to Tiger Shark from comment #66)

> 
> Please enum real reasons for not changing the default to utf8 and please,
> stop assuming that because there is an option somewhere to change it,
> everyone will find it right away without losing hours of valuable time.

I think the real reasons have been stated in this bug (though, admit, it is a lot to read) but think its unfair to say they have not been. It basically comes down to many people have different requirements, many people have decades of existing data that was created and continues to operate under one set of assumptions and we don't want to break them. Plus, I have to emphasize, there are many places where "encoding can go wrong" and making one of them the default all the time, will not magically solve all those problems. Those that use encodings must understand them to some degree. 

I still think the path to improvement is to make the "user education" better. Such as instead of the editor warning you "the file should be saved as UTF-8", perhaps it should have also given you the choice to change the project settings to UTF-8? 

Further, from your description, I have a feeling your character would have gotten corrupted anyway ... going to a windows machine, that was not using UTF-8, ... works in cases if you export/import projects from/to an SCM system, but it almost sounds like you copied the single file to the file system? (Otherwise, the file would have still been "ok", going from Eclipse to Eclipse via SCM projects.) If it was XML, it would have been handled fine if you had a correct DECL, but, perhaps it was a bash script, which does not have any means to "self document" its encoding? I only mention all these to try an convey the fact that "its complicated".
Comment 69 Stanislav Spiridonov CLA 2014-01-27 06:12:53 EST
Hmm... "Windows machine" at least several years use UTF8 as default :) 

For some reason Oracle JDK on Windows threat the "Language for non-Unicode programs" as a system locale. And so each Java application on Windows works in non-Unicode mode. Fortunately you can force the Java to use UTF8, but Eclipce still use local encoding even in that case.
Comment 70 Tiger Shark CLA 2014-01-27 08:35:43 EST
(In reply to David Williams from comment #68)
> (In reply to Tiger Shark from comment #66)
> 
> > 
> > Please enum real reasons for not changing the default to utf8 and please,
> > stop assuming that because there is an option somewhere to change it,
> > everyone will find it right away without losing hours of valuable time.
> 
> I think the real reasons have been stated in this bug (though, admit, it is
> a lot to read) but think its unfair to say they have not been. It basically
> comes down to many people have different requirements, many people have
> decades of existing data that was created and continues to operate under one
> set of assumptions and we don't want to break them. Plus, I have to
> emphasize, there are many places where "encoding can go wrong" and making
> one of them the default all the time, will not magically solve all those
> problems. Those that use encodings must understand them to some degree. 
> 
> I still think the path to improvement is to make the "user education"
> better. Such as instead of the editor warning you "the file should be saved
> as UTF-8", perhaps it should have also given you the choice to change the
> project settings to UTF-8? 
> 
> Further, from your description, I have a feeling your character would have
> gotten corrupted anyway ... going to a windows machine, that was not using
> UTF-8, ... works in cases if you export/import projects from/to an SCM
> system, but it almost sounds like you copied the single file to the file
> system? (Otherwise, the file would have still been "ok", going from Eclipse
> to Eclipse via SCM projects.) If it was XML, it would have been handled fine
> if you had a correct DECL, but, perhaps it was a bash script, which does not
> have any means to "self document" its encoding? I only mention all these to
> try an convey the fact that "its complicated".

Not really. I did not move the files. I'm still using the very same machine and OS from where the file was originally created. Its a java source, nothing else.
The problem is, the file was in utf8 but after reimporting the project, eclipse assumed the system locale. So when I opened the file after the reimport, the character became corrupted and since it was not a unicode character anymore, the following times the file was saved, the locale encoding was used. 
I agree the option to turn the entire project default to utf8 should be displayed but that alone is not enough if eclipse wont detect it on a reimport of an existing project.

Legacy code... ok, so this change will never come as there will always be locale encoded files since its the default XD
Comment 71 Stijn de Witt CLA 2014-01-28 12:40:06 EST
"Legacy code... ok, so this change will never come as there will always be locale encoded files since its the default XD"

This kinda sums it up. Eclipse and dozens of other programs insist on keeping creating new files in the, clearly inferior, legacy encodings of old. The reason is that there are so many files out there in those encodings... It's a self-fulfilling prophecy. Or a vicious cycle or whatever. The point is that those legacy encoded files will never go away until we start changing the defaults.

This is also the reason why myself (and judging from the comments, many others) are not happy with the ability to change the setting on our local machine. It does not stop the mass creation of new files in legacy encodings that keeps us all locked in this never ending story.

Since this bug has been open so long and there are so many comments, I think the people at Eclipse should think of ways to give the community at least something... Maybe for NEW projects, set the encoding setting to Unicode/UTF-8? Or make it an important setting in New Project wizards? So people explicitly have the option to set the encoding, possibly with some explanation?
Comment 72 Marcel Stör CLA 2014-01-28 14:21:36 EST
(In reply to Stijn de Witt from comment #71)
>  It's a self-fulfilling prophecy.

Nicely put...The Eclipse 4.x would have been (another) perfect opportunity to "modernize" the default encoding. We should accept that at one point the past is just that - the past.
Comment 73 Lars Vogel CLA 2014-02-24 10:41:28 EST
*** Bug 428892 has been marked as a duplicate of this bug. ***
Comment 74 Szymon Ptaszkiewicz CLA 2014-02-24 12:01:36 EST
There is really nothing we can do at the Platform/Resources level.
Comment 75 Stanislav Spiridonov CLA 2014-02-25 02:09:28 EST
Ten years to find out that issue has wrong parameters?! And just close it?! Why do not correct the Product/Component to the right ones?
Comment 76 Doug Schaefer CLA 2014-02-25 09:37:45 EST
WONTFIX is not acceptable for this bug. There's been a pretty intense outcry from leaders in the community on this. Let's keep it open and work out a proper solution.
Comment 77 Szymon Ptaszkiewicz CLA 2014-02-25 10:12:11 EST
(In reply to Doug Schaefer from comment #76)
> WONTFIX is not acceptable for this bug. There's been a pretty intense outcry
> from leaders in the community on this. Let's keep it open and work out a
> proper solution.

I don't know why everyone thinks that "proper solution" is to change component default values. As stated numerous times in this bug, the default value cannot change at the Platform/Resources level (component level).

However, it is always possible to change the default per product, so for example one ask to change it for certain EPP package via pluginCustomization file. The same thing was done in many other cases, e.g. lightweight refresh - it is still disabled by default at the component level, but enabled via pluginCustomization file for EPP packages (see bug 384104). If you want to have different defaults than component defaults, that's the way forward.

Marking WONTFIX, because there is really nothing we can do at the Platform/Resources level. Feel free to move this bug to EPP.
Comment 78 Matthieu Paret CLA 2014-02-25 10:30:54 EST
 @all you opened the bug in the wrong component, nobody tells you during ten years. As we don't care your problem here and we are already happy to build the best IDE in the world, we're closing it. Love you <3
Comment 79 Wayne Beaton CLA 2014-02-25 14:28:26 EST
(In reply to Matthieu Paret from comment #78)
>  @all you opened the bug in the wrong component, nobody tells you during ten
> years. As we don't care your problem here and we are already happy to build
> the best IDE in the world, we're closing it. Love you <3

In defense of the Platform team, the EPP project did not exist 10 years ago. 

(In reply to Doug Schaefer from comment #76)
> WONTFIX is not acceptable for this bug. There's been a pretty intense outcry
> from leaders in the community on this. Let's keep it open and work out a
> proper solution.

+1

But let's move it to EPP and see if we can convince a package maintainer to implement this.
Comment 80 Doug Schaefer CLA 2014-02-25 15:34:13 EST
Who's going to come over here and fix my plug-in customization file? If fixing everything that's wrong with the Platform in places not in the Platform is the direction we want to take, then let's plan that carefully and do it right.
Comment 81 Dani Megert CLA 2014-02-26 05:27:10 EST
(In reply to Doug Schaefer from comment #80)
> If
> fixing everything that's wrong with the Platform

You can claim it was wrong not to use UTF-8 when we started. Fair enough. But changing it now would definitely be wrong and break clients. Cp1252 is *not* a subset of UTF-8. This means if a user wrote a file in a current Eclipse Windows workspace with the following content:
Diese Nüsse sind geröstet.
and then opens, edits and saves it with a workspace where the default is now UTF-8, he will end up with a corrupted file. At least for the Platform this is not an option. If a certain EPP, RCP or product does not see this as a problem for their clients, then they are free to change the default.
Comment 82 Laurent Barbareau CLA 2014-02-26 06:43:44 EST
We do know that this is a difficult change and that could cause problems to many people if it is not well managed.

But do you really think that we can continue like that, seriously?

Former character encodings are a recurring problem that we must eradicate.

So let's go now !

Do expose issues to resolve so that a patch can be produced which side effects will be as low as possible on existing installations.

Some strategies :

1 - Make the modification, and while deploying it, a kind of popup window warn users of the difficulties that they might encounter and what to do with them. That solution might be more or less clever in its way of detecting current context and what to warn about...

2 - Announce long time before the target release that it will embed a core modification that could affect existing installations. Elaborate different procedure to fix side effects.

3 - Make a sophisticate patch that will try to fix most of the common cases, avoiding at the maximum side effects.

4 ...

These are only bootstraps.

Your turn and be positive and constructive ! Thank you.
Comment 83 Martin Oberhuber CLA 2014-02-26 09:29:08 EST
Would it be an option to set UTF-8 by default for *new* workspaces only, 
but keep the current behavior for existing workspaces ?
Comment 84 Paul Webster CLA 2014-02-26 09:36:17 EST
(In reply to Martin Oberhuber from comment #83)
> Would it be an option to set UTF-8 by default for *new* workspaces only, 
> but keep the current behavior for existing workspaces ?

I like this option as well, although there's still a risk: whenever I create a new workspace it's to check out existing projects :-)  But that could be part of the migration guide "if you have projects that aren't UTF-8 and you create a new workspace you have to either specify the encoding in the project settings or flip the setting back to default in your new workspace".

PW
Comment 85 Dani Megert CLA 2014-02-26 09:39:16 EST
(In reply to Martin Oberhuber from comment #83)
> Would it be an option to set UTF-8 by default for *new* workspaces only, 
> but keep the current behavior for existing workspaces ?

(In reply to Martin Oberhuber from comment #83)
> Would it be an option to set UTF-8 by default for *new* workspaces only, 
> but keep the current behavior for existing workspaces ?

That would only just protect the existing workspace but not the case when you check out code from a repository or import an existing project. A more durable solution could be to add a new option that allows to specify the encoding to use *and set* when creating a new project. That would also make sure that the project can be opened in all workspaces, since the encoding would be set on the project.
Comment 86 Martin Oberhuber CLA 2014-02-26 11:03:40 EST
(In reply to Dani Megert from comment #85)
> durable solution could be to add a new option that allows to specify the
> encoding to use *and set* when creating a new project. 

Great suggestion. Encoding should be associated with the project anyways, and not with the workspace. That way, (new projects created with UTF-8 by default), more and more projects would convert to UTF-8 over time. 

At one point, a warning dialog could come up when importing a project that doesn't have the encoding specified.
Comment 87 Nikolas Grottendieck CLA 2014-03-12 08:43:54 EDT
Elaborating on the suggestions already made. Eclipse already has a default welcome screen for new workspaces, what about adding information on that very screen about the currently used character encoding and an option to change it then and there? Alternatively a small (~5ish steps) setup at the very first startup of Eclipse itself to set a small number of default settings, say encoding, line numbers, etc.

This would serve to educate the users about existing problems and capabilities as well as offer an easy and mostly painless way to address the issue at hand. Such a welcome screen / setup tool could of course be used for projects instead/as well, too.
Comment 88 a e CLA 2014-06-07 08:03:00 EDT
+1 for changing to UTF-8 as the default text file encoding on all platforms.

Currently, the default for Swedish Eclipse users on Windows is Cp1252 (Windows code page 1252).

I recently began work in a 5 year old medium sized project (~100 ppl), where they did not know or cared about character encoding in the IDE at the time of project start up.
So now everyone in this project still have to use Cp1252 in the IDE.
And yes, since this is a governmental system, all code comments and logging has to be in Swedish.
One "funny" thing is that the system is run on UNIX, where logging using Cp1252 is not optimal...
Comment 89 Laurent Barbareau CLA 2014-06-07 08:27:51 EDT
@a e do vote for that issue ;)

One might wonder how many votes (according to the oldness of that issue) are necessary to trigger any fix study but how to do otherwise ?
Comment 90 Stijn de Witt CLA 2014-06-15 05:36:01 EDT
"You can claim it was wrong not to use UTF-8 when we started. Fair enough. But changing it now would definitely be wrong and break clients. Cp1252 is *not* a subset of UTF-8. This means if a user wrote a file in a current Eclipse Windows workspace with the following content:
Diese Nüsse sind geröstet.
and then opens, edits and saves it with a workspace where the default is now UTF-8, he will end up with a corrupted file."

And how is this different from a user that already has set the default encoding to UTF-8 in his workspace? Or even worse, a user that did not make any changes, but is running on an OS that has a different encoding set as the default?

Your use case is based on the assumption that people work alone, on the same machine. Only in these cases would something break which does not break in the current situation. 

But look at it from this perspective: In the current situation peoples projects will *allways* break when they are being shared across different machines with different default encoding. Stuff is *already* broken. That is why this issue exists for ten years and people are still taking the trouble to add comments to it.

How about this:

New files:           UTF-8 with BOM
Existing files:      Auto-detect
Fallback:            Platform encoding.

Basically rename the existing option to 'Fallback' and add a new option for New files.

If a file being opened does not contain a BOM (and so encoding can not be determined reliably), use the fallback encoding when opening. Otherwise use the encoding detected from the BOM. When creating new files, create them as UTF-8 with BOM so that they will be opened correctly even on machines that have different defaults set.

If you look at all the ideas suggested for this issue that would at least improve on the situation you can't keep saying that 

"As stated numerous times in this bug, the default value cannot change at the Platform/Resources level (component level)."

If you guys really wanted this you *could* and you would change it.
Comment 91 Laurent Barbareau CLA 2014-09-23 05:06:27 EDT
Stijn, thank you for those reminders and explanations.

Of course, this remains a really tricky migration but Eclipse have to switch to UTF-8 as default encoding. There is no other option, alternate encodings are heavy hindering and eventually disappear.

Every day I'm personally facing encoding problems, because of Eclipse, because of my development environment that I can't change, because of plugins gaps with encodings and because in my language we can't satisfy with ASCII characters.

Then now, let's establish steps to make that migration as softly as possible !

1st of all : Users have to be prepared and warned each time that UTF-8 could/should be considered, instead of the OS encoding or any other one. This should occurs when installing Eclipse, creating a new workspace, creating each new project, RCP, plug-ins...
The encoding of the file that is opened into the editor that has the focus, could/should be displayed somewhere (in the status bar for instance).
...
Comment 92 David Carver CLA 2015-01-12 12:34:48 EST
Created attachment 249872 [details]
Default Encoding in IntelliJ Community Edition 14

Attached a screenshot from IntelliJ Community Edition 14, in which it shows what the IDE's encoding is set to, and that projects use system default.  So IntelliJ also defaults to whatever the system is as well.   Certain files like IDE files and XML files it defaults to UTF-8, but it is still the projects responsibility to set a default encoding if it needs too.
Comment 93 Stanislav Spiridonov CLA 2015-01-12 12:47:11 EST
Created attachment 249873 [details]
UTF-8 from a system in Eclipse

I have the system default UTF-8, but anywhere Eclipse use some legacy encoding for certain file types, e.g. for .properties it is ISO-8859-1. So I need to check these types each time for a new workspace.
Comment 94 Markus Knauer CLA 2015-01-12 12:53:43 EST
(In reply to Stanislav Spiridonov from comment #93)
> e.g. for .properties it is ISO-8859-1.

This is one of the exceptions... the default encoding of a Java .properties file is in fact ISO 8859-1, not UTF-8, see the following JavaDoc:

http://docs.oracle.com/javase/6/docs/api/java/util/Properties.html#load%28java.io.InputStream%29
Comment 95 Stanislav Spiridonov CLA 2015-01-12 13:14:08 EST
Thank you, I see the reason. I am working with GWT and it recognizes the UTF-8 in properties files. But anyway there are still JSP content types which is also have the legacy encoding by default.
Comment 96 Dani Megert CLA 2015-01-28 08:18:07 EST
*** Bug 458618 has been marked as a duplicate of this bug. ***
Comment 97 Mickael Istria CLA 2015-09-29 10:08:48 EDT
I've read someone complaining about it, again.
I have a dummy question on this topic: is the encoding inherited between content-types? In preferences, the view is a tree, and the "root" text content-type doesn't specify an encoding. If we only set this one to UTF-8, does that mean that all children content-types that don't override the setting will be UTF-8 ?
If yes, it seems to be a minor change for reasonable users satisfaction.
Comment 98 Olivier Croisier CLA 2015-09-29 17:17:52 EDT
Hi, I am the complaining guy Mickael Istria refers to in comment #97.

Every time this happens : the developers team work on Windows, unzip Eclipse, start to work a few weeks, then the product is deployed on a production server that runs on anyrything but Windows (Solaris, RedHat...) and thus doesn't use a MS proprietary charset, but rather UTF-8.
And then all web pages display those lovely diamonds with question marks, stating that we yet again have an encoding issue. So now everyone has to take time to reconfigure Eclipse in all the places it manages file encoding, verify hundreds of files for re-encoding issues etc.

I agree this is the developer duty to know his tools and configure them properly, but not everyone is a senior, encoding-problems-aware developer (at least not in any team I worked with so far), so a sane default would prevent sooo many problems.

Add to that, that nowadays most Java applications are web applications, thus very likely to be i18n'd in "funky" languages like Arabic, Japanese, Korean or Chinese (with glyphs not covered by WIN-CP1252 nor ISO-8859-1)...

To conclude, I would be very grateful if the Eclipse team would correct this simple but always painful issue !
Comment 99 David Williams CLA 2015-09-29 19:22:13 EDT
(In reply to Olivier Croisier from comment #98)

> Add to that, that nowadays most Java applications are web applications, thus
> very likely to be i18n'd in "funky" languages like Arabic, Japanese, Korean
> or Chinese (with glyphs not covered by WIN-CP1252 nor ISO-8859-1)...

What filetype are you complaining about? If you have webtools installed, it should pickup the correct encoding from the file content itself (as well as XML and many other types of files). 

If you have your code under source code control, you should have set the encoding for your projects as you like ... then all the "new guys" will automatically get your preferred encoding set on those projects. 

> To conclude, I would be very grateful if the Eclipse team would correct this
> simple but always painful issue !

And you don't mind breaking many other people/projects who have different assumptions, eh? :)
Comment 100 Doug Schaefer CLA 2015-09-29 21:32:42 EDT
I'm sorry. UTF-8 is the industry standard. I can't see how anyone can deny that. I don't mind breaking people. Make the few who this impacts change the preference back.

Sometimes you have to take a hit to do the right thing for the vast majority of your users. We need to do the math. I'm tired of sucking just to keep the status quo to satisfy the few. Be brave and take the hit and do the right thing.
Comment 101 Mickael Istria CLA 2015-09-30 01:56:11 EDT
The webtools editors use the best strategy to *detect* the encoding when possible. But in case there is not enough to detect the encoding, using UTF-8 as fallback seems to be the best approach from user perspective.

I second Doug here. I believe that there will be more people happy by the move to UTF-8 than people unhappy with it, and that those who are using funky alternative conventions and encodings should be the one having to do the extra-step of setting their encoding if it doesn't match.
Telling most users to change a preference is not as good user experience as setting this preference as default.
Maybe this can be part of a future poll, such as the one that happened about line numbers some time ago?
Comment 102 Alex Blewitt CLA 2015-09-30 03:06:56 EDT
A future implementation note: if you set -Dfile.encoding=UTF-8 then you lose the ability to switch back to the native filesysyem encoding (because you are effectively saying that the native file encoding is UTF-8). If you change the Eclipse preference from OS native to UTF-8 then it will at least permit those who want to switch back to do so.
Comment 103 Alex Blewitt CLA 2015-09-30 03:07:44 EDT
PS happy ten year bugaversary for this bug earlier this month :-)
Comment 104 Stanislav Spiridonov CLA 2015-09-30 03:15:33 EDT
  (In reply to Alex Blewitt from comment #102)
> A future implementation note: if you set -Dfile.encoding=UTF-8 then you lose
> the ability to switch back to the native filesysyem encoding 

Using the JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8 is correct for Windows in 99% because Java "incorrectly" define the native filesysyem encoding. It takes the windows fallback settings ("Language for non-Unicode programs") as default.

So with JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8 you just set the CORRECT native filesysyem encoding for Windows.
Comment 105 Mickael Istria CLA 2015-09-30 03:28:01 EDT
Note that the default workspace encoding is not a preference, it's computed in Platform and stored as a metadata on the workspace (so the issue seems to be in Platform, not in EPP).
See org.eclipse.ui.WorkbenchEncoding#getWorkbenchDefaultEncoding. It relies on the file.encoding of the JVM and fails back to UTF-8.
As Stanislas mentioned just above, advising users to revise their JVM settings on Windows rather than telling them to tweak the workspace is a good idea. However, it would be nice if users could figure this out by themselves in the IDE. I suggest we replace the "Default (...)" label for encoding in Preferences > General > Workbench by "JVM Default (...)" and add a tooltip such as "Relies on the 'file.encoding' JVM property. For development and execution consistency, it's recommended that you configure your JVM property rather than overriding the workspace configuration."
WDYT?
Comment 106 Tilman Potthof CLA 2015-09-30 03:53:25 EDT
Sorry, but this is so ridiculous. The platform may have a shitty default, so we have to stick to that shitty default. Of course, in an ideal world every dev should know that problem and configure its environment correctly, BUT you always have new developers in a Team or you have a new computer and forget that little tiny thing and you encodings get messed up.

What was the benefit taking the platform default? I probably missed that point.
Comment 107 Tilman Potthof CLA 2015-09-30 03:53:59 EDT
Sorry, but this is so ridiculous. The platform may have a shitty default, so we have to stick to that shitty default. Of course, in an ideal world every dev should know that problem and configure its environment correctly, BUT you always have new developers in a Team or you have a new computer and forget that little tiny thing and you encodings get messed up.

What was the benefit taking the platform default? I probably missed that point.
Comment 108 Mickael Istria CLA 2015-09-30 04:18:44 EDT
(In reply to Mickael Istria from comment #105)
> See org.eclipse.ui.WorkbenchEncoding#getWorkbenchDefaultEncoding. It relies
> on the file.encoding of the JVM and fails back to UTF-8.

Or maybe simply override the ResourcePlugin.getEncoding method to return UTF-8 instead of checking property.
Indeed, there is no strong relationshop between the JVM that is used to run Eclipse IDE on the workstation, and the target environment (it may not even be Java), so inferring Resource encoding from underlying JVM settings seems irrelevant.
Comment 109 Eclipse Genie CLA 2015-09-30 04:34:13 EDT
New Gerrit change created: https://git.eclipse.org/r/57036
Comment 110 Eclipse Genie CLA 2015-09-30 04:51:43 EDT
New Gerrit change created: https://git.eclipse.org/r/57040
Comment 111 Denis Roy CLA 2015-09-30 10:52:33 EDT
(In reply to Olivier Croisier from comment #98)
> Hi, I am the complaining guy Mickael Istria refers to in comment #97.

Thank you for taking the time to register for an Eclipse account and for posting your comments. The entire community benefits from more input.
Comment 112 Stijn de Witt CLA 2015-10-01 16:57:47 EDT
"Note that the default workspace encoding is not a preference, it's computed in Platform and stored as a metadata on the workspace (so the issue seems to be in Platform, not in EPP)."

Ha ha yes this is what everyone has been saying for ten years now.

"And you don't mind breaking many other people/projects who have different assumptions, eh? :)"

David, could you please describe such a project? 

This is a sincere question, because I believe it actually to be *impossible* to have a non-breaking setup without setting an encoding on the project level, because of the current default. Let me explain.

1. If you do NOT set encoding at the project level (or file level), Eclipse uses the platform default.
2. Because of how Java works, the platform default is *never* compatible across different machines and operating systems. On Windows machines it will assume CP-1252, whereas on Macs and Linux boxes it will use different (incompatible) encodings.
3. Even on Windows, the platform default will actually vary across locales. There are dozens of different encodings for e.g. Germany, Poland, Japan, France, Spain etc.

So the 'many other people/projects' that would break would have to be groups of people that:

* Are all on the same OS
* Are all within the same locale (or compatible at least)

I'm not sure where these people are that are never working with people from different countries, or that are using different OS etc but are they really the people that should be protected? Every day developers are losing time because of this defaults. Developers that want to create *interoperable* software that works in *every* country and on *every* OS. Is their life really being made more difficult for the sake of these legacy-encoding-dependant people that are creating software that is *per definition* NOT interoperable?

I know I come on strong with my arguments... But imho, in 2005 when this bug was created it made *some* sense to argue against changing the defaults. Unicode was still pretty new then. But today, in 2015 it's becoming totally ridiculous if you ask me. UTF-8 is *the* de-facto standard encoding and has been for years now.
Comment 113 David Williams CLA 2015-10-01 18:30:05 EDT
(In reply to Stijn de Witt from comment #112)

> David, could you please describe such a project? 

> So the 'many other people/projects' that would break would have to be groups
> of people that:
> 
> * Are all on the same OS
> * Are all within the same locale (or compatible at least)

This is the use-case I'm aware of. (Japanese developers, developing Japanese web applications, specifically). 

Admittedly, I've not worked with those development groups for a long time, but I'd think if nothing else, they could have assets still in use. 

I know, for them at least, possibly others, it's even more complicated that the complications you mentioned, since there's often special hardware, and special versions of Java made for such cases. (That I do not really keep track of.) 

I don't mind you, and others, repeatedly asking for this ... but, many alternatives have been suggested, over the years, and I have yet to hear why none of those alternatives would be feasible. So, it does get tiresome. 

The easy alternatives: make sure your files that allow self documenting encoding are properly self documented, and make sure your project encoding are set properly.  Beyond that, there were suggestions for someone with a vested interest to contribute "user aides" that would remind users, say during "New Project ...", to specify a better encoding than "workspace default". 

Those things seem better to me than risk messing up someone's existing data. 

Which, reminds me, that's how this case is different than, say "voting on line number preferences". Here we are talking about the possibility of damaging someone's existing data, or, I think it was suggested "make them invest in converting all their existing data". 
 
These type of things (damage, and "forced investments") are not open to "majority rule", IMHO. I feel an obligation to protect the minority, in such cases.
Comment 114 Dani Megert CLA 2015-10-02 02:48:50 EDT
(In reply to David Williams from comment #113)
> Those things seem better to me than risk messing up someone's existing data. 
> 
> Which, reminds me, that's how this case is different than, say "voting on
> line number preferences". Here we are talking about the possibility of
> damaging someone's existing data, or, I think it was suggested "make them
> invest in converting all their existing data". 
>  
> These type of things (damage, and "forced investments") are not open to
> "majority rule", IMHO. I feel an obligation to protect the minority, in such
> cases.

+1. I'm definitely also against such a change. I will take this into our next PMC call to see whether other PMC members have a different opinion on this.
Comment 115 Laurent Barbareau CLA 2015-10-02 08:09:43 EDT
(In reply to comment #114)
> +1. I'm definitely also against such a change. I will take this into our next
> PMC call to see whether other PMC members have a different opinion on this.
Good.

Nobody wants to rot the environment of others. There are certainly different strategies to achieve this without causing anger and disappointment.

If you decide to do nothing, do you realize that you'll have more and more complaints about this issue ?
Comment 116 Mickael Istria CLA 2015-10-02 08:11:32 EDT
(In reply to Laurent Barbareau from comment #115)
> If you decide to do nothing, do you realize that you'll have more and more
> complaints about this issue ?

Or less and less, since users may prefer other IDEs ;)
Comment 117 Dani Megert CLA 2015-10-02 08:17:39 EDT
(In reply to Laurent Barbareau from comment #115)
> Nobody wants to rot the environment of others.

Right! Well, some seem to.


> There are certainly different
> strategies to achieve this without causing anger and disappointment.

I think a partial solution would be to set the encoding to UTF-8 for empty workspaces. It won't solve all issues (see my comment 85) but solve the 80% problem.
Comment 118 Mickael Istria CLA 2015-10-02 08:22:31 EDT
(In reply to Dani Megert from comment #117)
> I think a partial solution would be to set the encoding to UTF-8 for empty
> workspaces. It won't solve all issues (see my comment 85) but solve the 80%
> problem.

I believe that's a (the?) good solution. It's more or less what I was willing to do with the suggested patches, but I didn't manage to do that. Is the alternative of just setting default value for PREF_ENCODING to UTF-8 a good way to implement that behaviour.
Comment 119 Dani Megert CLA 2015-10-02 08:31:50 EDT
(In reply to Mickael Istria from comment #118)
> (In reply to Dani Megert from comment #117)
> > I think a partial solution would be to set the encoding to UTF-8 for empty
> > workspaces. It won't solve all issues (see my comment 85) but solve the 80%
> > problem.
> 
> I believe that's a (the?) good solution. 

Good path forward. I think we only need to do two things:

1. Add org.eclipse.core.resources.ResourcesPlugin.getDefaultEncoding() that returns UTF-8 if the workspace is empty (detecting whether it's a completely new workspace is hard) and return current default otherwise.
2. Call that method in ResourcesPlugin.getEncoding() and WorkbenchEncoding.getWorkbenchDefaultEncoding() and other places where appropriate.


BTW: Didn't like your threat in your previous comment ;-).
Comment 120 Laurent Barbareau CLA 2015-10-02 10:03:29 EDT
(In reply to comment #117)
> (In reply to Laurent Barbareau from comment #115)
> > Nobody wants to rot the environment of others.
> 
> Right! Well, some seem to.
No... what we/people want is to get rid as soon as possible of those encoding issues. UTF-8 as default encoding in Eclipse is just a little step. Eventually, everything in the Eclipse ecosystem has to converge towards UTF-8 but today neither Eclipse nor its plugins know how to deal properly with encoding. There is always a component that doesn't reach to determine the good (or most appropriate) encoding according to a specific situation even if you have an identical one at the different levels (workspace, projects properties, files....)

But, before initiating any action you should take the time to prepare people to that transition. You must not just decide to change something without warning users (as it is too often the case in my opinion, in Eclipse or elsewhere too). And when I'm talking about warning people, I think to explicitly describe what's going to change for users, each time it may be useful or necessary. Not everybody knows which encoding is best for them according to their situation.

Globally, I suggest to ensure that the encoding information is displayed or asked everywhere it may be useful. For instance :
 - by adding that information into the status bar (as we can see in some other softwares) and each time you click on an element that can be concerned, the encoding is displayed/updated.
 - when you create or copy a workspace, a project, a file... you're asked to choose the encoding (accompanied with a guide (in a popup) to choose the most appropriate one for you).
 - when you start first time Eclipse, you're asked to choose the encoding (accompanied with a guide (in a popup) to choose the most appropriate one for you).
 - ...
But always pushing UTF-8 as best choice it you start from scratch.

Simultaneously, regarding what I was saying about the encoding issues encountered with Eclipse or its plug-ins, it hope it would be possible to guide developers to take more care about the encoding determination...
Comment 121 Dani Megert CLA 2015-10-09 12:13:10 EDT
(In reply to Dani Megert from comment #114)
> I will take this into our
> next PMC call to see whether other PMC members have a different opinion on
> this.

We have discussed this on Wednesday in our Eclipse top-level PMC call and the PMC unanimous agreed that we will not change the default. You can find more details in the PMC Meeting Minutes from October 7:
https://wiki.eclipse.org/Eclipse/PMC#Meeting_Minutes


To repeat the reasons for the decision:
-  changing the encoding to 'UTF-8' on Windows causes lots of troubles:
  - encoding on Windows (including Windows 10) is 'Cp1252' in most countries
    around the globe
  - all Windows tools (including compilers) read and write files with that 
    encoding
  - characters will no longer be readable when copying or importing files from 
    disk
  - characters will be destroyed without warning when saving the file


To go forward the following things can improve the workflow:
- make sure that the encoding is set on the project when creating it (bug 479450)
- add a warning when a project does not have the encoding set and provide a
  quick fix to set it (bug 479451)
- revisit the decision regarding the Welcome Questionnaire
- explicitly set the encoding when creating a resource where the encoding 
  differs from the parent (e.g. when using drag & drop or import)

Please use the mentioned bugs to comment on the individual ideas, rather than comment here.
Comment 122 Mickael Istria CLA 2015-10-09 12:20:05 EDT
Thanks for the input Dani. What you suggest is quite good from user POV, despite it's not chaning default, it will give the same user experience in most cases.
Comment 123 a e CLA 2015-10-09 12:44:36 EDT
(In reply to Dani Megert from comment #121)
>   - encoding on Windows (including Windows 10) is 'Cp1252' in most countries
>     around the globe

Yeah, so?
That looks like a pro-argument, not a con-argument to me.

>   - all Windows tools (including compilers) read and write files with that 
>     encoding

Again, so??

>   - characters will no longer be readable when copying or importing files
> from 
>     disk

What? How ...?

>   - characters will be destroyed without warning when saving the file

LOL, seriously?


There must be more reasons than this, right?
Comment 124 Doug Schaefer CLA 2015-10-09 22:02:42 EDT
(In reply to Mickael Istria from comment #122)
> Thanks for the input Dani. What you suggest is quite good from user POV,
> despite it's not chaning default, it will give the same user experience in
> most cases.

Thanks Mickael and Dani. If you can make it so that all new/empty workspaces are UTF-8, that would be a reasonable compromise.

Remember when you make decisions like this, you are making them on behalf of the community, the entire community, not just your employer and certainly not only yourself, and that you are doing it with respect for the opinion of those who've commented on this bug and elsewhere. Then hopefully the respect would be mutual. There are millions of users out there. Our products are successful because Eclipse in the large is successful. We need to make sure we protect that.
Comment 125 Stefan Xenos CLA 2015-12-02 14:17:33 EST
FYI, there is precedent for using UTF-8 by default on Windows. Visual SourceSafe (a Microsoft tool) uses UTF-8 by default:

https://msdn.microsoft.com/en-US/library/5fdkw2w1(v=vs.80).aspx
Comment 126 Stefan Xenos CLA 2015-12-02 19:53:17 EST
Re: comment 123

While I strongly agree that using UTF-8 by default is the right thing to do, insults and sarcasm won't help convince anyone of this.