Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 337711

Summary: Unicode Characters are not put out correctly on console
Product: [Eclipse Project] Platform Reporter: Christian <ChristianOrtolf>
Component: DebugAssignee: Platform-Debug-Inbox <platform-debug-inbox>
Status: CLOSED DUPLICATE QA Contact:
Severity: normal    
Priority: P3 CC: Michael_Rennie, pawel.1.piech, prakash, pwebster
Version: 4.1   
Target Milestone: ---   
Hardware: PC   
OS: Windows 7   
Whiteboard:

Description Christian CLA 2011-02-21 07:20:04 EST
Build Identifier: M20100909-0800

Sometimes unicode characters are replaced with some replacement characters.

public class SystemOutTest {

	static Random rand = new Random(0);
	/**
	 * @param args
	 */
	public static void main(String[] args) {
		for (int k = 0; k < 10; k++) {
			StringBuilder sb = new StringBuilder();
			for (int i = 0; i < 100; i++) {
				for (int j = 0; j < 70; j++) {
					sb.append(rand.nextFloat() < 0.1f ?'\u25A0': '\u25A1');
				}
				sb.append("\n");
			}
			System.out.println(sb);
		}
	}

}

Reproducible: Sometimes

Steps to Reproduce:
Compile and run the provided code from within eclipse
Most of the time, one can see a line that sticks out.
This means one character was replaced with 2 replacement boxes, therefore the line becomes longer and breaks the block.
Comment 1 Michael Rennie CLA 2011-02-22 15:46:06 EST
Have you set the console encoding properly? On Windows the default encoding will usually not properly display unicode characters.

You either have to set it on the Common tab using the console encoding options or use the VM argument (-Dfile.encoding=UTF-8 for example).
Comment 2 Christian CLA 2011-02-22 16:41:23 EST
The grand majority of the characters are shown correctly. So I assume settings there are correct.

(Although I don't know which serttings you mean)
Comment 3 Michael Rennie CLA 2011-02-23 11:24:49 EST
(In reply to comment #2)
> The grand majority of the characters are shown correctly. So I assume settings
> there are correct.
> 
> (Although I don't know which serttings you mean)

In the launch configuration you use to run the program there are two tabs where you can set encoding options:

1. The Common tab, there is an 'Encoding' option group
2. The Arguments tab, there is a text area called 'VM arguments' where you can enter the '-Dfile.encoding=UTF-8' vm argument
Comment 4 Christian CLA 2011-02-23 12:23:07 EST
Common was set to utf-8 (inheritted from workspace). No settings were done in the arguments tab.

Also I doubt a setting is possible that would make the majority of all characters come out correctly, but only sometimes fail at the very same characters.


After all the example only uses 2 characters  which were taken over from the program where this happened the first time.
The bug might already appear with a single character, but I haven't tried that.

I was able to reproduce this problem on two different windows 7 machines.
Comment 5 Pawel Piech CLA 2011-06-08 15:02:21 EDT

*** This bug has been marked as a duplicate of bug 266658 ***