Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 352850

Summary: [TCP/IP Monitor] Improper handling of national characters
Product: [WebTools] WTP ServerTools Reporter: Krzysztof Daniel <krzysztof.daniel>
Component: wst.serverAssignee: Steven Hung <sghung>
Status: RESOLVED FIXED QA Contact: Elson Yuen <eyuen7>
Severity: normal    
Priority: P3 CC: ccc, eyuen7, sghung
Version: unspecifiedFlags: eyuen7: review+
Target Milestone: 3.2.5 P   
Hardware: PC   
OS: Windows XP   
See Also: https://git.eclipse.org/r/109098
Whiteboard:
Bug Depends on:    
Bug Blocks: 373442    
Attachments:
Description Flags
TestCase
none
Fix proposition
none
Patch as fragment
none
Deployable fragment for testing
none
Patch v2
none
Patch v2 as fragment (source)
none
Patch v2 as fragment (binary for testing)
none
Patch v1.3 (modifications to Christopher's patch v1.2)
none
New approach (Patch v1.4)
eyuen7: iplog+
New apprach (1.5) with a viewer
none
Plugins for testing with Eclipse 3.4
none
New approach (v1.6)
eyuen7: iplog+
Plugins for testing with Eclipse 3.4
none
Patch v1.7
eyuen7: iplog+
Patch v1.8 none

Description Krzysztof Daniel CLA 2011-07-22 07:52:41 EDT
Created attachment 200171 [details]
TestCase

Reproduction steps:
1. Import the test case, fix dependencies
2. Set up monitor on local port 9083
3. Right click on /jwsAddressBookClient/WebContent/WEB-INF/wsdl/AddressBookService.wsdl -> Web Services -> Test with Web Services Explorer
4. In the explorer, click FindAddress, type a name with national characters (f.e. japanese). Click Go.
5. Check the request in the tcp monitor. The name is unreadable.
Comment 1 Krzysztof Daniel CLA 2011-07-22 08:43:08 EDT
Created attachment 200177 [details]
Fix proposition

While I understand that byte view is a byte view, it is justified to want to read text with national characters. I have added a new view which displays the text using proper encoding. It is called a String view.
Comment 2 Krzysztof Daniel CLA 2011-07-27 03:10:59 EDT
Created attachment 200420 [details]
Patch as fragment

Functionality wrapped in a plugin fragment (source)
Comment 3 Krzysztof Daniel CLA 2011-07-27 03:14:28 EDT
Created attachment 200422 [details]
Deployable fragment for testing

Put this file in dropins folder to test it. You should have an additional view in TCP monitor called "String" which shows properly encoded content.
Comment 4 Krzysztof Daniel CLA 2011-08-01 03:21:18 EDT
Created attachment 200642 [details]
Patch v2
Comment 5 Krzysztof Daniel CLA 2011-08-01 03:27:37 EDT
Created attachment 200643 [details]
Patch v2 as fragment (source)
Comment 6 Krzysztof Daniel CLA 2011-08-01 03:28:02 EDT
Created attachment 200644 [details]
Patch v2 as fragment (binary for testing)
Comment 7 Steven Hung CLA 2011-08-03 14:38:15 EDT
Christopher, your patch looks good, I did some testing with national characters. I have two comments:

1. Is there a reason for the changes in patch v2 to MonitorUIPlugin? The method encodingAwareParse is not referenced anywhere, nor is it hit when a debug breakpoint is added to it. I believe this method could be removed, unless this is a convenience method. If it is a convenience method, then there should be a javadoc to indicate its purpose to avoid confusion.

2. In org.eclipse.wst.internet.monitor.ui.internal.viewers.StringViewer.parseWithEncoding(byte[]), the e.printStackTrace() should be replaced with the logger.
Comment 8 Steven Hung CLA 2011-08-03 14:46:26 EDT
Created attachment 200846 [details]
Patch v1.3 (modifications to Christopher's patch v1.2)

A revised patch to the patch Christopher had provided based on my comments from comment #7. 

Tests done:
1. Used the TCP/IP monitor to test national characters. Verified data was garbled in the default "Byte view" and then switched to  "String view" to ensure message was displayed correctly
2. Switched between the two views (Byte and String) 

One usability improvement could be to add a preference as part of the TCP/IP monitor to allow the user to choose which view the request and response should be. Right now, the default view (which is Byte view) will always be displayed for all new TCP/IP monitor entries. If the user wants the ungarbled String view, they need to switch from Byte view to String view each time, which could get tedious. Another bugzilla could be opened for this usability improvement.
Comment 9 Elson Yuen CLA 2011-08-04 15:03:04 EDT
I reviewed the code and the base logic looks good.  A couple of potential changes on the fix:
1. StringViewer and ByteViewer are pretty much the same.  The only difference between the two is the detection logic during the parse and creation of the the reader with the proper encoding.  It will be easier to maintain the code if we combine both of them:
  a) Having the ByteViewer inherit the StringViewer and add a boolean switch on StringViewer to turn on the encoding flag or not.
  b) Move the parse method parseWithEncoding() to MonitorUIPlugin and change the parse() method to parse(boolean) where the boolean is specifying whether encoding is needed or not.

2. if(Trace.SEVERE) check is on later version of the product only but not available on 305 P.  A separate patch needed to be created with the new style tracing.
Comment 10 Krzysztof Daniel CLA 2011-10-04 04:42:23 EDT
Created attachment 204495 [details]
New approach (Patch v1.4)

Unfortunately earlier fixes do not work for HTTPRequests. This is due to the fact that the code path is different for successful http connections (HTTPRequest) and unsuccessful ones (Request). The problem is that HTTPRequest splits the request into header and content, and passes only content to the viewer, without any information about encoding. 

It is necessary to pass also header to the viewer. I've attached changes that needs to be done in the monitor plugins, the String viewer is not ready yet.
Comment 11 Krzysztof Daniel CLA 2011-10-20 07:31:43 EDT
Created attachment 205606 [details]
New apprach (1.5) with a viewer

I am a platformUI commiter. There is no need to check iplog every time.

I have added String viewer to the pathc.
Comment 12 Krzysztof Daniel CLA 2011-10-20 07:35:15 EDT
Created attachment 205607 [details]
Plugins for testing with Eclipse 3.4
Comment 13 Krzysztof Daniel CLA 2011-10-25 06:58:53 EDT
Created attachment 205902 [details]
New approach (v1.6)

Previous patch did not work for responses, only for requests
Comment 14 Krzysztof Daniel CLA 2011-10-25 06:59:32 EDT
Created attachment 205903 [details]
Plugins for testing with Eclipse 3.4

New plugins for testing
Comment 15 Steven Hung CLA 2012-03-06 15:33:55 EST
Created attachment 212163 [details]
Patch v1.7

The method of reading the encoded response and request is based on the previous patches. The view is different when compared to the previous patch, as the ByteViewer was extended, instead of adding a new view.

In addition, the previous patches relied on detecting encoding. However, unless the response or request has the line "Content-type" at the beginning of a line followed by "charset=", it will not be able to detect this encoding. This automatic detection is therefore quite limited, since it cannot deal with encodings such as a simple JSP page with national characters.

A new design was used, to modify the ByteViewer to have an encoding combo box. This combo box has preset values with basic encodings (encodings from:
http://docs.oracle.com/javase/1.4.2/docs/guide/intl/encoding.doc.html)

The user can also enter a user defined value.

Whenever an invalid encoding is entered into the combo box (or the default option of <None>) is selected, the default encoding will be used. This encoding is the same result if the user had used the old ByteViewer (i.e. there is no behavioral change).

If the user changes the encoding, the request or response content will be re-encoded. This is similar to when the user switches between different viewers, the content is updated without having the user re-run the scenario.

Tests ran:
- Ensure Request and Response for ByteViewer work: Used a webservice to test Request and Response with national characters. Changed the encodings and ensure the national characters show up correctly
- Test encoding: Test adding the new encodings attribute into the extension, to ensure that the way encodings are extracted from this attribute are valid and show up properly in the combo box. Test with encodings as:
1. "UTF-8"
2. ""
3. " UTF-8 ,UTF-16, ASCII" - ensure that it is trimmed correctly
4. "UTF-8,UTF-16,ASCII,Cp1250,Cp1251,Cp1252,Cp1253,Cp1254,Cp1257,ISO8859_1,ISO8859_2,ISO8859_4,ISO8859_5,ISO8859_7,ISO8859_9,ISO8859_13,ISO8859_15,KOI8_R,UnicodeBigUnmarked,UnicodeLittleUnmarked,UnicodeBig,UnicodeLittle"
- Ensure other viewers can be selected and there are no errors
- Test different types of encodings for Responses (using JSPs with different encodings). In all cases, use other encodings to see garbled message and then switch to correct encoding to test that content was encoded:
1. UTF-8
2. UTF-16
3. US-ASCII
Comment 16 Elson Yuen CLA 2012-03-06 15:40:01 EST
Comment on attachment 205902 [details]
New approach (v1.6)

Marking patch v1.6 as obsolete since Patch v1.7 replaces v1.6.
Comment 17 Elson Yuen CLA 2012-03-06 15:42:16 EST
Patch v1.7 looks good.
Comment 18 Steven Hung CLA 2012-03-06 15:54:03 EST
Created attachment 212165 [details]
Patch v1.8

Updated minor version of the plug-in for patch 1.7
Comment 19 Elson Yuen CLA 2012-03-06 15:55:12 EST
Steven, thanks for the new patch with the plugin version updated.
Comment 20 Elson Yuen CLA 2012-03-06 17:39:38 EST
Changes released to HEAD
Comment 21 Elson Yuen CLA 2012-04-03 10:12:30 EDT
Changes released to 325P on 20120308
Comment 22 Eclipse Genie CLA 2017-10-11 16:38:02 EDT
New Gerrit change created: https://git.eclipse.org/r/109098