Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 347090 - A utf8 javascript is sent as content-type charset iso-8859-1
Summary: A utf8 javascript is sent as content-type charset iso-8859-1
Status: CLOSED INVALID
Alias: None
Product: Jetty
Classification: RT
Component: server (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: 7.2.x   Edit
Assignee: Greg Wilkins CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-24 17:36 EDT by josvazg CLA
Modified: 2011-08-15 05:06 EDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description josvazg CLA 2011-05-24 17:36:47 EDT
Build Identifier: 7.4.0

Having some static Javascript files encoded using utf8  jetty serves them with a Content-Type with charset iso-8859-1 instead.

As a result the texts comming from the Javascript using NON Ascii characters are displayed incorrectly.

 

Reproducible: Always

Steps to Reproduce:
1. Use a utf8 as static javascript file, make it display some intersting test, like "EspaÑa"
2. Access them with an script tag from an HTML.
3. The text comming from the script will get to the browser with a Content-Type's  charset iso-8859-1 and will be displayed incorrectly.
Comment 1 Greg Wilkins CLA 2011-05-25 22:54:44 EDT
Does your JSP include 
<%@ page contentType="text/html; charset=UTF-8" %>

to set the content type? JSP cannot autodetect content type.


If your JSP does have this, then please attach an example and reopen the issue.
Comment 2 josvazg CLA 2011-05-26 03:22:27 EDT
Yes, the decorator for ALL JSP pages in my app start like this:
<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
...

Some example:

<html><head>
<title>Franco Bingo Caja-Consultas</title>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link href="/modcaja/favicon.ico" rel="shortcut icon" type="image/x-icon">
<link href="/modcaja/styles/styles.css" rel="stylesheet" type="text/css">
<link href="/modcaja/styles/chromestyle.css" rel="stylesheet" type="text/css">

<script type="text/javascript" src="/modcaja/js/prototype.js"></script>
... // Jsvascript texts here in UTF8 are OK!!
//-->
</script>

<link href="/modcaja/styles/formulario.css" rel="stylesheet" type="text/css">

	<link href="/modcaja/js/calendar-system.css" rel="stylesheet" type="text/css">
	<script src="/modcaja/js/calendar.js" type="text/javascript"></script>

        <!-- Texts loaded below are received with a NO encoding at all and interpreted as ISO-8859-1 by the browser -->
	<script src="/modcaja/js/calendar_es.js" type="text/javascript"></script> 

	<script src="/modcaja/js/calendar-setup.js" type="text/javascript"></script>
</head>

When I check the Server Response headers for http://tys14ubu:8080/modcaja/js/calendar_es.js with Chrome I get this:
Date:Thu, 26 May 2011 07:13:38 GMT
Server:Jetty(7.4.0.v20110414)

NO Content-Type and no Encoding header at all, even though the file is UTF-8 and the OS is configured as UTF8.

The same Javascript works fine in JBoss 4, it's loaded as UTF-8. But the funny thing is that I don't understand why cause the server headers are NOT much better than Jetty ones:
Date:Thu, 26 May 2011 07:19:48 GMT
ETag:W/"3943-1259327252000"
Server:Apache-Coyote/1.1
X-Powered-By:Servlet 2.4; JBoss-4.2.3.GA (build: SVNTag=JBoss_4_2_3_GA date=200807181417)/JBossWeb-2.0
Comment 3 Alfredo Osorio CLA 2011-06-16 17:39:36 EDT
Apparently Jetty has nothing to do with this problem. The one responsible for this is Struts 2 framework. I have the same problem. This is what happens:

In a Struts 2 application you have to declare the following filters in order to let it handle the requests:
org.apache.struts2.dispatcher.ng.filter.StrutsPrepareFilter
org.apache.struts2.dispatcher.ng.filter.StrutsExecuteFilter

So this is what happens when a static resource is request:
www.mydomain.com/myApp/scripts/utils.js

1. Even though Struts 2 is not going to handle the request (utils.js is a static resource file) org.apache.struts2.dispatcher.ng.filter.StrutsPrepareFilter calls prepare.setEncodingAndLocale(request, response);
which sets the Request Encoding and the Response Locale.
2. Jetty Response setLocale obtains the character encoding corresponding to that locale and assign it to _characterEncoding attribute.
3. Struts Execute filter doesn't handle the request so chain.doFilter(request, response); is called.
4. Once all filters in the chain are called the jetty's DefaultServlet is called to handle the request.
5. When the HttpConnection.Output.sendContent(Object content) is called and it calls String enc = _response.getSetCharacterEncoding(); returns
the already assigned value (the one that Struts 2 assigned before) and append it in the Content-Type request header.

_responseFields.put(HttpHeaders.CONTENT_TYPE_BUFFER,
                                        contentType+";charset="+QuotedStringTokenizer.quoteIfNeeded(enc,";= "));
Comment 4 Alfredo Osorio CLA 2011-06-16 18:05:28 EDT
I found that this only happens in Struts 2 when you set a default encoding using struts.locale in the struts.properties file because it's when response.setLocale(locale) is called.
Comment 5 josvazg CLA 2011-06-17 05:18:23 EDT
(In reply to comment #4)
> I found that this only happens in Struts 2 when you set a default encoding
> using struts.locale in the struts.properties file because it's when
> response.setLocale(locale) is called.

I couldn't confirm this. But I could see the problem was related to Struts2 as the same .js that was working being interpreted as ISO-8859-1 on Struts2 + Jetty was interpreted correctly as UTF-8 on some other apps with ApacheClick+Jetty.

Following your explanations on character and locale setting problems under Struts2 I created a Filter that corrects the problem:

public class CharEncodingFilter implements Filter {
	
	private static Log l=LogFactory.getLog(CharEncodingFilter.class);
	
	private String charEncoding="UTF-8";
	
	private List<String> extensions;

	public CharEncodingFilter() {
		super();
		extensions=new ArrayList<String>();
		extensions.add(".js");
		extensions.add(".css");
	}

	@Override
	public void doFilter(ServletRequest req, ServletResponse res,
			FilterChain chain) throws IOException, ServletException {
		HttpServletRequest rq=(HttpServletRequest)req;
		String uri=rq.getRequestURI().toLowerCase();
		for(String ext:extensions) {
			if(uri.endsWith(ext)) {
				if(!charEncoding.equals(res.getCharacterEncoding())) {
					String oldCharEncoding=res.getCharacterEncoding();
					res.setCharacterEncoding(charEncoding);
					l.debug("Response for "+uri+": charEncoding " +
							oldCharEncoding+" => "+res.getCharacterEncoding());
				}
			}
		}
		chain.doFilter(req, res);
	}
...
}

The filter is setup in the web.xml AFTER struts2 (and sitemesh) filters as this:
<filter>
	<filter-name>charEncodingFilter</filter-name>
	<filter-class>com.rfranco.filter.CharEncodingFilter</filter-class>
	</filter>
	<filter-mapping>
		<filter-name>charEncodingFilter</filter-name>
		<url-pattern>/*</url-pattern>		
	</filter-mapping>
</filter>

This filter will search for request uri ending with certain extensions, like .js or .css, and if found to have response character encodings different from the correct one (in my case UTF-8) it will change them to be correct (UTF-8)

This works correctly for my apps. Don't know if there is a better or simpler solution.

But one question remains, why is JBoss NOT affected by this Struts2 bug?
Comment 6 Greg Wilkins CLA 2011-06-22 03:15:55 EDT
josvazg,

you have <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">, which is entirely different to <%@ page contentType="text/html; charset=UTF-8" %>

The former tells the client side and the later tells the server side.
Comment 7 Jan Bartel CLA 2011-08-15 03:48:40 EDT
I'll close this issue again, as it seems like this is more to do with struts2 than jetty.

FYI, there has been a couple of changes in handling the encodings in the upcoming 7.5.0 and jetty-8.0.0 releases:

http://jira.codehaus.org/browse/JETTY-1153
https://bugs.eclipse.org/bugs/show_bug.cgi?id=354204


The last one may affect the encoding used when no encoding is explicitly set - in the case of json it will (now) be UTF-8.

Jan
Comment 8 josvazg CLA 2011-08-15 05:06:33 EDT
I could not use those headers as the problem DOES NOT happen on HTML content but only on Javascript .js files.

(In reply to comment #6)
> josvazg,
> 
> you have <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">,
> which is entirely different to <%@ page contentType="text/html; charset=UTF-8"
> %>
> 
> The former tells the client side and the later tells the server side.