czwartek, 29 sierpnia 2013

Legacy Java web apps and Unicode support

I had a problem with a web app that was not handling Unicode strings (Japanese, Korean) correctly.
The thing was, that the Tomcat (7.0.4x) container I was deploying the app into, was (incorrectly) using the ISO-8859-1 encoding for request parameters.

According to the spec, it could be overwritten in conf/server.xml for each Connector with the
 URIEncoding="UTF-8" attribute.

Tried it - had no effect ;( I have also found out some other comments in net that it didn't worked for other people as well.

So, I went for manual setting the encoding with:

  ServletRequest.setCharacterEncoding("UTF-8");

the problem was, though, that this has to be invoked BEFORE any request.getParameter() method is invoked.
Unfortunately, I had to build my app on top of some crappy webapp a customer had and I couldn't modify the core code. And, of course, this 'core' was playing with request.getParameter() very early in servlet's lifecycle.

So, the solution I come up was just to create a filter which sets the encoding, which is set up early (before my app) in the servlet chain:

web.xml:

...
    <filter>
        <filter-name>fixEncodingFilter</filter-name>
        <filter-class>pack.age.FixEncodingFilter</filter-class>
    </filter>
    <filter-mapping>
        <filter-name>fixEncodingFilter</filter-name>
        <url-pattern>/io/*</url-pattern>
    </filter-mapping>
...


This fixed the problem, nice and clean.