encoding - What is the point of Tomcat's setting URIEncoding? -


in apache tomcat, parameter uriencoding tells tomcat how interpret incoming uris:

uriencoding

this specifies character encoding used decode uri bytes, after %xx decoding url. if not specified, iso-8859-1 used.

apache tomcat 7 - http connector

however, explained example in what proper way url encode unicode characters? , non-ascii characters in uris encoded in utf-8, following current standards (rfc 3986 , 3987).

so:

  • why there setting mandated standard?
  • why default different standard mandates? (iso-8859-1 instead of utf-8)

is because tomcat setting predates standard, , retained backwards compatibility? or there situation value different utf-8 makes sense?

the description of parameter uriencoding in tomcat 8 - apache tomcat 8 - http connector:

this specifies character encoding used decode uri bytes, after %xx decoding url. if not specified, utf-8 used unless org.apache.catalina.strict_servlet_compliance system property set true in case iso-8859-1 used.

thus description changed of apache tomcat 7. default value of org.apache.catalina.strict_servlet_compliance false apache tomcat 8. utf-8 default value of uriencoding apache tomcat 8, means tomcat follows standard (and common usage).


as why tomcat used iso 8859-1 default uri encoding until tomcat 7:

that seems because tomcat devevelopers believed servlet specification requires (as name of setting strict_servlet_compliance indicates).

as matter of fact, servlet spec not explicitly mention uri encoding in version. does, however, mention post data must parsed iso 8859-1 if content-type http header not specify encoding via charset (servlet specification v2.5, "request data encoding"). apparently interpreted mean query parameters (and whole uri) should decoded iso 8859-1 default.

the root problem arguably servlet specification not specify default encoding use decoding uris, let alone way change encoding. in turn because uri spec did not allow non-ascii characters in uris - standardized introducing iris, see rfc 3987 january 2005. therefore every servlet container had come own default value , configuration parameter, such uriencoding in apache tomcat.

these 2 problems have been reported bugs against servlet specification:

maybe servlet specification amended 1 day...


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -