encoding - What is the point of Tomcat's setting URIEncoding? -
in apache tomcat, parameter uriencoding
tells tomcat how interpret incoming uris:
uriencoding
this specifies character encoding used decode uri bytes, after %xx decoding url. if not specified, iso-8859-1 used.
apache tomcat 7 - http connector
however, explained example in what proper way url encode unicode characters? , non-ascii characters in uris encoded in utf-8, following current standards (rfc 3986 , 3987).
so:
- why there setting mandated standard?
- why default different standard mandates? (iso-8859-1 instead of utf-8)
is because tomcat setting predates standard, , retained backwards compatibility? or there situation value different utf-8 makes sense?
the description of parameter uriencoding
in tomcat 8 - apache tomcat 8 - http connector:
this specifies character encoding used decode uri bytes, after %xx decoding url. if not specified, utf-8 used unless org.apache.catalina.strict_servlet_compliance system property set true in case iso-8859-1 used.
thus description changed of apache tomcat 7. default value of org.apache.catalina.strict_servlet_compliance
false apache tomcat 8. utf-8 default value of uriencoding apache tomcat 8, means tomcat follows standard (and common usage).
as why tomcat used iso 8859-1 default uri encoding until tomcat 7:
that seems because tomcat devevelopers believed servlet specification requires (as name of setting strict_servlet_compliance indicates).
as matter of fact, servlet spec not explicitly mention uri encoding in version. does, however, mention post data must parsed iso 8859-1 if content-type
http header not specify encoding via charset
(servlet specification v2.5, "request data encoding"). apparently interpreted mean query parameters (and whole uri) should decoded iso 8859-1 default.
the root problem arguably servlet specification not specify default encoding use decoding uris, let alone way change encoding. in turn because uri spec did not allow non-ascii characters in uris - standardized introducing iris, see rfc 3987 january 2005. therefore every servlet container had come own default value , configuration parameter, such uriencoding
in apache tomcat.
these 2 problems have been reported bugs against servlet specification:
- servlet_spec-145 - specify default url encoding
- servlet_spec-146 - add ability specify url encoding
maybe servlet specification amended 1 day...
Comments
Post a Comment