visual c++ - std::regex fatal error -
i'd think isn't bug in standard library, i'm running out of places look.
the statement std::regex(expression) expression std::string causes memory access fatal error.
expression declared statement:
std::string expression = std::string("^(") + std::string("[\x09\x0a\x0d\x20-\x7e]|") + // ascii std::string("[\xc2-\xdf][\x80-\xbf]|") + // non-overlong 2-byte std::string("\xe0[\xa0-\xbf][\x80-\xbf]|") + // excluding overlong std::string("[\xe1-\xec\xee\xef][\x80-\xbf]{2}|") + // straight 3-byte std::string("\xed[\x80-\x9f][\x80-\xbf]|") + // excluding surrogates std::string("\xf0[\x90-\xbf][\x80-\xbf]{2}|") + // planes 1-3 std::string("[\xf1-\xf3][\x80-\xbf]{3}|") + // planes 4-15 std::string("\xf4[\x80-\x8f][\x80-\xbf]{2}") + // plane 16 ")*$"; this regex taken http://www.w3.org/international/questions/qa-forms-utf-8 test whether byte sequence utf8.
is bug in library, or missing tiny?
compiled vs2015 c++, if happens make difference.
edit: forgot mention there 1 specific line in breaks code. std::string("[\xe1-\xec\xee\xef][\x80-\xbf]{2}|") + // straight 3-byte line breaks. comment out , works fine. line on it's own creates memory access error.
so, if use escapes in string literals, without using raw syntax,
have escape escapes.
example, new string:
std::string expression = std::string("^(") + std::string("[\\x09\\x0a\\x0d\\x20-\\x7e]|") + // ascii std::string("[\\xc2-\\xdf][\\x80-\\xbf]|") + // non-overlong 2-byte std::string("\\xe0[\\xa0-\\xbf][\\x80-\\xbf]|") + // excluding overlong std::string("[\\xe1-\\xec\\xee\\xef][\\x80-\\xbf]{2}|") + // straight 3-byte std::string("\\xed[\\x80-\\x9f][\\x80-\\xbf]|") + // excluding surrogates std::string("\\xf0[\\x90-\\xbf][\\x80-\\xbf]{2}|") + // planes 1-3 std::string("[\\xf1-\\xf3][\\x80-\\xbf]{3}|") + // planes 4-15 std::string("\\xf4[\\x80-\\x8f][\\x80-\\xbf]{2}") + // plane 16 ")*$"; when don't escape them, compiler tries interpret a
special character. in case interpreting hex binary characters.
and, while regex engine gets right character,
better pass hex engine can see character
might break (if does).
Comments
Post a Comment