visual c++ - std::regex fatal error -


i'd think isn't bug in standard library, i'm running out of places look.

the statement std::regex(expression) expression std::string causes memory access fatal error.

expression declared statement:

std::string expression = std::string("^(") +     std::string("[\x09\x0a\x0d\x20-\x7e]|") + // ascii     std::string("[\xc2-\xdf][\x80-\xbf]|") + // non-overlong 2-byte     std::string("\xe0[\xa0-\xbf][\x80-\xbf]|") + // excluding overlong     std::string("[\xe1-\xec\xee\xef][\x80-\xbf]{2}|") + // straight 3-byte     std::string("\xed[\x80-\x9f][\x80-\xbf]|") + // excluding surrogates     std::string("\xf0[\x90-\xbf][\x80-\xbf]{2}|") + // planes 1-3     std::string("[\xf1-\xf3][\x80-\xbf]{3}|") + // planes 4-15     std::string("\xf4[\x80-\x8f][\x80-\xbf]{2}") + // plane 16     ")*$"; 

this regex taken http://www.w3.org/international/questions/qa-forms-utf-8 test whether byte sequence utf8.

is bug in library, or missing tiny?

compiled vs2015 c++, if happens make difference.

edit: forgot mention there 1 specific line in breaks code. std::string("[\xe1-\xec\xee\xef][\x80-\xbf]{2}|") + // straight 3-byte line breaks. comment out , works fine. line on it's own creates memory access error.

so, if use escapes in string literals, without using raw syntax,
have escape escapes.

example, new string:

std::string expression = std::string("^(") +     std::string("[\\x09\\x0a\\x0d\\x20-\\x7e]|") + // ascii     std::string("[\\xc2-\\xdf][\\x80-\\xbf]|") + // non-overlong 2-byte     std::string("\\xe0[\\xa0-\\xbf][\\x80-\\xbf]|") + // excluding overlong     std::string("[\\xe1-\\xec\\xee\\xef][\\x80-\\xbf]{2}|") + // straight 3-byte     std::string("\\xed[\\x80-\\x9f][\\x80-\\xbf]|") + // excluding surrogates     std::string("\\xf0[\\x90-\\xbf][\\x80-\\xbf]{2}|") + // planes 1-3     std::string("[\\xf1-\\xf3][\\x80-\\xbf]{3}|") + // planes 4-15     std::string("\\xf4[\\x80-\\x8f][\\x80-\\xbf]{2}") + // plane 16     ")*$"; 

when don't escape them, compiler tries interpret a
special character. in case interpreting hex binary characters.

and, while regex engine gets right character,
better pass hex engine can see character
might break (if does).


Comments

Popular posts from this blog

java - pagination of xlsx file to XSSFworkbook using apache POI -

Unlimited choices in BASH case statement -

apache - How do I stop my index.php being run twice for every user -