escaping - Character code pages: control code page assignment that means "the next rendered character (in this source code) is escaped?" -


i acknowledge question may unanswerable, or extremely difficult answer.

also, notwithstanding expect audience familiar escape sequences in e.g. scripting languages are, reasons of clarity you'll see later in post, i'll review concept:

by "escaped," mean example printable characters interpreted "do not use next character usual; interpret in context." contexts include characters intended not interpreted code, literal printed characters, or conversely, characters may interpreted literal characters want interpret instead code. examples (more confusingly, realize) use latter case.

specific example: regex used 'nix sed, which, when not escaped sed, this:

([^0-9]*)(20[0-9]{2})([^0-9]{1,2})([0-9]{1,2}) 

but when escaped shell pass regex sed such sed knows interpret characters not literal characters, regex code, whole string becomes uglier (and less human-readable):

\([^0-9]*\)\(20[0-9]\{2\}\)\([^0-9]\{1,2\}\)\([0-9]\{1,2}\) 

escape characters (or sequences) 1 of banes of programming. true long strings (or code lines), practical either pay extreme attention and/or use tools create , remove escape sequences.

i've looked around , not encountered solution i'll propose, not knowing may named if exists, , not being expert, search futile.

where things "control code page assignment," i'm talking code pages in sense of tables of printable (and non-printable) characters computers use render , control layout of text, etc., explained in the wikipedia article on "code pages". (loosely) call these "computer alphabets," if will. "code page assignment," mean entry in computer's "alphabet" interpreted either rendered glyph (printable character) or unprinted control code (non-printable characters).

the idea designate specific, unprinted control code page assignment mean "interpret next character escaped," text renderer "read" , indicate programmer changing e.g. color and/or brightness of escaped character follows control code. and/or control code page assignment printable glyph, being example standardized, non-intrusive accent glyph doesn't conflict other accents in alphabets related roman alphabet.

this unprinted code page assignment read interpreters , compilers similarly.

suppose rendered version of longer regex gave above:

unescaped, fugly regex

if had unprinted code page assignment means "the next character escaped," escaped characters example rendered brighter, indicate escaped:

less fugly control code escaped regex

that far eisier human interpret (albiet difficult begin regex) following, instead uses printed characters escape sequences:

enter image description here

the predominant if not universal situation write use printed characters in escape sequences, not unprinted code page assignments.

attendant problems proposed solution ensuring conformity escaped code page assignment many tools programmers use. programmers have know utilities support escaped code page assignment , don't. also, best tools adopting such code page assignment explicit whether backward compatible (whether can use both printed characters , unprinted code page assignment escape sequences).

i not prefer programming language or tool accomplished means other escape control code page assignment. same, i'd curious tools this.

so after of that, question is: programming languages exist this, and/or there code page assignment this?

  • as far i'm aware, pretty programming languages stick printable ascii characters*.
  • there special escape control character in ascii, called, unsurprisingly, escape or esc (the similarity esc key not accidental), code 27 or 0x1b. character not used way anymore.
  • i think pretty close want syntax highlighting.
  • if you're willing break direct correspondence between bytes in file you're editing , characters see on screen, think \ can stay being escape character. need find editor that's configurable enough , configure way want.

* 2 main exceptions can think of not interesting here: apl own set of symbols , languages supporting unicode in identifiers.


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -