python - Extract block from CSS by Regular expression or any other method? -
i have extract text between /*custom-d start*/
, /*custom-d end*/
, , there maybe spaces after /*
, maybe spaces before */
i did in 2 steps:
- regular expression extract start , end text.
- string find method text between start , end text.
following code:
data ="""/* highlighting edits text on tib , not on pdf output. .main-igp selector makes style apply tib. */ .main-igp .edit1 {color: rgb(235, 127, 36)} .main-igp .edit2 {color: rgb(0, 0, 180);} /*custom-d start */ .main-igp .edit3 {color: rgb(0, 180, 180);} .main-igp .edit6 {color: rgb(200, 200, 0);} /* custom-d end */ /* production note ===== */ p.production-note-rw { display: none;} /* production note end ===== */""" def extractcustomd(): """ extract custom-d block css data. starting text /*custom-d start*/ , ending text /*custom-d end*/ there may space after /* , before */ """ import re try: start_text = re.findall("/\* *custom-d start *\*", data)[0] end_text = re.findall("/\* *custom-d end *\*", data)[0] except indexerror: return "" return data[data.find(start_text)+len(start_text):data.find(end_text)]
we can extract target content regular expression? or there other way this?
edit: following working me
>>> re.findall("/\* *custom-d start *\*/([\s\s]*)/\* custom-d end \*/", data) ['\n.main-igp .edit3 {color: rgb(0, 180, 180);}\n.main-igp .edit6 {color: rgb(200, 200, 0);}\n']
currently, extract /*custom-d start */
, /* custom-d end */
substrings. however, need text between them.
you can use 1 regex expression extract substring:
/\* *custom-d start *\*/\s*(.*?)/\* *custom-d end *\*/
see regex demo. use re.s
modifier.
see ideone demo:
import re p = re.compile(r'/\* *custom-d start *\*/\s*(.*?)/\* *custom-d end *\*/', re.dotall) test_str = "/* highlighting edits text on tib , not on pdf output. \n .main-igp selector makes style apply tib. */\n.main-igp .edit1 {color: rgb(235, 127, 36)}\n.main-igp .edit2 {color: rgb(0, 0, 180);}\n/*custom-d start */\n.main-igp .edit3 {color: rgb(0, 180, 180);}\n.main-igp .edit6 {color: rgb(200, 200, 0);}\n/* custom-d end */\n/* production note ===== */\np.production-note-rw {\n display: none;}\n/* production note end ===== */" m = p.search(test_str) if m: print(m.group(1))
note can unroll lazy dot matching to
/\* *custom-d start *\*/\s*([^/]*(?:/(?!\* *custom-d end *\*/)[^/]*)*)
this version faster 1 lazy dot matching.
Comments
Post a Comment