regex - Remove simple HTML-Tags from String in Oracle via RegExp, Explanation needed -


i not understand, why columns reg1 , reg2 remove "bbb" string, , reg3 works expected.

with t (select 'aaa <b>bbb</b> ccc' teststring dual)  select   teststring,   regexp_replace(teststring, '<.+>') reg1,   regexp_replace(teststring, '<.*>') reg2,   regexp_replace(teststring, '<.*?>') reg3 t   teststring             reg1        reg2          reg3 aaa <b>bbb</b> ccc     aaa ccc     aaa ccc       aaa bbb ccc 

thanks lot!

because regex greedy default. i.e. expressions .* or .+ try take many characters possible. therefore <.+> span first < last >. make lazy using lazy operator ?:

regexp_replace(teststring, '<.+?>') 

or

regexp_replace(teststring, '<.*?>') 

now, search > stop @ first > encountered.

note . includes > well, therefore greedy variant (without ?) swallows > last.


Comments

Popular posts from this blog

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -

javascript - oscilloscope of speaker input stops rendering after a few seconds -