regex - Remove simple HTML-Tags from String in Oracle via RegExp, Explanation needed -
i not understand, why columns reg1 , reg2 remove "bbb" string, , reg3 works expected.
with t (select 'aaa <b>bbb</b> ccc' teststring dual) select teststring, regexp_replace(teststring, '<.+>') reg1, regexp_replace(teststring, '<.*>') reg2, regexp_replace(teststring, '<.*?>') reg3 t teststring reg1 reg2 reg3 aaa <b>bbb</b> ccc aaa ccc aaa ccc aaa bbb ccc
thanks lot!
because regex greedy default. i.e. expressions .*
or .+
try take many characters possible. therefore <.+>
span first <
last >
. make lazy using lazy operator ?
:
regexp_replace(teststring, '<.+?>')
or
regexp_replace(teststring, '<.*?>')
now, search >
stop @ first >
encountered.
note .
includes >
well, therefore greedy variant (without ?
) swallows >
last.
Comments
Post a Comment