Why does this JavaScript regex crash in the browser? -
to knowledge [ab]
, (a|b)
should equivalent in purpose when trying match against set of characters. now, @ 2 regex:
/^(\s|\u00a0)+|(\s|\u00a0)+$/g /^[\s\u00a0]+|[\s\u00a0]+$/g
they should both match against whitespaces @ beginning , end of string (see section on polyfill here more info on regex itself). when using square brackets things work well, when switch parenthesis simplest of strings causes browser run seemingly indefinitely. happens on latest chrome , firefox.
this jsfiddle demonstrates this:
a ="a b"; // doesn't work // alert(a.replace(/^(\s|\u00a0)+|(\s|\u00a0)+$/g,'')); // works alert(a.replace(/^[\s\u00a0]+|[\s\u00a0]+$/g,''));
is crazy quirk browser's implementation of regex engine or there else regex's algorithm causes this?
the problem seeing called catastrophic backtracking, explained here.
first of all, let me simplify , clarify test case:
a = array(30).join("\u00a0") + "b"; // string 30 consecutive \u00a0 s = date.now(); t = a.replace(/^(\s|\u00a0)+$/g, ''); console.log(date.now()-s, a.length);
what's happening second part of expression: ^(\s|\u00a0)+$
. note \s
matches number of whitespace characters, including \u00a0
itself. means both \s
, \u00a0
matches each of 30 \u00a0
characters.
therefore if try match string /(\s|\u00a0)+/
, find each of 2^30
different combinations of 30-character whitespace patterns result in match. when regular expression matcher matched first 30 characters try match end of string ($
) , failed, backtracks , ends trying 2^30
combinations.
your original string (in jsfiddle, 1 in stackflow "normalized" spaces) a \u00a0 \u00a0 ... \u00a0 b
30 \u00a0
characters, took browser 2^30
effort complete. not hang browser, take few minutes complete.
Comments
Post a Comment