redundancy - Simplifying the regex "ab|a|b" -
(how) following regex simplified:
ab|a|b
?
i'm looking less redundant one, i.e. 1 a
, 1 b
. possible?
some tries:
a?b? # matches empty string while shouldn't ab?|b # still 2 b
note real regex has more complicated a
, b
parts, i.e. not single char inner subregexes let's say.
if using perl or pcre engine (like php's preg_
functions), can refer previous groups in pattern, this:
/(a)(b)|(?1)|(?2)/
the main purpose of feature support recursion, can used pattern reuse well.
note in case cannot around capturing a
, b
in first alternation, incurs (possibly) unnecessary overhead. avoid this, can define groups inside conditional never executed. canonical way use (?(define)...)
group (which checks if named define
group matched anything, of course group doesn't exist):
/(?(define)(a)(b))(?1)(?2)|(?1)|(?2)/
if engine doesn't support (edit: since using java, no feature not supported), best can in single pattern indeed
ab?|b
alternatively, can build ab|a|b
version manually string concatenation/formatting like:
string = "a"; string b = "b"; string pattern = + b + "|" + + "|" + b;
this avoids duplication well. or can use 3 separate patterns ab
, a
, b
against subject string (where first 1 again concatenation of latter two).