Skip to content

Archives

Links for 2011-04-20

  • demerphq on “perl’s regexps are slow” : His classic response to the Russ Cox DFA-over-NFA regular expressions paper. ‘A general purpose regex engine like that required for perl has to be able to do a lot, and has to balance considerations ranging from memory footprint of a compiled object, construction time, flexibility, rich feature-sets, the ability to accomodate huge character sets, and of course most importantly matching performance. And it turns out that while DFA engines have a very good worst case match time, they dont actually have too many other redeeming features. Construction can be extremely slow, the memory footprint vast, all kinds of trickery is involved to do unicode or capturing properly and they aren’t suitable for patterns with backreferences.’ — Also interesting to note that he mentions an approach I’ve used in several SpamAssassin speedup add-ons, too ;)
    (tags: performance perl regular-expressions perlmonks demerphq regexps dfa nfa state-machines)