Greedy vs. non
I am making several regex substitutions in Python along the lines of
ws+w
over many large documents. Obviously if I make the regex non-greedy (with a ?
) it won't change what it matches (as w
!= s
) but will it make the code run any faster? In other words, with non-greedy regexes does Python work its way from the first character matched onwards rather than from the end of the document back to that character, or is this a naive view?
Is this the pattern you implied?
In [15]: s = 'some text with tspaces between'
In [16]: timeit re.sub(r'(w)(s+)(w)', '1 3', s)
10000 loops, best of 3: 30.5 us per loop
In [17]: timeit re.sub(r'(w)(s+?)(w)', '1 3', s)
10000 loops, best of 3: 24.9 us per loop
Seems to be a pretty small difference here. Only 5 microseconds with the non-greedy,
Using a 500 word lorem-ipsum, with multiple mixed whitespace between every word, I get an 8 ms difference.
链接地址: http://www.djcxy.com/p/76918.html上一篇: JavaScript中贪婪行为有所不同?
下一篇: 贪婪与非