-->

Are preg_match() and preg_replace() slow?

2020-08-13 08:27发布

问题:

I've been coding in PHP for a while and I keep reading that you should only use preg_match and preg_replace when you have to because it slows down performance. Why is this? Would it really be bad to use 20 preg_matches in one file instead of using another PHP function.

回答1:

As Mike Brant said in his answer: There's nothing wrong with using any of the preg_* functions, if you need them.
You want to know if it's a good idea to have something like 20 preg_match calls in a single file, well, honestly: I'd say that's too many. I've often stated that "if your solution to a problem relies on more than 3 regex's at any given time, you're part of the problem". I have occasionally sinned against my own mantra, though.

If you are using 20 preg_match calls, chances are you can halve that number simply by having a closer look at the actual regular expressions. Regex's, especially the Perl regex, are incredibly powerful, and are well worth the time to get to know them. The reason why they tend to be slower is simply because the regex has to be parsed, and "translated" to a considerable number of branches and loops at some low level. If, say, you want to replace all lower-case a's with an upper-case char, you could use a regular expression, sure, but in PHP this would look like this:

preg_replace('/a/','A',$string);

Look at the expression, the first argument: it's a string that is passed as an argument. This string will be parsed (when parsing, the delimiters are checked, a match string is created and then the string is iterated, each char is compared to the pattern (in this case a), and if the substring matches, it's replaced.
Seems like a bit of a hasstle, especially considering that the last step (comparing substrings and replace matches) is all we really want.

$string = str_replace('a','A',$string);

Does just that, without the additional checks performed when a regular expression is parsed and validated.
Don't forget that preg_match also constructs an array of matches, and constructing an array isn't free either.

In short: regex's are slower because the expression is parsed, validated and finally translated into a set of simple, low-level instructions.

Note that, in some cases people use explode and implode for string manipulations. This, too, creates an array which is -again- not free. Considering that you're imploding that very same array shortly thereafter. Perhaps another option is more desirable (and in some cases preg_replace can be faster here).
Basically: regex's need additional processing, that simple string functions don't require. But when in doubt, there's only 1 way to be absolutely sure: set up a test script...



回答2:

Don't worry about optimization unless you have a problem.

Don't look for areas of optimization without measuring with something like XDebug (http://xdebug.org).

If your code takes 100ms to run with preg_match() and 110ms via some other method, do you really care about the difference?

Write for correctness and clarity first, then consider speed.



回答3:

It really depends on your use case. There is nothing inherently "bad" about using regex. Sometimes it is your only available solution to a particular problem. However, there are times when simple string manipulation functions will work just fine. These tend to be faster than the preg* functions, so if you run into cases where you have a script that is run very frequently and/or has a large number of string manipulations to be performed, the impact of using regex can begun to be felt.

As is the case for anything, you should test in your application and environment and decide what works best for you.



回答4:

Check how much time it needs (display times when STARTED and ENDED):

var_dump( microtime(true) );

//...............  your function executions here.............

var_dump( microtime(true) );


回答5:

Depends on what you're doing. For complex regex just go with the preg_ functions, if you need simple substitutions or similar, go with other, more specific functions like str_replace(), strpos(), strstr()...

The web is full of discussions, like http://www.simplemachines.org/community/index.php?topic=175031.0