I'm Home. Yay.
And now, I become one with my bed. ZZZ.
copy("/tmp/index.html", "compress:zlib://" , $tmp_name);apc_store($key, $contents, $ttl) $contents = apc_fetch($key) apc_delete($key)
The first session I attended today was on improving the performance of PHP applications, presented by Ilia Alshanetsky. It was pretty informative, but he spent a significant amount of time talking about optimizations that, while relevant, aren't php-specific, which makes them both useful for all websites, not just PHP applications.
The interesting PHP-related bits, though, are:
Using an optimizer without an opcode cache can be a net loss. (Though, this is obvious if you think about it - in order to generate optimal executable code, the optimizer may spend more time generating that code than can be saved in a single execution.)
You can have a separate ini file for commandline PHP (php-cli.ini; php-<SAPI>.ini), so you can do things like disable register_argc_argv on the webserver where it's never needed, and enable it for the cli when it is.
Instead of using time(), consider $_SERVER['REQUEST_TIME']. In my own tests, this only appears to be a 15% difference, so I wouldn't be in too much of a hurry to make this change for existing code. There's a few other functions whose values are duplicated in constants, so they can be fetched with even better gains.
preg_* is generally faster than ereg_*; but in any case, don't use regular expressions when there's a PHP API function that does specifically what you need. (This I already knew, but it's worth repeating, since I see a lot of people making this mistake.)
This one's counter-intuitive, since it seems like you're creating more work. When doing string replaces, it's often advantageous to do the replace conditionally: In the case that there's no match, if(strpos() !== false) str_replace(); is significantly faster than blindly calling str_replace(), and barely any slower in the case that there's a text match. (This is due to the fact that str_replace has to duplicate the search, replacement, and source strings.) My tests show that it's about 1.7 times faster for an empty file (non-matching, of course), 2.3 times faster for a non-matching 95kb file, and 3.2 times faster for a 2MB file (non-matching). For a matching replace in the 2MB file, it appears to be about a 1% difference. I'm wondering if it might be possible to modify str_replace so that it can do this by itself - first search for the replacement, and then if it finds one, create the new copies. A quick glance at the str_replace code suggests that there might be potential for some improvment, but it's going to take some time for me to do more research.
The @ (error-suppression) operator, which I had already decided was evil on the grounds that you shouldn't be writing code that emits errors. (Though I use @ for those PHP functions which are brain-dead enough to emit a warning when the code is perfectly valid but an error occurs - e.g. mysql_connect() to a server that is down.) Ilia, however, says it's evil because it's amazingly slow. And he's right. For a call into a function that does nothing, it's slower by a factor of four. For a no-op (@0; vs. 0;), 100 million iterations of @0; takes about 80 seconds. I can't actually time how long it takes to do 100 million iterations of 0;, though, because it executes so quickly that the noise of the background processes on my computer result in meaningless benchmark values - about half the time, I'm getting a negative value for time elapsed. In any case, @ is really slow, and should be used as close to never as is practical.
I'm really happy ilia touched on accessing array indexes with unquoted strings (e.g. $foo[bar] = 1). This is a major pet-peeve of mine, and while I knew it was slower, I hadn't really realized how much slower it was. (It involves one call to strtolower, two hashtable lookups, an E_NOTICE error, and the creation of a temporary string; none of which is necessary if the index is enclosed in quotes.) Ilia's benchmarks show an average of a 700% difference "on average" depending on the length of the key, but my tests are showing an 1100%+ difference, since I'm factoring out the cost of loop iteration.
I asked Ilia whether using echo with multiple parameters was faster than giving it concatenated strings (e.g. echo $one, $two, $three vs. echo $one . $two . $three). He said it was, but I'm not so sure this is true in all cases. I did some rather pathological benchmarks, and found that when outputting/concatenating two empty strings two-parameter echo is twice as fast. But increasing the two strings to one character results in concatenation being twice as fast with output buffering disabled, and about 10-15% faster with output buffering enabled. Clearly, this is deserving of more extensive benchmarks.
Ilia explains that it's important to fix code that's generating errors that aren't displayed by default (E_NOTICE and E_STRICT), because they result in time spent generating the error mesage, even it it's not displayed, but didn't say what the speed penalty was. My tests show that accessing an array element that does not exist, for example, takes about 10 times longer than accessing an array element that does exist. I've written a lot of code that runs into this. Often, I'll have
if($array['someProperty']) {do something }
when I should have
if(!empty($array['someProperty'])) { ... }
instead (which is about 7.5 times faster). I don't think it looks as pretty (which is why I've been stubbornly doing the former), but I think I can justify the "ugilier" code now in new development. (I'm not sure I could justify going back and changing existing code though, except as an exercise to remove the warnings so that when E_NOTICE/E_STRICT are enabled, they don't drown out any new warnings.)
A reminder that PHP5 passes variables with copy-on-write, so it's not necessary (and actually bad for performance) to pass a variable to a function by reference unless you need the function to modify the original variable.
There's a number of other things that were brought up that are good to know, but which I'm not going to ponder in any great detail here. These include using full pathnames in include|require(_once) calls; using references for loop invariants with multidimensional arrays (e.g. for($x = 0; $x < 5; $x++) $arr['a']['b'][$x] = $x;).