| John Bafford ( @ 2006-11-01 15:45:00 |
| Current location: | Zend Conference, Doubletree, San Jose, CA |
| Entry tags: | php, zendconference2006 |
ZendCon Session Notes - Caching Systems
Presented by Ilia Alshanetsky.
Ilia presented a number of different caching approaches and talked about their pros and cons:
Complete Page Content Caching
This can be implemented simply: Create a cache() function that tries to read in a cache file. If the file is too old or non-existant, call your init_cache() function, which turns on output bufferent and sets up a register_shutdown_function() which gets the content out of the output buffer, echos it out, and writes out the cache file.
When writing the cache file out, you will want to use a tempnam(), file_put_contents(), and then rename(), so that you don't run into issues with multiple connections attempting to write into the same file at the same time.
This is fast, but it requires that the entirety of your page be cacheable, which is not often the case.
Compressed Page Cache
If the browser accepts gzip, we can Content-Encoding: gzip and use a compressed version. This can be done really easily with:
copy("/tmp/index.html", "compress:zlib://" , $tmp_name);thanks to the magic of PHP file streams.
Content Pre-Generation
This cache generation code can be simpler than on-the-fly full-content caching, because we're manually triggering the cache generation operation and creating the entire website all at once. This allows us to ignore having to handle locking issues when multiple accesses attempt to write out a page's cache multiple times. However, this may result in the generation of pages no one may visit; and the disk space used may be very large. The time to generate an entire site's worth of content may also be very large.
On-Demand Caching
Instead of creating all the content of the site all at once, we can instead create it on-the-fly, by implementing a 404 error handler which generates the page the user was trying to access. On a 404 to an .html file, the error handler generates the page, then writes it out to that file. Future accesses to the page hit the static content, and are very fast.
Partial Page Caching with APC
APC also has the ability to create and manage shared memory regions, which provides easy access to a shared memory cache:
apc_store($key, $contents, $ttl) $contents = apc_fetch($key) apc_delete($key)
APC isn't a built-in extension, so this limits its availability, however.
SQL Query Caching
When doing a search that is slow, you can store the ids for the results of a search query into a database table keyed by date and search id. Ilia also suggested limiting query results to 1000 items. More than that indicates that the user is probably doing too broad of a search. (If it's good enough for Google, it's good enough for you.)
In-Memory Caching without APC
If APC is not available, you can use the built-in shmop module to gain the same benefits. However shmop requires a little more code to use.
Browser Caching
Finaally, you can have the browser cache the content by sending Expires, Last-Modified, and Etag headers (and also returning 304 results to tell the browser that it can continue to use its cached copy). This reduces data send and server resource use to next to zero, but it's not always guaranteed to work and there's flimsy control over the cache age, since you're at the mercy of the browser's cache algorithm.