Optimizing APC for Drupal

APC is the Alternative PHP Cache, an opcode cache for PHP, or as its developers describe it: "APC is a free, open, and robust framework for caching and optimizing PHP intermediate code."  If your eyes just glazed over or you can't make sense of that last quote, you may now walk away.

Note: This article is specifically about APC's function as an opcode cache -- as has been noted in the comments, APC can also be used as an object cache, and please stay tuned for an article about that usage.
PHP is a high level language.  Like all high level languages, it needs to be compiled to be of any use to the machine that is to run it.  Content management systems like Drupal are made up of hundreds, if not thousands of PHP scripts, and the modular nature of such systems means that each page load might actually use hundreds of these scripts.  Unlike all high level languages, PHP is compiled on-the-fly; that is to say that the actual script is read & converted into bytecode at execution time.  This is where APC comes in: APC can keep a copy of the compiled script in memory & ready to go, greatly reducing the amount of resources needed to (for example) render a Drupal page.  (It's worth noting that some languages do this automatically -- loading a Python script on an Apache server will result in the Python interpreter leaving a compiled version of the script right next to the script itself, which the web server will use for future accesses)

I'm going to jump ahead to the configuration of APC.  If you need to install APC, there are myriad tutorials elsewhere on the internet.  I definitely recommend installing it via PECL.  For the rest of this article, I'm assuming that you have a strong command of server administration, PHP configuration and that you've installed APC correctly.

A Word of Warning
The default settings for APC may or may not be appropriate for your particular server.  You might decrease your server's performance dramatically if you don't configure APC correctly.  I once had a client who was experiencing 15 second page load times on a fresh install of Apache-- an incorrectly configured APC was found to be the culprit.

Things that can cause APC to degrade performance:

  • Not enough memory:  If APC has only 8MB of memory allocated to it, and every page loads 12MB of PHP scripts, then APC will have to flush its cache on every page load, causing extremely high overhead and very slow page loads. See the fine-tuning information later in this article for hints on how to allocate enough memory.
  • Memory Fragmentation:  When APC releases memory from the cache, it leaves odd-sized little holes in memory that may be difficult for the system to reuse.  APC does not manage memory fragmentation in any way.  See the fine-tuning information later in this article for hints on how to manage fragmentation via setting TTL.

For fine tuning your installation of APC, you'll need to find the apc.php file and copy it somewhere that you will be able to access it via your web browser.  You should edit the file and put in a username and password so you can access the advanced features.

Important configuration directives (these live in your php.ini, or perhaps in apc.ini - depends on your install.  Again, I'm assuming you have a handle on this..  You will find clues in phpinfo() output if you can't find your APC configuration):

apc.enabled=1
This enables APC.  Pretty obvious, but very important.  :-)  Handy for disabling APC if you've caused performance degradation on your live web server.  (You are doing this on a test server, aren't you?)

apc.shm_segments=1
apc.shm_size=32
These values control how much memory is allocated to APC.  The above values will allocate one 32MB segment of memory for PHP caching.  A method for determining how much memory to allocate goes like this:  Set APC for a fairly high amount of memory (say, 64MB) and restart your web server.  In your browser, open every PHP based web application available on your server (more later if you only want to use APC for certain applications), and use as many functions of the application as you can.  For a Drupal installation, load the modules page, load the content types admin page and a few other admin screens.  Then visit your apc.php and it will tell you how much memory it has used.  You want to set APC so it has slightly more memory allocated than your applications will ever use.  If APC runs out of memory, it will flush its cache and rebuild, and the unfortunate websurfer who unwittingly causes that situation will have to wait for it, and will experience painfully slow page load times.  On some shared hosts, you will only be able to allocate a small amount of memory with apc.shm_size, and this is where apc.shm_segments comes in handy:  you can allocate 4 8MB segments to get 32MB of cache memory.

apc.cache_by_default=1
apc.filters=
These settings are the key to managing which applications actually use the APC cache.  By default, APC will cache every php script it encounters, which is appropriate for many situations.  I administer a web server that hosts many sites for many different clients.  The clients are free to install whatever PHP scripts they desire, and some even install their own content management systems.  This, of course, creates an unknown situation.  My goal on that server is to provide the best Drupal performance for my clients that are paying a premium for Drupal hosting.  I also don't want to cause poor performance by overflowing the cache memory every time I use PHPMyAdmin.

The apc.filters directive is much misunderstood, as the documentation is a little bit vague on how it works.  As with almost all filtering activities, I prefer an inclusive rather than an exclusive approach:  Set apc.cache_by_default to 0 and then specify what scripts can be cached, rather than telling APC "cache all scripts except the ones specified here".  apc.filters accepts a list of POSIX style regular expressions, preceded by a plus or minus sign which denotes "include or exclude"  My simple configuration which instructs APC to only cache scripts in directories whose names include the word "drupal" is as follows:

apc.cache_by_default=0
apc.filters="+drupal6"

And it's that simple.  Now my clients can't affect APC memory usage by installing other PHP scripts -- (unless they install them in a directory that contains the name "drupal6"!!)

apc.stat=1
If there is one configuration directive that can really impact performance positively, it's this one.  apc.stat=1 directs APC to check every PHP script at execute-time to see if the file had been updated since the last time it was read off the disk.  If you set apc.stat=0, you will realize a noticable performance gain, as APC does not have to "stat" each file to make sure it hasn't been updated.  This is highly recommended for production servers, but you must remember to reset the cache (or restart Apache) whenever any PHP script is updated, otherwise the updates will not take effect.  Using this directive in conjunction with apc.filters can be highly effective -- just don't cache scripts that are in development areas.  Remember that your hosting clients who edit their PHP scripts won't see their updates if you enable this directive for their web applications!  When updating modules or Drupal core, you'll need to reset APC before the new scripts will be recognized.

apc.ttl=0
This is another very important configuration directive which controls the Time To Live of cached entries (in seconds).  Its default is 7200 (2 hours), which means that any file that is in the cache longer than two hours since its last access is released.  This is the cause of memory fragmentation in APC.  You can set this to zero to completely disable expiration functionality.  This may not be appropriate for your setup, as scripts will never expire from the cache, but it works well for my setup, as I reset my Apache server once per day for backups.  When used in conjunction with apc.stat=0, each PHP script is read off disk only once and resides in memory -- permanently.  Or until you reset APC or Apache.