agaskar.com

admin page memory issue / WSOD with free tag taxonomies in Drupal

If you're using free tagging, you will eventually WSOD (White Screen Of Death) your admin content manage page. I say eventually, because one of our projects has 64M of mem allocated to each PHP thread (a non-trivial amount, imho -- if you're wondering, it's set that high to permit large uploads) and it's WSOD-ing on 32K tags. Now, 32 thousand tags might seem high in the context of creating a useful taxonomy, but consider that flickr hit twenty million (presumedly unique) tags this January 2008 after about 4 years of operation, so it's not absurd to think that this issue might affect you if you use free tagging.

After the jump, an explanation of why this field white-screens, and how to fix it.

Here's *why* it white-screens:
1. When you open the "Manage Content" page Drupal tries to build a form select for ALL valid terms in the term_data database.
2. the taxonomy_form_all function starts calling the taxonomy_get_tree function for each and every one of these tags. taxonomy_get_tree caches results so we don't pummel our database. Unfortunately in this instance we're both hitting the DB with a ton of queries AND filling up child process memory like crazy -- Drupal makes a DB query for each and every term in our term_data database AND sticks those results in memory. Again, if you're using free-tagging quite liberally, at some point this is just not going to happen. On a dead stock install of php where the memory_limit is set at 8M, my guess is you'd hit it with around 4000 records. Not a lot of tags, really, but already about 3950 too many for a useful drop-down select box.

Now, we've got two obvious options -- don't cache the results of taxonomy_get_tree -- which isn't the greatest idea either, because (it's been awhile, but) I believe that this would adversely affect any taxonomy operations, forcing the script to hit the database again and again for taxonomy data that we might be working with on any particular view. If you've used drupal at all, you know that you want to minimize database runs as much as possible, since drupal is already pretty query-hungry.

The other option is just to not build a select field with thousands and thousands of options in it. The latter seemed most sensible to me in this case. Accomplishing this is fairly simple:

1. Do a quick:

grep -r "module_invoke('taxonomy'," *

and look for "module_invoke('taxonomy','form_all',1)". Make sure that it's ok to skip 'free tag' instances for these occurences. In all likelihood, it probably is, but you should consider your particular situation. The only places I was calling the taxonomy_form_select function was in the stock node.module that comes with Drupal. On my 5.2 version they were on lines 1364 (the "Manage Content" index) and 2571
(the "Advanced Search" page). Since 1) a select with 32000 elements was not going to be useful to admins using the "Manage Content" page and 2) I have tag search for my free tags instead of an "Advanced Search" page, it's not a big deal to exclude free tags in my taxonomy select forms.

2. If you've decided you don't need free tags in your taxonomy select field, add the following line to taxonomy_form_select (in modules/taxonomy/taxonomy.module), right after the function declaration:

function taxonomy_form_all($free_tags = 0) {
  $free_tags=0;
  ...
}

By adding this variable declaration, we're preventing any script from invoking a form field that will grab all the tags. This is much easier to maintain than changing all the instances of "module_invoke('taxonomy','form_all',1)". After adding that line, you should only see non-free-tag taxonomies in your search drop-downs, and your pages should now no longer WSOD.

It would've been nice to have an 'include free tags' switch in core so that end-users could fix this on their own without hacking core, but realistically this is a change you're only going to make once.

Ideally the taxonomy search field should be split into a drop-down for non-free-tag categories and a search field with auto-complete for free tag categories. That way you'd get the best of both worlds without overly taxing your server for a what's an outside-case for many people (I've yet to use the category search filter).