Here are some performance tuning tips and instructions for setting up a very performant Drupal 8 Commerce site using Varnish, Redis, Nginx and MySQL. I’ve got this setup running nicely for at least 13,000 concurrent users and it should scale well past that.
FYI, We also have a Drupal 7 Commerce Performance Tuning guide here.
You’ll need some specific config for Drupal as well as some extra config to work nicely with BigPipe caching. These are standard for Varnish and Drupal and not specific to Commerce.
You’ll want to setup the Purge and Varnish Purge modules to handle tag based cache invalidation, nothing here is unique to Commerce, so you can follow the standard instructions. You will, however, want to make sure your pages actually are cached, as often modules or small misconfigurations can make a page not cacheable. To work nicely with Varnish, you want the entire page to be cacheable so your webserver doesn’t even get hit. An underused module that I find very helpful is Renderviz, which will show you a 3D breakdown of what cache tags are attached to what parts and can help you identify problem parts. I run
renderviz(‘max-age’, ‘0’) to show me anything that can’t be cached. Usually the parts you find can be corrected and made cacheable.
For example: In a recent set of performance testing I was doing, I found a newsletter signup that appeared on the bottom of every page had an overly aggressive honeypot setting, which rendered the page uncacheable. Changing the settings to only apply to necessary forms, as well as correcting a language selector, turned tons of uncached pages into cacheable pages. Now these pages return <10ms and put zero load on my web servers or database.
Use the most modern version of PHP you can, preferably the latest stable. Never ever ever use PHP 5 which is terrible, terrible, terrible. Otherwise, make sure you have sufficient memory and allowed threads, and that will cover most of your PHP tuning. This is almost certainly the most resource heavy part of your Drupal stack, but it is also easy to scan horizontally, pretty much indefinitely. Also, the more you can make use of Varnish, the less this will get used.
Most of this is just making sure you can handle the number of connections. You may need to up the file limit...
...of your web user to allow for more than 1024 connections per nginx instance.
A Commerce site is usually more write-heavy than your standard site, as your users create lots of "content" (aka carts and orders). This will usually change your MySQL config a bit, although the majority of your queries will still be reads. A pretty simple way to tune your site is to run...
...against it after getting some real traffic data for at least a couple days, or simulating high traffic. It’s recommendations will get you a pretty good setup.
There is one other VERY important thing you need to do, you need to change your transaction isolation level from READ-REPEATABLE to READ-COMMITTED. READ-REPEATABLE is much too aggressive at table locking to work with most Drupal sites, especially anything write heavy. You will suffer from constant deadlocks even at fairly low traffic levels without this. Frankly, I think this should be a flag in the status page, but my patch hasn’t gotten any traction.
Nothing special here, but you are going to want use a separate caching option. It could be Memcache, Redis or even just a separate MySQL database. Redis is nice and fast, but the biggest gain is just splitting your cache away from the rest of your db so you can scale them easier.
There are a few specific patches that will be a great help to your performance.
_list cache tag invalidation
Every entity type has an entity_type_list cache tag, which gets invalidated any time an entity of that type is added or changed and that those lists will need to get rebuild. This happens a LOT, but is a relatively simple query.
update cachetags set invalidation=invalidation+1 where tag=’my_entity_list’
This is an update, which is a blocking query, nothing else can edit this row while this query is running, which wouldn’t be so bad except...
This query often gets run as a part of larger tasks, in our case, such as when placing an order. A big task like this is run in a transaction, which basically means we save up all the queries and run them at once so they can be rolled back if something goes wrong. This means though, that this row stays locked for the whole duration of the transaction, not just the short time it takes this little query to run. If this invalidation happens near the start of the transaction, it can take a query that would talk 0.002 seconds and make it take 0.500 seconds, for example. Now, if we have more than 2 of these happening a second, we start to back up and build a queue of these queries, which just keeps getting longer and longer until we just start returning timeouts. Since this query is part of the bigger order transaction, it stops the whole order from being processed and can bring your checkout flow to a halt.
Thankfully, the above listed patch allows these cache invalidations to be deferred so as to not block large transactions. I think the update query for invalidating cache tags is still a bottleneck as you could eventually reach it without these long transactions, but at this point that problem is more hypothetical than something you will practically encounter.
Add index to profiles
As you start getting more and more customers and orders, you will get more profiles. Loading them, especially for anonymous users, will really start to slow down and become a bottleneck. The listed patch simply adds an index to prevent that. Please note, this is a patch for the Profile module, not Commerce itself.
Make language switcher block cacheable
This issue is unfortunately on hold pending some large core changes, but once it does land, this will allow the language switcher block to be used without worry of it blocking full page caching.
You should be able to scale well above 10,000 concurrent users with these tips. If you encounter any other bottlenecks or bugs, I’d love to hear about them. If you want help with some performance improvements from Acro Media and yours truly, feel free to contact us.