Why we switched from Apache to LiteSpeed


Posted on Aug 11, 2018 | By Hosting4Real

Earlier this year we decided to start switching our infrastructure from Apache to LiteSpeed, the change happened over multiple months with our first trial last year in November.

At the end of April and beginning of May, we decided to enable the new web server across our infrastructure after extensive testing and very few compatibility problems.

This post is about why we did the switch, and which issues we saw during the switch and what issues got resolved by switching.

Why we wanted to switch

One of the main reasons we wanted to switch from Apache to LiteSpeed was due to an on-going issue with Apache when working with mpm_event and lsphp in a CloudLinux environment.

What would happen, is that whenever we would restart the web server it could in some cases result in connections getting blocked for a rather extensive time of 5-15 seconds. As you can imagine this is quite a lot, and we wanted to solve this by switching to a web server that does true graceful restarts.

Another issue we saw under Apache, would when using a large set of mod security rules in our web application firewall, we would sometimes experience issues with high CPU consumption of the web server processes itself. During large POST-requests we could see the CPU going to 100% on a single core to be able to inspect the request against our security rules.

The issue was more apparent on some servers compared to others, and this was mainly due to their traffic pattern and the type of websites they would host.

The 3rd issue is about scalability, especially for high traffic websites.

At Hosting4Real we're hosting some rather big websites when you look at their traffic pattern, some of these do very long running connections towards end-users, and for this Apache usually isn't a great solution.

The way Apache process requests are assigning a new "worker" (thread) to handle the lifetime of the connection for the visitor - this works great in small-scale environments, or where your connections are rather "normal" in the sense of time spent.

With LiteSpeed, since it's event-driven, it also means that it's a lot better at handling thousands of concurrent connections, which is one the reasons why we decided to opt for LiteSpeed over other web server solutions.

Why we chose LiteSpeed

There's plenty of alternatives to Apache, some of them are free, and others are paid software packages.

We decided to go for LiteSpeed because it's acting as a drop-in replacement for Apache (to some extent) and because it's officially supported by cPanel as well.

It offers features such as QUIC, TLSv1.3, HTTP/2 Push, Brotli support, ESI caching and has caching plugins for systems like WordPress, Joomla, Prestashop, Magento, Xenforo, Drupal, and MediaWiki.

Features such as QUIC and Brotli allows us to deliver websites even faster to visitors - and this alone brings a lot of benefits to our customers.

The main reason why we switched to LiteSpeed was that it would resolve two issues that had on-going:

WAF consuming a lot of CPU

Despite we're using the same ruleset, we do see a lot fewer resources being used for inspecting the requests towards our servers. This greatly lowers our resource utilization for the web server itself.

Blackouts during restarts

Apache had the issue of starting to drop connections when doing a graceful restart - this happens because Apache, in reality, can't do graceful restarts. LiteSpeed solves the problem by spawning a new LiteSpeed process and first when this is done; it will start shutting down the old one.

What issues we experienced during/after the switch

LiteSpeed is a drop-in replacement for Apache; there are specific features that aren't supported for various reasons.

In general of the amount of sites we host, only a very very few websites actually experienced a problem by switching - these problems come from bugs in the website software itself, but since these bugs are not triggered when using Apache because Apache has some fundamental design problems, it's errors that would be hard to catch.

Bug when using [OR] flag in rewrite rules

When you use rewrite rules in your website, you sometimes want to validate specific conditions one could be if the country is Denmark and the IP does not match 8.8.8.8, you'd write it something like this:

RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(DK)$
RewriteCond %{REMOTE_ADDR} !=8.8.8.8
RewriteRule ^(.*)$ /en/ [L,R=302]

If you want to make another condition such as where the country is Denmark or the IP is 8.8.8.8, you'd change the RewriteCond to something like this:

RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(DK)$ [OR]
RewriteCond %{REMOTE_ADDR} !=8.8.8.8
RewriteRule ^(.*)$ /da/ [L,R=302]

In this case, only one of the conditions has to be true to cause the 302 redirect to /da/.

Now this works fine in both Apache and LiteSpeed - however in Prestashop if you happen to use a single store with media servers enabled, this would cause a bug that would only trigger when using LiteSpeed.

The code that Prestashop would generate would be something like this:

RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$1$2$3.jpg [L]
RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$1$2$3$4.jpg [L]
RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$1$2$3$4$5.jpg [L]
RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$1$2$3$4$5$6.jpg [L]
RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$1$2$3$4$5$6$7.jpg [L]
RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$6/$1$2$3$4$5$6$7$8.jpg [L]
RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$6/$7/$1$2$3$4$5$6$7$8$9.jpg [L]
RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$6/$7/$8/$1$2$3$4$5$6$7$8$9$10.jpg [L]
RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^c/([0-9]+)(\-[\.*_a-zA-Z0-9-]*)(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/c/$1$2$3.jpg [L]
RewriteCond %{HTTP_HOST} ^cdn-static.mediaservers.prestashop-domain.com$ [OR]
RewriteRule ^c/([a-zA-Z_-]+)(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/c/$1$2.jpg [L]

What happens here is that there would be a RewriteCond that would match the host of the media server, but it would contain an [OR] in the end.

Apache blindly ignores this, basically it will trigger as if the RewriteCond didn't exist at all. Where in the case of LiteSpeed, it actually reads the rule as it's supposed to, what would happen is that the rules would never match if you requested an image via the URL of the shop itself. It would only trigger rewrites when the domain host would be cdn-static.mediaservers.prestashop-domain.com.

A simple solution for this would be to not add the RewriteCond at all; however since Prestashop uses it when you're using a multistore setup, you could as well add the shop's domain to the RewriteCond list, it's something that happens when you're having multistore enabled.

We submitted a pull request to the Prestashop core explaining the issue, and it is merged into the Prestashop core.

We decided to patch the Prestashop codebase since the code was invalid even under Apache, however, because Apache didn't trigger the rules, it would result in it working - therefore the best solution would be to fix the software that caused the problem, in this case, it being Prestashop.

Parsing [L] flags

The [L] in Apache is supposed to stop processing rewrite rules, but the functionality differs (unintentionally) based on which context you're operating in.If you put your rewrite rules in your Apache vhost the rules will work differently from if you put the rules in your .htaccess-file - this is a bug within the Apache software that it's handled differently, however Apache currently do not have a bugfix for it (hopefully some day?).

The issue is, that the spec of the rewrite flags itself explains that processing should stop when it meets an [L] since LiteSpeed follow the spec, it means that the bug would trigger in the htaccess-context as it would on the vhost-context on Apache.

Now, one could argue that this is a fault of the web server by behaving differently despite being a drop-in replacement. However, the spec is defined to function in a way; the spec is being followed as it should - so, in reality, the issue here is that Apache does not adhere their own spec - and more important that software, in this case, PrestaShop, relies on broken functionality within the web server.

We're trying to work out a patch for Prestashop to see if we can get it merged into one of the newer releases - and at same time, we've asked LiteSpeed if they can add a feature to "break" the spec in htaccess-context if people really want to have this broken behaviour.

Conclusion

Generally the switch went smooth, it has lowered the overall resource usage of the server, and at at same time improved the performance for all of our customers.

It allows us to bring better caching capabilities to our customers, offer features such as HTTP/2 Push, QUIC and TLSv1.3.

Looking at the stability, the overall uptime have improved the last few months, because we're avoid the blackouts and we're not facing the bug anymore of lsphp sometimes crashing the main process.

It at same time allowed us to contribute to open source projects such as Prestashop.

It did introduce very few issues along the switch and as much as we hate that this might affect a single website, it does bring benefits to the remaining sites we host - and we have to look at the bigger picture in this case.

Posted in: General