We're making your website a little faster!


Posted on Sep 6, 2020 | By Hosting4Real

At Hosting4Real we're always trying our best to push performance even further, and we're excited to have begun the migration of all hosting accounts towards new infrastructure.

Over the years our hardware requirements has grown from servers with dual core CPUs, 8 gigabytes of memory and 2x500GB spinning drives to servers with E5-1650v4, E-2136, EPYC 7351P, EPYC 7371, 128GB GB RAM and 2x1.92TB NVMe SSDs with the help of OVH.

As time moved on we started to invest in some of our own hardware, mainly for our backup system as well as a proxmox cluster for our own systems such as our website, billing system, analytics etc.

Recently we decided to take a bigger step, and invest heavily in new equipment to power our webhosting infrastructure.

Our new servers come with the following specs:

Model: GIGABYTE R152-Z31
CPU: AMD EPYC 7402P (24 cores, 48 threads)
RAM: 128GB DDR4 3200MHz ECC memory
Disks: 2x1.92TB Samsung PM983 NVMe SSDs
NIC: dual 10G/25G Mellanox NIC

That gives us a total of 480 CPU threads, 1280GB of RAM, roughly 18TB of usable NVMe SSD storage. Compared to our old servers (server8 to server16), these had a total of 168 threads combined.

While we won't utilize all the servers from the beginning, we at least have the available capacity whenever required.

Some servers go from a 6 cores/12 threads config to 24 cores/48 threads config, while others "only" go from 16 cores/32 threads to 24 cores/48 threads.

The core-count in itself is exciting, because it allows for better concurrency and scalability for a given server, it does also allow us to consolidate some systems together, making resource utilization better, while playing it safe.

So how do we consolidate our servers?

We generally keep the number of customers relatively low on a given system, but at the same time, we don't want to under-utilize our hardware too much, because it means we'd pay for hardware that wouldn't be used in an optimal fashion, and thus throwing away money (and increasing our costs).

In our specific case, we're having some older 6 core/12 thread systems (server8 to server12), if we'd allocate 5 of our new physical machines to these 5 servers, we'd not really benefit from the additional core-count.

For that specific reason, we're consolidating the 5 old servers into 3 new ones:

nlcp01: server8 + new customers
nlcp02: server9 + server11
nlcp03: server10 + server12

At this moment server8 and server9 has been migrated to nlcp01 and nlcp02, as a reference the CPU usage on those two machines (including doing a deep scan of malware) looks like this:

resource utilization new servers

As you can see on nlcp02, the bump later in the graph is due to the malware scanner running! (We do real-time malware scanning and 1 deep-scan every week).

When we select which two servers will be migrated into one, we look at the overall utilization of the servers over a longer period of time, and we'll make the right "fit" so we do not end up with one server being utilized a lot more than another.

In case we see a big difference between systems, we may shift individual customers around during migrations to balance out the load even more.

As mentioned earlier, nlcp03 will contain the servers server10 and server12, and nlcp02 will get server11 as well.

If we look at the resource utilization on server10 to 12:

resource utilization

We can see in the particular example (and it repeats pretty much daily), the usage on server11 is slightly higher than server12, thus we're paring server10 and server12, and not server11 and server12 as an example.

The average CPU of server11 are 33% of 12 threads throughout the day, while during the last 12 hours the average hits 41% of 12 threads.

When we migrate the specific server onto nlcp02, the server should see an average CPU usage of 15% of 48 threads.

So while we consolidated servers, we still doubled the core count and thread count on the particular server which in the end benefits the users because we can allocate a higher resource limit to each individual customer.

The benefit of our new CPUs

We ended up choosing AMD EPYC 7402P CPUs because it offers an excellent price/performance ratio.

The 7402P has a turbo frequency at 3.35ghz, while this is lower than an E5-1650v4 which has a turbo frequency of 3.8GHz all-core turbo, we still manage to yield a 7-11% performance improvement on a core per core basis based on some benchmarks we've performed on both systems (both under load and idle).

Additionally the standardization of core-count allows us to generally balance the load better on our systems when we select where a specific customer will end up when they sign up for hosting.

Some WP benchmark numbers

For us to verify the performance of the improvements to CPUs despite the lower frequency, we decided to do a rather simple test:

We installed WordPress with a bunch of plugins and made the exact same setup of the two installations, and we performed a load test using the Apache Benchmark (ab) to see what the requests per second would hit.

We ran the concurrency with 10 clients and a total of 500.000 requests being performed to WordPress.

E5-1650v4: 251.74 requests per second average for 500k requests

EPYC 7402P: 281.37 requests per second average for 500k requests

While this very non-scientific test reveals an increase of 11.77% in requests per second handled, we can see that the CPU architecture improvements over the years has given a lot of benefit in terms of performance - because we have to take into account that the frequency went from 3.8GHz to 3.35GHz (a decrease of 11.84%), we still see an improvement!

We also looked at some of the actual customer sites, both smaller and bigger ones, and we saw improvements of anywhere between 5 and 25% in the time it takes PHP to process a given request - since all sites work differently, the benefits may differ from site to site.

Conclusion

We want to give a better experience to everyone, one of the steps to do this, is to first of all increase the performance of individual requests, but on the other hand we also need to increase the concurrency that can be handled for times such as Black Friday or Christmas sales.

The migrations are taking place over the coming months, with the customers from server8 to server12 that will be migrated before the end of September, and customers on server13 to server16 will be migrated in the beginning of 2021.

We continue to look for ways to improve performance even further, and we're always eager to try to help our customers solve possible performance issues!

We also want to say thank you to all customers over the years for supporting us, and allowing us to grow together with you!

P.s. As we migrate other CPU architectures to the new servers, we'll surely provide more details on the improvements there!

Posted in: General