We have always tried to make 4RoadService fast to use, and especially so since our current design came online in 2015, but “fast” is subjective, and changes over time, so when Smashing Magazineannounced their Front-End Performance Challenge at the end of October, we knew we wanted to level-up our performance, so we set to work.
The homepage is usable 45% sooner than it used to be, and downloads 23%* less data. The most dramatic improvement was the homepage, but the other pages we tested had improvements as well.
* This number removes the banner for the “Featured Listing” at the bottom of the homepage. It changes frequently so so including it gives us inconsistent numbers that either hide or over-emphasize the improvements on the rest of the page.
The rest of this post is more technical than anything we’ve written before, and is aimed at web developers & performance geeks. Proceed at your own risk!
We’re optimizing for three views that are the most important to someone looking for repair service: the homepage, search results, and the listing detail page.
Figure 1: Performance before Optimization (Lower is better)
||First Meaningful Paint
||Perceptual Speed Index
||Total Download Size
Being a Cloudflare customer made some things pretty quick & easy:
- We turned on WebP conversion & Lossy compression for images in Cloudflare. While this shrinks image files, it could be better:
- PNGs and Gifs are converted to lossless WebP images, (only JPEGs get lossy compression from Cloudflare). Using lossy compression would save more bandwidth.
- Only filenames ending in .jpg, .jpeg, .png, .gif, and .img are WebP-ized, so images we manipulate on the fly with a PHP file need more trickery
- Turn on auto-minifaction of HTML, (CSS and JS are minified in our build system).
Cache Static Assets
Part 1: Assets that we create
We gave all static assets on 4RoadService.com a versioned URL using this strategy, (but not on WordPress), then set caching headers to tell browsers, (and other caches), to cache static assets for at least a year.
Part 2: User-Uploaded Assets
Repair companies with paid listings on 4RoadService can upload their company logo, a banner, and some photos. Since they don’t change at the same time as we make changes to the 4RoadService.com code they need their own versioning scheme. In our old system new versions of images didn’t get new URLs, so we changed that so whenever a new image is uploaded it gets a unique URL, and set far-future expiry headers on those as well.
Optimize the Critical Rendering Path
The Critical Rendering Path is the work the web browser does to render a web page, and it turns out we can tweak our web pages to make it happen quickly, but the “normal” way of building websites is pretty slow.
Async all the JS
asyncattribute to the script tag cleared that up and shaved 200ms off the time it took to reach the DOMContentLoaded event.
HTTP/2 Server Push things that block rendering, (hint: it’s styles)
The next biggest blocker to our CRP being speedy was our main CSS file. Even with HTTP/2’s multiplexing, there was still a period of time between when we started sending the HTML and when the browser requested the CSS file. HTTP/2 Server Push to the rescue!
HTTP/2 Server Push is fantastically simple to set up. If the web server supports it all that’s needed is adding a HTTP response header and the web server handles the rest for you. While it helped our rendering time, the difference wasn’t as dramatic as we had hoped, but it was an incremental improvement of 50-100ms.
Everything discussed up to this point improved the DOMContentLoaded event, and Lighthouse‘s First Meaningful Paint, First Interactive, and Consistently Interactive values but about a second. Depending on the page that’s loading that’s a 15% to 33% improvement. Not bad!
Once the browser starts receiving the web page the first several steps now progress very quickly. Most room for improvement still exists at both ends of the page load: our time-to-first-byte could be improved, (it’s just under 1/2 second), and on some pages there are large images that take a while to finish loading, (this is especially true on very large screens). To tackle these problems we can make our resized-on-the-fly images, (user-uploaded images), cacheable, and WebP-able, by Cloudflare, optimize our source images better, and speed up our server code. Because of the deadline on the Smashing Magazine challenge we focused on the first two.
What doesn’t seem worth doing (right now)
We could also break up our styles by media query and use several <link> tags with different media attributes. When a browser encounters a <link> to a stylesheet with a media query that’s not applicable it doesn’t block rendering for that file, so there may be gains, especially for people on mobile devices, however because the styles are fairly quick, and we would have to make guesses about which CSS files to send with a HTTP/2 Push, we’ll explore this later.
Optimize Those Last Few Images
Remember how we thought using lossy WebP would improve the download size of some images? Cloudinary’s Website Speed Test says that the gains could be big, especially for large PNGs like the orange truck on our homepage. Since the large images were taking 5-7 seconds to finish downloading on a very large screen we went ahead and optimized these in 2 steps.
Step 1: Optimize non-WebP images
Our build process already compresses our images, but we can do better. By tweaking the algorithms used for image optimization, then waiting forever while they run, (ahem, Guetzli), we shrunk the file sizes for everyone who can’t accept WebP images.
Step 2: Lossy WebP, and special URLs
We also added a step to our build process to create lossy WebP versions of every image. The size differences are incredible, for example, for the largest version of the orange truck on our homepage, the unoptimized PNG is 4.2MB, the optimized PNG is 1.1MB, and the WebP is 244KB. If you can use Lossy WebP, do.
These magical WebP images cause a problem, though: Not all browsers understand them. We could solve that by serving a WebP image if the browser sends images/web in its Accept HTTP header, but since CloudFlare caches our images it would cache the WebP version of a file and serve it to non-WebP-capable browsers, so we have to have separate URLs for WebP images, which means some image URLs change if the browser sends an image/webp Accept header. Once that happens both our optimized non-WebP images and our tiny, lossy, WebP images are cached in Cloudflare’s CDN for ultimate image speed.
Step 3: Compress user-supplied images as much as possible
Compressing the user-supplied images is more tricky. We resize some of them on-the-fly in PHP, and the compression tools in PHP don’t seem to be as good as the tools we use in our node-based build process. However, we can take advantage of Cloudflare’s compression. Yes, it’s not as good as the lossy WebP images we’re serving for our own assets, but it’s better than nothing. There was a hurdle, where Cloudflare doesn’t apply image compression to URLs ending in .php, so we re-wrote the URLs to look like image files and Cloudflare started compressing them for us. Problem solved-ish.
This is all we had time to do before the deadline for the challenge, but we have identified a few things to keep working on:
- Improve our time-to-first-byte: We’re working on one thing that should improve this a lot. There may also be ways to improve it by caching more, but some of our pages are customized for logged-in users, so we’ll have to work on caching pages while keeping customizations for logged-in users.
- Improve compression of user-provided images: We need to spend some time researching PHP compression systems.
- Resource Hints: We can tell the browser what outside services we’re going to connect to in HTTP headers, then it can get started on the connection without waiting to parse the HTML.
- Break up CSS and JS: This time around we decided there are bigger fish to try than breaking up our CSS & JS files to take advantage of HTTP/2’s parallelization, but doing this should yield a small reduction in load time.
- Service Workers, and Progressive-Web-App-izing 4RoadService: Some of 4RoadService relies on getting accurate, fresh, data from the server, but some things can be sped up with Service Workers and locally-cached data.
That’s as much as we snuck in under Smashing Magazine’s deadline. So, how did we do?
Figure 2: Performance After Optimization (Lower is better)
||First Meaningful Paint
||Perceptual Speed Index
||Total Download Size
That’s not too bad. This effort took less than half a week of developer time and produced very real improvements to our page load times, and reductions to the number of bytes we’re sending to our largely-mobile audience. In addition we have identified things that we can keep improving. Participating in the Smashing Magazine Front-End Performance challenge was a worthwhile exercise, not just to bring our skills up to speed on some new front-end performance practises, but to improve the experience of people using 4RoadService and let them find help quickly.