Bad Corporate Engineering Blogging: Image Comparison Edition

As a general rule, I like when companies keep an active engineering blog, particularly when they’re working in a space I’m interested or have expertise in! Dan Luu has a great article on some companies with great blogs and how they’ve cultivated them.

Unfortunately, not every post will be a winner. A recurring theme on this blog is that I’m motivated to write when someone is Very Wrong on the Internet in a space I’m familiar with, particularly if I see them receiving accolades on social media for it. In this particular case, Netflix has produced not one but two stinkers this year. To avoid picking on just Netflix, I’m also going to highlight two from the Crunchyroll blog since they keep making a mistake that annoys me immensely and this will hopefully shame them into not doing it.

Netflix publishing two bad articles in one year is particularly disappointing to me. They have brilliant engineers working on incredible, cutting-edge technology, so when they publish deceptive posts it makes me sad. There’s no need for this! Netflix, please just highlight the genuinely great work going on.

Crunchyroll is, in comparison, a much smaller operation that just happens to be in a space I’m particularly familiar with. With that said, they keep making the same basic error and should definitely know better. They, too, have some great engineers, so it would be nice if the posts could avoid cheapening their work.

AVIF for Next-Generation Image Coding

First off, on February 13th, 2020, Netflix published an article entitled AVIF for Next-Generation Image Coding. The article was intended to highlight the AV1 Image File Format (AVIF), which uses the AV1 encoding standard for still images. Fairly straightforward, and I’m sure this offers some potential compression gains over JPEG, which is their point of comparison.

To demonstrate this, they opted to compare images from a standard Kodak dataset at a roughly fixed size, with the goal of highlighting the quality improvement offered by AVIF. Some of their comparisons are quite stark, even at reasonable file sizes:

JPEG 444 @ 19,787 bytes

AVIF 444 @ 20,120 bytes

Unfortunately, this actually sets off a red flag for me. At that file size, the JPEG looks far worse than it should. And sure enough, nowhere in the post does Netflix actually explain how they produced the JPEGs, so I have no idea what encoder or settings they used here. Checking the Docker container reveals the answer: the JPEG-XT reference encoder, using default settings. This produces signifcantly worse results than even libJPEG, let alone MozJPEG.

If you take the same original image and run it through MozJPEG, you get far better results¹:

MozJPEG 444 @ 19,858 bytes

The sad part is, AVIF still looks better in most cases, and at higher resolutions the difference is even more pronounced. So why cheat? Why do this? I don’t get it, I really don’t. All the comparisons in their post use the worse encoder, and every single image looks way better as a JPEG with a reasonable choice of encoder. They offer below only a few different metrics and some data (with some of the excluded metrics raising an eyebrow), but all of this is basically junk because the JPEGs are far worse than they should be. Very sad.

Optimized shot-based encodes for 4K: Now streaming!

Next up we have, published by Netflix on August 28th, 2020, Optimized shot-based encodes for 4K: Now streaming!. This article goes over their process of backporting work the company has done on per-title and per-shot encoding optimizations to their ‘premium bitstreams’ for 4K content and how it compares under VMAF. I have various complaints with their optimization work and VMAF in general, but the details of that are largely irrelevant here as their sin is conflating what is good for customers versus good for the company.

If you’re a company serving digital video, you are highly incentivized to cut down on the size of your video streams. The smaller the video, the less storage is required and the less bandwidth used. At Netflix’s scale, being able to lower the bitrate of video translates to a huge cost savings, so it’s understandable that they want to do this. Unfortunately, this is not always a win for the customer.

If you’re streaming on a poor connection, it may very well be a win: better quality at lower bitrate means an improvement if your limitation is your connection. For a lot of their customers, however, the limitation is not the network but the hardware. Netflix chooses not to give customers any control over what stream they receive, instead selecting it dynamically based on what they think the network and hardware can handle. The hardware is the key part here: if your screen is 720p, that is the highest resolution you will get. Similarly, if your desktop/phone/tv doesn’t have the magical hardware and browser combination required you won’t be going above 1080p regardless of your screen size or network connection. This means that for a lot of customers, their resolution, and thus their quality, will be locked to 1080p.

Now, why does this matter? In that article, all of Netflix’s charts use the bitrate as the x-axis and their comparisons are done at a roughly similar bitrate, rather than resolution. However, if you’re stuck at 1080p, this isn’t a fair comparison! You can’t get the 4K stream, so a more useful comparison is looking at a 1080p stream on the old encoding pipeline versus the newer one. This is a problem because, well, it looks worse when you do this. This is a downgrade for a lot of their customers. Even if you don’t trust my eyes and prefer their ‘objective’ metrics, look at their chart here. You are absolutely taking a quality hit, according to VMAF, if you’re limited to 1080p streams. This is not a customer win. It’s a win for Netflix, because the file size is smaller, but that’s it. And maybe that’s fine, but at least frame it that way! Or perform an apples-to-apples comparison, and pit your old pipeline against the new one at the same resolution.

Improving Video Quality for Crunchyroll and VRV

Okay, on to Crunchyroll. On March 16, 2017, they put out the post Improving Video Quality for Crunchyroll and VRV in response to customer backlash. They were clearly in the wrong, and this post was sleazy because it pretended this was a mistake rather than a deliberate attempt to lower file sizes (which they later admitted to), but that is not the core problem here. When you compare frames across different encodes, and you choose to crop the images, please use the same frame and cropping each time. You would think this is a no-brainer, but alas…

Multiple people complained about this at the time, so I figured the message was received. Their blog mostly died after that, so I never saw another example either way.

Scaling up Anime with Machine Learning and Smart Real Time Algorithms

Until this year!

Specifically on March 28, 2020, when they published Scaling up Anime with Machine Learning and Smart Real Time Algorithms. The post discusses how they chose to try and improve the image quality available to customers with both server-side and client-side upscaling improvements, with Waifu2X on the server and Anime4K on the client. Once again, I have various quibbles with their technical decisions, particularly on the Anime4K front, but the larger problem is that once again they somehow fail to actually crop comparison images correctly.

Come on, guys. Why do this?

Crunchyroll image cropping

There are a billion ways to invalidate image comparisons, but this is definitely among the dumbest. Commenters on even the orange site manage to get this right most of the time, so there is truly no excuse.

Riot

As a final note, since this post is basically dumping out some random complaints previously relegated to my Twitter account, I want to give a special “shoutout” to Riot’s blog. Every time I’m linked an article on there it seems to have some insufferable tone that makes me immediately want to close the tab. Some quick examples from searching Discord logs: 1 2. Please don’t do this. The anti-cheat one is particularly bad, since it essentially snarks at people concerned about them installing a fucking kernel driver. Whose complaints, by the way, turned out to be totally valid when it started causing performance issues on unrelated games. If you’re going to push an unpopular technology on people, the least you can do is avoid talking down to them in the process. Then again, this is the company that makes you write essays on how much you love League if you want to work there, so I’m not sure what I was really expecting.

Footnote:

1: For additional examples, and because I’m too lazy to generate the images myself, see here. Be sure to click the slowpics links directly as Medium will sometimes modify the image once it enters their CDN. ↩