Greg Kroah-Hartman, the stable Linux kernel maintainer, could have prefaced his Open Source Summit Europe keynote speech, MDS, Fallout, Zombieland, and Linux, by paraphrasing Winston Churchill: I have nothing to offer but blood sweat and tears for dealing with Intel CPU’s security problems.
Or as a Chinese developer told him recently about these problems: “This is a sad talk.” The sadness is that the same Intel CPU speculative execution problems, which led to Meltdown and Spectre security issues, are alive and well and causing more trouble.
The problem with how Intel designed speculative execution is that, while anticipating the next action for the CPU to take does indeed speed things up, it also exposes data along the way. That’s bad enough on your own server, but when it breaks down the barriers between virtual machines (VM)s in cloud computing environments, it’s a security nightmare.
Kroah-Hartman said, “These problems are going to be with us for a very long time, they’re not going away. They’re all CPU bugs, in some ways they’re all the same problem,” but each has to be solved in its own way. “MDS, RDDL, Fallout, Zombieland: They’re all variants of the same basic problem.”
And they’re all potentially deadly for your security: “RIDL and Zombieload, for example, can steal data across applications, virtual machines, even secure enclaves. The last is really funny, because [Intel Software Guard Extensions (SGX)] is what supposed to be secure inside Intel ships” [but, it turns out it’s] really porous. You can see right through this thing.”
To fix each problem as it pops up, you must patch both your Linux kernel and your CPU’s BIOS and microcode. This is not a Linux problem; any operating system faces the same problem.
OpenBSD, a BSD Unix devoted to security first and foremost, Kroah-Hartman freely admits was the first to come up with what’s currently the best answer for this class of security holes: Turn Intel’s simultaneous multithreading (SMT) off and deal with the performance hit. Linux has adopted this method.
But it’s not enough. You must secure the operating system as each new way to exploit hyper-threading appears. For Linux, that means flushing the CPU buffers every time there’s a context switch (e.g. when the CPU stops running one VM and starts another).
You can probably guess what the trouble is. Each buffer flush takes a lot of time, and the more VMs, containers, whatever, you’re running, the more time you lose.
How bad are these delays? It depends on the job. Kroah-Hartman said he spends his days writing and answering emails. That activity only takes a 2% performance hit. That’s not bad at all. He also is always building Linux kernels. That takes a much more painful 20% performance hit. Just how bad will it be for you? The only way to know is to benchmark your workloads.
Of course, it’s up to you, but as Kroah-Hartman said, “The bad part of this is that you now must choose: Performance or security. And that is not a good option.” It’s also, he reminded the developer-heavy crowd, which choice your cloud provider has made for you.
But wait! The bad news keeps coming. You must update your Linux kernel and patch your microcode as each Intel-related security update comes down the pike. The only way to be safe is to run the latest Canonical, Debian, Red Hat, or SUSE distros, or the newest long-term support Linux kernel. Kroah-Hartman added, “If you are not using a supported Linux distribution kernel or a stable/long term kernel, you have an insecure system.”
So, on that note, you can look forward to constantly updating your operating system and hardware until the current generation of Intel processors are in antique shops. And you’ll be stuck with poor performance if you elect to put security ahead of speed. Fun, fun, fun!