Intel processors offer the AVX-512 instruction set to allow high performance for vectorized workloads. You would be correct in being tempted to use it on your applications/databases deployed in the cloud.

However, there is a flip side to it.

Using the AVX instructions will cause the entire processor to get clocked down! This has huge implications.

Affect on Cloud VMs

The AVX slowdown doesn't care about VM boundaries. When you rent a VM on AWS, GCP, etc, you are getting access to just a few of the many cores from any physical processor.

Lets say a processor on AWS has 4 cores, and you request 2 for your VM. Another account B on AWS spins up a VM and gets assigned 2 of the remaining cores from that same processor. Now B starts running some AVX heavy workload. Well, what do you know, it results in your VMs getting slowed down too!

Affect on Kubernetes clusters

It means your own docker containers running AVX workloads can slowdown your other containers, despite the resource limits being set. Not only that, a different account's Kubernetes cluster which has pods scheduled on a different VM but on the same physical processor as your VM can impact your containers!

This was pointed out by Kelly Sommers yesterday.

No easy way out

I went ahead and filed a Kubernetes bug #67355 for this. They do seem to be aware of this issue but currently have no good answers for it:

Even Intel has no solution for it currently:



This is a Catch-22 situation all around. Cloud vendors want to offer VMs with AVX-512 instructions enabled to allow their users to get better performance. It is in the best interest of the individual user to use it. However, doing so may not only impact their own VMs/containers but even another account's VMs.