How to defend against side-channel attacks on SGX

Our last blog post explained the concept of secure enclaves. In this blog post, we will specifically focus on hardware enclaves and discuss just how secure they are. 

Introduction

But first, a refresher on hardware enclaves for our new readers. Hardware enclaves provide an isolated environment for code and data within an untrusted machine, where privileged software can be compromised by an adversary. A prominent example of hardware enclaves is Intel SGX. SGX provides three crucial security features: memory encryption, remote attestation, and sealing. Memory encryption ensures that only the enclave process can access its own memory: other enclaves, processes, and the operating system cannot access this enclave’s memory. Remote attestation provides an enclave with the ability to prove that the desired code has been loaded and is running securely inside this isolated environment, instead of malicious software. Finally, sealing enables encrypting and authenticating the enclave’s data such that no process other than the same enclave can decrypt or modify it.

However, enclaves are still vulnerable to side-channel attacks. An adversary can launch a side-channel attack by observing a program’s indirect effects on the system during execution, instead of exploiting direct program vulnerabilities. In other words, if an application’s runtime behavior is somehow correlated with the secret input content that it operates on, then it will exhibit different runtime behavior when operating on different secret inputs. For example, a simple program that renders English text can access pre-rendered font images based on the input data, but the image access pattern will differ based on the input text’s content. If this difference in runtime behavior is also observable and quantifiable by the adversary, then the adversary can learn some extra information about the secret input. 

Side-channel attacks are especially dangerous because there are many different attack vectors: page access patterns, timing behavior, power usage, etc. The existence of such attack vectors also breaks the ideal enclave abstraction that people usually assume. In the rest of this blog post, we will discuss and categorize past side-channel attacks on Intel SGX. While Intel originally specified that SGX was not designed to protect against side-channel attacks, such attacks are a significant threat to the enclave ecosystem, and thus we will explore them in detail in this blog post.

Side-channel attacks on SGX

Page-based attacks. The first prominent side-channel attack on SGX, called “controlled channel attack,” was introduced by MSR researchers [1]. The authors attempt to attack an enclave program inside SGX by using a compromised, malicious operating system. In particular, an enclave program that behaves differently on different inputs will exhibit different page access patterns. This can be true for both code and data pages. Some pages might be accessed more frequently than others, and sometimes pages might be jointly accessed together. Since the OS controls page mapping, the attacker is able to trigger page faults in order to infer memory accesses made by an enclave program. The authors additionally demonstrate that it is possible to extract text documents from widely used word processing tools, and obtain outlines of JPEG images decompressed by libjpeg (all of these programs are run inside the enclave environment). Note that these attacks don’t break any of SGX’s original security guarantees: all of the enclave memory content is encrypted. However, which memory locations are accessed is revealed to the adversary, and such a sequence of accesses enables the adversary to infer information about the secret data content over time.

Cache-based attacks. Cache-based timing attacks are one of the most well-known side-channel attacks. In such attacks, the adversary relies on analyzing data access timing information that differs depending on whether the data is located in the cache or not. For example, a common attack called Prime+Probe identifies the cache set trace of the victim process. In this attack, the adversary first fills the cache set with its own data (the “prime” step). Next, the victim executes for some time, potentially touching the cache set. Next, the adversary can measure the access time to the previously loaded data (the “probe” step). If the access time for a particular data is long, then that means the corresponding cache set has been used. Past work has demonstrated Prime+Probe attacks on AES encryption that can steal the secret key. Similarly, SGX is also vulnerable to cache timing attacks like Prime+Probe since enclave programs still use caches the same way as non-enclave programs. Even though memory is encrypted and isolated, content in the cache is not encrypted and shared in the last level cache. Prior works make use of the Prime+Probe technique to attack AES implementations [2], and also show that a malicious enclave can extract RSA keys from co-located enclaves [3].

Speculative execution attacks. Speculative execution attacks are another set of side-channel attacks that have been widely publicized since they were introduced in 2018. The most famous were the Spectre attacks [3]. The general idea of the attack is to leak information from a particular common hardware optimization technique called speculative execution, which attempts to do some work predictively before it is known whether this work is necessary or not. The prediction is based on some “guess.” If the guess is correct, then the performance of the code can be greatly improved. If not, some extra work is wasted, but the changes can be reverted so that correctness is not affected. One particular speculative execution is in branch prediction, which happens when there is a branch that occurs in the code. The processor attempts to guess the likely outcome of branch instructions, and better predictions can increase the number of correct speculatively executed operations. However, such an optimization opens up an attack vector by falsely executing on sensitive data. Even if eventually the correct code is executed, the temporary incorrect execution can still leak information. For example, a security bounds check that is mispredicted and speculatively executed, combined with a cache side-channel attack like Prime+Probe, can leak secret information that otherwise should not have been leaked since the bounds check should have prevented this execution. Prior works have shown that SGX is also prone to Spectre-like attacks [4].

This section summarized some important attacks, but please take a look at this general survey on recent attacks on SGX for details.

Mitigations

While there are many side-channel attacks, there are also mitigations and defenses against these attacks. The mitigations can be summarized in three ways:

  • Microcode is a set of complex instructions that implements various CPU functionalities and directly controls how the CPU behaves. Microcode can also be updated by the CPU manufacturer. Some SGX attacks [4] were directly patched by Intel via microcode updates.
  • Another general type are generic software solutions that are designed to address specific side-channels. For example, T-SGX [5] utilizes Intel’s Transactional Synchronization Extensions to ensure that page faults only occur on a specific page, and therefore addresses the controlled channel attack. These generic software mitigations can sometimes be added by the compiler: the authors of T-SGX were able to modify LLVM to automatically transform an enclave program into one that uses the T-SGX mechanism. 
  • Finally, side-channel attacks exist because of how applications are written. The application implementation can leak information if its behavior changes based on secret data content. Therefore, one powerful way to address most of these side-channel attacks is by transforming existing programs into oblivious programs. In these programs, the same sequence of instructions always executes regardless of the secret input’s content. Therefore, the program’s behavior should never leak any information because it does not change when given different inputs. A correctly written oblivious program should be safe from:
    • Controlled channel attack since it exhibits the same page access patterns for different inputs
    • Prime+probe because it should access data pages in a predictable way
    • Branch misprediction attacks since it is branch-free

Of course, there is no such thing as a free lunch: writing such a program is difficult for non-experts and the resulting program can be quite slow. However, there are two ways to address these issues. One could leverage an oblivious programming framework [6, 7] to transform an existing program into an oblivious program. Alternatively, one could leverage an existing oblivious framework that is designed and optimized for specific functionalities. For example, the MC2 project has explored designing efficient oblivious SQL analytics and decision trees. 

Conclusion

In this blog post, we discussed side-channel attacks on hardware enclaves, with a focus on Intel SGX. We also talked about some strategies to mitigate these attacks, with different tradeoffs. However, when should one consider using these mitigation strategies? The answer will depend on where the enclave is located.

If the enclave is located on an arbitrary third party’s computer, then one must assume the worst case scenario where the third party can attempt to do everything in its power to compromise the enclave program. In this scenario, oblivious programs may be necessary to give the best security guarantees. 

However, if the enclave is located on a cloud provider with good intentions, then the client can trust that the cloud provider has an incentive to be honest and provide the best security for their customers. Enclaves, along with state-of-the-art security mechanisms, makes the cloud a much safer place compared to an arbitrary third party.

[1] Y. Xu, W. Cui, and M. Peinado, “Controlled-channel attacks: Deterministic side channels for untrusted operating systems,” in Proc. – IEEE Symp. Secur. Priv., vol. 2015-July, 2015, pp. 640–656, ISBN: 9781467369497. DOI: 10.1109 / SP .2015.45. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/7163052/.

[2] Moghimi, Ahmad, Gorka Irazoqui, and Thomas Eisenbarth. “Cachezoom: How SGX amplifies the power of cache attacks.” International Conference on Cryptographic Hardware and Embedded Systems . Springer, Cham, 2017.

[3] https://cacm.acm.org/magazines/2020/7/245682-spectre-attacks/fulltext

[4] Koruyeh, Esmaeil Mohammadian, et al. “Spectre returns! speculation attacks using the return stack buffer.” 12th USENIX Workshop on Offensive Technologies (WOOT 18). 2018.

[5] Shih, Ming-Wei, et al. “T-SGX: Eradicating Controlled-Channel Attacks Against Enclave Programs.” NDSS. 2017.

[6] Liu, Chang, et al. “Oblivm: A programming framework for secure computation.” 2015 IEEE Symposium on Security and Privacy. IEEE, 2015.

[7] Sasy, Sajin, Sergey Gorbunov, and Christopher W. Fletcher. “Zerotrace: Oblivious memory primitives from intel sgx.” Cryptology ePrint Archive (2017).