Some of the biggest barriers to cloud adoption are concerns about security, data loss/leakage, and the associated legal and regulatory concerns with storing and processing data off-premises1. Several cloud data breach incidents in recent years indicate that these concerns are warranted as a result of constant insider and outsider threats.
The challenge with Infrastructure as a Service (IaaS) offerings in the cloud is that customers lack a guarantee that the infrastructure is secure against threats and have to trust that their cloud providers will not inadvertently or purposefully access the data being processed on their infrastructure. Confidential computing is an emerging initiative focused on removing the need for trusting such providers, and one of the most promising approaches for achieving this is secure enclaves or trusted execution environments. Secure enclaves remove the need for trusting IaaS providers by enforcing a provably secure environment that is inaccessible by other applications, users, or processes colocated on the system.
Secure enclaves address the fundamental problem of securing data-in-use—one of the states that digital data can be in. Data-in-use refers to any data that is being actively processed by a system and is different from data that is being stored for later use or in the process of being moved across some network. Most current approaches to protecting data-in-use have relied upon implementing access controls and policies, client-side encryption techniques, and data breach detection, but none are able to prevent the threat in its entirety. Other approaches for protecting data-in-use such as homomorphic encryption are still far too inefficient for many production workloads5. To fully realize the benefits of the cloud, providing confidentiality and integrity guarantees for data at application runtime is vital.
As data becomes increasingly valuable, so do the insights which can be derived from them. Aggregating data from multiple data sources for joint analysis such as rich data analytics and machine learning training has also become an increasingly powerful technique, as witnessed by the increased interest in federating learning and collaborative analytics. The recent advances in secure enclave technologies finally enable such collaboration to occur securely and confidentially.
Secure enclaves explained
A secure enclave (also known as trusted execution environment) refers to a computing environment that provides isolation for code and data from the operating system using either hardware-based isolation or isolating an entire virtual machine by placing the hypervisor within the trusted computing base (TCB). Even users with physical or root access to the machines and operating system should not be able to learn the contents of secure enclave memory or tamper with the execution of code inside the enclave.
There are several implementations of secure enclaves available for both consumer electronic devices and workstations as well as for data centers. Most notable among them are Intel Software Guard Extensions (SGX), AMD Secure Encryption Virtualization (SEV), and AWS Nitro Enclaves. Not all secure enclave implementations are created equal – there is a tradeoff between security and convenience and depending on the complexity of the application, using one secure enclave implementation over another may make more sense.
Intel SGX provides integrity and confidentiality guarantees even if all privileged processes on the machine are malicious by using a remote attestation procedure that allows clients to verify that a known application code is running within the enclave. There is good support for developing applications atop Intel SGX in the form of developer SDKs such as OpenEnclave SDK and the Intel SGX SDK.
AMD SEV eliminates the need for developer SDKs for building customized enclave-compatible applications by encrypting the entire memory of a VM, thus securing it in the presence of a “benign but vulnerable” hypervisor9. However, since the hypervisor controls the VM’s access to all hardware resources, physical memory, and I/O operations, there is a much larger attack surface present and extra precautions must be taken to guard against resulting threats.
AWS Nitro Enclaves are built upon the Nitro System in which the hypervisor is very lightweight since much of the typical hypervisor functionality is modularized and offloaded to different Nitro Cards. These enclaves run alongside an existing AWS EC2 instance but exist as a separate, isolated VM. The Nitro Enclave can only be accessed by an application running in the same EC2 instance. At the time of writing, only one enclave per EC2 instance can be created.
All of the major cloud providers now have secure enclave offerings:
- Microsoft Azure offers machines with both AMD and Intel-based processors
- AWS offers Nitro Enclaves
- Google Cloud Confidential VMs are powered by AMD EPYC™ processors
- IBM offers secure enclaves in its IBM Z® Platform
How is this different from traditional encryption?
Most systems today have adopted standard techniques for protecting data-at-rest and data-in-motion using SSL/TLS, symmetric, and asymmetric key encryption. When data needs to be used, its contents are decrypted in memory. This is where the problem arises. An application that operates on some data must be able to see the data, but if this data is sensitive and contains PII or other confidential information, then application users must trust that the application is not leaking data and that the platform on which the application is running is secure and that malicious insiders or hackers will not gain access to their data. We call this the problem of protecting data-in-use2.
- Data-at-rest refers to stationary data that is stored in non-volatile memory such as databases, cloud storage, or hard drives.
- Data-in-motion refers to data that is actively being transmitted from one location to another, typically over some network.
- Data-in-use refers to data that is actively being processed by a system for the purposes of some application. Various secrets such as keys and passwords or any PII data are some candidates for data-in-use which might need to be secured.
Other methods of securing data-in-use
Current approaches to protecting data-in-use often rely on enforcing access controls where access to systems or data is restricted to only those who need it; however, this approach is susceptible to human error or access policy misconfigurations which may go undetected. Additionally, managing privileged access does not protect against credential leakage. Unauthorized actors could still gain access to sensitive data in the case of leaked passwords and keys.
Another approach is to use homomorphic encryption (HE), a technique that allows for encrypting data in a way such that computations can be performed directly on the encrypted data. The concept has been around for over 40 years and in that time, several implementations7 and variations of the scheme ranging from partially homomorphic encryption (PHE) to fully homomorphic encryption (FHE) have emerged8. However, there are a couple of pitfalls to homomorphic encryption which have prevented its adoption in the industry:
- Speed: All current FHE processing implementations are many magnitudes slower (currently anywhere from 1,000 to 1,000,000 times slower) than workloads with equivalent processing on plaintext data and are thus impractical for many production workloads.5
- Lack of Integrity: Homomorphic encryption schemes are inherently malleable, a property that allows encrypted data to be transformed into another valid encryption by applying some function to it. An attacker with access to the encrypted data cannot decrypt the underlying data (unless they posses a decryption key), but they could transform and replace the data without detection, hence violating data integrity. Homomorphic encryption would need to be combined with techniques such as zero-knowledge proofs to provide additional guarantees to mitigate this concern.
Secure enclaves provide an alternate solution that is both efficient and is less susceptible to human misconfiguration. See our previous blog post here to learn more about the differences between using secure enclaves and homomorphic encryption.
Does your company need secure enclaves?
Enterprises face a constant onslaught of internal and external threats. According to a recent survey by Pulse sponsored by Arm, using secure enclaves in the enterprise setting is attractive for implementing safeguards for the following scenarios3:
- Protect against insider threats – data in the cloud is accessible to the database administrators of the cloud applications or infrastructure via direct access to the database, application logs, and device memory
- Prevent platform software (i.e., a platform hypervisor) from accessing data
- Protect data from adjacent workloads in a multitenant/user environment
- Protect the integrity of crowdsourced ML models
Secure enclave technologies can benefit you if you fall into any of the following categories:
- You are a company that already stores confidential data in the cloud and wish to analyze and transform that data directly in the cloud
- You are a platform or service provider that wishes to improve the guarantees about your security posture and confirm to partners/customers that their data remains confidential even when passing through your services
- You are a company that needs to comply with customer data and privacy regulations (e.g., GDPR, CCPA) while still being able to extract value from customer data
- You are one of several entities belonging to some group or consortium that wants to collaborate to perform joint computation without any party revealing its private data
Ask your CISO if secure enclaves are right for you!
Despite the clear security advantages of secure enclaves, there are some key concerns that must be addressed before secure enclaves become the de facto compute standard for cloud workloads. The ecosystem around secure enclaves is still nascent and migrating existing applications and processes for use with secure enclaves may incur overhead. Software vendors considering the use of secure enclaves should ask themselves or their CISOs the following questions:
- Do the workloads and applications that we want to secure require the use of specialized hardware beyond CPUs for performing computation?
- Secure enclaves implementations are currently limited to CPUs; aside from a few academic works that have attempted to extend secure enclaves to GPUs and other PCI devices, enclave support on specialized hardware is not yet available. This makes it difficult to perform certain types of workloads, for example, training neural networks.
- Do we have the engineering bandwidth to migrate existing applications to use a new SDK and (potentially) be rewritten in a different programming language?
- Some secure enclave implementations, such as Intel SGX, require rewriting applications by partitioning the code into secure and insecure parts. Any code that lives outside the TCB is only able to interact with enclave code via a narrow interface. Sometimes, performing such partitioning can be very difficult.
- Some enclave implementations currently only support applications written in specific languages.
- Will application users tolerate slower performance in exchange for stronger security guarantees?
- Memory limitations, post-verification, etc. add additional steps computational overhead which can introduce latency for the client.
- Do we need to worry about enclave memory limitations?
- The size of the memory available for secure enclave computation affects what kinds of workloads can be performed and how efficient the computation will be compared to an unsecured version of the workload.
- Memory limitations are not a concern for the AMD-based enclaves or for AWS Nitro Enclaves, but the Enclave Page Cache (EPC) in Intel processors has previously been limited to 256 MB, which restricted the amount of data that could be processed at a time in memory. Recently, however, Intel announced their 3rd Gen Xeon Processors which increases the EPC size from 256 MB in previous Xeon processor generations to anywhere between 8GB to 1TB depending on the processor SKU.
- What level of security is needed?
- Hardware-based enclaves such as SGX allow for partitioning an application into trusted and untrusted parts, which allows for minimizing the TCB to reduce the potential attack surface; however, this requires rewriting applications using SGX-compatible SDKs. In recent years, several side-channel vulnerabilities have also been found for hardware-based enclaves7.
- Secure enclaves which isolate an entire virtual machine instance drastically reduce the amount of application code that needs to be modified in order to make the app compatible with the enclave, but with the assumption of a malicious hypervisor, there is a much larger attack surface present and a variety of attacks become possible8.
- Do we want to lock ourselves to a single cloud?
- Using AWS Nitro Enclaves is only possible on AWS alongside EC2 instances whereas Google Cloud Confidential VMs only support AMD processors. Choosing which secure enclave implementation to adopt may require locking yourself to a specific cloud.
- There are several implementations of secure enclaves out there, each with different SDKs, and varying threat models. It is unclear if a single approach will prevail or if the variety in application requirements will allow for multiple secure enclave implementations to coexist. Some clouds only support specific implementations, but unless a unifying secure enclave framework emerges or unless all clouds add support for multiple secure enclave implementations, application developers may need to implement multiple versions of their application for each cloud, which seems like an unlikely solution.
As enterprises migrate from on-premises to cloud environments, there is a need for additional confidentiality guarantees. Many industries/use-cases can benefit from the adoption of secure enclaves:
- Financial Services
- Edge computing
- Collaborative computing
The idea of collaborative/multiparty computation has existed long before secure enclaves, but until now, the techniques for implementing such computation relied on either purely cryptographic approaches—which are still far too inefficient for anything too complex—or on delegating computation to a trusted third party, which is still less than ideal. With a secure enclave environment, parties can finally efficiently collaborate by contributing individual data for some larger computation without other parties or any third parties ever learning any private data.
At Opaque, we’re building a first-of-its-kind analytics platform that uses secure enclaves to provide a confidential and collaborative environment that makes it easy for organizations to take their existing data and jointly perform secure analytics and machine learning on it with others without anyone (except the data owners) ever seeing the data.