Five Ways to Analyze Sensitive Data

There are three major trends occurring across organizations right now that are forcing analytics and AI teams to explore more secure ways to store, process and analyze sensitive data. First, a more significant portion of data is deemed sensitive and confidential – across entire data sets and select fields in a data set. Second, regulatory and compliance standards are getting more rigid to protect people’s private data and more. Thirdly, the number of use cases that involve confidential and sensitive data is growing exponentially across every industry from banking, to insurance, to AdTech, healthcare, manufacturing supply-chains, e-commerce, government and more. The quandary every analytics and AI team is facing, is how to securely store, process and analyze sensitive data while delivering critical insights to the business faster while complying with regulatory requirements.

So how are these three major trends being solved? The good news is that over the last 12-15 months innovation to meet these trends has accelerated. As a result, today every analyst and organization is now in a position to overcome the hurdles and speed up delivering insights on confidential and sensitive data. Here are the five techniques analytic and AI teams are employing effectively.

1 – Establishing What Data is Sensitive and Processing it in Encrypted Form

Most analytics teams hesitate to encrypt a lot of data as they realize that analytics and machine learning might become less accessible to their analytic teams who are otherwise well-versed in SQL, Python and popular ML/AI frameworks. However, the best way to start is by categorizing what data is considered sensitive (e.g. client names, account activity, social security numbers, medical records) what insights are considered sensitive, and what particular fields within existing data sets are confidential information. Once done, then this data can be accessed from disparate sources and stored in encrypted form. But for iron clad security this data should be processed in confidential Trusted Execution Environments (TEE’s) which protects data by using secure enclaves that protect the data and code to prevent unauthorized access, even if the compute infrastructure is compromised. All public cloud vendors, including Microsoft Azure, Google Cloud Platform and Amazon AWS offer Confidential Computing cloud environments that embed enclave technology.

2- Technologies to Process Confidential and Highly Sensitive Data

Organizations are increasingly processing their confidential and sensitive data in confidential computing cloud platforms. These cloud platforms meet compliance standards and encrypt data at rest and in transit. The sophisticated encryption offered by enclave technology protects data from data breaches, security hacks and most importantly ensures that only authorized users can access and utilize the data. In order to securely share sensitive data between authorized users and teams you’ll need to establish policies and adopt tools that enable secure, auditable data sharing.

3 – Performing Collaborative Analytics and Machine Learning on Sensitive Data

Performing analytics and machine learning on encrypted data has been complex and only doable by high specialized security experts. The reason for this is because up until recently accessing and performing analytics on data stored in confidential computing environments and in specialized encrypted form, has required the data to be unencrypted during analysis which is highly problematic as it exposes the data to cyberattacks and malicious users. To prevent any malicious use of the data, the optimal way to perform analytics and machine learning on sensitive data is to analyze the data in its encrypted form – so all sensitive data can be stored in encrypted form, encrypted in transit and remain encrypted during use for analytics, machine learning and AI.

The massive organizational need for this level of data protection has led to the emergence of new software platforms that enable analysts and machine learning practitioners to securely analyze data at rest, in motion and during processing at scale – without ever having to unencrypt the data. These platforms and tools have emerged as a new category of software called ‘Confidential AI and Analytics’ or commonly known as Confidential AI. The genesis of one such platform is from world-renowned computer scientists at UC Berkeley who created the open source technology called MC2. The same team founded Opaque Systems, which offers one of the leading Confidential Analytics and AI Platforms.

4 – Enabling Secure Data Sharing Between Teams and Organizations

One of the major breakthroughs in Confidential Analytics and AI is the ability to enable secure data sharing between users, between teams within the same organization and between companies that are working together on a use case such as money laundering or drug research and discovery. In both the inter-company scenario and the intra-company scenario, it is essential that data sharing only allows data to be shared in encrypted form and moreover only certain fields and data be shared for the sole purpose of performing analytics and AI to obtain insights pertinent to each party while the original data itself is protected and cannot be unencrypted.

5- Enforcing Compliance on Access, Data in Use, and Protecting Insights

One of the major benefits of Confidential Analytics and AI is the unique capability around enabling collaborative analytics and machine learning on encrypted data. Confidential Analytics and AI platforms and tools enable users or entities to securely access encrypted data, execute queries on encrypted data, run machine learning models on encrypted data and gain insights all while still keeping the data and the analytical outcomes private to each party. There are many use cases across every industry today that mandate strict policies, governance and protection of insights derived from confidential and sensitive data.

In summary, with accelerated innovation we have made great strides in making sensitive data more accessible to analytics and machine learning while protecting the data in encrypted form end-to-end– from the data sources, into encrypted format in enclaves, to data at rest, in transit and in use. Further, new innovation in Confidential Analytics and AI platforms and tools now enable organizations to analyze this data in its encrypted form, securely share data between parties and generate insights where only authorized users can see the insights pertinent to the data they own.

The breakthrough Confidential Analytics and AI has unlocked hundreds of use cases and accelerated gaining insights to address critical business priorities in money laundering, fraud, multi-party supply-chains, multi-party drug discovery and research, audience monetization, analysis and AI on personal data and so much more. Today alone, there is over $300 billion worth of data locked up that can now be safely accessed, processed and analyzed by analytics and AI teams.