Your Guide to the EU AI Act

Explore the EU AI Act's impact on AI systems and data governance in 2024.

Your Guide to the EU AI Act

Introduction

I took advantage of the winter break to read the 108 pages of the EU AI Act published in December 2023. Alright, this is a lie. But I read enough about it to know it will mark a sharp change for AI systems & data governance in 2024. If you're unsure about whether you should worry about this article, take this quiz to help you decide.

  1. Are you in the business of launching AI systems in the EU market?
  2. Does your organization, while not in the EU, use AI systems that affect stakeholders within the EU?
  3. Are you involved in deploying AI systems within the boundaries of the EU?

Consider the EU AI Act similar to the GDPR- it's an EU law, but its reach goes way beyond. So it is likely this article has something in store for you.

In this piece, we’ll dive into the following:

  1. What is the EU AI Act, and Why Should You Care?
  2. Identifying High-Risk AI Systems
  3. Crafting Your Plan to Stay on the Right Side of Compliance

Let's dive in!

Get the AI Act Compliance Audit Template

I - What is the EU AI Act?

A - Understanding the EU AI Act

The EU AI Act was passed on December 8, 2023. It regulates the use and deployment of AI systems in the European Union. The Act marks a vital step in the EU's journey towards harmonizing AI regulation across the single market. If 2023 was all about ChatGPT and AI blowing up everywhere; 2024 is all about getting these systems under control with some rules.

The EU AI Act works by categorizing AI systems into distinct levels based on the potential risks they pose to society.

Before diving into the different levels of risk; let’s start with the basics. The Act defines an "AI system" as:

"a machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments. Different AI systems vary in their levels of autonomy and adaptiveness after deployment." - EU Artificial Intelligence Act, 2023.
Three categories of risk in the EU AI Act - Image courtesy of CastorDoc

Categorizing AI systems based on risk, the EU AI Act identifies:

Prohibited AI Systems: These applications pose unacceptable risks to safety, security, and fundamental rights within the European Union, like AI for social scoring or workplace emotion recognition. For example, authorities ban the use of AI for social scoring or recognizing emotions at work.

High-Risk AI Systems: Subject to stringent compliance within the EU, these systems are vital in sensitive sectors such as healthcare, transportation, and law enforcement. We’ll dive into high-risk systems in more detail in part II.

Minimal Risk AI Systems: These systems, like AI chatbots or AI-generated content, pose limited risks and thus face minimal regulatory requirements under the Act.

When will the Act be implemented?

The AI Act is expected to be approved by the European Parliament and council and published in the Official Journal at some point during Q2 or Q3 of 2024, after which it will come into force. As an EU regulation (as opposed to a directive), it will therefore be directly effective in Member States without the need for local enabling legislation. The compliance timeline is detailed in the table below.

EU AI Act Implementation calendar - Image courtesy of CastorDoc

B - Why is it important?

The act's entry into force is a concern for most companies developing AI systems. It applies to AI systems used within the European Union, regardless of where they are developed or deployed, making it a concern for global companies.

Although it’s an initiative developed by the European Union, the act also sets a precedent in AI legislation globally. It outlines a comprehensive approach that could influence AI regulations worldwide, extending well beyond the member states of the EU.

Last but not least, companies that fail to comply with the regulations will be penalized. As you'll notice in the information below, the penalties involved are too significant to overlook.

Penalties in case of non-compliance with the EU AI Act - Image courtesy of CastorDoc

II - High-risk AI systems under the EU AI Act

A - Determining the High-Risk Status of AI Systems

The EU AI act is especially important for you if your organization is building High-Risk AI systems. This is where the regulation is the most binding and hence why we have chosen these systems as the main focus of our article.

The first step is identifying what makes an AI system High Risk. This will be the focus of this section.

The EU AI Act, employing a risk-based approach, specifies which AI systems are considered high-risk in Annex II and Annex III. The European Commission can update these lists to reflect new technological developments and risks. To make it simple, high-risk Artificial intelligence models fall into two categories: AI systems as safety components in EU-regulated products, and AI systems bearing health, safety, or fundamental rights risks. The image below illustrates the criteria used to determine high-risk AI systems.

Determining the high-risk status of AI systems - Image courtesy of CastorDoc

On the flip side, your system is not considered high-risk if it meets these conditions:

  1. Performs specific, simple tasks without impacting safety or security directly.
  2. Works to improve human performance as a support tool.
  3. Detects inconsistencies in patterns without actively shaping decision-making processes.

A few words on your obligations as a high-risk system provider. There are a lot of things you need to comply with. and this is what we will cover in the next section.

III - Your Compliance Action Plan: Getting Ready for the EU AI Act

For high-risk AI systems, the requirements of high quality data, documentation and traceability, transparency, human oversight, accuracy and robustness, are strictly necessary to mitigate the risks to fundamental rights and safety posed by AI” - EU AI Act, Article 2.3.

As we edge closer to the anticipated Q2-Q3 2024 rollout of the EU AI Act, companies must start preparing for compliance. The question arises, what's the most effective way to do this?

We have a threefold answer to this question: data governance, data governance, data governance.

Jokes apart, data governance should be the backbone of your EU AI Act compliance strategy. Data is at the heart of every AI system. It is the raw material that fuels AI, the training ground where AI sharpens its capabilities, and often, the end product that AI delivers.

Effective data governance involves managing data meticulously from its initial collection to its eventual disposal. Practically speaking, this means ensuring the data's high quality, keeping thorough documentation, and upholding stringent privacy standards. We will explore each of these areas in depth, examining the EU AI Act's requirements and outlining steps to achieve compliance.

A - Ensuring Data Quality and Integrity

“High-quality training, validation, and testing data sets require the implementation of appropriate data governance and management practices. Training, validation, and testing data sets should be sufficiently relevant, representative free of errors and complete in view of the intended purpose of the system” - EU AI Act, section 44.

The cornerstone of any AI system lies in the quality and integrity of its training data. To align with the EU AI Act, the primary focus should be on ensuring that the data powering your AI systems is accurate, unbiased, and representative of its intended use.

To achieve this, you need to have the right processes and tools in place.

On the tooling side - you should look for systems that will help you monitor the origin and the quality of the data. Many data catalogs provide indicators or scores that reflect the quality of data. These can be based on factors like freshness, completeness, or consistency, helping users to select the highest quality data for their AI models. Data observability tools should help you achieve the same goal.

This approach not only ensures compliance with the EU AI Act but also enhances the overall reliability and effectiveness of your AI applications.

Data Quality issues flagged in CastorDoc

Although tools are important to ensure data quality & integrity, don’t forget to include human oversight in the data training process. This oversight plays a key role in verifying that the data is free from bias and accurately represents the intended scenarios. Human intervention in reviewing and validating the data adds an essential layer of scrutiny, providing a more nuanced understanding of the data’s context and potential biases.

This approach isn't just our advice—it's a requirement under the EU AI Act that human oversight should augment efforts to ensure data quality.

2. Comprehensive Documentation & Traceability

“Requirements should apply to high-risk AI systems as regards the quality of data setsused, technical documentation and record-keeping, transparency and the provision ofinformation to users” - EU AI Act, section 44.

We’ve always been convinced of the vital importance of documentation & traceability of data. It helps stakeholders find and understand the data, eliminating some inefficiencies associated with data teams. In 2024, these two aspects will also become the cornerstone of compliance with the EU AI Act.

Documentation under the EU AI Act means keeping a comprehensive record of your data processes - from its sources to the AI system.

Documentation starts with the raw data. Your data should be well-documented and its collection method should be clear. This part of the documentation can be tackled with a data catalog tool. The good news is; most of it can be automated. By maintaining comprehensive records of the data, including its origins, structure, and modifications, data documentation helps in establishing transparency around the data inputs into AI systems. This transparency is vital for understanding how the data might influence AI decision-making.

Example of raw data documentation in CastorDoc

However, to ensure full compliance with the Act, your documentation process must extend beyond the raw data. It's crucial to also document the internal workings of your AI systems, including algorithms, logic, and decision-making processes. This documentation should encompass the operational deployment, monitoring, and performance evaluation of the AI systems, often necessitating separate records within the AI development and deployment environment.

3. Privacy and Security Compliance

“This proposal contains certain specific rules on the protection of individuals with regard to the processing of personal data” - EU AI Act, section 2.1.

Under the EU AI Act, it's essential to weave strong privacy and security practices into your data governance strategy. The Act aligns with privacy laws like the GDPR, especially for sensitive personal data, emphasizing the need for robust security to protect data integrity and privacy.

The explosion of AI systems translates to more individuals, including AI system developers, handling sensitive data.

The first step in creating secure AI systems is to implement strict access controls. This involves ensuring that only authorized personnel can access sensitive data. Your governance tool should have solid Role-based Access Control features and Automated PII tagging, marking tables containing personally identifiable information.

The second key step is to maintain thorough data lineage. Not only does this align with the EU AI Act’s transparency and traceability mandates, but it also leaves an audit trail. This is essential for demonstrating your compliance with the Act. Data lineage serves as a critical accountability tool. The EU AI Act demands that providers prove their adherence to its regulations. Without clear evidence of the data's journey and processes, you risk non-compliance and potential fines.

Data Lineage example in CastorDoc

Creating your data's lineage by hand is time-consuming. Using a data catalog or lineage tool is more efficient, as it automatically constructs your data assets' lineage. This offers a transparent and trackable record from origin to endpoint, which you can document and export into formats such as Excel, CSV, or PDF, enhancing your compliance measures.

Evaluate Your Compliance with the EU AI Act

It's crucial to understand where your organization stands in terms of compliance with the EU AI Act. To help with this process, we've developed an "AI Act Compliance Audit Template." This resource should guide you through a comprehensive audit of your AI systems, helping identify key areas of focus to ensure compliance with the new regulations. Download the AI Act Compliance Audit Template and start your journey towards full compliance.

Conclusion

The EU AI Act will change a lot of things for AI systems, but more importantly for data, which feeds these systems. Disregarding the Act's impact isn't an option, as it applies to most companies, and the penalties for non-compliance are substantial. If your priority is to ensure compliance once the act enters into force, then it’s important to put data governance at the forefront. By 2024, maintaining thorough documentation, clear data lineage, and robust privacy measures will transition from best practices to mandatory requirements for any AI system deployed.

If ensuring compliance in 2024 is on your agenda, consider the value of robust governance and data cataloging. These are not just tools but foundational elements that contribute to building reliable AI systems. Should you find yourself making this a priority, we invite you to chat with the team about how strategic governance and effective data cataloging can serve as cornerstones for your AI initiatives.

New Release
Share

Get in Touch to Learn More

See Why Users Love CastorDoc
Fantastic tool for data discovery and documentation

“[I like] The easy to use interface and the speed of finding the relevant assets that you're looking for in your database. I also really enjoy the score given to each table, [which] lets you prioritize the results of your queries by how often certain data is used.” - Michal P., Head of Data