Data Sharing Challenges: Privacy & Security Concerns

Navigating Privacy & Security when implementing data sharing

Data Sharing Challenges: Privacy & Security Concerns

Data sharing can bring many benefits to a company but also comes with its own set of problems. Two major issues that companies often struggle with are Privacy & Security.

No one really likes talking about these topics. I'll be the first to admit that they're not the most exciting things to think about. But trust me, it's worth taking a few minutes to pay attention to them. It can help your company avoid multimillion-dollar fines.

Although data sharing brings tremendous benefits, it may seem at odds with Privacy & Security.

Data sharing is about giving business teams access to data to help them make data-driven decisions. Let’s recall the principles of data sharing:

  • Everyone should have access to the data they need, not just certain roles or job titles.
  • There should be no barriers preventing people from getting the data they need.
  • The data should be organized and structured in a way that makes it easy for anyone to access, understand, and use it.

Therefore, it’s natural to assume that Privacy & Security conflict with these principles. Privacy is about giving individuals control over their personal information. Security is about protecting it from unauthorized access. These two concepts may thus seem at odds with data sharing.

If you think you're not affected by Privacy & Security rules, think again. All companies processing personal data are. What's more, these rules are enforced by stringent regulations such as the General Data Protection Regulation (GDPR) in Europe and the California Consumer Privacy Act (CCPA) in the United States.

According to laws like GDPR and CCPA, it's not enough to just comply, you have to be able to prove it! It is the Accountability principle. If you can't show that you're following the rules, you're deemed non-compliant. And as we know, non-compliance comes with a hefty price tag.

This article explores the implementation of data sharing while avoiding the associated risks with Privacy & Security. It is divided into three sections:

  1. Addressing Privacy risks and how to manage them.
  2. Security concerns in data sharing and how to mitigate them.
  3. Demonstrating compliance and the accountability principle.

Managing privacy and security risks can be achieved by implementing a few key strategies. To protect personal information, it is important to invest in proper documentation and establish clear data sharing agreements. Additionally, implementing access controls and adhering to data minimization practices can help to mitigate security risks and ensure the safety of sensitive information. We'll take a closer look at these solutions later in the article, but for now, you can get a glimpse of them in the image below.

Ensuring compatibility between Data Sharing, Privacy, and Security — Image from Castor

1. Privacy

When sharing our Personally Identifiable Information (PII) it's only natural to want to keep it private. Privacy is all about having control over what information is shared, who it's shared with, and why.

Personal information includes things like our names, Social Security Numbers, emails, mailing addresses, and IP addresses. It's important to keep this kind of data under wraps to protect ourselves from intrusions that range from inconveniences (like spam advertisements) to real threats (like identity theft).

The most widely recognized Privacy rule in GDPR and CCPA is known as purpose limitation. According to this rule, you should process PII for specified, explicit, and legitimate purposes. You must communicate these purposes to the data subject before collecting the data.

This principle ensures the data collected is always used for its specified purpose.

Let's say that you are a retailer collecting customers' addresses for product delivery. Under the purpose limitation principle, you can only use this data for product delivery. You have no right to use it for another purpose, such as a marketing campaign.

When it comes to sharing data, how can we make sure it's only being used for its intended purpose? When the data is made accessible to a wider audience, it can be difficult to keep track of how it is is being used. Oftentimes, stakeholders are unaware of the specific reason why the data was collected. Without this knowledge, it can be hard to follow the rules.

Additionally, data sharing increases the number of potential points of exposure for PII data. This opens the door to potential Privacy violations, such as identity theft, and loss of control over personal information. The more you open access to data, the more opportunities there are for stakeholders to use it for nefarious purposes.

Solution: Documentation & Data Sharing Agreements

Before going into the solution part, it is important to note that data sharing doesn't mean unrestricted access to PII data. PII data should only be exposed to the people who need to see it. We will discuss management of access controls later, in the section of this article that covers security concerns.

This section is about dealing with Privacy concerns, and ensuring those who have access to the data use it of the intended purpose.

Now the solution is… 🥁 documentation. Proper documentation of PII data is a crucial step in ensuring that it is handled in an ethical and legal way.

Documentation involves identifying PII data and flagging it in the database. You should then specify the purpose for which the data was collected, and the specific use for which you will put it.

Enriching each PII field with the right context ensures everyone is aware of its purpose. Various teams accessing the data can thus use it lawfully under the purpose limitation principle.

Let's say you have a column labeled "email address" in a dataset. For this column, it is important to include a detailed explanation of how the data should be used. This might include a statement such as: "email address, to use only for product delivery"

This ensures stakeholders use the data for the intended purpose and not for any other unauthorized activities.

Flagging PII information in your documentation tool - Image courtesy of Castor

It is vital to make the documentation accessible and digestible for business teams. Data sharing is about empowering business teams with the data. They should understand how they can lawfully use the data to support their goals.

To ease this, it's important to document the data in an interface that is accessible to business teams. The documentation should thus be in the tools that business teams are familiar with. You cannot expect business teams to dive in technical tools to fetch the information.

A data catalog can help teams use data in a compliant way. It's a central place where important information about the data is stored and easy to access through a user-friendly interface. This includes:

  • Definition of the data
  • Where it came from
  • How it's used
  • Who can access it

When designed with a user-friendly interface, it makes it simple for business teams to search, locate and understand the data. This can include features like search functionality, lineage diagrams, and definitions. This helps stakeholders use the data in compliance with Privacy regulations.

Once you've got your business teams all set up with easy access to well-documented data, another way to keep things up is by setting up a Data Sharing Agreement (DSA). A DSA is a legally binding contract that lays out all the terms and conditions for how data will be shared and used.

It outlines what types of data will be shared, why it's being shared, and how it will be protected. It also lays out everyone's responsibilities, including any limits on how the data can be used, and what happens if things don't go as planned. These agreements are used all the time in research, business, and government. They're a great way to make sure everyone's playing by the rules and using data for its intended purpose.

2. Security

Security is about the measures implemented to protect personal information. PII data needs protection from unauthorized access, use, disclosure, disruption, modification, or destruction.

One of the most critical Security rules in GDPR is the integrity principle. It states that personal data must be protected against unauthorized access, alteration, or destruction.

Implementing data sharing is like opening the floodgates to a wide range of potential threats, such as hacking or malware. The more people who have access to the data, the more opportunities there are for unauthorized parties to access it. Plus, when data is shared, it may also be stored in multiple locations, making it more difficult to keep an eye on.

Even if a company's IT system is like Fort Knox, data sharing can still pose a Security risk. This is because while a robust IT system may be able to withstand external threats, it may not be able to prevent internal threats, such as insider breaches. And let's not forget about human error - an employee accidentally sending sensitive data to the wrong person can be just as dangerous.

Sharing data can be tricky business. The more people who have access to it, the more potential weak spots there are in your company's systems. But, it doesn't have to be all doom and gloom. Even with more eyes on the data, it's still possible to keep it safe and sound while staying regulation compliant. It's just a matter of taking the right steps to protect the data.

Solutions: Access Controls & Data Minimization

When it comes to sharing data, it's all about striking the right balance between access and security. On one hand, you want to make sure that the right people have access to the information they need to do their jobs, but on the other hand, you don't want to leave the door wide open for just anyone to come strolling in. That's where access controls and data minimization come in.

Access controls are about making sure that only the right people have access to the data. This can include things like setting up user roles with different levels of access, using authentication methods like passwords, and monitoring who is accessing the data and when. By putting these controls in place, you can be sure that only the people who are supposed to be looking at the data are able to see it.

Data minimization is another key part of the puzzle. It's all about keeping the amount of data shared to a minimum. Instead of sharing everything you've got, take a step back and think about what information is truly essential for various teams to perform their job. In general, you can remove or mask PII columns in datasets without stakeholders suffering from it. By sharing only the data that is essential, you can keep the amount of PII information that's floating around to a minimum.

When used together, access controls and data minimization can help you share data with more people while still keeping it secure and compliant with Security regulations. Together they can keep your data safe and sound while still making it available to the people who need it.

The best way to put this into practice is by using a data-sharing platform. Think of it like a virtual "filing cabinet" where you can store and share your data with the right people. These platforms often come with built-in access controls, so you can be sure that only the people who are supposed to have access to the data can see it. Plus, they often have robust Security measures in place to keep your data safe from falling into the wrong hands.

By using a data-sharing platform to manage access controls and data minimization, you can share your data with more people while still keeping it secure and compliant with Security regulations. It's like a combination lock that keeps your data safe and sound, while still making it available to the people who need it. A win-win situation for everyone.

3. Accountability: How to Prove Compliance?

As stated earlier, if you can't show that you're following the rules, then you're considered to be breaking them. This is the basic idea behind the accountability principle in data regulations like GDPR and CCPA. Being accountable means being able to prove that you're following all the regulations and keeping personal data safe.

The accountability principle states that companies must be able to demonstrate that they have the appropriate technical and organizational measures in place to meet their obligations under the regulation.

Imagine you're an organization and the regulatory authority is conducting an audit to check who has been accessing sensitive data and what they have been doing with it. Without the proper tools in place, you'll be left scratching your head, trying to figure out how the data has been used.

Two features that can come in extremely handy when it comes to accountability are data lineage and query history. These features can be found in a modern data catalog, which can help you keep track of data access and usage, and quickly identify any potential Security risks.

Data lineage, also known as data genealogy, is the ability to trace the origin and movement of data throughout its lifecycle. It allows you to see where data came from, where it's been, and where it's going. This means you can easily track down who's been accessing that sensitive data, when they accessed it, and what they did with it.

Using data lineage to monitor data usage - Image courtesy of Castor

The query history is like a digital paper trail for your data. It keeps a record of every query that's been run on your data, including who ran the query and when. This means you can see exactly who's been searching for what and when.

With these features in place, you'll be able to demonstrate accountability and compliance with regulations, because you'll have a clear understanding of who's been accessing your data and what they've been doing with it. It's like having a data Security guard on duty 24/7, keeping an eye on everything, and making sure everything is above board.


While data sharing can bring significant benefits to a company, it also comes with its own set of problems, particularly in regard to Privacy & Security.

Privacy is all about having control over what information is shared, who it's shared with, and why, while Security refers to protecting the data from unauthorized access, alteration, or destruction.

Managing privacy and security risks can be achieved by implementing a few key strategies. To protect personal information, it is important to invest in proper documentation and establish clear data sharing agreements. Additionally, implementing access controls and adhering to data minimization practices can help to mitigate security risks and ensure the safety of sensitive information. We have summarized this information in the image below.

Dealing with Privacy & Security — Image courtesy of Castor

Companies must comply with stringent regulations such as GDPR and CCPA to keep data safe and secure. The accountability principle also states that companies must be able to prove compliance.

However, with the right tools and practices in place, companies can effectively manage data sharing while respecting Security and Privacy rules.

Subscribe to the Castor blog

About us

We write about all the processes involved when leveraging data assets: from the modern data stack to data teams composition, to data governance. Our blog covers the technical and the less technical aspects of creating tangible value from data.

At Castor, we are building a data documentation tool for the Notion, Figma, Slack generation.

Or data-wise for the Fivetran, Looker, Snowflake, DBT aficionados. We designed our catalog software to be easy to use, delightful and friendly.

Want to check it out? Experience CastorDoc with a demo.

New Release

Get in Touch to Learn More

See Why Users Love CastorDoc
Fantastic tool for data discovery and documentation

“[I like] The easy to use interface and the speed of finding the relevant assets that you're looking for in your database. I also really enjoy the score given to each table, [which] lets you prioritize the results of your queries by how often certain data is used.” - Michal P., Head of Data