Where Should Data Documentation Live?

Finding the perfect home for your documentation

Where Should Data Documentation Live?

Today, we embark on the thrilling finale of our "Data Documentation Demystified" series. In this series, we’ve journeyed through a lot of valuable lessons across different aspects of documentation:

  1. The significance of data documentation
  2. The different forms that documentation can take
  3. The most effective documentation framework
  4. The art of crafting meaningful documentation

The last element - where you curate your documentation – is essential to determine who uses it and how often. You can have the most polished data documentation ever, but if it’s in the wrong place, people will not use it.

While there's no one-size-fits-all answer, there are some essential considerations to remember during your search. From prioritizing your users' needs, to picking the right tool, to creating a documentation homepage – this article will be your friendly guide to finding the perfect approach.

Now, let’s uncover the best habitat for your data documentation to prosper!

I. Keep your readers in mind

When it comes to choosing the perfect home for your data documentation, keeping your users in mind is essential.

There are two primary types of data documentation users: Information seekers or explorers. Some are on a quest for specific information, while others prefer browsing through the documentation to get an overview of the data landscape. Understanding their needs will help you choose the right tools.

When it comes to thinking about the types of users for your documentation, it's easy to default to categorizing them by function, such as marketing, operations, or finance. However, this approach can overcomplicate things. Instead, we suggest using the more straightforward framework suggested above.

This framework is applicable across all functions and can help you understand how any user, regardless of their role, will engage with your documentation. By simplifying your approach, you can better tailor your documentation to meet the needs of your users.

Recognizing the type of user you aim to accommodate will assist you in selecting the most suitable tools in Section II.

Two types of documentation readers - Image courtesy of Castor

A - Search Engines: Equipping Information Seekers

Data consumers seeking specific information require an efficient method to find the necessary details. For these people, search engine capabilities can be a game-changer. Some advanced search engine capabilities include:

  1. Advanced filters: This functionality enables users to narrow down their search results with precision. By applying specific criteria, such as data type, date range, or other relevant attributes, users can quickly locate the information they're looking for. 🔬
  2. Ranking results by popularity: Data catalogs can prioritize search results based on their popularity, making it easier for users to identify the most relevant and useful resources. This feature highlights the information that other users have found valuable, so newcomers can benefit from the collective wisdom. ⭐⭐
  3. Regular expression (regex) search: Regular expressions are a powerful tool that allows users to search for specific patterns in data. With regex search, users can find all tables containing specific patterns, such as those ending with _xxx or starting with yyy_. This advanced feature is especially helpful when looking for specific data structures or naming conventions.

While wiki tools offer basic search engines capabilities, you will only find the advanced features outlined above in data catalog tools. We’ll cover tools in more depth in section II.

Quick tip: Did you know that Google Chrome offers a customizable site search feature? This allows you to search quickly within your preferred tools, streamlining your search experience and making it even more efficient.

Google customizable search site feature - example with Castor

B - Hierarchy

Readers who prefer to browse through documentation rather than search for specific information need a clear and well-organized documentation structure. Creating an easy-to-navigate and comprehensive layout is essential to cater to this user group.

To ensure a user-friendly experience, it's crucial to maintain a well-balanced documentation structure. Here's how you can achieve this:

  1. Manage the level of detail: Provide enough information for readers to understand the content without overwhelming them with excessive technical jargon or minutiae. Striking the right balance helps users absorb the material at their own pace while maintaining their interest. ⚖️
  2. Keep related information together: Grouping related topics or data sets ensures that users can navigate the documentation easily and find what they need without getting lost. This approach also helps them see connections and relationships between different pieces of information. 🏘️
  3. Present a clear, big-picture overview: Offer users a comprehensive understanding of your documentation by providing a high-level summary or table of contents. This helps them grasp the overall structure and purpose of the documentation, making it easier to find and engage with the content they're interested in. 🗺️
  4. Ensure users are aware of available documentation: Make it easy for readers to discover and access all the available resources. If they know what documentation is available and where to find it, they are more likely to refer to it when needed. Clear navigation menus, links, or other signposts can guide users to the information they seek. 💡

By considering these factors, you'll create a documentation environment that caters to browsing users, ensuring they have a seamless and informative experience.

Tools like wiki pages and data catalogs offer an intuitive way to create hierarchical documentation that caters to users who prefer browsing over searching. We will delve deeper into these tools and their specific features in the upcoming section.

lI - Use the right tooling

As you venture into the world of data documentation, choosing the right tools can significantly impact the effectiveness and accessibility of your information. With various options available, it's crucial to find a solution that aligns with your needs and your users' preferences.

In this section, we'll explore three popular tooling options for data documentation—annotations and markdown, wiki pages, and data catalogs—highlighting their advantages and drawbacks to help you make an informed decision.

A - Annotations and Markdown

Annotations and markdown are great if you are using dbt. If you choose to go this way, columns and tables are described in yml files.

Advantages:

✅You can set owners and add tags.

✅ Close to your codebase

✅ Versioned and reviewed.

Issues:

⛔️Does not let you include pictures and schemas⛔️Complicated for non-technical users, not accessible for people who cannot read code⛔️Not suited for data browsing, unless combined with portals such as dbt docs

B - Wiki pages

Wiki pages are also a good option for documenting your data.

Advantages:

✅Wiki pages offer many layout possibilities: Pictures, embedded diagrams, and links.

✅ Basic search engine capabilities

✅ Easier to access for non-technical users

[]

Issues:

⛔️ Not versioned

⛔️ Not tied to your data assets

⛔️ Must be updated manually

⛔️ Lack advanced search engine features

C - Data Catalog

If you’re looking for a combination of the two above solutions to maximize the advantages and minimize the downsides, your best option is a data catalog. Modern data catalogs refresh/update your data documentation automatically.

Advantages:

✅ All your documentation in one place

✅ Advanced search engine capabilities

✅ Additional metadata: tags, owners, popularity, PII, …

✅ Pinned assets

✅ Easier to access for non-technical users

✅ Versioned

✅ Automatic lineage for browsing

[]

Castor is a catalog that can fetch and push back descriptions to data tools, such as dbt and looker. It offers the benefits of data catalogs while keeping descriptions close to your code.

Issues:

⛔️ It’s not free

III - Keep a Homepage

In many cases, especially without a data catalog, your documentation might be scattered across various platforms.

This can include wiki pages for general knowledge and onboarding, code repositories for descriptions and tags (when using code annotations), JIRA or Gitlab for managing issues and requests, and Slack channels for sharing collective knowledge, among others.

If your documentation is scattered across multiple sources, you can create a centralized homepage to gather all the essential links. This approach offers users a convenient starting point for accessing your documentation, making it easier to navigate the wealth of metadata available.

A quick tip: Pin your homepage in the channels where data consumers typically ask their questions.

Documentation homepage in Notion - Image courtesy of Castor

Final words

In conclusion, finding the perfect home for your data documentation is a vital aspect of ensuring its effectiveness and accessibility. As we've explored in this article, there are several key considerations to keep in mind when selecting the right location, tools, and structure for your documentation.

Remember to prioritize your users' needs, whether they're searching for specific information or browsing through the documentation to gain a broader understanding. Choose the right tooling that aligns with your requirements, such as annotations and markdown, wiki pages, or data catalogs, each with its unique advantages and challenges. Lastly, create a centralized homepage to make it easy for users to navigate and access the wealth of information you've curated.

With these insights, you're now equipped to provide the ideal environment for your data documentation to thrive. By tailoring your approach to cater to your users' preferences and utilizing the most suitable tools, you can ensure that your documentation is both valuable and accessible to a wide range of readers. Happy documenting!

Subscribe to the newsletter

About us

We write about all the processes involved when leveraging data assets: the modern data stack, data teams composition, and data governance. Our blog covers the technical and the less technical aspects of creating tangible value from data.

At Castor, we are building a data documentation tool for the Notion, Figma, Slack generation.

Or data-wise for the Fivetran, Looker, Snowflake, DBT aficionados. We designed our catalog software to be easy to use, delightful, and friendly.

Want to check it out? Reach out to us and we will show you a demo.

New Release
Share

Get in Touch to Learn More

See Why Users Love CastorDoc
Fantastic tool for data discovery and documentation

“[I like] The easy to use interface and the speed of finding the relevant assets that you're looking for in your database. I also really enjoy the score given to each table, [which] lets you prioritize the results of your queries by how often certain data is used.” - Michal P., Head of Data