Today, we embark on the thrilling finale of our "Data Documentation Demystified" series. In this series, we’ve journeyed through a lot of valuable lessons across different aspects of documentation:
The last element - where you curate your documentation – is essential to determine who uses it and how often. You can have the most polished data documentation ever, but if it’s in the wrong place, people will not use it.
While there's no one-size-fits-all answer, there are some essential considerations to remember during your search. From prioritizing your users' needs, to picking the right tool, to creating a documentation homepage – this article will be your friendly guide to finding the perfect approach.
Now, let’s uncover the best habitat for your data documentation to prosper!
When it comes to choosing the perfect home for your data documentation, keeping your users in mind is essential.
There are two primary types of data documentation users: Information seekers or explorers. Some are on a quest for specific information, while others prefer browsing through the documentation to get an overview of the data landscape. Understanding their needs will help you choose the right tools.
When it comes to thinking about the types of users for your documentation, it's easy to default to categorizing them by function, such as marketing, operations, or finance. However, this approach can overcomplicate things. Instead, we suggest using the more straightforward framework suggested above.
This framework is applicable across all functions and can help you understand how any user, regardless of their role, will engage with your documentation. By simplifying your approach, you can better tailor your documentation to meet the needs of your users.
Recognizing the type of user you aim to accommodate will assist you in selecting the most suitable tools in Section II.
Data consumers seeking specific information require an efficient method to find the necessary details. For these people, search engine capabilities can be a game-changer. Some advanced search engine capabilities include:
While wiki tools offer basic search engines capabilities, you will only find the advanced features outlined above in data catalog tools. We’ll cover tools in more depth in section II.
Quick tip: Did you know that Google Chrome offers a customizable site search feature? This allows you to search quickly within your preferred tools, streamlining your search experience and making it even more efficient.
Readers who prefer to browse through documentation rather than search for specific information need a clear and well-organized documentation structure. Creating an easy-to-navigate and comprehensive layout is essential to cater to this user group.
To ensure a user-friendly experience, it's crucial to maintain a well-balanced documentation structure. Here's how you can achieve this:
By considering these factors, you'll create a documentation environment that caters to browsing users, ensuring they have a seamless and informative experience.
Tools like wiki pages and data catalogs offer an intuitive way to create hierarchical documentation that caters to users who prefer browsing over searching. We will delve deeper into these tools and their specific features in the upcoming section.
As you venture into the world of data documentation, choosing the right tools can significantly impact the effectiveness and accessibility of your information. With various options available, it's crucial to find a solution that aligns with your needs and your users' preferences.
In this section, we'll explore three popular tooling options for data documentation—annotations and markdown, wiki pages, and data catalogs—highlighting their advantages and drawbacks to help you make an informed decision.
Annotations and markdown are great if you are using dbt. If you choose to go this way, columns and tables are described in yml files.
Advantages:
✅You can set owners and add tags.
✅ Close to your codebase
✅ Versioned and reviewed.
Issues:
⛔️Does not let you include pictures and schemas⛔️Complicated for non-technical users, not accessible for people who cannot read code⛔️Not suited for data browsing, unless combined with portals such as dbt docs
Wiki pages are also a good option for documenting your data.
Advantages:
✅Wiki pages offer many layout possibilities: Pictures, embedded diagrams, and links.
✅ Basic search engine capabilities
✅ Easier to access for non-technical users
[]
Issues:
⛔️ Not versioned
⛔️ Not tied to your data assets
⛔️ Must be updated manually
⛔️ Lack advanced search engine features
If you’re looking for a combination of the two above solutions to maximize the advantages and minimize the downsides, your best option is a data catalog. Modern data catalogs refresh/update your data documentation automatically.
Advantages:
✅ All your documentation in one place
✅ Advanced search engine capabilities
✅ Additional metadata: tags, owners, popularity, PII, …
✅ Pinned assets
✅ Easier to access for non-technical users
✅ Versioned
✅ Automatic lineage for browsing
[]
Castor is a catalog that can fetch and push back descriptions to data tools, such as dbt and looker. It offers the benefits of data catalogs while keeping descriptions close to your code.
Issues:
⛔️ It’s not free
In many cases, especially without a data catalog, your documentation might be scattered across various platforms.
This can include wiki pages for general knowledge and onboarding, code repositories for descriptions and tags (when using code annotations), JIRA or Gitlab for managing issues and requests, and Slack channels for sharing collective knowledge, among others.
If your documentation is scattered across multiple sources, you can create a centralized homepage to gather all the essential links. This approach offers users a convenient starting point for accessing your documentation, making it easier to navigate the wealth of metadata available.
A quick tip: Pin your homepage in the channels where data consumers typically ask their questions.
In conclusion, finding the perfect home for your data documentation is a vital aspect of ensuring its effectiveness and accessibility. As we've explored in this article, there are several key considerations to keep in mind when selecting the right location, tools, and structure for your documentation.
Remember to prioritize your users' needs, whether they're searching for specific information or browsing through the documentation to gain a broader understanding. Choose the right tooling that aligns with your requirements, such as annotations and markdown, wiki pages, or data catalogs, each with its unique advantages and challenges. Lastly, create a centralized homepage to make it easy for users to navigate and access the wealth of information you've curated.
With these insights, you're now equipped to provide the ideal environment for your data documentation to thrive. By tailoring your approach to cater to your users' preferences and utilizing the most suitable tools, you can ensure that your documentation is both valuable and accessible to a wide range of readers. Happy documenting!
We write about all the processes involved when leveraging data assets: the modern data stack, data teams composition, and data governance. Our blog covers the technical and the less technical aspects of creating tangible value from data.
At Castor, we are building a data documentation tool for the Notion, Figma, Slack generation.
Or data-wise for the Fivetran, Looker, Snowflake, DBT aficionados. We designed our catalog software to be easy to use, delightful, and friendly.
Want to check it out? Reach out to us and we will show you a demo.