“In an extreme view, the world can be seen as only connections, nothing else. We think of a dictionary as the repository of meaning, but it defines words only in terms of other words. I liked the idea that a piece of information is really defined only by what it’s related to, and how it’s related. There really is little else to meaning. The structure is everything. There are billions of neurons in our brains, but what are neurons? Just cells. The brain has no knowledge until connections are made between neurons. All that we know, all that we are, comes from the way our neurons are connected.”
Our brains fire neurons to connect information, feelings, and logic. In much the same way, our everyday business decisions are informed by the knowledge people, data, and processes create. When those are disconnected, we can’t get accurate and clear answers fast enough to compete.
If this is your reality, you probably experience the following gaps in data work.
The discoverability (and meaning) gap
You can’t find or understand the information you need fast enough to matter, so customers and competitors pass you by.
The relevance (and reusability) gap
Data is disconnected from business concepts and initiatives, so it isn’t understood in context. As a result, you have to start from the ground up on new analysis without building on previous work.
The impact (and reproducibility) gap
Data isn’t democratized if the majority can’t use it without expert help. But even so, working with, collaborating on, and packaging the data for decision makers is an afterthought.
But before deciding to embark on a data catalog evaluation process, you have to be clear on what you want to accomplish with one to get the most value from it.
There are a number of valuable ways to use a data catalog, but our customers tell us the following use cases helped them make critical business decisions with clarity, accuracy, and speed.
P.S. If you’re asking, “What is a data catalog?” Take a quick pause to read this blog and come back.
If you can’t see or understand it, you can’t use it. If you don’t solve that problem, your most important business decisions have to wait. Or worse, they’ll get made without the context needed to achieve the goal. This happens every day in organizations that don’t have a well-maintained, active inventory of data and analysis.
If you inventory all your data resources, make them easy to find, enriched with useful metadata (meaning) and validations, and connect them to meaningful business concepts, you’ll vastly reduce the amount of time it takes for your company to ask a question and produce and answer from your data.
By the way, you can use Castor to start doing that right now !
Searching for the right data for analysis work can feel like being lost in the forest with no compass. You might have some educated guesses as to where to go, but in the end, you’re relying on instinct and every moment counts.
Think like a cartographer, and make a map of your best data with your data catalog tools. By curating data sources and making them accessible, you’ll be able to add them to your library of reusable assets.
Your curated library might be a slice of data from your data warehouse. It could also be your most popular spreadsheets being shared currently on a shared drive or via email, or the hundreds of datasets your company buys from third-parties. Ultimately, the goal is to surface the needle in the haystack and point the company to the 20% (or much less) of assets that provide 80% (or much more) of the value.
Of course, all of our work with data is for nothing if it doesn’t influence the decisions we make. This is where sharing information with stakeholders ineffectively or incompletely increases risk and slows productivity. Lost cycles may cost hundreds of thousands of dollars, and a poor decision could cost millions.
That risk is why we must ensure IT, data stewards, data engineers, analysts, and business folks are collaborating. When they do, analysis can be documented and shared in a way that is agile, iterative, and easily consumable. Workflows can be reproduced and reused more easily and deliver more consistent answers.
You should now have a good sense of the three high-impact data catalog example use cases you might want to take on for your company. While you’ll eventually want to do all three, it may be best to start by picking the most pressing problem to solve or gap to close.
If your biggest challenge is understanding what data resources the company has and what they mean, inventory everything. If it’s knowing which data assets are best and reusable for any given situation, curate what’s useful. And finally, if you have a recurring analysis or business challenge, encourage your colleagues to analyse, share, and repeat their analyses to make them reusable.
If you’d like to dig deeper into these three tactics, visit our website