With CastorDoc, I managed to identify and remove 300 tables. This saved me at least 2 to 3 days of work.
This customer story was contributed by Olivier Detriché, the Lead Analytics Engineer at Payfit. PayFit is a leading provider of comprehensive payroll and human resources (HR) services, offering tailored solutions to a wide range of industries. The company streamlines processes including payroll management, benefits administration, and employee onboarding.
At PayFit, data plays a vital role in our operations. As the Lead Analytics Engineer, my team and I are responsible for transforming data coming from various source applications and making it reliable and usable for our end users. Acting as the interface between the source applications and the database, we receive data, clean it, transform it, and document it. This ensures that the data we provide to our consumers and clients is accurate, valuable, and ready to use.
My team focuses on creating key performance indicators (KPIs) and documenting the data, allowing us to deliver precise and valuable insights. All the departments at Payfit rely on this data to track metrics such as monthly recurring revenue (MRR), customer service resolution rates, efficiency, and product usage.
In terms of team structure, we currently have a team of approximately 20 people, divided into three teams: data engineering, analytics engineering, and data analysis. These teams operate independently and autonomously, each handling their respective topics. Our VP of Data oversees all three teams, while a product owner takes care of the "data as a product" aspect.
Challenge: The documentation gap at Payfit
“We recognized that a system needed to be put in place where all the data was documented, organized, and easily accessible to anyone who needed it.” Olivier Detriché, Lead Analytics Engineer, Payfit.
Data discovery was a constant source of frustration for us at Payfit. The lack of organization and documentation made it feel like searching for a needle in a haystack. We were drowning in scattered and unreliable data, with no reliable source to turn to. This not only wasted valuable time but also raised concerns about the accuracy and reliability of the information we were working with.
One particular incident stands out just after my arrival at Payfit, where I spent an entire week trying to find our client list. It was a painstaking process of navigating through seventeen different tables, only to be met with inconsistent and frustrating results. This was a wake-up call for us - it was clear that our documentation and modelization processes needed improvement.
The challenge we faced was twofold - we needed to establish a reliable and centralized data source, and we needed to improve our documentation practices. We recognized that a system needed to be put in place where all the data was documented, organized, and easily accessible to anyone who needed it. This was not just a matter of convenience, but a crucial step towards improving efficiency and ensuring the accuracy and reliability of the data we relied on.
Solution: A well-integrated data catalog
“My main objective in acquiring a data catalog was to seamlessly integrate documentation into our codebase, ensuring that it remained up to date throughout the development process.” Olivier Detriché, Lead Analytics Engineer, Payfit.
To improve our documentation process, I reassessed our tools and explored new options. Although CastorDoc was already implemented at Payfit, we hadn't started using it. So, we decided to evaluate all available catalog tools.
My main objective was to integrate documentation seamlessly into our codebase, avoiding duplication of efforts between dbt and the data catalog. CastorDoc stood out because it synced with dbt, and the debt integration was much more advanced than the other catalogs we evaluated. This allowed us to maintain documentation close to the code while making it accessible through the data catalog.
In addition to its integration with dbt, CastorDoc provided other essential features that aligned with our requirements. It offered a comprehensive and automated data lineage feature, enabling us to trace the origin of each table.
Compared to other tools, CastorDoc was simple and user-friendly. Other catalogs had a lot of features but lacked intuitiveness and the basic functionalities of a powerful data catalog search. This is where CastorDoc made a significant difference. The enthusiasm displayed by our team for CastorDoc also influenced our decision-making process.
Once we chose CastorDoc, the implementation process was straightforward. As a new tool for all teams, we did not have a point of comparison, but we quickly realized its simplicity and ease of use. It was quickly adopted by the whole team.
Impact: Streamlined documentation, cleaner data warehouse
“With CastorDoc, I managed to identify and remove 300 tables and views. This saved me at least 2 to 3 days of work” Olivier Detriché, Lead Analytics Engineer, Payfit.
Since we started using Castor, we've seen big changes in how we handle data at PayFit. Our documentation is always up to date and it's easier than ever to find what we need. This has made us more efficient and improved the quality of our decisions. This has not only affected the way the company handles its data but has fundamentally improved productivity and optimized costs.
Documentation – Productivity Gains and Better Decision-Making
With the implementation of CastorDoc, PayFit overcame its data management challenges. The user-friendly interface allowed for easy creation and updating of data dictionaries, establishing data relationships, and ensuring data integrity. Collaboration became more effective, with teams able to document data sources, define data quality rules, and provide clear definitions for each data attribute.
This systematic approach to data management improved the ability to find and use data for analysis and decision-making. It also facilitated transparent communication channels between technical teams and end-users, building trust in the data. Integrating documentation into the codebase and synchronizing it with a centralized catalog eliminated time-consuming searches for information.
Storage Cleanup – Cleaner Warehouse, Better Discovery, and Storage Costs Optimization
The second major area of impact is in the cleaning up of PayFit's data warehouse. By identifying and deleting unused databases and models, CastorDoc has enabled the organization to optimize storage costs and enhance the discovery of essential information.
With CastorDoc, I managed to identify and remove 300 tables and views. This saved me at least 2 to 3 days of work. Not to mention the noise and potentially the extra work for my team to migrate these tables to the new database. We are now working on identifying dashboards and data models that are not used. I expect similar results
This cleanup led to a more efficient and cleaner warehouse, freeing up time for more important tasks and providing a reliable foundation for ongoing work. The ability to clean up unused models and databases has been particularly beneficial, as it has allowed for more focused attention on critical areas.
In the future, our vision for Payfit is to extend CastorDoc's usage beyond the data team and make it accessible to business departments company-wide. We work so hard on documenting the data, so we want to make sure it can be shared with a wider audience to maximize the impact of this initiative.
To enhance the accessibility of data, we plan to leverage the AI features of CastorDoc. Features like the query explainer and AI assistant will provide users with a more intuitive and efficient way to interact with the data. We think AI-generated explanations can be valuable for users without a strong SQL background.
Furthermore, we have future plans to transform CastorDoc into a comprehensive repository for all our metrics. This will enable us to centralize and streamline our metric management.
Read More Success Stories
Bridging the Gap Between Data and Decision-Making: How JW Player Used CastorDoc to Democratize Data Access Across the Organization
Driving Decentralization and Efficiency: How Fluid Truck's Partnership with CastorDoc Revamped Their Data Management, Streamlined Data Warehousing, and Fueled Operational Excellence
Fantastic tool for data discovery and documentation
“[I like] The easy to use interface and the speed of finding the relevant assets that you're looking for in your database. I also really enjoy the score given to each table, [which] lets you prioritize the results of your queries by how often certain data is used.”
Michal, Head of Data, Printify