“I like to compare Castor to an iPhone in the sense that it is very easy to use and the interface is extremely intuitive. I can give Castor to anyone in the company and I know that they won’t ask any questions.” Filipe Palma, Data Platform Product Manager, Printify.
Founded in 2015, Printify is a marketplace connecting online merchants to major print-on-demand and dropshipping manufacturers worldwide.
The company serves more than 2 000 000 merchants, the majority of which are individual businesses selling products to end customers through Printify.
At Printify, the data team is separated into four different sub-teams: The BI team, the data analysis team, the data science team, and the data platform team.
Castor partnered with Printify starting in 2022 to help the company improve its user experience around data while sustaining its rapid growth.
In a recent interview, we spoke with Printify's data platform product manager, Filipe Palma, about the company's data discovery journey. A year into using Castor, Filipe revealed how data discovery has opened up new possibilities for Printify.
Castor assisted Printify in two key areas: data team efficiency and data collaboration. Now, the company plans to use Castor for creating a unified metric alignment across the organization.
In our discussion with Filipe, we discussed the challenges that led Printify to look for a data catalog, the benchmarking and implementation process, and the results that followed.
I - Challenge: balancing rapid growth with strong data governance
“The quality of work was below our expectations. Data consumers were spending much time trying to understand which data to use, sometimes using the wrong datasets to produce their analysis” Filipe Palma, Data Platform Product Manager, Printify.
During the pandemic, Printify experienced exponential growth in both revenue and company size. New data stakeholders were onboarded monthly, and more employees were accessing the data.
Printify’s data environment grew rapidly. However, the lack of data governance started to threaten the data experience for stakeholders working with data.
“We had a lot of information in our data warehouse, but we were not offering context to our consumers on how to use it” Filipe claimed.
In practice, Filipe explained that data consumers were spending a lot of time trying to understand which data they should use for their analyses, and what the data meant.
Data stakeholders inundated the data engineering team with daily business inquiries. As Filipe pointed out, this diverted the team's time from their primary role of development.
Poor understanding from stakeholders ended up spamming a lot of slack channels and led to a downturn in overall productivity and quality of work.
Moreover, Printify saw the importance of stakeholders grasping the data lineage. The team aimed to comprehend data flow within the company, but providing data lineage for their dataset was difficult. A survey gauging user experience with data showed dissatisfaction, with a 2.3 out of 5 rating for data documentation satisfaction. This revealed a clear discontent with the context surrounding data.
After reviewing the survey results, Printify decided to prioritize the implementation of a data catalog.
II - Choosing and implementing a data catalog
“One of the factors we considered was choosing a data catalog that was also a startup, so it could be more agile compared to legacy vendors” Filipe Palma, Data Platform Product Manager, Printify.
As they searched for a data catalog tool, Printify used Medium and Reddit to get ideas from other companies in the ecosystem.
Printify’s data catalog search was guided by a need for simplicity. The company wanted to avoid some of more the complex data catalog solutions that could be found on the market.
Data stakeholders at Printify wanted a simple solution to search the data and obtain the right context to leverage it. They also needed a tool that could provide visibility into the data lineage in a straightforward manner.
“In our opinion, this is where Castor shines, because the search capability is more straightforward than other solutions. People immediately understand how to use the tool because it’s extremely intuitive” Filipe said.
Printify evaluated three data catalog solutions. For the proof of concept, the company chose to test the tools with all its consumer profiles within the data team: data scientists, data analysts, product managers, engineers, and business intelligence. They evaluated the tools according to a pre-determined evaluation matrix.
Printify opted for a two-step implementation process:
- The initial step involved generating value within Castor. Printify began by documenting key data assets, making the software more valuable. Castor's feature identifying a company's most popular data assets helped Printify pinpoint the most used ones, which they then documented.
- Printify then organized sessions to introduce the data catalog to a wider stakeholder audience and explain the value it could provide. These sessions were about raising awareness about the catalog and were effective in improving data literacy.
At Printify, different teams share the responsibility of enhancing the data catalog. The main duty falls on the data steward, who maintains the catalog. They collect context from various teams and document data assets accordingly.
When specific teams create data assets, they are responsible for documenting their work. They collaborate with the data steward to ensure documentation and definitions are standardized throughout the organization, as Filipe explained.
“Before Castor, we were getting questions such as ‘Where can I find this data?’ or ‘What does this data mean?’ at least once a day in our slack channels. Now, we barely have one every two weeks. Stakeholders now use Castor to answer these kind of questions.” Filipe Palma, Data Platform Product Manager, Printify.
Since implementing Castor as a data catalog solution, Printify noticed clear improvements in productivity and collaboration.
Implementing a data catalog allowed stakeholders to use data without depending on the data engineering team. This boosted productivity for both stakeholders and the data engineering team. Stakeholders can now work quickly and accurately, while the data engineering team focuses on producing data rather than addressing data requests.
There are two indicators that suggest a strong increase in productivity at Printify.
- The number of data-related slack pings to the data engineering team was cut by 90%.
- Stakeholders’ satisfaction with data documentation increased by 87%, from a rating of 2.3/5 to a rating of 4.3/5.
Second, Castor also allowed Printify to build a culture of collaboration through two initiatives: identifying experts in specific data, and re-using existing analyses for new projects.
Castor makes it possible to assign ownership over data assets. For a specific dataset, one person or team can be assigned as the owner. At Printify, this feature has made it easy for stakeholders to identify the go-to person to qualify a specific data asset. Through ownership, Castor guides stakeholders to subject matter experts, which improves the quality of everyone’s work.
Printify is also using Castor as a central repository where stakeholders share analysis around specific areas. Everyone is able to find and re-use the information in Castor, such as popular queries. This improves collaboration and the quality of everyone’s work because it provides stakeholders with a good starting place when beginning analysis in a specific area.
Following these improvements around collaboration, Printify decided to take things one step further with Castor. The company is planning to use the tool as a single source of truth for business definitions, with the aim to align everyone around company metrics.
“We want Castor to be a dictionary for every definition and concept around the company. For example, if someone is wondering what the concept of ‘merchant’ or ‘order’ means, he will find the answer in Castor.’ Filipe explained.
Subscribe to the Castor blog
We write about all the processes involved when leveraging data assets: from the modern data stack to data teams composition, to data governance. Our blog covers the technical and the less technical aspects of creating tangible value from data.
At Castor, we are building a data documentation tool for the Notion, Figma, Slack generation.
Or data-wise for the Fivetran, Looker, Snowflake, DBT aficionados. We designed our catalog software to be easy to use, delightful and friendly.
Want to check it out? Reach out to us and we will show you a demo.
You might also like
Discover Stuart's approach to Data Mesh and how this innovative paradigm shift in data architecture can help your organization to increase data efficiency.
Overcome the challenges of documentation by harnessing Collective Intelligence within your organization. Make your documentation actionable and easily accessible.
Fantastic tool for data discovery and documentation
“[I like] The easy to use interface and the speed of finding the relevant assets that you're looking for in your database. I also really enjoy the score given to each table, [which] lets you prioritize the results of your queries by how often certain data is used.”
Michal, Head of Data, Printify