How to build your data team?

Overview of mid-market data team organization models

7 min read

As businesses recognize the decisive power of data, most are hoping to use analytics as the driver for their business and product strategies. This entails building a strong data practice which can propagate its insights effectively across different areas of the business. Unfortunately, this is no easy task.

These are questions that heads of data typically grapple with:

  • How big should this practice be?
  • How many data engineers, data analysts, data scientists?
  • How does the team interact with the rest of the organization?
  • Which structure for the data team? Centralised or embedded?

They rightly do so, as having a strong data practice is not a luxury anymore, but essential to remain competitive.

Let's start with the basis though.

What being Data Enabled means

Data maturity is the journey towards improvement and increased capability in using data. We propose a simple framework of data maturity assessment, in which you measure your ability to understand your past, know your present and predict your future. What does it mean?

Well, each department has its key KPI that needs to be defined, tracked and predicted. Your ability to predict future outcomes rests on a clear knowledge of your present, which builds on a strong understanding of the past. This is a simple way to assess your data maturity. If you're unable to identify the revenue drivers for your company (your past), it means you need to work on your data maturity by bringing visibility in your business before you seek to predict future outcomes. We don't recommend skipping steps. It's like a pyramid of Maslow, but for data.

For example:

Marketing ROI. Define your ROI, across multiple chanels, with an identified attribution model. Then understand its evolution in the past 12 months, and especially its drivers (identify performing channels, time of the year, product, ....). Then track on a daily/weekly/monthly basis its evolution thanks to a reporting suite you trust (present). Forecast your marketing budget based on these predictive models (future).

Customer Satisfaction. What is it? NPS, CSAT? Make everyone in the company has trust in the way its computed. Same as before, compute its evolution in the past 12 months, find its drivers (past). Then track daily the satisfaction of your customers with trusted dashboards. Identify action to take from today to increase it. Your understanding of the past and the present state of customers satisfaction will allow to predict churn efficiently (future)

User autonomy and Self BI

Another good indicator of data maturity is user autonomy. Who is autonomous with data in your company? That is, who can autonomously get a quick answer to their analytical questions?

There are different models to achieve business users autonomy on data, some models include a full and extensive suite of dashboards, other demand to train a majority of employees on SQL. What matters at the end is how fast employees can get the answer to their analytical question.

Wether you have all analysts in your data team or one analyst per department, if the analyst working with marketing is not able to explain recent Ad Spend or recent ROI changes in less than an hour, then do not focus on making all marketing people autonomous on these questions. You should start by ensuring complete autonomy of data people before you enable traditional department to be autonomous with data.

Ok, now that you're ready to build a data team. Who do you want? and why?

Key players on a data analytics team

A data analytics team is usually composed of four core functions, which are detailed below.

  1. Data engineer: He is responsible for designing, building, and maintaining datasets that can be leveraged in data projects. As such, data engineers closely work with both data scientists and data analysts. We also include the role of analytics engineer here, although it's between analytics and engineering.
  2. Data scientist: He uses advanced mathematics and statistics, programming tools and machine learning to build predictive models. The role of data scientist and data analyst are pretty similar, but the focus of the data scientist is more on the engineering than analysis.
  3. Data analyst: He uses data to perform reporting and direct analysis. Whereas data scientists and engineers typically interact with data in its raw or unrefined states, analysts work with data that’s already been cleaned and transformed into more user-friendly formats.
  4. Business analyst/operations analyst: He helps the organisation improve its processes and systems. He is agile and straddle the line between IT and the business to help bridge the gap and improve efficiency. He always works with a specific department, and his SQL literacy ranges from dashboard to advanced.
  5. Head of data: He oversees the data team. His goal is to create an environment allowing all different parties to access the data they need painlessly, enforce data governance, and ensure data quality. He also acts as a bridge between the data team and the main business unit, acting both as a visionary and a technical lead. He ultimately defines the roadmap of the data team, and ensures that data impacts business outcomes.

I - How large should the team be?

Different companies will build data teams of different sizes, and there is no made to order guideline regarding the size of your data team. However, here are a few elements you can take into consideration when structuring your analytics team.

  1. As a general rule, you should aim to have a total of 8% of SQL enabled employees in your company. Some companies such as Amazon or Facebook are training a huge portion of their employees, but we exclude them from our stats.
  2. Brand new data teams often start with a lead data engineer and a data analyst. When the first data science needs appear, they are taken care of by the data analyst. When building a larger team, think in terms of the skillset you need. A typical data project requires the following skills: database, software development, machine learning, visualisation, collaboration and communication skills. It is very rare to find individuals who possess all these skills. You should thus be aware of which skill each candidate brings to the table. Regardless of how many people you decide to hire, your team should ideally cover this skillset.
  3. What should ultimately guide the size of your data team is the amount of problems and the complexity of the most serious problems. Look at the size of your roadmap and establish how many people you need to complete your data projects within a reasonable amount of time. If you realise it would take more than a year for to data team to complete its projects, then it's probably time to expand the team. A good guideline to keep in mind is to have at least one and a half project per person. This allows members of your team to make progress on their secondary project when they encounter a deadlock with their primary project.
  4. Finally, you might have to make some project specific hirings. If you're a fintech conducting a project on fraud detection, or a company specialising in dispatching for logistics, you might want to hire someone who knows the specifics of your industry.

II - What model to integrate the data practice to the company

There is no perfect structure for an analytics team, and your structure is likely to change many times. No changes for the last years? Your organisation is sub optimal. Why? Because the data practices is evolving so fast and so may be your company. That's two reasons to adapt frequently the structure your organisation. Also keep in mind that the more static your organisation, the harder the next change will be. For this reason, we won't prescribe a given structure, but we present the most common models models and how they can be suited to different types of businesses.

The very first step to take when structuring your data team is to find the data people in your organisation. They might not be the people with a term "data" in their title, but any employee who's not afraid of SQL, such as business analysts/ operations analysts. If you don't locate you data people very carefully, you will end up with an unplanned structure, unlikely to fit your business needs.

Centralised model

The centralized model is the most straightforward structure to implement, and it is usually the first step for companies who aim to be data driven. This model usually leads to a centralised data "platform", where the data team has access to all the data and serves the whole organisation in a variety of projects. All data engineers,  analysts and scientists are managed directly within this team, with their projects and work defined by the head of data. This flexible model is adaptable to the continuously evolving needs of a growing business. If you're at the beginning of your data journey, that is, you still struggle to have a clear vision of your past and present, this is the structure we recommend. The data team's first projects will seek to bring visibility to the business, ensuring all departments in your organization have KPIs and dashboards they can trust. This kind of structure is particularly good for analytics where reusability and data governance are important.


✅ The data team can help with other teams' projects while working on its own agenda.

✅ The team can prioritise projects across the company.

✅ There are more opportunities for talent and skillset development in a centralised team. In fact, the data team works on a broader variety of projects, and data engineers, scientists and analysts can benefit from their peers's insights.

✅ The head of data has a centralised view of the company's strategy, and can assign data people to projects that are the most suited to their capabilities.

✅ Encourages career growth, as data engineers, scientists and have clear perspectives of seniority roles.


❌  High chance of disconnect between the data analytics team and other business units. In this model, data engineers and data scientists are not immersed in the day to day activities of other teams, making it difficult for them to identify the most relevant problems to tackle.

❌ Risk for the analytics group to be reduced to a "support"  function.

❌ As the data team serves the rest of the business, other business units might feel like their needs are not properly addressed, or that the planning process is too bureaucratic and slow.

The embedded model

In an embedded model, each department hire its "own" data people, with a centralized data engineering platform . In this model, data analysts and scientists focus on the problems faced by their specific business unit, with little interaction with data data people from other areas of the company.


✅ Embedded teams of data people are agile and responsive, because they are dedicated to their respective business functions and have good domain knowledge.

✅ Product managers can assign data tasks to the people most qualified to work on them.


❌  Lack of source of truth, duplication of data content

❌   Data people end up working on redundant issues due to a lack of communication between different teams.

❌ The creation of silos leads to productivity erosion, since data people can't draw on their colleagues expertise as they do in the centralised model.

❌  Business managers, usually lacking data science backgrounds, will find it hard to manage data people and understand the quality of their work.

Center of Excellence

In the Center of Excellence model (COE), we keep the idea of a centralised model with a unique coordination center, but data people are also deployed in other business units as they need additional resources. This approach allows to retain the advantages of both the centralised and the embedded model. This is a more balanced structure in which the data team's actions are coordinated, with data experts also operating in business units.

This strategy is most suited to larger, enterprise-scale companies with a clear strategy and data roadmap. The centre of excellence model entails a larger data team, as you need data scientists both in the ****COE  and in the different business branches. If you are a small or medium company, your needs might not require a data team of this size.

Again, it's extremely important that you know who your data people are. When building a centralised team at the beginning of your data journey, make sure you don't have business analysts/operations embedded in other departments. Otherwise, you will end up with an unwanted mixed model, creating complete chaos in your organisation. When creating a COE, you need to ensure it's wanted and planned.

If the Center of Excellence model provides the advantages of both the centralised and the embedded models, it still presents some drawbacks.


❌  With this approach, there is no centralised group which can focus on enterprise-level problems. Indeed, problems are still mostly solved at the unit level.

❌ Another problem is that this leaves little room for innovation. As data teams here largely focus on day-to-day needs in specific business units, they work less on long-term data initiatives.

Goal of a data team.

Ultimately, the data team aims at making your organisation more competitive, increase the satisfaction of your customers by leveraging data insights to ameliorate your product and your predictions. The engineering platform uses tools to help the rest of the analytics team have access to clean data. There are other roles on an analytics teams that we haven't mentioned in this article, as we were just covering best practices. For example, you often find product managers on data teams. Product managers ensure the product is actually used by the end customers. The product manager is therefore in a constant dialogue with the customer, data scientists and data engineers.

Louise de Leyritz

Growth Analyst Intern

Linkedin Profil

More From Castor Blog

Get more value from the data you already have

Start your free 14-day trial now or schedule a product tour.
We have a flexible pricing that works for companies of all sizes.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
logo castor color
Your data has never been so clear and friendly
Linkedin Profil
© 2021 Castor. All registered.