Catalog of Data Storage Tools

By Louise de Leyritz from CastorDoc

small circle patternsmall circle pattern

As data proliferates in modern organizations, the technologies we use to store this data have considerably evolved recently.

This explains the recent explosion in the past few years of data storage and data warehousing tools. This new trend is not going to stop, and we'd rather bring visibility and structure soon.

At CastorDoc, we believe the first step to structure the data storage tools market, is more transparency. For that reason, we put up a list of all the data storage tools we heard of.

This list is still exploratory, may contain errors, or lacking information. Please reach out to us, if you notice anything wrong: louise@castordoc.com

In-depth analysis and evolution

Read the full breakdown by generation and market analysis of data quality here.

Deep dive in data storage tools

What does each column in the benchmark below mean?

Deployment support: Does the solution support a Saas development model, an open-source model? Both?

Solution: Is the solution a modern or legacy solution? Please refer to the article: to understand how we differentiate the two.

Target market: Does the solution cater to enterprise clients, or does it offer a more affordable model, more suited to mid-market organizations? "Universal" solutions are adapted to both enterprise and mid-market clients.

Core use case: Again, to fully understand this criterion, please refer to the article Cloud Data Warehousing: The Past, Present, and Future. The aim of this criteria is to distinguish between pure data warehouses, and organizations focusing on real-time analytics.

Support for Standard SQL: Does the solution support Standard SQL for querying the warehouse? Today, most solutions do as SQL is the most widespread database language, but you still have some exceptions.

Support for semi-structured data: Does the solution support semi-structured data like Avro, JSON and XML?

Decoupled storage/compute: It implies that what you pay to store data is separate from the cost to run queries on the data. This not only brings cost benefits but also makes cloud data warehouses more performant with the ability to concurrently run hundreds of queries.

Data storage benchmark and key features

Additional Comparison and Benchmark Resources

An in-depth comparison between Snowflake, Redshift, and Big query

An in-depth comparison between Clickhouse, PostgreSQL, and TimescaleDB.

Detailed comparison between ClickHouse, Druid, and Pinot