California Schemin’! How the Schema Registry has Ancestry Basking in Data

« Kafka Summit New York City 2017

Many companies face the challenge of gathering data from disparate sources, organizing these data into a knowledge base, and serving up this information in a relevant way to their customers. With over 19 billion digitized historical records, 80 million family trees, 10 billion people/tree nodes, and three million DNA customers, Ancestry has a trove of data that helps people gain a greater sense of who they are, where they came from, and who else shares their family’s journey.

Companies with big data challenges often integrate Apache Kafka into their data architecture; yet despite successful Kafka integration, dealing with massive quantities of data can still be overwhelming and debilitating. How can you make sense of all the data from various sources in your warehouse, meet the needs of a growing business, and remain competitive? One of the ways Ancestry tackled this challenge was by introducing the schema registry into the heart of the data fabric and development process. The results have been transformational and dramatically reduced the time it takes from data source definition to reporting and production. Pre-schema registry, new data sets could take months to implement and then get lost in the petabytes of data.

Join this session and learn how Ancestry brought new data sets into production in not just months, or weeks, but DAYS and how the schema registry transformed our data fabric.

Presentation

California Schemin’! How the Schema Registry has Ancestry Basking in Data

Moderator

Chris Sanders

Moderator

Max Norris

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how