Project Metamorphosis: Unveiling the next-gen event streaming platformLearn More

Using Confluent Platform to Complete a Massive Cloud Provider Migration and Handle Half a Million Events Per Second

In the past 12 months, games and other forms of content made with the Unity platform were installed 33 billion times reaching 3 billion devices worldwide. Apart from our real-time 3D development platform, Unity has a monetization network that is among the largest in the world.

The centralized data team that supports the Unity development platform and the monetization network—my team—is relatively small. Despite just having roughly a dozen people, we have large responsibilities that include managing the data infrastructure that underpins the Unity platform and helping make Unity a data-driven company. As a result, each of us has to be very productive and do quite a bit with the resources we have. That’s one of the reasons we built our data infrastructure on Confluent Platform and Apache Kafka®. Today, this infrastructure handles on average about half a million events per second, with peaks of about a million events per second. It also reliably handles millions of dollars of monetary transactions. In fact, since we went live with Confluent Platform and Kafka a year ago, we have had zero outages that resulted in money loss.

A look back at how we got here and our move from Amazon Web Services (AWS) to Google Cloud Platform (GCP)

Though our company-wide data infrastructure is running smoothly now, it was not always so well integrated. Similar to many companies, Unity has a range of different departments—analytics, R&D, monetization, cloud services, etc.—that each have their own data pipelines running on individual technology stacks. Some were using Amazon RedShift and Snowflake while others had been using Kafka for quite some (dating back to version 0.7). We wanted to bring all of this together and build a common data pipeline for all of Unity. We went with Kafka because our previous experience showed us it was stable, easy to deploy, and high performing.

Then last year, Unity undertook a major cloud provider migration from AWS to GCP. At that point, after seeing the benefits of better control and additional support, plus receiving guidance on our Kafka architecture and best practices, we reached out to Confluent. Engineers from the Confluent Professional Services team met with us at our offices. We spent about a week going through our use cases and discussing the best way to set up our architecture both for the migration and for long-term operation. They gave us advice on the type of cluster and nodes that would work best for our specific needs.

We started running a new Kafka cluster on GCP and began mirroring traffic between it and our AWS Kafka cluster with Confluent Replicator, which actively synchronizes and replicates data between public clouds and datacenters. It was critically important for us to finish this migration in a safe way because even one day of outage could cost the company considerably.

On the other hand, we needed to move as quickly as was reasonably possible, since every month we spent running a company as large as Unity on both AWS and GCP was essentially doubling our cloud costs. We had a huge incentive to move fast, but an equally huge incentive to make absolutely certain it was done right. With Confluent support, we successfully completed the massive migration over the course of a few months, moving petabytes of data—and our operations—from AWS to GCP.

New opportunities opened

Having a well-proven data infrastructure based on Confluent Platform has opened a lot of new possibilities for product teams across Unity. Many of them have reached out to us for help in building new solutions based on the central Kafka cluster that we support.

The Confluent Platform based data infrastructure enabled us to move from batch model thinking to a more event streaming model. For example, the previous data lake at Unity was working with a two-day latency, and an ETL job ran once a day and results were ready the next day. Now, latency is down to 15 minutes. Reducing the latency between a particular event and when an informed decision can be made based on the event has led to a variety of business improvement opportunities.

Aside from reduced latency, the improved reliability we’ve seen since deploying Confluent Platform has also led to significant shifts in the way our internal developers think about our data platform. It’s important to note here that we don’t have anyone whose job it is to simply look after Kafka. We expect it to work reliably on its own, and today it is rock solid. But it wasn’t always that way.

We were on Kafka 0.10 for some time in the past, and although we knew that new versions likely had fixes that we needed, we put off upgrading because we didn’t know if the update would have other unforeseen effects. If something went wrong in our environment back then, Kafka was typically on our list of possible suspects. Now that we are on Confluent Platform and staying current with the latest releases, Kafka is not even on that list. And when we perform an update, we can use Ansible Playbooks from Confluent to test the upgrade. Additionally, we know that Confluent support will respond within the hour if we run into anything unexpected.

Plans for feeding machine learning models with real-time data

The fact that we completed the migration from AWS into GCP with zero downtime and zero data loss was a huge win for Unity, as was changing how Kafka and event streaming are perceived by our teams. Now that those teams have seen how stable it is, they are starting to build more and more event streaming systems on top of Confluent Platform. In fact, we’re exploring usage of the platform to support faster training of machine learning models. Kai Waehner gave an interesting talk on this, and we see several initiatives that would benefit from using Kafka to feed machine learning models in real time.

Interested in more?

If you’d like to know more, you can download the Confluent Platform to get started with a complete event streaming platform built by the original creators of Apache Kafka.

Oguz Kayral is the engineering manager at Unity Technologies. Oguz leads the centralized data team that supports the Unity development platform and the monetization network.

Did you like this blog post? Share it now

Subscribe to the Confluent blog

More Articles Like This

Spring for Apache Kafka – Beyond the Basics: Can Your Kafka Consumers Handle a Poison Pill?

You know the fundamentals of Apache Kafka®. You are a Spring Boot developer working with Apache Kafka. You have chosen Spring Kafka for your integration. You have implemented your first […]

How Merging Companies Will Give Rise to Unified Data Streams

Company mergers are becoming more common as businesses strive to improve performance and grow market share by saving costs and eliminating competition through acquisitions. But how do business mergers relate […]

ksqlDB: The Missing Link Between Real-Time Data and Big Data Streaming

Is event streaming or batch processing more efficient in data processing? Is an IoT system the same as a data analytics system, and a fast data system the same as […]

Jetzt registrieren

Start your 3-month trial. Get up to $200 off on each of your first 3 Confluent Cloud monthly bills

Nur neue Registrierungen.

Wenn Sie oben auf „registrieren“ klicken, erklären Sie sich damit einverstanden, dass wir Ihre personenbezogenen Daten verarbeiten – gemäß unserer und bin damit einverstanden.

Indem Sie oben auf „Registrieren“ klicken, akzeptieren Sie die Nutzungsbedingungen und den gelegentlichen Erhalt von Marketing-E-Mails von Confluent. Zudem ist Ihnen bekannt, dass wir Ihre personenbezogenen Daten gemäß unserer und bin damit einverstanden.

Auf einem einzigen Kafka Broker unbegrenzt kostenlos verfügbar

Die Software ermöglicht die unbegrenzte Nutzung der kommerziellen Funktionen auf einem einzelnen Kafka Broker. Nach dem Hinzufügen eines zweiten Brokers startet automatisch ein 30-tägiger Timer für die kommerziellen Funktionen, der auch durch ein erneutes Herunterstufen auf einen einzigen Broker nicht zurückgesetzt werden kann.

Wählen Sie den Implementierungstyp aus
Manuelle Implementierung
  • tar
  • zip
  • deb
  • rpm
  • docker
Automatische Implementierung
  • kubernetes
  • ansible

Wenn Sie oben auf „kostenlos herunterladen“ klicken, erklären Sie sich damit einverstanden, dass wir Ihre personenbezogenen Daten verarbeiten – gemäß unserer Datenschutzerklärung zu.

Indem Sie oben auf „kostenlos herunterladen“ klicken, akzeptieren Sie die Confluent-Lizenzvertrag und den gelegentlichen Erhalt von Marketing-E-Mails von Confluent. Zudem erklären Sie sich damit einverstanden, dass wir Ihre personenbezogenen Daten gemäß unserer Datenschutzerklärung zu.

Diese Website verwendet Cookies zwecks Verbesserung der Benutzererfahrung sowie zur Analyse der Leistung und des Datenverkehrs auf unserer Website. Des Weiteren teilen wir Informationen über Ihre Nutzung unserer Website mit unseren Social-Media-, Werbe- und Analytics-Partnern.