Project Metamorphosis: Unveiling the next-gen event streaming platformLearn More

Getting Started with the Kafka Streams API using Confluent Docker Images


What’s great about the Kafka Streams API is not just how fast your application can process data with it, but also how fast you can get up and running with your application in the first place—regardless of whether you are implementing your applications in Java or other JVM-based languages such as Scala and Clojure. Unlike competing technologies, Apache Kafka® and its Streams API does not require installing a separate processing cluster, and it is equally viable for small, medium, large, and very large use cases.

In fact, it’s pretty common for our users to have their first application or proof-of-concept running in a matter of minutes. Some users, for example, opt to test-drive and develop their applications on their laptops against embedded, in-memory instances of Kafka and related services such as Confluent Schema Registry. And they also use the same setup for automated integration testing in CI environments backed by Jenkins or Travis CI. Our own GitHub repo containing the Confluent demo applications uses exactly such a setup.

Docker and Kafka Streams API: A Perfect Match

Many developers love container technologies such as Docker and the Confluent Docker images to speed up the iterative development they’re doing on their laptops: for example, to quickly spin up a containerized Confluent Platform deployment consisting of multiple services such as Apache Kafka, Confluent Schema Registry, and Confluent REST Proxy for Kafka.

Additionally, Docker is also a very popular choice among Kafka users for containerizing and deploying applications and microservices on platforms such as Kubernetes or in the cloud. And yes, unlike related technologies such as Apache® Spark™ or Apache Flink®, where you must install and run special processing clusters into which you then submit cluster-specific “processing jobs,” you actually can containerize applications that use the Kafka Streams API because these are standard Java applications. (And as a side note, these applications are backwards and forwards compatible with Kafka cluster versions, making such deployment super-flexible to accommodate for independently working teams across a company.) This also means you are able to use the same organizational processes and technical tooling for development, testing, packaging, deployment, and monitoring of the Kafka Streams applications just like you do everywhere else inside your company. For example, if you don’t like containers but prefer deploying to VMs with Puppet or Ansible, no problem. If you do like containers and enjoy deploying to Kubernetes or a cloud service like AWS EC2, no problem either. And—speaking of containers and Docker—this brings us to the focus of this blog.

To get started with the Kafka Streams API, most users typically begin with our Confluent demo applications or the Kafka Streams API chapter in the Confluent documentation.  In order to make your getting started experience even better, we recently added a new Docker-based demo setup. This Docker-based demo is the focus of this blog post and, because the demo is a one-click experience, the remainder of this post will be quite short and concise!

Creating the Kafka Music Demo

We will run the Confluent Kafka Music demo application in a containerized, multi-service deployment, using Docker. If you are reading this blog post for the first time, this will take you about five minutes. Afterward, this will take just a few seconds!

Our Kafka Music application demonstrates how to build a music charts application that continuously computes, in real-time, the latest charts such as “Top 5 songs” per music genre. It exposes its latest Streams processing results—the latest music charts—through Kafka’s Interactive Queries feature (see our documentation on Interactive Queries) combined with a REST API. The application’s input data is in Avro format and comes from two sources: a stream of play events (think: “song X was just played”) and a stream of song metadata (“song X was written by artist Y”).  The corresponding Avro schemas are registered with the Confluent Schema Registry instance because that’s how one creates production-ready data streams.

We will run the following containerized services:

If you first want to see a preview of what we will do in the subsequent sections, take a look at the following screencast:

Screencast: Running Confluent Kafka Music demo application (3 mins)


There is only one requirement to meet: you must install a recent version of Docker and Docker Compose on your host machine (e.g., your laptop running Mac OS, Linux, or Windows) if you haven’t done so already. If you are on a Mac, follow the instructions at Docker for Mac. The Confluent Docker images require Docker version 1.11 or greater.

For reference, I have run the instructions in this blog on a MacBook Pro with Mac OS Sierra and the following Docker versions:

Running the Kafka Music demo application

The first step is to clone the Confluent Docker Images repository:

Now we can launch the Kafka Music demo application including the services it depends on, such as Kafka:

After a few seconds, the application and the services are up and running. One of the started containers is continuously generating input data for the application by writing into the application’s input topics. This allows us to look at live, real-time data when using the Kafka Music application.

Now we can use our web browser or a CLI tool such as curl to interactively query the latest processing results of the Kafka Music application by accessing its REST API. In other words, we can play around now!

REST API example 1: list all running application instances of the Kafka Music application

REST API example 2: get the latest Top 5 songs across all music genres

The REST API exposed by the Kafka Music demo application supports further operations. See the top-level instructions in its source code for details (link points to the sources for Confluent 3.2).

If you’d like to continue exploring, perhaps by creating new Kafka topics or launching additional demonstrations, take a closer look at our Docker tutorial for Confluent 3.2.1.

Once you’re done you can stop all the services and containers with:

Conclusion and Wrapping Up

What’s great about what we have just done is not the actual Kafka Music example — rather, it’s that you can do the very same for your own applications! You can containerize your Kafka Streams application, similar to what we have done for the Kafka Music application above, and you can also deploy your application easily alongside other services such as an Apache Kafka cluster (with one or multiple brokers), Confluent Schema Registry, Confluent Control Center, and much more—including your own dockerized services. All you need is Docker and Confluent Docker images for Apache Kafka and friends. If you need an example or template for containerizing your Kafka Streams application, take a look at the source code of the Docker image we used for this blog post.

Lastly, the image for running the Kafka Music demo application actually contains all of Confluent Kafka Streams demo applications. This means you can easily run any of these applications, too. I won’t cover that in this blog post, but we have instructions for how to do so.

Next Steps

If you have enjoyed this article, you might want to continue with the following resources to learn more about Apache Kafka’s Streams API:

Did you like this blog post? Share it now

Subscribe to the Confluent blog

More Articles Like This

Announcing the Snowflake Sink Connector for Apache Kafka in Confluent Cloud

We are excited to announce the preview release of the fully managed Snowflake sink connector in Confluent Cloud, our fully managed event streaming service based on Apache Kafka®. Our managed […]

How Merging Companies Will Give Rise to Unified Data Streams

Company mergers are becoming more common as businesses strive to improve performance and grow market share by saving costs and eliminating competition through acquisitions. But how do business mergers relate […]

Build Real-Time Observability Pipelines with Confluent Cloud and AppDynamics

Many organisations rely on commercial or open source monitoring tools to measure the performance and stability of business-critical applications. AppDynamics, Datadog, and Prometheus are widely used commercial and open source […]

Jetzt registrieren

Start your 3-month trial. Get up to $200 off on each of your first 3 Confluent Cloud monthly bills

Nur neue Registrierungen.

Wenn Sie oben auf „registrieren“ klicken, erklären Sie sich damit einverstanden, dass wir Ihre personenbezogenen Daten verarbeiten – gemäß unserer und bin damit einverstanden.

Indem Sie oben auf „Registrieren“ klicken, akzeptieren Sie die Nutzungsbedingungen und den gelegentlichen Erhalt von Marketing-E-Mails von Confluent. Zudem ist Ihnen bekannt, dass wir Ihre personenbezogenen Daten gemäß unserer und bin damit einverstanden.

Auf einem einzigen Kafka Broker unbegrenzt kostenlos verfügbar

Die Software ermöglicht die unbegrenzte Nutzung der kommerziellen Funktionen auf einem einzelnen Kafka Broker. Nach dem Hinzufügen eines zweiten Brokers startet automatisch ein 30-tägiger Timer für die kommerziellen Funktionen, der auch durch ein erneutes Herunterstufen auf einen einzigen Broker nicht zurückgesetzt werden kann.

Wählen Sie den Implementierungstyp aus
Manuelle Implementierung
  • tar
  • zip
  • deb
  • rpm
  • docker
Automatische Implementierung
  • kubernetes
  • ansible

Wenn Sie oben auf „kostenlos herunterladen“ klicken, erklären Sie sich damit einverstanden, dass wir Ihre personenbezogenen Daten verarbeiten – gemäß unserer Datenschutzerklärung zu.

Indem Sie oben auf „kostenlos herunterladen“ klicken, akzeptieren Sie die Confluent-Lizenzvertrag und den gelegentlichen Erhalt von Marketing-E-Mails von Confluent. Zudem erklären Sie sich damit einverstanden, dass wir Ihre personenbezogenen Daten gemäß unserer Datenschutzerklärung zu.

Diese Website verwendet Cookies zwecks Verbesserung der Benutzererfahrung sowie zur Analyse der Leistung und des Datenverkehrs auf unserer Website. Des Weiteren teilen wir Informationen über Ihre Nutzung unserer Website mit unseren Social-Media-, Werbe- und Analytics-Partnern.