Project Metamorphosis: Wir präsentieren die Event-Streaming-Plattform der nächsten GenerationMehr Erfahren

Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning without a Data Lake

Kai Waehner, Confluent

Machine Learning (ML) is separated into model training and model inference. ML frameworks typically use a data lake like HDFS or S3 to process historical data and train analytic models. But it’s possible to completely avoid such a data store, using a modern streaming architecture.

This talk compares a modern streaming architecture to traditional batch and big data alternatives and explains benefits like the simplified architecture, the ability of reprocessing events in the same order for training different models, and the possibility to build a scalable, mission-critical ML architecture for real time predictions with muss less headaches and problems.

The talk explains how this can be achieved leveraging Apache Kafka, Tiered Storage and TensorFlow.