Portable Streaming Pipelines with Apache Beam

Portable Streaming Pipelines with Apache Beam

Video ansehen

Kafka Summit NYC 2017 | Streams Track

Much as SQL stands as a lingua franca for declarative data analysis, Apache Beam aims to provide a portable standard for expressing robust, out-of-order data processing pipelines in a variety of languages across a variety of platforms. By cleanly separating the user’s processing logic from details of the underlying execution engine, the same pipelines will run on any Apache Beam runtime environment, whether it’s on-premise or in the cloud, on open source frameworks like Apache Spark or Apache Flink, or on managed services like Google Cloud Dataflow.

In this talk, I will:

  • Briefly introduce the capabilities of the Beam model for data processing and integration with IO connectors like Apache Kafka.
  • Discuss the benefits Beam provides regarding portability and ease-of-use.
  • Demo the same Beam pipeline running on multiple runners in multiple deployment scenarios (e.g. Apache Flink on Google Cloud, Apache Spark on AWS, Apache Apex on-premise).
  • Give a glimpse at some of the challenges Beam aims to address in the future.
Frances Perry, Software Engineer, Google

Wir verwenden Cookies, damit wir nachvollziehen können, wie Sie unsere Website verwenden, und um Ihr Erlebnis zu optimieren. Klicken Sie hier, wenn Sie mehr erfahren oder Ihre Cookie-Einstellungen ändern möchten. Wenn Sie weiter auf dieser Website surfen, stimmen Sie unserer Nutzung von Cookies zu.