Apache Flink's Edge in Stream Processing PyData NYC 2024

Apache Flink's Edge in Stream Processing
.ical

11-08, 10:55–11:35 (US/Eastern), Central Park East

As the volume of real-time data continues to surge, businesses are under increasing pressure to process this data with speed and accuracy. Apache Flink emerges as a powerful solution, excelling in the realm of real-time stream processing by providing low-latency data handling and advanced capabilities for managing out-of-order events. This talk is designed for intermediate Python developers who want to explore how Flink's unique features—like its sophisticated time and windowing mechanisms—enable it to deliver exactly-once semantics, ensuring reliability even in the most demanding scenarios.

We’ll also compare Flink with Kafka, illustrating where Flink outperforms in complex event processing and dynamic windowing tasks. Through real-world examples, you’ll see how Flink is being used to detect fraud in financial services, optimize ride-sharing routes, and more, all while maintaining high throughput and low latency. Additionally, the talk will cover how Flink can simplify and accelerate ETL pipelines, making it an indispensable tool for modern data-driven applications.

Ever wondered how businesses handle the deluge of real-time data to make instantaneous decisions? As data streams continue to grow exponentially, the need for efficient, low-latency processing becomes paramount. Apache Flink stands out as a robust solution for real-time stream processing, especially in handling out-of-order events and providing exactly-once semantics. This talk will delve into:

Introduction to Apache Flink and its capabilities: Discover how Flink excels in processing endless streams of data with minimal latency, enabling immediate insights and actions.
Handling out-of-order data and ensuring exactly-once semantics: Learn how Flink's advanced time and windowing capabilities provide accurate and reliable data processing.
Comparing Flink with Kafka for complex use cases: Explore scenarios where Kafka's capabilities may fall short, and Flink's strengths in complex event processing and dynamic windowing shine through.
Real-world examples: From financial services detecting fraud in real-time to ride-sharing platforms optimizing routes on-the-fly, see how Flink outperforms in handling out-of-order events and ensuring low-latency, high-throughput processing.
Building efficient ETL pipelines: Understand how Flink simplifies ETL processes, making data transformations faster and more efficient compared to traditional batch processing.

Join us to uncover how Apache Flink is redefining real-time stream processing and why it's a crucial tool for modern data-driven solutions.

Prior Knowledge Expected –

No previous knowledge expected

Shekhar Prasad Rajak

Shekhar is passionate about Open Source Softwares and active in various Open Source Projects. He has contributed SymPy, Ruby gems like: daru, daru-view (author), Bundler, NumPy & SciPy. He has successfully completed Google Summer of Code 2016, 17, also worked as Admin for SciRuby & mentored. Shekhar was speaker at RubyConf 2018, PyCon 2017, ApacheCon 2020 on “Running ML algorithms with ML tools available in Apache Ecosystem” & “Cluster Management in Apache Ecosystem & Kubernetes”.

Apache Flink's Edge in Stream Processing .ical 11-08, 10:55–11:35 (US/Eastern), Central Park East

Apache Flink's Edge in Stream Processing
.ical

11-08, 10:55–11:35 (US/Eastern), Central Park East