The Splendors and Miseries of the Distributed Streams

The Splendors and Miseries of the Distributed Streams

Java 8 introduced the Stream API - a modern, functional, and very powerful tool for performing bulk operations on collections of data. One of the main benefits of the Stream API is that it hides the details of iteration over the underlying data set, allowing for parallel processing within a single JVM, using a fork/join framework.

I will talk about the Stream API implementations for the development of parallel distributed across many machines and many JVMs programs on top Java 8 Stream API.

You will learn how you can use the same API to process massive data sets across large clusters, which you already know how to do in a single JVM.

I will review distributed implementations (OSS and commercial) of Java Stream API - Infinispan, Oracle Coherence, and Hazelcast Jet.

With an explanation of internals of the Hazelcast Jet implementation, I will give an introduction to the general design behind stream processing using DAG (directed acyclic graph), and how it provides in-memory performance while still leveraging industry-wide known frameworks as Java Streams API.

Speakers
Viktor
Gamov
Senior Solutions Architect
at
Hazelcast
Viktor Gamov

Viktor Gamov is a Senior Solution Architect at Hazelcast, the leading open-source in-memory technologies company. Viktor has comprehensive knowledge and expertise in enterprise application architecture leveraging open source technologies. He's helping companies build low latency, scalable and highly available distributed systems. He is co-organizer of Princeton JUG and New York Hazelcast User Group. He is a co-author of O'Reilly's «Enterprise Web Development». Viktor’s presenting at the conferences (http://lanyrd.com/gamussa/), blogging and producing a podcast. Follow Viktor on Twitter @gamussa.