Integrating Real-Time Stream Processing and Data-Parallel Analytics Using Digital Twins

Slides

Target Audience

This talk is targeted at software architects and application developers who develop applications for streaming analytics that employ data-parallel computations to identify issues and aggregate trends across many incoming data streams. These applications need to track and analyze incoming messages from thousands of data sources so that they can (a) maintain a dynamic (in-memory) model of each data source’s behavior, (b) provide immediate feedback and alerting when issues are discovered, and (c) implement aggregate analysis to boost overall situational awareness.

Numerous applications can benefit from this type of streaming analytics, especially those with large numbers (thousands) of data sources. Examples include fleet tracking for rental car and trucking companies, asset tracking during disaster recovery, logistics for retail outlets, contact tracing for large companies, fraud detection in banking transactions, ecommerce recommendations, healthcare device tracking, security and intrusion detection, and more.

Purpose of the Talk

The talk describes a software construct called a real-time digital twin running on an in-memory data grid and its use to integrate streaming analytics and data-parallel computations. This construct provides a highly scalable architecture for simultaneously extracting dynamic information from thousands of data streams and continuously feeding it to MapReduce computations. The results of these computations can then be immediately visualized in real-time charts that identify dynamic trends.

The talk uses code samples and demos to show how real-time digital twins can simplify applications in streaming analytics by providing a straightforward framework for organizing application code. This approach offloads key functionality to the execution platform that otherwise would create challenges for the application developer, namely managing memory-based state, integrating data-parallel computations, avoiding scalability bottlenecks, and ensuring high availability.

Technologies Covered

The talk describes object-oriented APIs that are used to build applications in languages such as Java and C# that run on an in-memory data grid (IMDG). It describes their execution model on the IMDG, using the IMDG to store and access state information, and the implementation of MapReduce computations within the IMDG. It will cover scalability considerations for the IMDG and ensuring high availability at all stages of the streaming analytics pipeline. It will also compare this approach to pipelined and graph-oriented streaming analytics architectures, such as Apache Flink and Beam.

What the Audience Will Learn

The audience will learn about a new model for streaming analytics and its key benefits in comparison to other approaches in addressing challenges for applications that have thousands of data sources. They will see how this model can be applied in numerous use cases and how to use it to easily create data-parallel computations that continuously access streaming state. They will also see how the model enables individualized feedback for thousands of data sources as well as increased situational awareness from aggregate analysis.

Speakers

William

Bain

CEO

ScaleOut Software, Inc.

Dr. William L. Bain is founder and CEO of ScaleOut Software, which has been developing software products since 2003 designed to enhance operational intelligence within live systems using scalable, in-memory computing technology. Bill earned a Ph.D. in electrical engineering from Rice University. Over a 40-year career focused on parallel computing, he has contributed to advancements at Bell Labs Research, Intel, and Microsoft, and holds several patents in computer architecture and distributed computing. Bill founded and ran three companies prior to ScaleOut Software. The most recent, Valence Research, developed web load-balancing software and was acquired by Microsoft Corporation to enhance the Windows Server operating system. As an investor and member of the screening committee for the Seattle-based Alliance of Angels, Bill is actively involved in entrepreneurship and the angel community.

Track:

Streaming Data

Schedule:

Thu, 10/29/2020 - 12:00

(Pacific Time Zone)