Build and Deploy Digital Twins on an IMDG for Real-Time Streaming Analytics

Build and Deploy Digital Twins on an IMDG for Real-Time Streaming Analytics

In use cases ranging from IoT to ecommerce, an ongoing challenge for stream-processing applications is to extract important insights from real-time systems as fast as possible and then generate effective feedback that enhances situational awareness while optimizing operations and avoiding costly failures. The key to extracting these insights and responding in real-time is to maintain dynamically evolving state information about each data source and to analyze incoming streaming events using this rich context. The use of in-memory computing techniques makes this possible and ensures that events can be processed with low latency. In-memory computing also enables real-time aggregate analytics that can spot important patterns and trends within seconds and then provide an immediately and effective response in rapidly evolving situations.

As described in a previous talk, the “digital twin” model offers a powerful software architecture for organizing stateful stream-processing applications that track the dynamic state of data sources. In-memory data grids (IMDGs) provide a natural platform for hosting real-time digital twins by leveraging object-oriented data storage and integration of method execution within the IMDG – where the data lives. Because of their simplicity, low-latency, and avoidance of network bottlenecks in accessing state, IMDGs provide a highly appealing alternative to conventional, pipelined stream-processing architectures (such as Apache Storm, Beam, and Flink). Unlike streaming pipelines combined with batch analytics (or other Lambda architectures), IMDGs also enable real-time aggregate analytics across all digital twin instances that vastly improve overall situational awareness.

A key challenge in developing and deploying real-time digital twins within IMDGs is to create APIs that make them first class citizens for application developers. By not forcing developers to combine the use traditional create/read/update/delete (CRUD) access APIs with method execution techniques to create ad hoc digital twins, these new APIs, which directly host digital twin models within an IMDG, dramatically simplify application design, transparently handle implementation details, and ensure fast, scalable performance. The net result is that developers can focus on building digital twins for real-time stream-processing while taking full advantage of the underlying power of IMDGs as an execution platform.

This talk describes a set of digital twin APIs that have been developed to implement real-time digital twins written in C#, Java, and JavaScript. It covers the design considerations that led to specific API features, and it explains how these APIs are implemented using lower-level IMDG APIs, such as CRUD APIs for data storage, ReactiveX for fast event delivery, and parallel method invocations for data-parallel analysis. The talk also covers APIs used to connect digital twins to external data sources such as Kafka, Azure IoT Hub, AWS IoT, and REST. Lastly, it describes how the implementation (both on-premises and cloud-based) ensures transparent throughput scaling and high availability. Several code samples in IoT and intelligent real-time monitoring that illustrate the use of these techniques are used throughout the talk.

About the Talk

This talk is targeted at application developers who want to explore the use of in-memory computing for streaming analytics. The talk’s goal is to describe new APIs for building real-time digital twin models that run on an in-memory data grid or in the cloud for real-time streaming analytics. The audience should gain an understanding of how this design technique simplifies the use of in-memory data grids (IMDGs) for stream-processing while fully leveraging their ability to process incoming events and access associated state information with low latency, transparent throughput scaling, and high availability. The importance of the talk is that these APIs give developers a compelling new software architecture for stream-processing made possible by in-memory computing platforms.




Regency Ballroom C


ScaleOut Software, Inc.
Dr. William L. Bain founded ScaleOut Software in 2003 to develop in-memory data grid and in-memory computing products. As CEO, he has led the creation of numerous innovations for integrating data-parallel computing with in-memory data storage. Bill holds a Ph.D. in electrical engineering from Rice University. Over a 38-year career focused on parallel computing, he has contributed to advancements at Bell Labs Research, Intel, and Microsoft, and holds several patents in computer architecture and distributed computing. Bill founded and ran three start-up companies prior to ScaleOut Software. The most recent, Valence Research, which developed and distributed Web load-balancing software, was acquired by Microsoft Corporation and is a key feature within the Windows Server operating system. As an investor and member of the screening committee for the Seattle-based Alliance of Angels, Bill is actively involved in entrepreneurship and the angel community. Bill has presented at the prior three IMCS conferences in San Francisco.

Recent talks presented by Bill Bain:
• In-Memory Computing Summit London and Silicon Valley 2018: Integrating Data-Parallel Analytics into Stream-Processing Using an In-Memory Data Grid
• In-Memory Computing Summit London and Silicon Valley 2018: In-Memory Computing Brings Operational Intelligence to Business Challenges
• In-Memory Computing Summit Amsterdam and San Francisco 2017: Stream Processing with In Memory Data Grids: Creating the Digital Twin
• DEVintersection Spring 2017: Supercomputing with Microsoft’s Task Parallel Library
• In-Memory Computing Summit 2016: Implementing User-Defined Data Structures in In-Memory Data Grids
• Database Month New York April 2016: Using Memory-Based NoSQL Data Structures to Eliminate the Network Bottleneck
• IBM POWER8 ISV Testimonial 2015: POWER8 and ScaleOut Software: In-memory computing for operational intelligence
• In-Memory Computing Summit 2015: Implementing Operational Intelligence Using In-Memory, Data-Parallel Computing
• Database Month New York May 2015: Using In-Memory, Data-Parallel Computing for Operational Intelligence
• Big Data Spain 2014: Real Time Analytics with MapReduce And In-Memory
• Strata+Hadoop World 2014: Using Operational Intelligence to Track 10M Cable TV Viewers in Real Time
URLs of previous presentations:
• In-Memory Computing Summit Amsterdam 2017:…
• In-Memory Computing Summit 2016:… • Database Month New York April 2016:,
• IBM POWER8 ISV Testimonial 2015:
• In-Memory Computing Summit 2015:…
• Database Month New York May 2015:,
• Big Data Spain 2014:
• Strata + Hadoop 2014:…,

Slides & Recordings

   Download Slides