Enabling Real-Time Analytics for Hadoop Data Lakes with GridGain

Data lakes, such as those powered by Hadoop, are an excellent choice for analytics and reporting at scale. Hadoop scales horizontally and cost-effectively and fulfills long-running operations spanning big data sets. However, the continual growth of real-time analytics requirements — where operations need to be completed in seconds rather than minutes, or milliseconds rather than seconds — has brought new challenges to Hadoop based solutions.

In this session, Denis Magda describes how Apache Ignite and GridGain as an in-memory computing platform can modernize existing data lake architectures, enabling real-time analytics that spans operational, historical, and streaming data sets. In particular, you'll learn the following:

How to choose the right deployment mode and responsibilities when working with GridGain and Hadoop
How to determine which operations should be handled by GridGain and which should be sent to Hadoop
How to use Spark DataFrames to run federated (aka cross-database) queries that span GridGain and Hadoop
How to perform initial data loading from Hadoop to GridGain
How to set up bi-directional synchronization between Hadoop and GridGain

Schedule:

Mon, 06/03/2019 - 14:40

Room:

Edward 5-7

Tracks:

Architecture

Speakers

Denis

Magda

Product Manager

at

GridGain Systems

Denis Magda runs Product Management at GridGain Systems, and is the Vice President of the Apache Ignite PMC. He is an expert in distributed systems and platforms who actively contributes to Apache Ignite and helps companies and individuals deploy it for mission-critical applications. You can be sure to come across Denis at conferences, workshop and other events sharing his knowledge about use case, best practices, and implementation tips and tricks on how to build efficient applications with in-memory data grids, distributed databases and in-memory computing platforms including Apache Ignite and GridGain.

Before joining GridGain and becoming a part of Apache Ignite community, Denis worked for Oracle where he led the Java ME Embedded Porting Team -- helping bring Java to IoT.

Slides & Recordings

Download Slides

Enabling Real-Time Analytics for Hadoop Data Lakes with GridGain

Enabling Real-Time Analytics for Hadoop Data Lakes with GridGain

Slides & Recordings

Win a £20 Ticket Voucher

Stay

Updated!

Follow us @imcsummit