In-Memory Performance at the Cost of Flash

Reducing the number of data stack layers and incorporating real-time data processing engines which utilize CPU parallelism enables bare-metal application performance, running millions of application ops/sec per node with unprecedented latencies. Using flash as in-memory can substantially reduce costs and be extremely fast, but we only see a fraction of that speed when we layer OS abstractions, middleware and apps on top, forcing us to use a lot more hardware resources and settling for high and unpredictable latencies.

Slides

KLM, or Kappa, Lambda architectures and My journey from Legacy to the next new

Lambda- and Kappa-architectures, and of course serverless concepts all lead to a scalable solution that is cost effective and can be utilized for analytics and machine learning as well.

But what if the project you are working on has already bagged a few thousand man-years of development work and you are asked to integrate that project with yet another project of similar size and you have to make that 'digitized'?

Slides

SnappyData: Apache Spark meets Embedded In-Memory Database

I will introduce the in-memory based big data processing platform seamlessly integrated with Apache Spark.

Apache Spark is an excellent distributed computing framework. However, it is necessary to read the data each time it is processed. In addition, it is necessary to write the processing results in some data store. As a result, it takes time to read and write, and there is a problem to use for real-time analytics. 

Slides

Big Data for Small Dollars

The defining attribute of Big Data is size. It's big data, and that means it's going to cost money to store it for processing. Hardware isn't free.

So instead what we're going to look at here is running big data analytics in-memory prior to storage.

Slides

When One Minute Can Cost You a Million: Predicting Share Prices in Real-Time with Apache Spark and Apache Ignite

The stock trading world is a harsh reality for many investors, in which many times you need to make critical decisions in a very short window of time. In this constantly changing landscape in which prices are constantly updated and investing at the right moment makes all the difference, having the right tools to collect, process and analyze big volumes of data in a short amount of time becomes very important.

Slides

“Hybrid Transactional Analytical Processing (HTAP)” the Key to Intelligent Systems

Machines are getting stronger, software is getting faster, and data is getting bigger... these three trends should not be viewed in isolation, but as an evolution of how we integrate our technology. Trends in our data need to be analyzed and identified, and these, in turn, govern our software systems and provide intelligence to transactions. This is the essence of hybrid transaction and analytics processing (HTAP) - making intelligent decisions based off of real time feedback from in-flight and historical data.

Slides

Apache Ignite as MPP Accelerator

In-memory computing technologies have already changed a lot of IT spheres, from relational databases to data science solutions. But can standalone in-memory grids accelerate traditional enterprise data warehouses (DWH)?

In the talk we will show how Apache Ignite can be used together with world's first open source massive-parallel processing database - Greenplum DB - to give 12x acceleration to the queries.

Subtopics of the talk:

- Why do traditional DWH needs in-memory grid 

Slides

Fast and robust complex event processing using Apache Ignite

The requirements of modern business go far beyond the traditional offline analytics. Business needs an infrastructure that fully supports new interactive models of communication with client.

Slides

How to Build an Event-Driven, Dynamically Re-Configurable Micro-Services Platform

In this presentation I will show you how to combine Apache Ignite with Docker to not only build an event-driven microservice platform but also to make this dynamically re-configurable without any downtime at all.

Slides

When Scientific High Performance Computing (HPC) meets Big Data Tools

The NEC Auroroa Tsubasa Line is the latest vector processor technology to reach HPC. It represents the culmination of decades of R&D in traditional scientific HPC - the computers used for scientific simulations of weather, fluid dynamics etc.

The question is: can those vector processors, optimized for large scale physical and mathematical calculations, contribute to present day Big Data Analytics challenges?

We believe the answer is YES. 

Slides