High Availability and Disaster Recovery for IMDG

Salon B/C/D

IMDG is generally used as a compute grid and/or DBMS cache. That's why data in a grid may be safely lost and then recovered from a persistent store. But what if we want to use the grid as the data store for the core banking platform? This paper is to describe the GridGain in-memory data fabric add-ons and new features to provide the ultimate distributed data store.


  • The bank requirements to HA & DR
  • The legacy HA & DR architecture
  • The high-level architecture of the Sberbank new generation core banking platform
  • The business continuity threat model
  • GridGain SPI implementation
    • „Data cells“ and the new affinity function
    • the node metadata-based topology validator
    • the quorum-based topology validator
  • The new features of GridGain in-memory data fabric Ultimate edition
    • Local Fie Store (LFS) – the new distributed data store on the local disks
    • write-ahead logging
    • grid snapshots
    • the common architecture of the backup/restore subsystem
    • the log mining API and data replication
    • the nodes' health self-check
    • the system software/firmware upgrade without the grid downtime
Enterprise IT Architect
Works in Sberbank since 2010. He realized the concepts of operational data store (ODS) and retail risk data mart as a part of enterprise data warehouse. In 2015 performed the test of 10+ distributed in-memory platforms for transaction processing. Now responsible for grid-based core banking infrastructure architecture including high availability and disaster recovery.
Operations expert & manager
Works in Sberbank since 2012. He is responsible for building the infrastructure landscape for the major mission-critical applications as core banking and cards processing including new grid-based banking platform. Now he acts as both expert and project manager in “18+” core banking transformation program.