Spring boot with Apache ignite persistent durable memory storage plus sql queries over ignite cache

In this post we will show how we can do the following :

  1. Integrate spring boot with Apache Ignite
  2. How to enable and use persistent durable memory feature of Apache Ignite which can persist your cache data to the file disk to survive crash or restart so you can avoid data losing.
  3. How to execute SQL queries over ignite caches
  4. How to unit test and integration test ignite with spring boot
  5. Simple Jenkins pipeline reference
  6. Code repository in GitHub : GithubRepo


what is Ignite durable memory ?

Apache Ignite memory-centric platform is based on the durable memory architecture that allows storing and processing data and indexes both in memory and on disk when the Ignite Native Persistence feature is enabled. The durable memory architecture helps achieve in-memory performance with the durability of disk using all the available resources of the cluster

What is ignite data-grid SQL queries ?

Ignite supports a very elegant query API with support for Predicate-based Scan Queries, SQL Queries (ANSI-99 compliant), and Text Queries. For SQL queries ignites supports in-memory indexing, so all the data lookups are extremely fast. If you are caching your data in off-heap memory, then query indexes will also be cached in off-heap memory as well.

Ignite also provides support for custom indexing via IndexingSpi and SpiQuery class.

more information on : https://apacheignite.readme.io/docs/cache-queries

So to have Apache Ignite server node integrated and started in your spring boot app we need to do the following :

  1. Add the following maven dependencies to your spring boot app pom file

  1. Define ignite configuration via java DSL for better portability and management as a spring configuration and the properties values will be loaded from the application.yml file :

  1. then you can just inject ignite instance as a Spring bean which make unit testing much easier

How to enable Ignite durable memory :

How to use Ignite SQL queries over in memory storage:

How to do atomic thread safe action over the same record via cache invoke API:

How to unit test Apache ignite usage in spring boot service :

How to trigger integration test with Ignite, check test resources as well :

How to run and test the application over swagger rest api :

  • build the project via maven : mvn clean install
  • you can run it from IDEA via AlertManagerApplication.java or via java -jar jarName

Screen Shot 2017-11-17 at 16.28.03.png

  • swagger which contain the REST API and REST API model documentation will be accessible on the URL below where you can start triggering different REST API calls exposed by the spring boot app:


Screen Shot 2017-11-17 at 16.24.11

  • if you STOP the app or restart it and do query again , you will find all created entities from last run so it survived the crash plus any possible restart
  • you can build a portable docker image of the whole app using maven Spotify docker plugin if you wish


References :




Guarantee your single computation task to be finished in case of node failures/crash in apache Ignite


How to guarantee your single computation task is guaranteed to failover in case of node failures in apache Ignite ?

As you know failover support in apache ignite for computation tasks is only covered for master slave jobs where slave nodes will do computations then reduce back to the master node , and in case of any failure in slave nodes where slave jobs are executing , then it that failed slave job will fail over to another node to continue execution .

Ok what about if I need to execute just single computation task and I need to have failover guarantee due may be it is a critical task that do financial data modification or must finished task in an acceptable status (Success or Failure) , how we can do that ? it is not supported out of the box by Ignite but we can have a small design extension using Ignite APIs to cover the same , HOW ?

Code reference is hosted into my github :


Single Job fail over guarantee overview

Here is the main steps from the overview above via the following flow :

1- You need to create 2 partitioned caches , one for single jobs reference and one for node Ids reference , you should make those caches backed by persistence store in production if you need to survive total grid crash

2- Define jobs cache after put interceptor to set the node id which is the primary owner and triggerer of that compute task

3- Define nodes cache interceptor to intercept after put actions so it can query for all pending jobs for that node id then submit them again into the compute grid with affinity

4- Enable event listening for node left and node removal in the grid to intercept node failure

Then let us run the show , imagine you have data and compute grid of 2 server nodes :

a- you trigger a job in node 1 which will do sensitive action like financial action and you need to be sure it is finished with a valid state whatever the case

b- what if that primary node 1 crashed , what will happen to that compute task , without the extension highlighted above it will disappear with the wind

c- but with that failover small extension , Node 2 . will catch an event that Node 1 left , then it will query jobs cache for all jobs that has that node id and resubmit them again for computation , optimal case if you have idempotent actions so it can be executed multiple times or use job checkpointing for saving the execution state to resume from the last saved point

Job data model for Jobs cache where we mark node id an ignite SQL queryable indexed field :

How the ignite failed nodes cache interceptor is implemented :

How the ignite jobs cache interceptor is implemented :

Apache ignite config :

Enable Node removal and failure events listening ONLY as enabling too much events will cause some performance overhead:

Main App tester :


Testing flow :

1- first run the first ignite server node with that code commented out :

Screen Shot 2017-11-15 at 15.20.44

2- then run the second server node but before doing it , uncomment the highlighted code above which simulate creating now jobs for computation by inserting them into the jobs cache

3- once you run the second node , after 5 seconds kill it by shutting it down once you see it started to submit jobs from the code you just uncommented, like:

intercepting for job action triggering and setting node id : f0920c5b-3655–4e85-aa60-f763a9eb1111
Executing computation logic for the request0Key

4- you will see in the first still running node a message that highlight it received and event about the removal of the second node which from it , it will fetch the node id , then insert it on the failed nodes cache where its cache interceptor will intercept the after put action , use the node id and query in jobs cache for still pending jobs that has the same node id and resubmit them again for execution in the compute grid and here we are happy that we caught the non finished jobs from the failed crashed primary node that submitted those jobs

Received Node event [evt=NODE_LEFT, nodeID=TcpDiscoveryNode [id=2da3e806–72e3–415b-acd3–07b7da0eabe0, addrs=[0:0:0:0:0:0:0:1%lo0,,], sockAddrs=[/, /0:0:0:0:0:0:0:1%lo0:47501, /], discPort=47501, order=2, intOrder=2, lastExchangeTime=1510666504589, loc=false, ver=2.3.1#20171031-sha1:d2c82c3c, isClient=false]]

and you will see it is fetching pending jobs and submitting it again, for example you will see the following in the IDEA console:

found a pending jobs for node id: c2a32b7d-1420–4e1a-8ca2-b7080e91dc22 and job id: 19Key
Executing the expiry post action for the request19Key

References :