We have a spring batch job where we trying to process around 10 million records. Now doing this in single thread will be very slow since we have to match SLA.
To improve performance, we have developed a POC where master step is creating partitions where each partition represent one unique prod id. This can range from anywhere between 500 to 4500. In POC we have 500 such unique prod id. Now each partition is being given a prod id and step work on it. All this end to end works fine.
What we noticed is that master steps takes more than 5min to send partition info to step execution request. What i mean by that is that, there is more than 5 min diff between master step generates partitions and step being executed for 1st partition.
What might be causing this slowness? What spring batch framework does during this 5 min?
Here are the 3 selects which is executed during that 5 min so many time
SELECT JOB_EXECUTION_ID, START_TIME, END_TIME, STATUS, EXIT_CODE, EXIT_MESSAGE, CREATE_TIME, LAST_UPDATED, VERSION, JOB_CONFIGURATION_LOCATION from BATCH_JOB_EXECUTION where JOB_INSTANCE_ID = ? order by JOB_EXECUTION_ID desc;
SELECT JOB_EXECUTION_ID, KEY_NAME, TYPE_CD, STRING_VAL, DATE_VAL, LONG_VAL, DOUBLE_VAL, IDENTIFYING from BATCH_JOB_EXECUTION_PARAMS where JOB_EXECUTION_ID = ?; SELECT STEP_EXECUTION_ID, STEP_NAME, START_TIME, END_TIME, STATUS, COMMIT_COUNT, READ_COUNT, FILTER_COUNT, WRITE_COUNT, EXIT_CODE, EXIT_MESSAGE, READ_SKIP_COUNT, WRITE_SKIP_COUNT, PROCESS_SKIP_COUNT, ROLLBACK_COUNT, LAST_UPDATED, VERSION from BATCH_STEP_EXECUTION where JOB_EXECUTION_ID = ? order by STEP_EXECUTION_ID;