Could someone please help me to understand what for Spring Batch?
I've examined an example, which made me think that this framework can divide big job in several steps and also I've found that there a lot of classes for manipulating files provided (org.springframework.batch.item.file.* for example).
But what are the real use-cases of this framework? What it is designed for? Read file --> process --> write file?
Have you ever worked in an environment where you have to do a lot of batch processing? This can be quite different from a typical web application which has to work 24/7. But in classic environments it's not unusual to do the heavy lifting for example during the night when there are no regular users using your system. Batch processing includes typical tasks like reading and writing to files, transforming data, reading from or writing to databases, create reports, import and export data and things like that. Often these steps have to be chained together or you have to create more complex workflows where you have to define which job steps can be run in parallel or have to be run sequentially etc. That's where a framework like Spring Batch can be very handy.
Marco Ehrentreich wrote:Have you ever worked in an environment where you have to do a lot of batch processing?
We had a web-app which did batch processing during night, but it was sufficient to use "org.springframework.scheduling.quartz", which executes a job according to cron expression. I know there wasn't any dependencies between our jobs. But is it really beneficial to have dependencies and the whole framework needs to be included just for that?
I think it's not unusual to have more complex environments where a simple scheduler like Quartz is not enough - although it's really a very helpful tool. Maybe you have batch jobs which have to run depending on the availability of another system or it's only allowed to run a job after other jobs have finished their work (successfully). Moreover there are other features which can be very helpful for batch jobs like processing big volumes of data in chunks, executing jobs in parallel or synchronizing jobs or being able to restart a job at the last position of its progress if it crashed for some reason. And there are many other issues addressed by Spring Batch like transaction handling, messaging, processing of different file formats, transforming data, read and writing data etc.
Of course you could build all that on top of something like Quartz but why would you do that if there's an existing framework which offers all these features?