- Polling file:
- In order
- Only those which have (1) failed processing, (2) yet to have processed
- Batch processing:
- Process and store changes in database:
- Update: Update time and detaails
- Create: Create record
- Delete: Update flag
- Process and store changes in database:
- Ensures that on a app restart, we have stored metadata of which files have been polled before.
- Unless the file was last updated since then, it will not be polled
- However, there will be the scenario where:
- processed files are polled again, regardless of
JobStatus
, as long as it was last updated
- processed files are polled again, regardless of
- Need to import dependency:
<dependency>
<groupId>org.springframework.integration</groupId>
<artifactId>spring-integration-jdbc</artifactId>
</dependency>
- Create
JdbcMetadataStore
Bean which will store and fetch metadata of files (date, last modified date), which will be used in theFileSystemPersistentAcceptOnceFileListFilter
@Bean
public JdbcMetadataStore jdbcMetadataStore(DataSource dataSource) {
return new JdbcMetadataStore(dataSource);
}
- A custom filter condition which can filter out those files which have already been processed.
- However, this does not ensure that the processing is done in order
- A concrete implementation of
AbstractFileListFilter
which will call theBatchMetadataService
, queries theJobRepository
- A file that passes this filter would be one without a BatchStatus.equals(completed):
- Failed processing
- Never been processed before
- We would need to process it in order. Following the javadocs:
public static FileInboundChannelAdapterSpec inboundAdapter(File directory,
@Nullable
Comparator<File> receptionOrderComparator){}
- receptionOrderComparator - the Comparator for ordering file objects.
- However, this approach might not always guarantee the files to be processed in the order of their names. The reason lies in the asynchronous nature of message processing in Spring Integration. When the inboundAdapter detects a new file, it creates an IntegrationMessage containing the file and sends it to the downstream components. The processing of these messages is asynchronous, meaning they might not be handled in the same order they were received.
- As such, I have done the sorting in the
JobSuccessfulFileListFilter
as described above
- File transferring halfway?
- Rename strategy?
- Use LastModified filters?
The following is the testing strategy that would like to employ:
- Positive flow:
- 2 files
- assert output of the final outcome
- Need to ensure that spring batch job metatables are fresh
- Negative flow:
- 1 file positive (?), 1 file negative
Resources: https://www.baeldung.com/spring-batch-testing-job Some thoughts:
- In memory database? - H2
- To mock the database layer, and to test how it was invoked?
- 2 files:
- Both new (not processed)
- 2 files:
- One not new, 1 new (not processed)
- 2 files:
- Both failed
- Processing (?)