Comments (14)
Dave Syer commented
Changed title to reflect current usage / terminology.
Discussed with Rob: we would prefer the input / output sources to be created as needed, rather than sitting there as Spring beans. This forces a factory/context pattern up the interface layers into the ItemProvider or its implementations.
from spring-batch.
Dave Syer commented
Custom scopes might be the best approach. There is now a StepScope available, but the ThreadLocals still need to be replaced. Keeping the state encapsulated in an object that expresses the class invariants still makes sense though.
from spring-batch.
Dave Syer commented
Replaced ThreadLocal everywhere with equivalent RepeatContext mechanism.
Actually it might be better to do the TX synchronization at the ItemProvider level anyway. There is no easy way to synchronize access to a file that is open for XML reading, so it's probably better to buffer the items themselves rather than try to mark an input stream. If we did that it would handle Skippable but not Restartable in the ItemProvider. Would also solve BATCH-26 (sort of) - at least it wouldn't be the same problem any more.
from spring-batch.
Lucas Ward commented
I agree that buffering may be another approach to transaction synchronization that may be more appropriate in certain scenarios. (especially XML) However, I'm not sure I understand the advantage of moving it to an ItemProvider. The InputSource already has state about the input, so it still seems natural that it would buffer the input if needed. Moving it to the ItemProvider would then require that the provider have state about what has been read so far, in addition to what the input source is already holding.
I think this also highlights another issue we've discussed before, at times, the line between ItemProvider and InputSource isn't very clear. It gets especially blurry when reading from XML or any other input source that would retrun a mapped object. Ideally, it would be the best if ItemProviders had nothing but developer code handling the business scenarios of loading the data, with no architectural concerns. However, given the interfaces such as Restartable that must be called, it's impossible to do.
from spring-batch.
Dave Syer commented
The advantage of doing the buffering at the ItemProvider is that it is generic for all InputSources. I guess it could just as easily be done by the InputSources using a helper object to do the buffering. The developer touch point is normally adding a mapper or something to an ItemProvider anyway, so it won't much matter how we do it.
from spring-batch.
Lucas Ward commented
As I was working through reviewing some of the new xml input sources, I had a new thought about buffering the objects returned by Input Source or Item Provider. One issue with doing so would be either the Item Provider (in the case of buffering an InputSource) or ItemProcessor modifying the object that had been buffered after it had been returned, if the transaction is rolled-back and that same object given back to the itemprocessor or provider, there could be issues if it depended upon the object being in it's initial state.
We could try and tell developers not to modify objects returned from a transactional buffer, or encourage them to be immutable, but there's no garantees, and these types of errors would be extremely hard to debug.
from spring-batch.
Wayne Lund commented
I don't think its possible to ask for these to be immutable. Therefore it sounds like you're spot on with saying that we can't buffer. Too bad we always lose something between language generations. Forte Tool actually rolled back instance variables as well as transactions but I've never seen another language that supports that feature.
from spring-batch.
Lucas Ward commented
The only way that I can see to still buffer and not have to worry about objects within the buffer being invalidated would be to either A) Make a copy of them, which would almost certainly cause a performance issue, not to mention there would need to be some kind of contract on the domain objects to ensure they are 'copyable' B) Buffer the objects getting written out. Afterall, they're the only things you really care about, since any given output will always be the same given it's corresponding input.
from spring-batch.
Dave Syer commented
The original impetus for this was the XML input sources, and they have solved the problem in a different way. So buffering would be nice for maintenance / reliability but arguably adds very little to the user's experience. Leaving this issue open but re-prioritised.
from spring-batch.
Dave Syer commented
Buffering is a non-issue if there is no TX synchronization - it won't be needed when all the retry happens at the chunk level.
from spring-batch.
Dave Syer commented
The best approach seems to be to put the mark(), reset() (and markSupported()) methods into ItemReader/Writer. Then let the clients of those classes decied what to do with them.
from spring-batch.
Dave Syer commented
A lot of work was done on this that is more relevant to BATCH-222 and BATCH-229. This one is finished really, with some tidying up to do (which will be covered there).
from spring-batch.
Dave Syer commented
ItemStream and mark(), reset() introduced on all the relevant Item*. The buffering is now an option, if someone wants to wrap an existing Item* (e.g. see StagingItemReader in the samples), but not a necessity.
from spring-batch.
Dave Syer commented
Assume closed as resolved and released
from spring-batch.
Related Issues (20)
- When writer throw non skippable exception(i.e. StringIndexOutOfBoundsException) then processor going infinite loop HOT 3
- Reference Documentation Uses Deprecated Class StepExecutionListenerSupport HOT 2
- StepExecution.getEndTime() is retun null HOT 4
- Spring batch Partitioner : How to make a spring boot app having spring batch functionality work as multinode HOT 2
- Incorrect bean configuration in DefaultBatchConfiguration
- Revisit the mechanism of job registration
- Not able to create JobRepository for spring-batch 5.x (MongoDB) HOT 5
- Improve recommendations for indexing metadata tables
- Incorrect deprecation in MongoPagingItemReader
- Schema Migration with Flyway HOT 2
- DELETE CASCADE on Foreign Keys HOT 3
- Spring Batch step write_count less than read_count and filter and skip counts are all zero HOT 1
- Incorrect Chunk property value in implementation of ItemWriter write method HOT 3
- 5.1.2 Backported issues
- 5.0.6 Backported issues
- remote partitioning doesn't work if you're using graalvm
- Access Job Description
- Kotlin data class support for `FlatFileItemReaderBuilder` HOT 1
- Deserialization of JobParameters throws exception
- Default value for ignoreWarnings in JdbcCursorItemReaderBuilder does not align with documentation HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spring-batch.