Spring-Batch Without Persisting Metadata to Database

Spring-Batch without persisting metadata to database?

I came back to my own question, as the solution did not work anymore.
As of spring-batch-1.5.3 use as follows:

@SpringBootApplication(exclude = {DataSourceAutoConfiguration.class})
...
@Bean
public PlatformTransactionManager transactionManager() {
return new ResourcelessTransactionManager();
}
}

Is there a way to skip persisting metadata for Spring Batch only for particular jobs?

Here's how I achieved what I'm looking for, in regards to the suggestions by Mahmoud Ben Hassine (Removed @EnableBatchProcessing, for some unrelated issue - see here):

I have two configuration classes:

@Configuration
public class SpringBatchConfiguration extends DefaultBatchConfigurer {

@Inject public SpringBatchConfiguration(DataSource dataSource) {
super(dataSource);
}

@Bean(name = "persistentJobLauncher")
public JobLauncher jobLauncher() throws Exception {
return super.createJobLauncher();
}

@Bean
@Primary
public StepBuilderFactory stepBuilderFactory() {
return new StepBuilderFactory(super.getJobRepository(), super.getTransactionManager());
}

@Bean
@Primary
public JobBuilderFactory jobBuilderFactory(){
return new JobBuilderFactory(super.getJobRepository());
}

@Bean
public JobExplorer jobExplorer() {
return super.getJobExplorer();
}

@Bean
public JobRepository jobRepository() {
return super.getJobRepository();
}

@Bean
public ListableJobLocator jobLocator() {
return new MapJobRegistry();
}
}

and the in-memory one:

@Configuration
public class SpringInMemoryBatchConfiguration extends DefaultBatchConfigurer {

@Inject public SpringInMemoryBatchConfiguration() {
}

@Bean(name = "inMemoryJobLauncher")
public JobLauncher inMemoryJobLauncher() throws Exception {
return super.createJobLauncher();
}

@Bean(name = "inMemoryStepBuilderFactory")
public StepBuilderFactory stepBuilderFactory() {
return new StepBuilderFactory(super.getJobRepository(), super.getTransactionManager());
}

@Bean(name = "inMemoryJobBuilderFactory")
public JobBuilderFactory inMemoryJobBuilderFactory(){
return new JobBuilderFactory(super.getJobRepository());
}
}

and when I want to start a "persistent" job, I use @Qualifier(value = "persistentJobLauncher") JobLauncher launcher and to start an "in-memory" one: @Qualifier(value = "inMemoryJobLauncher") JobLauncher launcher.

How to avoid Spring batch persistence of metadata in DB

As far as I know, disabling metadata persistence is not possible. A possible workaround for not having to setup a 'proper' database is to use an in-memory database for the metadata:

pom.xml

    <dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
</dependency>

However, there is the problem of using the default Spring Datasource for the Spring Batch Jobs Metadata repository. Here there is a complete workaround in order to define a secondary datasource for it:

application.properties

###################################
### JOBS DATASOURCE PROPERTIES. ###
###################################
## URL used to connect to the jobs database.
spring.secondDatasource.url=jdbc:h2:mem:jobsdb;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE
## Driver class used to connect to the jobs database (it will depend on datasource).
spring.secondDatasource.driver-class-Name=org.h2.Driver
## User name
spring.secondDatasource.username=xxx
## Password
spring.secondDatasource.password=xxx
## Datasource configuration for jobs database.
spring.jpa.hibernate.ddl-auto=create-drop
spring.secondDatasource.initialize=true
spring.secondDatasource.test-on-borrow=true
spring.secondDatasource.validation-query=select 1

Spring Configuration (I) Datasources

/**
* Config class holding several datasources, one business related, other for the spring batch jobs
*/
@Configuration
public class DataSourceConfiguration
{
@Bean
@Qualifier("businessDataSource")
@ConfigurationProperties(prefix = "spring.datasource")
public DataSource primaryDataSource()
{
return DataSourceBuilder.create().build();
}

@Bean
@Primary
@Qualifier("jobsDataSource")
@ConfigurationProperties(prefix = "spring.secondDatasource")
public DataSource secondaryDataSource()
{
return DataSourceBuilder.create().build();
}
}

Spring Configuration (II) - Spring batch

import org.springframework.batch.core.configuration.annotation.DefaultBatchConfigurer;

// Another @Configuration class...

@Autowired
@Qualifier("jobsDataSource")
private DataSource dataSource;

@Bean
public BatchConfigurer configurer()
{
// This is required to avoid problems when jobs datasource is into some secondary datasource.
return new DefaultBatchConfigurer(dataSource);
}

With all this, Spring Batch will use the in-memory datasource, while you are free to use the default datasource for your own purposes

Spring boot + spring batch without DataSource

I got around this problem by extending the DefaultBatchConfigurer class so that it ignores any DataSource, as a consequence it will configure a map-based JobRepository.

Example:

@Configuration
@EnableBatchProcessing
public class BatchConfig extends DefaultBatchConfigurer {

@Override
public void setDataSource(DataSource dataSource) {
//This BatchConfigurer ignores any DataSource
}
}

Spring Batch - How to prevent batch from storing transactions in DB

The following seems to have done the job for me:

@Bean
public DataSource dataSource() {

EmbeddedDatabaseBuilder builder = new EmbeddedDatabaseBuilder();
EmbeddedDatabase db = builder
.setType(EmbeddedDatabaseType.HSQL)
.build();
return db;
}

Now Spring is not creating tables in our production database, and when the JVM exits state is lost so nothing seems to be hanging around.

UPDATE: The above code has caused concurrency errors for us. We have addressed this by abandoning the EmbeddedDatabaseBuilder and declaring the HSQLDB this way instead:

@Bean
public BasicDataSource dataSource() {
BasicDataSource dataSource = new BasicDataSource();
dataSource.setDriverClassName("org.hsqldb.jdbcDriver");
dataSource.setUrl("jdbc:hsqldb:mem:testdb;sql.enforce_strict_size=true;hsqldb.tx=mvcc");
dataSource.setUsername("sa");
dataSource.setPassword("");
return dataSource;
}

The primary difference is that we are able to specify mvcc (Multiversion concurrency control) in connection string which resolves the issue.



Related Topics



Leave a reply



Submit