问题
One of the requirements in my project is to place the spring batch schema on amazon redshift db.
I am planning to start from the schema-postgresql.sql as the base line as redshift is based on postgres.
Looking at the spring batch source code it looks like you need to do few things to make this work:
- Extending JobRepositoryFactoryBean, DefaultDataFieldMaxValueIncrementerFactory.
- Adding My own RedshfitMaxValueIncrementer that extends AbstractSequenceMaxValueIncrementer
Looking at the redshift datatypes it does not look like I will not have any issues converting the schema script aside from sequence which used to create job,execution,step execution ids.
What do you suggest as the best workaround for the missing sequences?
- Specifies those columns as an IDENTITY column. Looks as the easiest way from the redshift point of view. This can be problematic as DataFieldMaxValueIncrementer.nextLongValue() return long and not Long and we need to return null and let IDENTITY do the job for us
- Implementation base on something like select max(STEP_EXECUTION_ID) from BATCH_STEP_EXECUTION And doing something similar to MySQLMaxValueIncrementer that extends AbstractColumnMaxValueIncrementer
- Creating the sequences in java code only; using tools similar to the ones hibernate use
- An approach not mentioned above
回答1:
Here's how I got at least that part to (apparently) work:
In my subclass of DefaultBatchConfigurer
, I added this code:
@Override
protected JobRepository createJobRepository() throws Exception
{
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(dataSource);
factory.setTransactionManager(getTransactionManager());
factory.setIncrementerFactory(new RedshiftIncrementerFactory(dataSource));
factory.afterPropertiesSet();
return factory.getObject();
}
The factory object looks like
public class RedshiftIncrementerFactory implements DataFieldMaxValueIncrementerFactory
{
private DataSource dataSource;
public RedshiftIncrementerFactory(DataSource ds)
{
this.dataSource = ds;
}
@Override
public DataFieldMaxValueIncrementer getIncrementer(String databaseType, String incrementerName)
{
return new RedshiftIncrementer(dataSource, incrementerName);
}
@Override
public boolean isSupportedIncrementerType(String databaseType)
{
return POSTGRES.toString().equals(databaseType);
}
@Override
public String[] getSupportedIncrementerTypes()
{
return new String[]{POSTGRES.toString()};
}
}
And then, finally, the incrementer itself:
public class RedshiftIncrementer extends AbstractSequenceMaxValueIncrementer
{
public RedshiftIncrementer(DataSource dataSource, String incrementorName)
{
super(dataSource, incrementorName);
}
// I need to run two queries here, since Redshift doesn't support sequences
@Override
protected long getNextKey() throws DataAccessException {
Connection con = DataSourceUtils.getConnection(getDataSource());
Statement stmt = null;
ResultSet rs = null;
try {
stmt = con.createStatement();
DataSourceUtils.applyTransactionTimeout(stmt, getDataSource());
String table = getIncrementerName();
stmt.executeUpdate("UPDATE " + table + " SET ID = ID + 1");
rs = stmt.executeQuery("SELECT ID FROM " + table + " WHERE UNIQUE_KEY='0'");
if (rs.next()) {
return rs.getLong(1);
}
else {
throw new DataAccessResourceFailureException("Sequence query did not return a result");
}
}
catch (SQLException ex) {
throw new DataAccessResourceFailureException("Could not obtain sequence value", ex);
}
finally {
JdbcUtils.closeResultSet(rs);
JdbcUtils.closeStatement(stmt);
DataSourceUtils.releaseConnection(con, getDataSource());
}
}
@Override
protected String getSequenceQuery()
{
// No longer used
return null;
}
}
This at least allows the job to start. However, there are other problems with Redshift that I will detail elsewhere.
来源:https://stackoverflow.com/questions/20789731/using-redshfit-as-spring-batch-job-repository-and-alternatives-to-sequence-in-re