Clustering Pentaho C.E. 5.x with Jackrabbit repository

问题

I'm trying to run several Pentaho BI CE server instances atop of a clustered PostgreSQL Database.

Pentaho Clustering guide on clustering(Cluster the Application Server) says that I should keep the contents of Jackrabbit per-node configurations the same

Your application nodes all need the same configurations and BA deployments installed already in order for clustering to work.

and then I only need to configure Jackrabbit's journal to specify unique node ids.

<Cluster id="Unique_ID ">
    <Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal">
      <param name="revision" value="${rep.home}/revision.log"/>
      <param name="url" value="jdbc:postgresql://HOSTNAME:PORT/jackrabbit"/>
      <param name="driver" value="org.postgresql.Driver"/>
      <param name="user" value="jcr_user"/>
      <param name="password" value="password"/>
      <param name="databaseType" value="postgresql"/>
      <param name="janitorEnabled" value="true"/>
      <param name="janitorSleep" value="86400"/>
      <param name="janitorFirstRunHourOfDay" value="3"/>
    </Journal>
</Cluster>

Jackrabbit's guide on clustering however has more requirements(emphasis mine)

In order to use clustering, the following prerequisites must be met:

Each cluster node must have its own repository configuration.

A DataStore must always be shared between nodes, if used.

The global FileSystem on the repository level must be shared (only the one that is on the same level as the data store; only in the repository.xml file).

Each cluster node needs its own (private) workspace level and version FileSystem (only those within the workspace and versioning configuration; the ones in the repository.xml and workspace.xml file).

Each cluster node needs its own (private) Search indexes.

Every cluster node must be assigned a unique ID.

A journal type must be chosen, either based on files or stored in a database.

Each cluster node must use the same (shared) journal.

The persistence managers must store their data in the same, globally accessible location

Does it mean that FileSystem inside 'Versioning' and 'Workspace' should have different prefixes per node or point to another(possibly non shared) place? This contradicts with the Pentaho documentation(Use PostgreSQL as Your Repository Database) where everything points to single database.

回答1:

Does it mean that FileSystem inside 'Versioning' and 'Workspace' should have different prefixes per node or point to another(possibly non shared) place?

If you use a shared database, then yes.

This contradicts with the Pentaho documentation(Use PostgreSQL as Your Repository Database) where everything points to single database.

Not necessarily, it depends on the definition of "same". If you use a local file system or local database, then the configuration is the same, as "Your application nodes all need the same configurations and BA deployments installed already in order for clustering to work."

来源：https://stackoverflow.com/questions/32076020/clustering-pentaho-c-e-5-x-with-jackrabbit-repository

标签

pentaho

jackrabbit