Clustering Pentaho Business Analytics Server 5.0.x version

Clustering Pentaho BA Server 5.0.x version

Clustering Pentaho means that 2 or more instances of pentaho business analytics share a common repository. Pentaho 5.0.X now uses the Jackrabbit Content Repository (JCR) for the BA Repository. Pentaho stores content about reports that you create, examples we provide, report scheduling data, and audit data in the BA Repository. The BA Repository resides on the database that you installed. The BA Repository consists of three repositories: Jackrabbit, Quartz, and Hibernate.

– Jackrabbit contains the solution repository, examples, security data, and content data from reports that you use Pentaho software to create.

– Quartz holds data that is related to scheduling reports and jobs.

– Hibernate holds data that is related to audit logging.

You can choose to host the pentaho business analytics Repository on the PostgreSQL, MySQL, or Oracle database (by default, Pentaho software is configured to use the PostgreSQL Database). As already mentioned above that each node must have a shared repository, please find the instructions below for initializing and configuring your solution repository,

Initializing: http://infocenter.pentaho.com/help/topic/install_pdi/task_prepare_rdbms_repository.html

Configuring: http://infocenter.pentaho.com/help/topic/install_pdi/task_configure_rdbms_repository.html

You will need to add a section of the code to the repository.xml file found in biserver-eepentaho-solutionssystemjackrabbit directory to allow each node to have a shared journal. Please note that each node must have a Unique ID. This will be explained in detail below. Configuring Each Node to have a Shared Journal: Before we start configuring shared journal, we would need to delete the files mentioned in the below directories,

– delete the contents of tomcatwork and tomcattemp directories.

– Navigate to biserver-eepentaho-solutionssystemjackrabbitrepository directory and remove all files and folders from the final repository folder.

– Navigate to biserver-eepentaho-solutionssystemjackrabbitrepository directory and remove all files and folders from the workspaces folder.

Now, in order to configure nodes for a shared journal, we would need to edit the repository.xml file found in biserver-eepentaho-solutionssystemjackrabbit directory. Add the below section of the code at the end.

<!–

Run with a cluster journal

–>

</Journal>

</Cluster>

You would need to replace the JDBC connection strings(URL, USERNAME, PASSWORD, DATABASE TYPE etc.,) to match to your specific database. Now Jackrabbit journalling is configured. Quartz will also need to be configured to avoid duplicate schedules created on each node.

Configuring Quartz for Cluster :

Navigate to bi-serverpentaho-solutionssystemquartz and edit the quartz.properties file using a text editor. You will need to make the following changes in order to configure Quartz for cluster,

1. Org.quartz.scheduler.instanceId = AUTO

You will need to set it as AUTo because you can add multiple instances. The default value which would be set is 1.

2. org.quartz.jobStore.isClustered = true

The default value would be false.

3. org.quartz.jobStore.clusterCheckinInterval = 20000

You would need to explicitly add this in quartz properties file.

Admin2019-06-26T09:43:55+00:00November 14th, 2014|Blogs 2014, Pentaho|

About the Author: Admin

Technical writer

Clustering Pentaho BA Server 5.0.x version

About the Author: Admin

About us

Products

Services

Competency Centers

Customers

Partners

Resources

Global Headquarters

USA

United Kingdom

Malaysia

Clustering Pentaho BA Server 5.0.x version

Clustering Pentaho BA Server 5.0.x version

Share This Story, Choose Your Platform!

About the Author: Admin

About us

Products

Services

Competency Centers

Customers

Partners

Resources

Global Headquarters

USA

United Kingdom

Malaysia

Our Products