CUBA Platform BI Integration with Pentaho


Hi and welcome to our webinar dedicated to Business Intelligence integration add-on which is designed to integrate CUBA applications with Pentaho Platform.
In 20 minutes you will learn how to:

  • Set up Pentaho server and configure CUBA BI add-on
  • Create a star schema for the report data
  • Create a data source and a Mondrian cube for data analysis
  • Use generic screens for registering and running Pentaho reports in CUBA apps
  • And, finally, use the visual component for embedding Pentaho reports to any application screen.

Pentaho setup

I have downloaded Pentaho Community Edition from their official website. Unpack the archive into an empty folder.

Adding Saiku plugin

We will also need Saiku Analytics plugin from Meteorite BI, it is available on Pentaho marketplace. Extract the plugin and move the saiku folder to
/pentaho-server/pentaho-solutions/system.

Next, remove the cpf-core-6.0.0.0-353.jar and cpf-pentaho5-6.0.0.0-353.jar libraries from /pentaho-server/pentaho-solutions/system/saiku/lib folder and replace them with the newer versions from other libraries used in your Pentaho version.

Adding Saiku license

In order to use Saiku, we’ll need a license. If you have never used Saiku before, access the http://licensing.meteorite.bi and sign up for a new account. Once you have validated your account, create a new company and generate the community license for it. Save and download the license. Rename the file to license.lic and then copy it to the
/pentaho-server/pentaho-solutions/system/saiku.

Pentaho ports configuration

We’ll change Pentaho default port to 18081 in the server.xml configuration file to avoid collision with CUBA default port. The same for the shutdown port: change it to 8015, as 8005 will be used by CUBA.

Add CUBA plugin to Pentaho

Download cuba-bi-pentaho-1.2.4.jar from CUBA website and copy it to /pentaho-server/tomcat/webapps/pentaho/WEB-INF/lib.

Then, download cuba-bi-pentaho-1.2.4-plugin.zip from CUBA website, unpack it and copy the saiku-cuba-bi folder to /pentaho-server/pentaho-solutions/system.

SSO in CUBA and Pentaho

We will use Single Sign-On authentication between the CUBA App and Pentaho. To make it work, users should have the same login names in the CUBA application and Pentaho. When a user opens a report, the application generates a ticket for the user and pins username and the ticket to the Pentaho request. The ticket is stored in the user session. CUBA plugin on the Pentaho server detects the ticket coming with the request and calls back the application to check if the ticket is valid. If the ticket is valid, the plugin authorizes the user on the Pentaho server with roles that configured for the user in Pentaho.

You should create a new cuba-pentaho-community-authentication.xml file in /pentaho-server/pentaho-solutions/system folder. The content for this configuration file you can copy from our BI Add-on documentation.

Next, edit the pentaho-spring-beans.xml file in the same folder and add one line of import after the applicationContext-spring-security import.

Pentaho data integration

Download and unpack Pentaho Data Integration from Pentaho website. We will use this tool to create a star schema for our BI report.

CUBA app

Now it’s time to add the main link in this chain - a CUBA application. I will use the Workshop sample app as the example. Workshop is available in Studio, it has the standard generic UI and a simple data model. I suggest using Mechanic and Order entities in our report to make a kind of mechanics load balancing report.

HSQL port

Firstly, let’s change the default HSQL port in CUBA app, again, to avoid collisions with PDI port. Changing it in CUBA Studio is much easier than editing dozens of configs in Pentaho.

Addon dependency

Secondly, let’s add BI add-on to our application. As the project uses the latest version of the platform, I add the latest version of the add-on. You can get the compatible version number in the add-on documentation. As you can see, the BI addon has been added as a dependency to the build.gradle file.

Pentaho tables

For working with multi-dimensional reports it is usually recommended to create additional tables, as querying the main tables of the database on the fly is too expensive. I’ve added 3 scripts to create 3 new empty tables: 2 for the report dimensions and 1 for report measures, or facts. Above I’ve generated insert scripts with random data. You can add some records in the existing tables manually, just to see something in the final report.
Run Create Db Gradle task to apply the scripts and start the application.

Workshop application

A small overview of the application: here are the mechanics and their hourly rates, and here’s the Orders browser, each one contains a link to a mechanic, the number of hours spent and the order amount which is calculated based on the mechanic’s hourly rate and the sum of spare parts used in the order. The Order entity does not have a date attribute explicitly, but all CUBA entities have create_ts system attribute with the timestamp of the order creation date, so, we have a source for a temporal dimension in our future OLAP cube.

PDI

Start Pentaho Data Integration by running Spoon.bat. Here we should firstly establish a connection to our workshop database. Fill in the database credentials, test the connection and save it.
PDI, or Spoon, whatsoever, is a powerful tool for working with data. We will not use all that power, but still, we’ll have to extract the data from workshop tables and put it in additional pentaho tables.
To begin with, drag the Table input node onto the worksheet. Get the base SELECT statement.
This transformation will provide data for the cube dimensions. Let’s now use the timestamp stored in create_ts to prepare data for the dedicated columns in the pentaho tables: year, month and day.
Next, use the Insert/Update node to insert the selected data in the Order dimension table. Map the columns from the SELECT extraction with the permanent table columns.
Finally, use the Update node to finalize the Order transformation.

The second dimension will be quite simple and contain only mechanics names. Usually, we use BI reports to demonstrate several levels in one dimension: for example, customers grouped in some categories by territory or their legal status, and we can drill in the dimensions of the report. In this sense, the Mechanic entity is flat, and changing the application data model is out of the scope of this tutorial. So, we’ll use nothing but mechanics names, all in plane.
But there’s a bit more about Mechanics in the workshop sample: they themselves do not contain their names, instead, they contain a link to a standard CUBA User entity, and this User’s name is used as the Mechanic’s instance name. So, we join the User table in the SELECT statement.
Again, set up the columns mapping in the Insert/Update node and finalize the Mechanics transformation using the Update node.
Save the transformation, you will need it soon.

Let’s move on to the facts transformation. New transformation requires new database connection - in case you want to use data from different databases.
We use the Order entity again, but for facts let’s take its measurable and aggregatable attributes: mechanic’s hourly rate, number of work hours and the total order amount.
Then, update the pentaho fact table and finalize the transformation, as you already know how to do. Save the facts transformation.

Wrap each transformation in the corresponding job. Each job should have a Start node, a Transformation node, and 2 possible outputs: Success and Abort job in case of an error.

Finally, create one main job that will put all these pieces together to create the star schema. Start the job, check DB connections, transform dimensions and facts one by one and add 2 possible outputs. Save the job file.
Try to run this job. If you’ve done everything right, you’ll see something like this.

Pentaho console

Start Pentaho server by running start-pentaho.bat.

Open Pentaho console in browser. Pentaho server is running on localhost on the port you’ve set in server.xml file.
Create new database connection to the workshop database. Next, create a new data source. Select the workshop tables as the source type. Select the additional pentaho tables for the report, define which one should be used for the report facts.
Define how the tables are joined to each other.
Customize the analysis: first of all, the ID columns say nothing to a report user, so remove them. I would also edit the level names to lower case, but it’s only a matter of taste.
Order dimension is a Time dimension, so check the corresponding checkbox. It allows us to select the time display format for each time field: year, month, and date.
The cube data source is done.

Create new Saiku query. Select the workshop cube and say hello to the new report. The functionality of visual analysis designer is out of CUBA scope, so I’ll keep the report as primitive as it is now out-of-the-cube. Save the report in the admin root folder.

Pentaho reports in CUBA apps

By default, the BI addon expects Pentaho server to run at port 8081. We’ve changed the port in Pentaho config files, so we should update the value of the pentahoServerUrl property. It can be done at runtime from the administration menu.

The standard generic screens for BI reports were automatically added to the application menu. Let’s create a report (or, in fact, a link to this report): set the path to a report in Pentaho storage. You can also define users roles and manage the user access to this report, for example, make it available to certain users only. If the list is empty, the report is available to all. But that is the subject of CUBA Security Subsystem tutorials.

Next, open the Run BI reports screen to see the list of created reports. We have only one. Click Run and see the same worksheet as in Pentaho console right in the CUBA application!

Of course, standard screens are not the only option. You can embed BI report to any screens of your app. I’ll create a new screen to demonstrate this.
I add the special visual component to the screen layout and save the screen. XML parse error occurred as I forgot to define the namespace for BI components. Let’s add it.

Pay attention that to add a new screen we haven’t even stopped the application server. The new screen has been hot deployed. To see it added to the Application menu, we only should log out from the system and login back. Open the new screen, and that’s it.

Thank you for your time and attention. Find another hands-on example in the documentation of CUBA BI add-on and don’t hesitate to contact us on CUBA forum.