Pentaho

 View Only

 Carte cluster not load balancing transformation between the slaves while using the API

Juan Sierra Pons's profile image
Juan Sierra Pons posted 03-29-2022 03:46
Hi,

I have configured a Carte cluster one master and two slaves. Basically I have launched the ones located in the pwd folder
./carte.sh pwd/carte-config-master-8080.xml
./carte.sh pwd/carte-config-8081.xml
./carte.sh pwd/carte-config-8082.xml

As far as I know there is a lack of a command tool (like pan or kitchen) to launch trans/jobs directly to the Carte cluster. I have read something about wrapping it inside a .klb https://forums.pentaho.com/threads/74921-Run-Transformation-or-Job-on-Carte-Server-from-pan-Kitchen/ but I would like to the use API instead of the wrapping

Slaves are registered successfully as per:

curl -s -L "http://cluster:cluster@localhost:8080/kettle/getSlaves/"
<?xml version="1.0" encoding="UTF-8"?>
<SlaveServerDetections>
<SlaveServerDetection>
      <slaveserver>
        <name>Dynamic slave [localhost:8081]</name>
        <hostname>localhost</hostname>
        <port>8081</port>
        <webAppName/>
        <username>cluster</username>
<password>Encrypted 2be98afc86aa7f2e4cb1aa265cd86aac8</password> <proxy_hostname/>
        <proxy_port/>
        <non_proxy_hosts/>
        <master>N</master>
        <sslMode>N</sslMode> </slaveserver>
<active>Y</active>
<last_active_date>2022/03/24 12:00:17.411</last_active_date>
<last_inactive_date/>
</SlaveServerDetection>

<SlaveServerDetection>
      <slaveserver>
        <name>Dynamic slave [localhost:8082]</name>
        <hostname>localhost</hostname>
        <port>8082</port>
        <webAppName/>
        <username>cluster</username>
<password>Encrypted 2be98afc86aa7f2e4cb1aa265cd86aac8</password> <proxy_hostname/>
        <proxy_port/>
        <non_proxy_hosts/>
        <master>N</master>
        <sslMode>N</sslMode> </slaveserver>
<active>Y</active>
<last_active_date>2022/03/24 12:00:17.416</last_active_date>
<last_inactive_date/>
</SlaveServerDetection>

</SlaveServerDetections>

When I launch in parallel a bunch of dummy transformations (200 grouped by 20) to the master they only run in the master. None has run in any of the slaves

seq 200 |parallel -j20 -n0 'curl -s -L "http://cluster:cluster@localhost:8080/kettle/executeTrans/?rep=myRepository&trans=Juan%2FDummy1&level=Debug"'

Am I missing something?
Andrew Cave's profile image
Andrew Cave
you'll need to set up a wrapper job with a  Job or Transformation Executor entry that points to the job you want to run.  In the options for the entry, you set the run configuration to point to your carte server.
Juan Sierra Pons's profile image
Juan Sierra Pons

Thanks @Andrew Cave I have done as suggested


​​With the cluster declared as this way:


But it is still running only in the master only.

What am I missing?

Andrew Cave's profile image
Andrew Cave
Hi Juan

You have only defined a run configuration for the Master server so it's doing what you're telling it do.  Add a run configuration for the slaved servers and update the execution step.
Juan Sierra Pons's profile image
Juan Sierra Pons
Hi @Andrew Cave,

I am not fully understanding this. As I see it the good thing about dynamic clusters is that they are transparent to spoon, pan, kitchen. I mean It shouldn't be  necessary knowing in advance the cluster's members therefore not configuration in advance is needed.
With your approach I have to configure the slaves in advance.

By the way I have added the two slaves

And updated the execution step

With no luck.

All is still running in the master :(

Thanks for your time


​​
Roberto Velasco Martin's profile image
Roberto Velasco Martin
I have a similar problem. But I run the transformation in the two slaves in paralel.This is also not the expected behavior. It should run on one of the two slaves

My config to execute is


Andrew Cave's profile image
Andrew Cave
Hi Juan et al

Have you set up the files for the dynamic clustering as per this page  https://help.hitachivantara.com/Documentation/Pentaho/8.2/Products/Data_Integration/Carte_Clusters/Setup ?
Juan Sierra Pons's profile image
Juan Sierra Pons
Yes @Andrew Cave, I was using the configuration from that link. But I am using 9.2

Thanks for your time

Best regards​​​
Andrew Cave's profile image
Andrew Cave
It's really unclear documentation isn't it. 

From this old post by Diethard Steiner http://diethardsteiner.blogspot.com/2013/03/creating-clustered-transformation-in.html   it looks like a cluster schema needs to be set up as well - which you can only do from a transformation (weird?)

Another reference is here https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/374571420/Dynamic+clusters   .  Also quite old but it does look like the cluster schema is very important.

Since you haven't mentioned that step and the 8.2 documentation says it is for executing in parallel you might have missed it?  But it also says
Dynamic cluster

If checked, a master Carte server will perform failover operations, and you must define the master as a slave server in the field below. If unchecked, the PDI client will act as the master server, and you must define the available Carte slaves in the field below.


Man, what a cluster-****   : D