I am able to run transformation job using AEL (Spark Engine) in PDI client.
So my question is,
How to run Pan and Kitchen using AEL (Spark Engine).?
I am not 100% sure I am following your question correctly. Pan and Kitchen are execution engines (slimmed down versions of Kettle), if you are using SPARK, then there is no need to use Pan or Kitchen. AEL (Adaptive Execution Layer) is an interface to allow you to use other engines (currently SPARK) to execute transformations and Jobs.
Adaptive Execution Layer - Pentaho Documentation
If you are asking about scheduling, that is a different topic all together.
Thanks for the response, My question is :- How do I run a transformation using AEL Spark engine through Pan, I need the transformation to run through pan since I am scheduling through cron. Currently I can run transformation successfully on AEL spark engine through Spoon but same transformation when I run it through Pan, it doesn’t use spark engine. Hope I am clear with my question.
You need a wrapper job with the transform job entry that then uses the Spark Run Configuration set in its properties. Then you'll be using kitchen.sh/bat to call the job and the transform will then execute in AEL Spark.
Loads of thanks, we followed your instructions to run job by using Kitchen , the transformation is executed and it is processed but I see there is no output file , Please find the attached snapshot for the same. Please advise if I'm missing out something in my approach.
I'm not certain I completely understand. Is the transformation supposed to output a file? If so, to what location? The screen shot provided is a warning message that I've usually seen when attempting to run spoon.sh in a Linux environment when that libwebkitgtk-1.0 has not been installed.
Please let me know.
Thanks for the reply.
I tried to clear the cache in the kitchen and executed the command where in I am able to run and its successfully fetching the record.But I am unable to locate the job(.kjb) in AEL spark engine.
Do you have any idea so that you can help me for the same.
My apologies for a very late response here. After kitchen fires the job and the transform within it that is using the run configuration of AEL, the transform then gets passed to the Spark daemon and converted to a Spark job, so you won't find a ktr in Spark. You can though monitor the execution via YARN and Spark History server if the daemon is configured correctly.
I hope that helps.
Thanks for your suggestion I had followed all the steps which you had mentioned,its working now.
I was going though it in detail hence it took some time to reply you back.
The information which you had shared was really helpful,thanks for that.
As per your guidelines I have executed the job using spoon.Its running successfully connecting to the spark engine and its been observed that its up running in AEL and Its also displaying in YARN and spark history server.
The same job when executed via kitchen command is not connected to the AEL.To be noted that its is running as a normal job.
I have an issue related to edge node.
Considering an example where in I am taking a .kjb file i.e(Job File) which internally connects a .ktr file (Transformation).
Both the job file and transformation files lies in HDFS(data-lake).
I am executing job through kitchen and transformation is configured to execute as a spark job through AEL.
But the transformation is not executing as a spark job through AEL,its executing as a normal job.
In the edge node there exists a data-integration folder of AEL.
Another folder with same name i.e data-integration lies for pan and kitchen.
The problem lies when we execute
The following command is used for executing through kitchen
sh kitchen.sh -norep -file="hdfs://nameservice1/user/AEL_Trans_Job.kjb" -level=Detailed
Can you help me in resolving this.This acts as a huge blockage for our work.
Retrieving data ...