Pentaho

 View Only

 Unable to run PDI AEL Spark Jobs

  • General
  • ReportingAndAnalytics
  • Pentaho
  • Kettle
  • Pentaho
  • Pentaho Data Integration PDI
Zach Izzard's profile image
Zach Izzard posted 07-17-2019 22:34

Pentaho Data Integration Version: 8.1

Hadoop Version: 2.7.3

Spark Version: 2.4.3

I am trying to use PDI AEL to submit Spark jobs to a Spark cluster. For my Hadoop configuration, I am using the config files provided by the cluster manager, yet I am still unable to run any jobs. When I attempt to submit a job, it correctly grabs the transformation file, however, once the transformation file is acquire, the AEL Daemon notifies me that the Spark session was killed. I believe the error is due to "No HA service delegation token found for logical URI hdfs://<uri>", however I am not sure so I have attached the info.log file below. Additionally, the server is secured with Kerberos, but I have tested the keytab and it allows me to connect to the Hadoop server.

If anyone could help me try to fix this, I would greatly appreciate that. I have been stuck on this for mulitple days and have found no solution.

For help reading the info.log file, lines 1 to 37529 are from the AEL Daemon starting up, while the HA service error occurs on line 49595.

Additionally, I have attached the error-contents.log file, which is a truncated version of info.log that highlights errors that I believe could be causing the issue.


#ReportingandAnalytics
#Pentaho
#PentahoDataIntegrationPDI
#Kettle
Data Conversion's profile image
Data Conversion
Data Conversion's profile image
Data Conversion
Attachment  View in library
info.log.zip 431 KB
girish athanikar's profile image
girish athanikar

spark session killed mainly some reasons as follows,

 

Remove pentaho-kerberos-jaas from etc-spark followed directory is data-integration/system/karaf/etc-spark and restart deamon.sh and check.

 

Thanks,

Girish