Pentaho Data Integration Version: 8.1
Hadoop Version: 2.7.3
Spark Version: 2.4.3
I am trying to use PDI AEL to submit Spark jobs to a Spark cluster. For my Hadoop configuration, I am using the config files provided by the cluster manager, yet I am still unable to run any jobs. When I attempt to submit a job, it correctly grabs the transformation file, however, once the transformation file is acquire, the AEL Daemon notifies me that the Spark session was killed. I believe the error is due to "No HA service delegation token found for logical URI hdfs://<uri>", however I am not sure so I have attached the info.log file below. Additionally, the server is secured with Kerberos, but I have tested the keytab and it allows me to connect to the Hadoop server.
If anyone could help me try to fix this, I would greatly appreciate that. I have been stuck on this for mulitple days and have found no solution.
For help reading the info.log file, lines 1 to 37529 are from the AEL Daemon starting up, while the HA service error occurs on line 49595.
Additionally, I have attached the error-contents.log file, which is a truncated version of info.log that highlights errors that I believe could be causing the issue.
spark session killed mainly some reasons as follows,
Remove pentaho-kerberos-jaas from etc-spark followed directory is data-integration/system/karaf/etc-spark and restart deamon.sh and check.
Thanks,
Girish