Pentaho

 View Only

 Pass SFTP files from Job to Transformation

's profile image
posted 06-07-2024 11:22

I have a Pentaho DI job where I am getting multiple files from SFTP server. I need to use those files in a transformation (which is getting call from that job) to update salesforce records. Is there a way that I can do that?

I tried to use Get rows from result action but that is not fetching the file data. I also went through Pentaho documentation for any reference. Kindly suggest any way to achieve this functionality.

's profile image

Hello,

There are two main approaches to pass SFTP files from a Pentaho DI job to a transformation for updating Salesforce records:


1. Passing filenames as parameters:


Job:


Use the "SFTP Get" step to download the files from the SFTP server.
For each file, use a "Variable" step to create a new variable containing the filename (including path).
Call the transformation and pass the filename variable as a parameter.
Transformation:


In the transformation, use a "Get Rows from Result" step with a "Variable" meta-data source.
Set the "Variable" to the parameter received from the job.
This step will return a single row containing the filename.
Salesforce Update:


Use a "Salesforce Output" step in the transformation.
Within the "Salesforce Output" step, you can access the filename retrieved earlier using variable substitution ("${filename_variable}") to construct the logic for updating Salesforce records based on the specific file content.
2. Dynamically reading files within the transformation:


Transformation:  MaryKayInTouch
Use a "Local File Input" step with a wildcard expression to read all files from the downloaded directory (assuming they are in a specific location).
This approach is simpler but requires the downloaded files to be accessible by the Pentaho server running the transformation.
Here are some additional points to consider:


Error Handling: Implement proper error handling mechanisms in both the job and transformation to catch any issues during file download, transformation processing, or Salesforce updates.
Security: Ensure secure handling of Salesforce credentials within the Pentaho job or transformation. Consider using secure password stores or environment variables.
Performance: For a large number of files, the first approach might be more performant as it avoids reading each file entirely within the transformation.
Resources:


Pentaho Documentation on "SFTP Get" step: https://pentaho-public.atlassian.net/wiki/spaces/EAI/pages/371558467/Get+a+file+with+SFTP (search for "SFTP Get")
Pentaho Documentation on "Salesforce Output" step: https://docs.hitachivantara.com/r/en-us/pentaho-data-integration-and-analytics/9.5.x/mk-95pdia003/pdi-transformation-steps/salesforce-input (search for "Salesforce Output")

I hope the solution may help you.