Optimal way to join multiple files and output without creating multiple transformations

Question

I have a task where I need to join multiple files and create a specific output file(s).

At the moment I have created around 10 or so transformations to carry out this task(one for each output file).

To me this seems quite cumbersome (not to mention tedious), so I would like to know if there is a more optimal way to achieve this?

The joins that create these output files are all very similar, many consist of the same joins in fact.

Here's an example of two of the transformations I have created:

pastedimage_1

pastedimage_2

As you can see, some of joins are repeated; this is a common pattern for the remaining transformations that I intend to create, with the only differences being: the files to join, the selected fields for the output.

Any help on this is appreciated.

#Pentaho
#Ctools

Answer

The metadata injection step is pretty unusual. If you are injecting a step, like Text Output, that would like output field columns and other metadata sent in, you send multiple rows (one for each field in the output). The other inputs streaming in will be singular values into the destination step. Once populated enough to satisfy the injected ktrs' steps, PDI will run the injected transform. This is important because it will not automatically loop. You'll need something to loop this for you and send a complete set of step configuration into the injector.

A good way to develop and troubleshoot is to output the injected transform (which is an option on a tab in the step) and see what it did. Be on the watch for things that typically miss Hitachi's Q/A, like setting checkboxes as constants in steps in the injection, but they don't show up in the injected transform. It happens. You get what you inspect not what you expect. If you run into a bug, please report it to support with a ktr and reproduction steps to jira.pentaho.com. If you have a support agreement please submit a support case. Good luck.

Pentaho

Optimal way to join multiple files and output without creating multiple transformations

Related Content

Pentaho - Call Stored procedure with 1 input parameter and Refcursor output (multiple output fields) with multiple rows

RE: Pentaho - Call Stored procedure with 1 input parameter and Refcursor output (multiple output fields) with multiple rows

How to create a nested XML Output

RE: How to create a nested XML Output

CTool Heat Map colour