Pentaho

 View Only

 How can I calculate the sum of a column, then use that sum to perform another calculation?

  • Pentaho
  • Kettle
  • Pentaho
  • Pentaho Data Integration PDI
James Antony's profile image
James Antony posted 03-28-2019 12:17

I need to perform a number of calculations. The first part is to find the sum of a particular column and then divide the sum by it's single value in the row.

LiabilityBal / Sum of LiabilityBal.

I've created a data grid with some test data, and used a group by step to perform the aggregation. This performs the calculation I need.

The non-populated fields need to be calculated, with WeightedAmtCalc being calculated with the above formula.grid-data

steps

I've tried splitting the streams and using a lookup on the id, but that only matches one row.

group

This is the result I get, Ideally I would like to have the sum on each row. The sum field on row 1 is the sum of all of the fields in the second column.

result

Is there a way for me to put the sum back into data stream where the sum appears on each row, so that I can then perform the calculation?

Thanks for your time


#Pentaho
#Kettle
#PentahoDataIntegrationPDI
James Antony's profile image
James Antony

I managed to resolve this using a modified javascript step; by keeping a record of the last sum found in the row.

Ana Gonzalez's profile image
Ana Gonzalez

You could also use Merge join step without any keys for the join, or the Join rows (cartesian product) product step to achieve the same, depending on the number the size of your data one of those could be more eficient20190329_001

20190329_002

Regards

James Antony's profile image
James Antony

Thank you for this. This might be useful for another issue I'm having at the moment.

Data Conversion's profile image
Data Conversion