AnsweredAssumed Answered

Spoon: performance issue when initialize PreviewData tab

Question asked by Weekend Warrior on Jan 19, 2018

Hi, i'm using Pentaho 7.0.0.0-25 having a transformation with ca 250 steps starting with 8 TableInput steps where streams are joined to one stream which is writing to result database table.

 

I now realized, that when starting the transformation (in Spoon with Preview Data active) it stays in state idle for up to 5 minutes. Ok, in first execution the DbCache has to be created what is time consuming as i have very complex queryies but also later executions stay long time in state idle.

 

Checking the implementation is see, that you call

transPreviewDelegate.capturePreviewData( trans, meta.getSteps() );

in prepareTrans fo TransGraph, here you call getStepFields() in TransMeta for each step.

 

i'm wondering now

why you delete the stepFieldsCache when processing a nwe step

why you do not restore the previous step when calling getStepFields recursively

 

This cause, that you have to recursively start the initialization from root step for each single step instead of reading already initialized rowMeta from stepFieldsCache.

 

from my pov you have to store/ read not only preStep and the hop but also the current step to stepFieldsCache like this:

 

    String fromToCacheEntry = stepMeta.getName() + ( targetStep != null ? ( "-" + targetStep.getName() ) : "" );

    RowMetaInterface rowMeta = stepsFieldsCache.get( fromToCacheEntry );

    if ( rowMeta != null ) {

      return rowMeta;

    }

    rowMeta = stepsFieldsCache.get( stepMeta.getName() );

    if ( rowMeta != null ) {

      return rowMeta;

    }

 

    stepsFieldsCache.put( fromToCacheEntry, rowMeta );

    if(!fromToCacheEntry.equals(stepMeta.getName()))

    stepsFieldsCache.put( stepMeta.getName(), rowMeta );

Outcomes