Pentaho

 View Only

 Loading large files

Ram Ch's profile image
Ram Ch posted 09-08-2023 21:46

HI,

I am trying to load large files into a table with some additional steps. I have set number of copies to 3, is there a way I can improve the performance further or are there any other way I can design this?

Read a Fixed width file, add sequences, cuts one of the field, sets some variables to stream field and loads into stg.

On another point.. I do have to do database lookup to get additional columns, but performance is really bad with those steps... tried database join and database lookup

Thanks,

Eshwar

Petr Prochazka's profile image
Petr Prochazka

Hi Eshwar,

this is too few information. Which database is on output? How is set commit size for output.

And you can try use bulk loader step. It's available for some DBMS, ex. PostgresSQL, MySQL.

There are difference between DB join and lookup. DB join call query for every row of stream. But DB lookup can use memory cache for key fields or load whole table to memory.

And performance depends od executed lookup query of DBMS. You can create some optimisation, ex. sort data for lookup fields and use cache. Query is executed only for first row, next rows by same key fields check cached lookup result at first or execute query.