Good morning everyone,As I'm gonna be needing to chose an ETL to perform daily tasks I wanted to try Pentaho by using the Community Edition first to see if it would be worth buying the Enterprise EditionAt the moment the product really satisfy my needs, however, when trying to work with a large volume of data the transformations are very slow, and I was wondering if it comes from the Community Edition being slow on purpose to get you to buy the Enterprise one?
Here is an example of transformation i'm working on:The CSV contains users with informations about them, the purpose is to put those users in a database by executing HTTP Request to a REST API linked to a MYSQL DataBase.It works well, with around 1000 rows in my CSV. But as soon as I get above 10000 rows it gets very slow during the "REST client" step (around 20 rows are processed per second then after 20 minutes it goes down to 5 rows per second.Processing the whole 10000 rows take 40 minutes which is way too much compared to other solutions that I tried which took less than 2 minutes.
Any Ideas on how to overcome this problem?Is PDI Community Edition slowed on purpose?Thanks in advance, and sorry for any mistakes as english isn't my native language.Cordially.
Thanks to your advice I was able to make the time of the transformation go from 38minutes to 2minutes!
Now I have an other question, when trying to do the same with a 100000 rows CSV Pentaho just stop answering (between the two json blocks).I've tried increasing the Nr of rows in rowset but nothing changes.Any idea? Thanks in advance,Robert.
A proud part of Hitachi Vantara
© Hitachi Vantara Corporation. All Rights Reserved.