Hello, I am analyzing why our ETL process using PDI is gradually performing slower with each execution when put under a heavy load for hours.
This process has been in place for years and has migrated across multiple versions of PDI (Kettle). I believe when first created it was on version 4. This year, we upgrade from version 5.4 to 8.0. After making this upgrade, we see performance move from fast to extremely slow over a few hours of constant processing.
Normally, this process takes 2-5 minutes to complete, however, after hours of back to back processing the performance diminishes to 25-35 minutes to complete one run. On version 5.4, we did not see this type of performance issue, but there were other issues.
This process has hundreds of steps and we have not tried anything to improve performance by making changes to the ktr file. We have ensured the environment is setup correctly and checked as many environment variables as we know to check.
Here are some specifics about our environment:
Postgresql DB version 10 on both databases
Postgresql JDBC driver 42.2.2
Java 8 (Oracle 1.8.0_181)
Currently, I have VisualVM connected to the server to watch performance. I notice the heap gets larger over time with the garbage collection running but it seems to not clean up as much as it should.
Our current work-around is to restart the tomcat server and this gets the performance back to the acceptable range. However, this is not a desired or sustainable solution.
Has anyone else faced a similar issue?
If so, how did you resolve it?
Any other ideas for pinpointing the issue?
Were there changes from version 5.4 to 8.0 that would cause this type of performance degradation?
Any help is greatly appreciated.