How to make Insert/Update not ignore duplicate values from production table?

Question

I created a Pentaho transformation to parse JSON from production table and load data to a warehouse table. To avoid inserting new duplicates, I used Insert/Update in my final step and although it does work well, it also ignores the existing duplicates from production table. Due to specific use cases, I need to include existing duplicates from production table to warehouse table but not sure how can I do it, the only way I can think of is to change the Comparator from = to > in The Keys to lookups values box, and it does include existing duplicates on first run, but once I run the transformation second time, it will insert new duplicates not from production table but on its own, in other words, the Insert/Update step generates new duplicates if I change the Comparator to > and I want to avoid that, what I want is to load existing duplicates from production table to warehouse table but avoid having Insert/Update step to generate new duplicates. Is there any chance I can change my approach here? Thanks!

Answer

Nvm, I finally figured it out. Just assign unique identifiers to records in the table before running the transformation and then I was able to retain existing duplicates.

Pentaho

How to make Insert/Update not ignore duplicate values from production table?

Related Content

Using Pentaho with Snowflake to load data at scale

PDI 8.0 Insert/Update step inserting NULL value for empty string

User Defined Java Class yields duplicate values in table output step

RE: User Defined Java Class yields duplicate values in table output step

Pentaho Spoon - Dimension lookup/update - update not working