I'm trying to keep a large table ( > 60M rows) from a local Oracle database in sync with an Azure SQL database. I'm aware of the "Insert/ update" operator, and my current transformation looks like this:
The source table doesn't have a unique key, so I'm calculating one (by combining several columns into one field). The process works well with smaller tables (e. g. 50k rows), but with the large table of over 60M rows, it's taking forever, and I'm trying to find a more efficient design.
The table contains records of over 10 years. However, I know for a fact that records that are older than 3 months (current month + 2 former months) will never ever change. Also, new records can only be added within this 3 months time frame.
Would there be an efficient way to simply "ignore" records older than 3 months?
Would it make sense to create a checksum column? In this case only those records would be considered where there checksum is different (i. e. where the record has changed).
Another option would probably be to split the target table into two, one (old) part that will remain static, and a new part that will require syncing with the source database. My concern with this solution is that once records in the table with the recent records are old enough, they should be moved over to the "static" table. This would need to happen automatically (i. e. nobody should have to move data manually from one table to another).
I'd really appreciate some ideas how to achieve that.