AnsweredAssumed Answered

The bug of CsvInput which is about multiple chars of delimiter

Question asked by Xiaoao Wanghu on Sep 27, 2018
Latest reply on Oct 6, 2018 by Xiaoao Wanghu

I have submit my code here:

7.1--fix the bug which is about multiple chars of delimiter by xiaoaowanghu · Pull Request #5810 · pentaho/pentaho-kettl…

But I only do some simple test with my code. So want to confirm my code is OK or not with the official team. I don't want to see other bugs about my code when my project is in production...

 

Below is detail:

When the multiple chars of delimiter (such as @|@) happened to be divided into two buffers, the data is read incorrectly in CSVinput node.

for example: set the NIO buffer to 10 and set delimiter as '@|@' in the CSVInput node, and the file has only one line as following:


a@|@aaaa@|@aaa@|@aaaa@|@aaaaa@|@

 

Get the fields automatically and click preview data, then you will see the bug.z

 

when the multiple chars of delimiter (such as @|@) happened to be divided into two buffers, the data is read incorrectly in CSVinput node.

for example: set the NIO buffer to 10 and set delimiter as '@|@' in the CSVInput node, and the file has only one line as following:
a@|@aaaa@|@aaa@|@aaaa@|@aaaaa@|@

Get the fields automatically and click preview data, then you will see the bug.

1.png

 

2.png

Outcomes