Pentaho

 View Only

 Excel Writer (v9.2) parameter Stream XLSX doesn't work correctly

Rob Opdam's profile image
Rob Opdam posted 01-06-2022 08:36
G'day all,

Upgraded from PDI v9.0 to v9.2 to address LOG4J-issues but now I've a problem with my Excel datasets for PowerBI.
I've created the xlsx files with parameter 'Stream XLSX data' on succesfully in release v9.0
Now I had to disable this parameter because PowerBI didn't find any records in 'the table'
When I open the same files in Excel all records are there.

Any ideas on how te fix this?

TTFN, Rob
Gert Wieland's profile image
Gert Wieland

Could you share your transformation and a sample file?

I read Excel files all the time, but not with the "Streaming" option. I use Spread Sheet Type "Excel 2007 XLSX (Apache POI)", which works just fine.

Rob Opdam's profile image
Rob Opdam
Thx Gert.
It would take some time to create an privacy-proof set ;-)
The spreadsheet type you mentioned is no longer available in v 9.2
Only .xls [Excel 97 and above] and .xlsx [Excel 2007 and above]
The last one works, but it takes up a lot of time to generate some bigger spreadsheets.
Minutes versus seconds, hence the wish to use the streaming option (again).
It seams that the streaming option doesn't initiate and/or close the file correctly.
PowerBI doesn't recognize it.
After opening the sheet in Excel first and saving it, than PowerBi can read it.
Carl Messner's profile image
Carl Messner
This problem also appeared in older Pentaho versions and solved later with new releases.
The same issue appeared in Pentaho 9.2, unfortunattly i did not find any solution to the problem.
Only unchecking the "Stream XLSX data" checkbox the job/transformation worked (and begging to the users for being patient for largest excel files).
I found that the issue was something related to java with the correct close of the excel file.
It seems that in Pentaho CE 9.3 the issue was fixed again.