Pentaho

 View Only

 S3 CSV Input fails when filename has special char (+) on PDI v9.4

Abhishek Sawant's profile image
Abhishek Sawant posted 02-05-2024 10:21

We are in process of migrating our existing ETLs from v7.0 to v9.4

I'm getting below error on PDI v9.4 when we read S3 file in S3 CSV Input step having "+" (plus) sign which was working on PDI v7.0:

2024/02/05 12:10:39 - S3 CSV Input GB.0 - com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we calculated does not match the signature you provided. Check your key and signing method. (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch; Request ID: 89BBHY8ZNHQJSFMP; S3 Extended Request ID: 8nnRCXTb8kayRgphj7/bCYD218xEEfStXPuS/GA6F9BD34dNX4cPQ3OiwJkQ0UKh0HqWgqvvbf8=), S3 Extended Request ID: 8nnRCXTb8kayRgphj7/bCYD218xEEfStXPuS/GA6F9BD34dNX4cPQ3OiwJkQ0UKh0HqWgqvvbf8=
2024/02/05 12:10:39 - S3 CSV Input GB.0 - The request signature we calculated does not match the signature you provided. Check your key and signing method. (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch; Request ID: 89BBHY8ZNHQJSFMP; S3 Extended Request ID: 8nnRCXTb8kayRgphj7/bCYD218xEEfStXPuS/GA6F9BD34dNX4cPQ3OiwJkQ0UKh0HqWgqvvbf8=)

filename is "export/cipDailyDetail/32f403/2024/01/28/appleTV+_Detail_GB_32f403_01282024_v1_1-1.csv" .

I tried escaping by adding \ before + but gave same error, also i tried URL encoding - "export%2FcipDailyDetail%2F32f403%2F2024%2F01%2F28%2FappleTV%2B_Detail_GB_32f403_01282024_v1_1-1.csv" but it gave "The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey"

Attached are the screenshots for more details.

when a filename without "+" is passed, it runs successfully.

Is there a way to read filename with + (plus) sign as the S3 files are created using product name (which is valid filename and it use to work in PDI v7.0)?

Thomas Stutz's profile image
Thomas Stutz

Consider renaming files that contain a plus sign before processing (using pentaho of course).