Pentaho

 View Only

 s3 file output Step connect timed out us-west-2

Vince Popplewell's profile image
Vince Popplewell posted 05-10-2022 12:02

The Pentaho Step S3 file output is picking up the Region = us-west-2 by default

How do we force it to look at the config and not default to Region = us-west-2 please? 

We are using profile instance

The environment .aws/config is set to region = eu-west-2

Andrew Cave's profile image
Andrew Cave
Hi Vince

So in the file it reads

~/.aws/config

[default] region=eu-west-2
and that file is in the home directory of the account that is running the pentaho instance?

try running the transformation with Debug logging on and see where Kettle is trying to read the config from.
Vince Popplewell's profile image
Vince Popplewell
Yes that file is in the root directory (I disguised name to pdi-xxx)

[root@pdi-xxx ~]# cd .aws
[root@pdi-xx .aws]# ls
config
[root@pdi-xxx .aws]# more config
[default]
region = eu-west-2
[root@xx .aws]#

I have -level=Debug and i am getting the same time out error: 
Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to dat-nonlive-xxx-dev-logs.s3.us-west-2.amazonaws.com:443 [dat-nonlive-xxx-logs.s3.us-west-2.amazonaws.com/52.218.229.1] failed: connect timed out

Does the config need more parameters like access_key or secret_key? Because these have been set up in the Profile.


Stephen Donovan's profile image
Stephen Donovan

I believe you are on the right track.  Kettle simply uses the java SDK from Amazon.  So the authentication options cycle through in order as specified here.

https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html

Not as familiar with CE, but double check your VFS connection set up.  The region selection is in that dialog an may be overriding the credentials file.

Attachment  View in library
image.png 45 KB
Vince Popplewell's profile image
Vince Popplewell
Thank you for your reply Stephen,

     What is VFS please?
Stephen Donovan's profile image
Stephen Donovan

VFS is Virtual FileSystem.  It allows you to reference s3, hadoop and others in a generic way in the dialogs.  Enabled in many of the input and output steps.  For example s3:  Would be configured as a VFS connection and then referenced as s3://.  Forgive me, as I have always set it up this way, have to dig deeper to see what other path you might be trying.

VFS configurations are in the same View tab as the database connections.  Right click New...

Vince Popplewell's profile image
Vince Popplewell
Hello Stephen,

     Thank you for replying  but it is not possible to connect through VFS at our installation. Spoon will not connect to aws s3. We have to deploy the code to our EC2 Linux Dev Box and run the job in a Shell Script. 
     I have a simple Job and Transformation with a Generate row step and an S3 file output step with:  s3n://s3n/dat-xxx-dev-logs/s3_xxx_test_file_vp 
     When i execute the script, the job fails with Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to dat-nonlive-xxx-dev-logs.s3.us-west-2.amazonaws.com:443 [dat-nonlive-xxx-logs.s3.us-west-2.amazonaws.com/52.218.229.1] failed: connect timed out

     The Job is supposed to pick up the aws region and credentials from the .aws profile and it does not work. The information is in this post.

     When i run the aws configure list statement, it has expected values but pentaho ignores them:

aws configure list
Name Value Type Location
---- ----- ---- --------
profile <not set> None None
access_key ****************MARE iam-role
secret_key ****************zsgf iam-role
region eu-west-2 config-file ~/.aws/config

Regards Vince
Vince Popplewell's profile image
Vince Popplewell
When I run the aws configuration statement we get the expected values but the Pentaho step ignores them:

aws configure list

Name Value Type Location
---- ----- ---- --------
profile <not set> None None
access_key ****************MARE iam-role
secret_key ****************zsgf iam-role
region eu-west-2 config-file ~/.aws/config[default]


Vince Popplewell's profile image
Vince Popplewell
When I run the aws configure list statement, it returns the expected values but Pentaho ignores them:

aws configure list
Name Value Type Location
---- ----- ---- --------
profile <not set> None None
access_key ****************MARE iam-role
secret_key ****************zsgf iam-role
region eu-west-2 config-file ~/.aws/config
Vince Popplewell's profile image
Vince Popplewell
Please reply to the new thread used for the discussion.
The old thread is broken!