Pentaho

 View Only

 Pentaho Data Integration carte - custom API read JSON body

Mini Malne's profile image
Mini Malne posted 08-26-2022 14:29

Hello community,

I am trying to dig in Carte and creation of custom API using Pentaho Data Integration. It works like charm when I use GET and define parameters for transformation/job in URL. Unfortunately I need to read JSON as an input (POST body), process it and then return transformed JSON. Is there any way to do it?

Thank you very much for any help.

With kind regards
Jiri

Juan Sierra Pons's profile image
Juan Sierra Pons
Hi @Mini Malne

Can you provide a more detailed example? or more context on the challenge you are facing?

I don't fully understand your problem here.

Regards​​
Antonio Petrella's profile image
Antonio Petrella
Hello, 
  I investigated some time ago the possibility of sending POST requests through Carte without success and eventually I turned to using a long list of GET parameters. If you find a way to use Carte with POST requests please let us know.

As for the JSON return data, you might want to try the JSON Output step configured with the "Pass output to servlet" checkbox checked. By doing so, the Carte call returns a JSON object crafted in the called transformation.

Best regards,
Andrew Cave's profile image
Andrew Cave
Hi Mini

I describe how to use a CDE form to send JSON to a Transformation here

https://bizcubed.com/ctools-using-a-pentaho-cde-form-as-an-input-form-part-1

and  here

https://bizcubed.com/ctools-using-a-pentaho-cde-form-as-an-input-form-part-2

regards
Mini Malne's profile image
Mini Malne
I probably found the solution. Thanks to @Andrew Cave I realized it is possible. The call is the same as it will be sent from any standard tool (like Postman).
I had almost the same transformation as mentioned in https://bizcubed.com/ctools-using-a-pentaho-cde-form-as-an-input-form-part-2. The important part that is not mentioned in Part 1.
You have to use Content-Type: application/x-www-form-urlencoded and in the body set following data  p_json={"data":"value"} and then the value {"data":"value"} will be set as parameter p_json in the transformation. My transformation simply returns the same json sent in the body, so the result is {"data":"value"} with no "p_json=" in the beginning.

In steps as follows:
1) start Carte with your preferred setup
2) call POST to http://my_carte:my_port/kettle/executeTrans/?trans=/home/kettle/example.ktr and set Content-Type: application/x-www-form-urlencoded and to body insert p_json={"data":"value"}

The name of the parameter and the Content-Type are crucial.

Hope it will help some other users as well.

Jiri
​​
Antonio Petrella's profile image
Antonio Petrella
Hello,

  I have a follow up question to this topic.

Is the p_json sent as a body of the POST request or rather as another query parameter of the POST request (just like trans, or rep or level)?

In other words, if you had to use curl to send the Carte call, would it be something like:

curl --request POST \
--url 'http://<hostname>:<port>/kettle/executeTrans?trans=<transformation.ktr>&level=Debug&rep=&p_json=%7B%22data%22%3A%22value%22%7D' \
--header 'Authorization: ..... \
--header 'Content-Type: application/x-www-form-urlencoded' \​

or rather:
curl --request POST \
--url 'http://<hostname>:<port>/kettle/executeTrans?trans=<transformation.ktr>&level=Debug&rep=' \
--header 'Authorization: ....' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data 'p_json={"data":value"}'​


The second request does not work for me, i.e. the p_json is not being read by the transformation. The first curl works but doesn't look that different from a GET request, given that everything is sent as a query parameter and the payload is not sent as a POST body.

Did you manage to use the POST body for your request?

Thank you!

Andrew Cave's profile image
Andrew Cave

Yeah.  You can send multiple parameters in the body too.   If they are named in the transformation parameters they'll get passed in.

curl --request POST \
--url https://host:port/pentaho/kettle/executeTrans/ \
--header 'Authorization: Basic whatever' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data '{"P_JSON":"{\"data\":[{\"field1\":\"880\"\"}]}","P_JOB_GUID":"d6eugsx1ilf-998202c8582c7-pc02urnb1ff"}'

Antonio Petrella's profile image
Antonio Petrella
Hello,

P_JSON is named in the transformation's parameter list however it still fails in receiving the JSON body.

By inspecting the browser's Network tab after having sent the call, I've also noticed that the original POST request gets converted to a GET and therefore I lose the json body.

I'm using plain Carte webservice, not the pentaho server, I wonder if this makes the difference.
Andrew Cave's profile image
Andrew Cave
I am puzzled as to how you are making the call.    Reading it in the Inspect window and having the HTTP method changed is weird.

How exactly are you making the call to pentaho and on what page are you reading the request contents on.   As an aside loggers by default don't log the body of a POST message.
Antonio Petrella's profile image
Antonio Petrella
Hi Andrew,
  These are the steps taken in my test bench:

  1. Start the carte webservice by running carte.sh localhost 8097
  2. Create a very simple test.ktr transformation, with P_JSON in the transformation parameters, a Get Variable step to fetch ${P_JSON} and a Write to log step to print the value of such variable in the log file (see the transformation test.ktr attached to this message)
  3. Call the Carte executeTrans API through the Insomnia client configured for a POST request with a P_JSON body set to {"data":"value"}, with query parameters for the transformation name, repository name and log level.  Set the Basic Authentication and the Content-Type header as application/x-www-form-urlencoded.
From the screenshot (see sshot.png attached to this message) of the timeline taken by the insomnia client after having called the executeTrans API I can read that the initial POST request (first yellow block from the top) is correctly formed, with the query parameters, the authentication, the header and the post data (line in purple).

Immediately after the first POST request I receive a response as HTTP/1.1 302 Found and the comment Switch from POST to GET with redirection to the URL in the Location response header and, as expected by the redirection, subsequent GET request with the same query parameters but without the POST data. Eventually the carte webservice returns a HTTP/1.1 200 OK. At this point the transformation has completed its echo of parameters on the logfile with the empty post data (as it was in fact a GET request).

To exclude Insomnia from the test bench I then created a simple web page locally with a just form posting data to the same executeTrans API (this time the POST data is the string "test post data"). By submitting the form I get the same HTTP 302 response with redirection from POST to GET (see sshot2.png for the timing of this request). In fact, from the web developer tool in firefox I can investigate the post data under the Request tab (look for the Form data in the lower right corner of the sshot2.png).

It seems to me that the Carte webservice is configured to redirect POST to GET, but I might have missed something, so thank you for your comments.
Attachments  View in library
sshot.png 178 KB
test.ktr 13 KB
sshot2.png 35 KB
Alan Hope's profile image
Alan Hope

For Antonio Petrella 

http://<hostname>:<port>/kettle/executeTrans?trans=​

The upper url will get 302, you need try:

http://<hostname>:<port>/kettle/executeTrans/?trans=​

add one more forward slash before '?'
Juan Sierra Pons's profile image
Juan Sierra Pons
@Alan Hope

The -L can be used with curl to allow curl to follow 302 redirections

-L, --location
(HTTP) If the server reports that the requested page has moved to a different location (indicated with a Location: header and a 3XX response code), this option will make curl redo the request
on the new place. If used together with --include or -I, --head, headers from all requested pages will be shown. When authentication is used, curl only sends its credentials to the initial
host. If a redirect takes curl to a different host, it will not be able to intercept the user+password. See also --location-trusted on how to change this. You can limit the amount of redi‐
rects to follow by using the --max-redirs option.

When curl follows a redirect and if the request is a POST, it will send the following request with a GET if the HTTP response was 301, 302, or 303. If the response code was any other 3xx
code, curl will re-send the following request using the same unmodified method.

You can tell curl to not change POST requests to GET after a 30x response by using the dedicated options for that: --post301, --post302 and --post303.

The method set with --request overrides the method curl would otherwise select to use.

Example:
curl -L https://example.com

See also --resolve and --alt-svc.
Antonio Petrella's profile image
Antonio Petrella
@Alan Hope

Excellent, spot on!
By adding the forward slash before the question mark I don't get the redirection and I can read the post data from the target transformation.

Many thanks!​