Calling HCI REST API from Pentaho Kettle

Document created by YK Park Employee on Apr 2, 2018Last modified by YK Park Employee on May 27, 2018
Version 4Show Document
  • View in full screen mode


In CARDv3 project, it was required to get file list from HCP-AW.

Since Pentaho was installed in Linux machine, I could not connect HCP-AW as local directory.

 

In this example, I used HCI to get index as JSON format and parse file list from the index that were responded by HCI.

(maybe, I could use HCP-AW API as well rather than HCI)

 

 

In this document, I will describe,

  • Configuring HCI self-signed certificate in Pentaho
  • Authentication API
  • Search API

 

Software version

  • Pentaho 7.1
  • HCI 1.1.4

 

Reference: Integrating to Content Intelligence REST APIs

 

1. Configuring certificate in Pentaho

Since HCI use "https", we need to register certificate into Java used in PDI.

 

First, we need to download certificate from browser and register the certificates into Java using "keytool"

 

(1) Download certificate

In any browser like Firefox, Internet Explorer or Chrome, we can download certificate.

Here, I used Firefox.

Name and store certificate into local directory.

Then copy the certificate into the Pentaho installed machine.

 

After copying certificate, register into Pentaho Java directory

keytool -import -alias <alias name> -keystore <java cacerts path> -file <certificate path>
For windows

>> keytool -import -alias hci -keystore C:\Pentaho\java\lib\security\cacerts -file my_hci_certificate.crt
For Linux
>> keytool -import -alias hci -keystore /home/pentaho/Pentaho/java/lib/security/cacerts -file my_hci_certificate.crt

 

2. Using Authentication API in transformation

(1) Calling Authentication API

As a cURL example,

curl -ik -X POST https://<system-hostname>:8000/auth/oauth/ \

-d grant_type=password \

-d username=<your-username> \

-d password=<your-password> \

-d scope=* \

-d client_secret=hci-client \

-d client_id=hci-client \

-d realm=<security-realm-name-for-an-identity-provider-added-to-HCI>

 

In transformation, use "Lookup-REST client" step,

URL field name ("authURL") is predefined in previous step ("Initialize Variables").

  • authURL: URL field
  • authParams: HTTP body
  • authRestResp: Response field that will store the result of REST API

 

(2) Parsing authentication response (JSON)

HCI will return JSON formatted response once authentication request comes.

{ "access_token" : "eyJr287bjle..." }

In order to access token in transformation, we need to parse JSON strings as below.

In previous step (REST client), result string was stored which field name is "authRestResp"

Use "JSON input" step and configure.

This step will take "access_token" value into field name "authToken".

Now it is ready to send API to HCI with authentication token that we received.

 

 

3. Querying Search to HCI

(1) Creating Authentication Header

Before calling any REST API to HCI, we need to build header which includes the access token that we previously got from authentication API.

 

In cURL example, it would be written as,

curl -X GET --header "Accept:application/json" https://<system-hostname>:<admin-app-port>/api/admin/instances --header "Authorization: Bearer <your-access-token-here>"

 

In Pentaho transformation implementation,

iniAuthHeader is "Bearer".

So the output of this step would be authHeader: "Bearer: <access-token>"

 

(2) Search REST API

In cURL example,

Request

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{

  "indexName": "Enron",

  "queryString": "*:*",

  "offset": 0,

  "itemsToReturn": 1

}' 'https://cluster110f-vm3:8888/api/search/query'

Response

{

  "indexName": "Enron",

  "results": [

    {

      "metadata": {

        "HCI_snippet": [

          "Rhonda,\n\nYou need to check with Genia as I have never handled the physical power agreement matters.\n\nSusan \n\n -----Original Message-----\nFrom: \tDenton, Rhonda L.  \nSent:\tTuesday, January 15, 2002 2:17 PM\nTo:\tBailey, Susan\nCc:\tHansen, Leslie\nSubject:\tSouthern Company Netting\n\nHere's Southern.  I never received a copy of the Virginia Electric Master Netting.  We do have netting within the EEI.\n << File: 96096123.pdf >> \n\n"

        ],

        "Content_Type": [

          "message/rfc822"

        ],

        "HCI_dataSourceUuid": [

          "f1c05be1-5947-41e1-a9f1-03a98f0fa036"

        ],

        "HCI_id": [

          "https://ns1.ten1.cluster27d.lab.archivas.com/rest/enron/maildir/bailey-s/deleted_items/25."

        ],

        "HCI_doc_version": [

          "2015-07-02T09:06:02-0400"

        ],

        "HCI_displayName": [

          "RE: Southern Company Netting"

        ],

        "HCI_URI": [

          "https://ns1.ten1.cluster27d.lab.archivas.com/rest/enron/maildir/bailey-s/deleted_items/25."

        ],

        "HCI_dataSourceName": [

          "HCP Enron"

        ]

      },

      "relevance": 1

      "id": "https://ns1.ten1.cluster27d.lab.archivas.com/rest/enron/maildir/bailey-s/deleted_items/25.",

      "title": "RE: Southern Company Netting",

      "link": "https://ns1.ten1.cluster27d.lab.archivas.com/rest/enron/maildir/bailey-s/deleted_items/25."

    }

  ],

  "facets": [],

  "hitCount": 478

}

 

  • Body: searchParams

{"requests":[ {

     "indexName": "<index_name>",

     "queryString": "",

     "offset": 0,

      "itemsToReturn": 1000,

      "facetRequests": [{

          "fieldName": "HCI_dataSourceName",

          "minCount": -1,

          "maxCount": -1,

          "displayName":"HCI_dataSourceName",

          "type": "string" }],

     "sortFields":[],

     "filterQueries":[]

     }]

}

1 person found this helpful

Attachments

    Outcomes