Pentaho

 View Only
Expand all | Collapse all

Performance issue with SQL query using Tableinput on Postgresql

This thread has been viewed 9 times
  • 1.  Performance issue with SQL query using Tableinput on Postgresql

    Posted 12-07-2023 05:17

    Hello team

    I am using Pentaho Table Input step to fetch data from a Postgresql database join. The query used takes 56 seconds to finish whereas the same query when executed using the pgadmin finish in less than 7 seconds. I collected the execution plan and could find that the execution plan running from both the tools is exactly the same except one index scan taking 40 more seconds to finish and that is the costliest DB action inside query execution.

    Has anyone faced such issue in the past?

    Any suggestions on connection configuration that will help to improve the performance?

    The query is already in optimized and I can see the issue is with other queries as well so there is no scope to further improve the execution plan to fix the problem.

    Your suggestions/recommendations on this topic is highly appreciated.

    Regards

    Sachin Tappe 



    ------------------------------
    Sachin Tappe
    Others
    BMC Software
    ------------------------------


  • 2.  RE: Performance issue with SQL query using Tableinput on Postgresql

    Posted 12-08-2023 11:44

    I want to clarify what you posted.  You are saying the same query run though pgAdmin SQL window and the table input step do not have the same execution plan?  I can honestly say I have not seen that.  The query is passed along through the jdbc driver, it should not be "different".

    Let's look for differences that could alter the plan and go from there.

    1. It is not just the input but the connection of the input to other buffering steps that changes performance.  Kettle will buffer 10k rows by default, so the input processing can be slowed by buffering downstream.  Disconnect the input from downstream processing and confirm.
    2. Parameters or bind variables.  Because these are unknown at parse time, the plan can be altered to account for all combinations.  When run though pgAdmin these are likely placed in the query thus altering the parse phase performed by the jdbc driver.
    3. Different users with different rights and/or using views.  Sometimes systems will be set up to filter user's DB access rights by hiding their tables behind views and/or synonyms.  Make sure you are using the same users and accessing the physical table in both cases.


    ------------------------------
    Stephen Donovan
    Digital Solutions Architect
    Hitachi Vantara
    ------------------------------