Dynamic SQL Query in PostGres

kmainsworth · January 11, 2021, 9:05pm

Here is what I am trying to do:

Import a CSV that contains a list of views I want to query (some needing a where, some not)
Derive 2 tables from the list - one for the “needs where” and one for those that don’t
Iterate through the “doesn’t need where” table passing the view name as a parameter to a new module that should substitute in the view name into a dynamic query against the PostGres connector prefaced by a "select * from "
Iterate through the “needs where” passing the view name and where clause.as parameters into a dynamic query containing “select * from scheman_name.” & {view_nm} & “where_clause”

The first two steps where easy, but I get an error trying to create the module to query the PostGres views and I think it is because I am not selecting an object to execute against. I couldn’t find anything in the docs about the best way to approach.

Any suggestions?

Thanks,

Keith

dgudkov · January 11, 2021, 10:19pm

Hi Keith,

If you run a custom SQL query composed using parameters then it shouldn’t require an object to execute against. It’s simply an SQL statement to be executed by the database driver.

What kind of error do you get? Can you post a screenshot?

kmainsworth · January 12, 2021, 8:42pm

Hi Dmitry,

I got it working, but I am trying to figure out how to “chunk” the results - I have a table that has 10M rows in it and while I have upped the timeout limits to what we allow, it is still failing. I want to break the cursor down to grabbing 20K rows at a time. Is that possible with the PostGreSQL connector?

Thanks,

Keith

dgudkov · January 12, 2021, 9:16pm

Hi Keith,

the approach is to find an attribute by which the source dataset can be partitioned, then process the dataset partition by partition using iterations.

See this response for details: How to process Large dataset from Database

Let me know if you need any assistance with setting up iterations.

kmainsworth · January 19, 2021, 10:49pm

Hi Dmitry,

Just to be sure I have this here my use case and what I think it translates to in EasyMorph:

Use Case:
I need to extract a set of view data from my Database. Some of these views have very large row counts. I have identified all the views needing “chunking” based upon their row counts and performance.

Here is what I came up with:

A master project that processes the workbook that contains the list of views to extract.
a) Derive the following filtered tables
i) Views that do not need to be chunked
ii) Views that need to chunked
b) Iterate through views that don’t need to be chunked - Need “Non-chunk” project for iteration
c) Iterate through views that need to be chunked to get “partition_ID” values - Need “Get_ID” project
d) Iterate through each view’s “partition_ID” values to create partitioned extract files - Need “Process Partions” project
e) Iterate through views to combine partitioned extract files into one extract dataset - Need “Combine Partitions” project

Am I looking at this right? Can I use one project with multiple modules or is multiple projects the best approach?

Thanks,
Keith

dgudkov · January 19, 2021, 11:09pm

Hi Keith,

The approach looks good to me. If you don’t need the parts “Non-chunk”, “Get_ID” and other parts to be re-usable, then it would be more convenient to keep them as multiple modules in one project. Moving a part of workflow logic into a separate project only makes sense if it will be re-used by 2 or more other projects.