Import a CSV that contains a list of views I want to query (some needing a where, some not)
Derive 2 tables from the list - one for the “needs where” and one for those that don’t
Iterate through the “doesn’t need where” table passing the view name as a parameter to a new module that should substitute in the view name into a dynamic query against the PostGres connector prefaced by a "select * from "
Iterate through the “needs where” passing the view name and where clause.as parameters into a dynamic query containing “select * from scheman_name.” & {view_nm} & “where_clause”
The first two steps where easy, but I get an error trying to create the module to query the PostGres views and I think it is because I am not selecting an object to execute against. I couldn’t find anything in the docs about the best way to approach.
If you run a custom SQL query composed using parameters then it shouldn’t require an object to execute against. It’s simply an SQL statement to be executed by the database driver.
What kind of error do you get? Can you post a screenshot?
I got it working, but I am trying to figure out how to “chunk” the results - I have a table that has 10M rows in it and while I have upped the timeout limits to what we allow, it is still failing. I want to break the cursor down to grabbing 20K rows at a time. Is that possible with the PostGreSQL connector?
Just to be sure I have this here my use case and what I think it translates to in EasyMorph:
Use Case:
I need to extract a set of view data from my Database. Some of these views have very large row counts. I have identified all the views needing “chunking” based upon their row counts and performance.
Here is what I came up with:
A master project that processes the workbook that contains the list of views to extract.
a) Derive the following filtered tables
i) Views that do not need to be chunked
ii) Views that need to chunked
b) Iterate through views that don’t need to be chunked - Need “Non-chunk” project for iteration
c) Iterate through views that need to be chunked to get “partition_ID” values - Need “Get_ID” project
d) Iterate through each view’s “partition_ID” values to create partitioned extract files - Need “Process Partions” project
e) Iterate through views to combine partitioned extract files into one extract dataset - Need “Combine Partitions” project
Am I looking at this right? Can I use one project with multiple modules or is multiple projects the best approach?
The approach looks good to me. If you don’t need the parts “Non-chunk”, “Get_ID” and other parts to be re-usable, then it would be more convenient to keep them as multiple modules in one project. Moving a part of workflow logic into a separate project only makes sense if it will be re-used by 2 or more other projects.