Hi guys,
I need to process a list of EasyMorph datasets (.dset files) stored in a folder to create related DB tables in a database.
Thus, I do get the list of files from the folder
Then iterate this list
The second module does create the table (delete if exist) with Database command (table name passed as a parameter) + populate the table with data from the appropriate.dset file, with Export to database action.
However, on second iteration, I always get errors about missing columns. It looks like Database command action does not refresh target table schema from the EasyMorph table schema (loaded .dset), and keeps in memory schema from previous execution. I tried as well to move my module in a separate project, same result.
Any idea what I am doing wrong ? I would like to avoid to create those tables "one by one"
Thanks and regards.
David
Hi @David_Juras,
Is the "Column mapping" option of the "Export to database" action set to "Automatic"? If so, which database you are exporting your data to?
Hi @andrew.rybka
Thanks for your answer. Yes it is set to "Automatic". We are using MariaDB for this project.
However I did notice that my connector is setup as Type = MySQL. Will test ASAP with proper MariaDB connector. But as they are very close DBs, maybe that won't change anything.
Will update you asap.
@David_Juras, switching to another connector won't help because the main issue here is the "Create table" command of the "Database command" action. That command requires the columns for the created table to be explicitly defined.
As a workaround, you can create a project that will generate "CREATE TABLE" command based on the input dataset using the "Table metadata" action. It's actually pretty straightforward. The only trick here is to distinguish between numeric and Date/DateTime columns. Does you data have both numeric and Date/DateTime columns? If so, does the data in those columns include only recent dates, or does it span multiple decades or centuries?
@andrew.rybka OK, that's what I thought.
Yes, my data contains both numeric and date/date time columns.
Dates are starting from 2007.
Hi @andrew.rybka,
Can you please elaborate "As a workaround, you can create a project that will generate "CREATE TABLE" command based on the input dataset using the "Table metadata" action. It's actually pretty straightforward". I am not sure to understand the workaround. Thanks.
Hi @David_Juras,
I meant creating a project that generates 'CREATE TABLE' SQL commands based on the provided dataset. These generated commands can then be executed via the 'Custom Command' command of the 'Database Command' action.
Here’s an example of such a project. While the generalized version turned out to be less straightforward than I initially anticipated, I believe it’s configurable enough for you to use without requiring modifications. The project also contains the Test
module with a usage example.
GenerateCreateTableCommand.zip (77.4 KB)
The following parameters should be provided to the project:
- TableName - quoted or unquoted name of the created table. When the {Delimiter} parameter is provided and the {TableName} contains an unquoted name, the name will be automatically quoted by the project
- Delimiter - should be provided if you want to generate a command with quoted table and column names
- MinDateThreshold - Used to distinguish between numeric and Date/DateTime columns. Should be set to the smallest possible date in the exported data
- MaxDateThreshold - Used to distinguish between numeric and Date/DateTime columns. Should be set to the greatest possible date in the exported data. When set to an empty value, the current date will be used
- IntegerType, NonIntegerType, DateType, DateTimeType, BooleanType, TextType - database column types for corresponding detected column types (see below)
The project detects column types as follows:
- Column is detected as Datetime when either of the following is true:
- The column has a date or a datetime format and has at least a single datetime value (non-integer number)
- The column has only numeric and empty values, at least a single datetime value (non-integer number) and all the non-empty values are between {MinDateThreshold} and {MaxDateThreshold} (or the current date if the parameter is empty)
- Column is detected as Date when either of the following is true:
- The column has a date or a datetime format, at least a single date value (integer number), and doesn't have datetime values
- The column has only numeric and empty values, at least a single date value (integer number), doesn't have datetime values and all the non-empty values are between {MinDateThreshold} and {MaxDateThreshold} (or the current date if the parameter is empty)
- Column is detected as Non-integer when all of the following is true:
- it wasn't detected as a date or datetime column
- it contains only numeric and empty values
- it contains at least a single non-integer number
- Column is detected as Integer when all of the following is true:
- it wasn't detected as a date or datetime column
- it contains only integer and empty values
- Column is detected as Boolean when it contains only boolean and empty values
- Column is detected as Text when it contains at least a single text value
If a column type can't be detected, the project will fail.
Feel free to ask me any questions regarding this project and the export workflow.
Hi @andrew.rybka, thanks so much !
Doing the job so far with very little tweaks for my specific cases. Once again, thanks a lot.
I will let you know if I have some issues around the topic in a future usage.
Any plans in the future to handle this "iterate table creation with database command" directly with some additional configuration in the Action itself ?
Hi @David_Juras,
You are welcome! I'm glad to hear that it's working for you.
As for incorporating this approach into the action itself, we have some concerns about whether this solution will meet the needs of a wide range of users. For now, we plan to wait for your feedback, as well as input from other customers.