Informations about EasyMorph

claudio.sofonio · June 20, 2017, 12:21pm

Good morning,
I’m trying easymorph free version with the target to buy the professional version.
After finished my first project, I have many questions about use and the futures of the software.

1 - Sandbox
How can I do the data sandbox refresh automatically?
I have a table with a filter that I use to put data to a sandbox, but when I change the filter, the refresh of data sandbox don’t run.
Only with manual “Send to Sandbox” allows the data refresh.

2 - Mapping Fields
I’m trying to export data from my sandbox to an external database table.
This is the situation:

My sandbox has 10 columns (item, colour, supplier, customer, etc…)
External Tables about 40 columns.

For to export the sandbox to external tables, I had to add every columns of destination tables with
the transformation “Calculate new columns”, because the transformation “Export to database” allows only table select
and the mapping columns is not expected.

There is a better way to do it?
There is, for example, a personalized object that allows the choice of the table columns where to put the data to export?

3 - Mail
Is possible to send email for detection the procedure fault?

4 - Main procedure/Sub procedure
Is possible to create a main procedure that calls two or more sub procedures?
I would like that every sub procedures start up only if the result of previous sub procedure be successful.

Thank you for any replies
Regards

dgudkov · June 20, 2017, 2:08pm

Hi @claudio.sofonio,

Sandboxes are meant to be static -- this is by design. If you need data to refresh automatically use derived tables (see Tutorial: Derived tables).

You can export only 10 columns into a database table even if it has 40 columns -- no need to create 30 more columns with empty values. Mapping is not needed too -- just rename columns so that column names in EasyMorph match the column names in the database table.

There is no a built-in feature for this. There is a workaround but it's a bit complicated and requires running EM projects from the command line, checking STDERR response and using a 3rd party emailing utility. Let me know if you need more information on it.

Sub-procedures are called using the Call or Iterate transformations. Typically, you can simply place one Call right after another to call two different subroutines. In this case if the 1st subroutine fails all the consequent calls are not executed.

If, for some reason, subroutines are called from different (and possibly independent branches) then the calls can be put in order using Synchronize transformation. Read more about the transformation here: Blog: Using Synchronize transformation.

claudio.sofonio · June 22, 2017, 12:53pm

Thank you Dmitry for your informations.
Is all clear!

Regards

claudio.sofonio · February 7, 2018, 9:22am

Hi Dmitry,
Recently, my company bought easy morph professional and Now I’m using it for a complex process.
I contact you, because I have not clear when it’s better to use a sandboxes instead derived tables.

I ask you some informations about this question, because I usually work with merge table transformation and i need that the data are updated.

I read your post about this argument, and you wrote that the sandboxes used for design, but the derived table can be used for it, too… And furthermore it has the feature that the embedded data are updated.

Thank you

dgudkov · February 7, 2018, 2:39pm

Hi Claudio,

Sandboxes
Sandboxes are typically used for disconnected one-off calculations with a transformation result, typically for data profiling or ad hoc data analysis. Imagine the following scenario:

At some point of your transformation project you obtained a list of customers. You are curious if there are customers mentioned two or more times in the list. You don’t need to find such customers on every transformation run, it’s just a one time data quality check. Therefore, you press Ctrl+B to create a temporary sandbox with the list, and then add the “Keep duplicates” transformation to the sandbox. The result is empty, no rows. Thus you understand that the list has no duplicates, as expected. You remove the sandbox and continue designing the project.

Think of a sandbox as of a “cutting board” that you can use to examine a dataset – e.g. does it have duplicates? If there are values of a wrong type? What is the total sum/count? Once you have your questions answered you remove the sandbox.

Sandboxes can also be used for one-time data exports. Again, you take a transformation result from the middle of a transformation chain, send it to a sandbox, then add, for instance, the “Export to Excel” transformation. After exporting is complete you remove the sandbox.

Sandboxes are disconnected, and therefore allow doing one-time operations with data without changing the main transformation chain.

All data in sandboxes is temporary. Once you close a project, all its sandboxes become empty. Data is not saved in sandboxes.

Derived tables
Derived tables are for a different purpose. They are used to arrange a permanent (i.e. not one-time) non-linear calculation logic, where a dataset should be used for two or more different kinds of calculations. Our tutorial article about derived tables mentioned above have a few examples.

Recently, derived tables got another use. Now they can be used to arrange a conditional calculation. See this blog article for more details: Blog: Conditional workflows in EasyMorph..

Resume
While sandboxes and derived tables might appear similar they have very different use cases. Typically, you would use derived tables for arranging a permanent, non-linear calculation logic, while occasionally sending data to temporary sandboxes for data quality checks or examining datasets.

claudio.sofonio · February 8, 2018, 8:00am

Thank you dmitry, for your availability and your knowledge.

Now, I have clear the differences.
Thank you for all.
Claudio

dgudkov · February 8, 2018, 5:17pm

You’re welcome!

PwrSrg · October 11, 2018, 8:24pm

I found this thread while searching for a way to MAP disparate tables (columns) for export.

Your suggested workaround (because you definitely can't call it a solution) of renaming every field is not only laughable, but is completely impractical and counterproductive. It is not a solution, it's a hack. Especially for software in the ETL space in which one of it's main purposes is easily mapping table data. I tried it on one of our smallest tables of only 21 columns and it was a NIGHTMARE. I could not imagine having to map hundreds of tables without a real mapping function.

Your software is very smart, clean and intuitive, but mapping is absolutely "needed". Without even a basic dedicated table mapping function, it is practically unusable. What I find strange is that you already have nearly the exact same functionality built into many other functions, mainly "Merge another table".

Also, just to verify - does "Export to database" really only export the first 10 columns of a table instead of all of them?

-Sergio

dgudkov · October 12, 2018, 4:42pm

@PwrSrg,

you have a point and we will think how to implement proper mapping in EasyMorph.

The "Export to database" action exports all columns of a table, not just the first 10.

PS. I appreciate your valuable feedback. It would be nice if you could tone down your comments a bit. Thanks.

dgudkov · January 16, 2019, 1:23am

@PwrSrg,

Column-to-field mapping is now supported in EasyMorph Starting from v3.9.2. You can download it from our web-site: https://easymorph.com/download.html. Note that it’s not available in the free edition.

Thanks again for your feedback.

etl-mapping

Romain_Dunand · April 6, 2021, 1:35pm

It would be great to have “auto-mapping” of existing fields with the explicit
Or an option with the “automatic” option to ignore missing fields

Use case :
One tiers person provide a source files with predefined fields
You create a batch to process / import the file in an SQL DB.
The tiers person adds a new column to the source (which you don’t need to)
→ your process stop working
→ You need to map manually all fields to workaround this issue (or add fields you don’t need)

The best would be an option to allow automatic mapping, ignoring missing columns/fields IMO
(keeping it as optional as the current behavior might be needed in other uses cases)

dgudkov · September 8, 2023, 5:20am

We will add the “auto-mapping” button for the explicit mode (also mentioned in this topic: Export to Database auto mapping).

However, the “Ignore missing fields” option doesn’t seem necessary. You can simply add the “Select/remove columns” action (before exporting to DB), and in the action click “Select all” to keep only the columns you really want to export.

The “Ignore missing fields” option can be tricky because its behavior is not transparent and can potentially lead to errors.