Dynamic Transformation v2

I think this is a feature request, because I haven't seen anything in the docs or in the community that indicates it's possible within the current version of easymorph. It's basically community topic Dynamic Transformation on steroids.

Use case
We need to import tables from SAP DaRT. They are many (200) and they are large (GB). DaRT files come in pairs: tab-separated data files and XML metadata files.

Required functionality
I need to apply the correct data type per column. For select data types, some pre-processing is required, e.g. for dates, 00.00.0000 needs to be replaced with NULL and for years 0000 needs to be replaced with NULL. This pre-processing and the column data type conversion needs to be performed dynamically (I'm not going to manually define the conversions for 200 tables, especially because the information is already available).

Workaround attempt 1 (9.4 KB)

  • parse metadata XML
  • derive metadata table per data type (row value filter)
  • derive data table with only columns of a specific data type (Select columns by lookup)
  • apply data conversion logic to these groups of columns
  • merge them back together

Issues:

  • the merge requires a key for linking the tables back together
  • adding such a key introduces a column that should be exempt from the data conversion. There is no way to specify that logic should apply to "all columns except ..."

Workaround hack 2 (9.9 KB)

  • parse metadata XML
  • reduce metadata to two columns: column name, and combined metadata spec (e.g. "DATS:000008:")
  • pivot the metadata, and append the data so that the metadata resides in the first row
  • iterate over the columns

in a module that

  • extracts the first row as a derived table
  • has derived tables for each data type, with a "Skip on condition in another table" to generate an empty result if the data type does not apply
  • merges the derived tables using a series of "Either table" actions

That works, but is there (or should there be) a less hack-ish way to solve this?

Welcome to the Community, Laurens!

If the order of rows doesn't change (and it seems like it doesn't), then you can use the "Append another table" action in the "Append columns" mode, instead of "Merge". In this case, a key isn't required.

I'd also recommend checking out the "Iterate column" action. It's intended exactly for such cases - when a complex logic needs to be applied to multiple columns, individually.

Can you share an example of the files so that we can better understand and propose a solution? Just a few rows anonymized would be helpful.

Yes, that will actually work!

I love this. I keep thinking I need to take care of the nitty gritty details, but of course easymorph has an index for the dataset behind the scenes, and can handle the reconstitution of a table that it has cut up.

I'll do you one better: I'll share the place I got stuck, and a solution based on @dgudkov's answer. It'll have to wait a few days, though, until I'm back in the office next week.

1 Like