I’m new to EasyMorph and I’m trying to figure out how to do the following:
Split a 10 million record CSV file (call this the Master file) into smaller files that don’t lock up EM, then for each split file do a fuzzy match to a separate smaller file (we’ll call this the Matching file here), for any matches found in each of the split master files write/append the matching results to it a result file.
I figured out how to split the master file (creates 10 files right now) what I can’t figure out in all the iteration docs and examples is how to iterate the matching between the match file and the 10 master files and append their results.
To iterate and append results, you need to have a child module and a parent module. In the parent module, you list all the file paths you want to iterate through. With the “Iterate” action, these file paths are sent as parameters to the child module, which is where the matching is made for each file. The Iteration mode of the “Iterate” action needs to be “Iterate and append results”.
I have created a sample project which shows how this would work, please download this zip (don’t move the files / rename the folders as relative paths are being used in the project).
Matching sample project.zip (22.2 KB)
In the ZIP, apart from the Morph project, there is 1 matching file, and 2 data files. In this example, we are attempting to match Product Name in the data & matching files in order to obtain the Product Code.
Data file 1:
Data file 2:
This is what happens in the project:
- List files:
- Result of iterations (note: In this sample project, the match that takes place in the child module is via a Lookup (exact match), so you would have to adapt this for fuzzy matching. This post may help: Partial or fuzzy lookup):
Child module: Load the matching file + perform the lookup with the data file that has been loaded in the current iteration. Example of iteration of Data file 1, loading both the matching file + data file and performing the lookup:
@roberto - thanks for this - I’ll take a look in detail on this later today. Thanks!