I have seen a lot of Q&As in this community on replacing a substring of text within a cell; e.g. Replace(). I am trying to normalize variants of submissions with normalized values.
I am looking to replace the full text within matching cells. For example I have a reference file with normalized values:
In this case when I try to replace the content I get doubled up text in my cells because EM is only replacing the substring and not the entire cell value. How can I create the normalized values in my first screen capture above.
If you have a reasonable confidence that the Change Type(from Subject) values will contain the string you are looking to detect for, then I would just use the rule action. It will let you replace the cell within the column you are targeting with the rule.
You can just create a rule to capture each of the strings you are looking to detect on, then set it to the normalized value.
If you have a wide range of variance on how the values will be coming that you need to detect on, then you may need to do other cleanup. I am assuming that the column that will be normalized on will always be containing the string in an expected manner, or it is small enough that making a couple of extra rules here and there won't hurt you.
For example: These all contain the string 'Cost Change'
1-Cost Change
1 - Cost Change
Cost Change
So, if you are always getting the expected string 'Cost Change' it is pretty easy.
Here I tried to replicate your list. It may be off by a space here or there but you should get the idea.
Thanks I really appreciate this community coming forward with solutions.
I have tried "replace with lookup", which will reference an external file that can be maintained and added to. I feel like this should work but it doesn't because it's replacing the substring as noted. If i could replace the full string rather than the substring this would work.
I can appreciate your solution, but I intend to use easymorph to normalize thousands of phrases. Writing a unique IF/THEN statements within "Rule" for each one isnt scalable. Having a reference file with incoming values and their approved replacements seems like a common ETL transformation that must be supported in this wonderful tool.
This is the step that I have tried to use. Seems that this should work, except it is looking at substrings (snippets within a cell) rather than the full string within a cell.
As is normally the case, there are many possible solutions....
You can have your lookup table include all the strings found in the main data and the normalized version, then use the Lookup action as Dmitry stated.
If you want to do what I think you originally tried, which is to have your lookup table hold a substring and the normalised value, and then map these to the main data where it contains the substring, you can do it by chaining the Match action with the Lookup action:
If the original data and lookup table are like this:
This is really helpful and shows 2 skills in one process. You and Dmitri are both using the Lookup function, which correct - that works. I was trying to use a single step for this particular type of "replace values" stage in the process but looks like Lookup is the way to go and then clean out unneeded columns. Thank you.