This feature would allow the user to select from one the common classical text similarity algorithms in order to cluster and ultimately merge/normalize similar values.
Here is a good overview of the classic cluster/text difference methods.
In other data tools I have seen the options to have these methods calculate scores of similarity for a field/column, create the clusters of similar values based on a user defined threshold, find the most common version of the clustered values, then present these groups to the user to confirm that they should be normalized to the most prevalent value, select a new final value, remove values from the clustered group. This can then be re-ran on the same dataset as new data is added, if any new data meets the existing approved cluster groups then it is automatically normalized, any new values get sent to some secondary job for review and addition to the ongoing approved clusters.
I can provide more details in a call as needed