Clearing columns from foreign letters

Hello!
I have a file that I read in to our SQL database on a regular basis and want to automate. However, it seems like the author responsible for the file that I get might be a little sloppy and leaves foreign letters and symbols where I don’t want them.

Is there any way for me to only keep numbers and letters from the ISO basic Latin alphabet? I tried to sanitize the text and a few other things, but all I can come up with is to create a lookup table and replace the characters when needed, but if this is already a feature that I don’t know of, or if there is a less time-consuming way to do this it would definitely be helpful.

Thanks in advance!

Hi Adam, and welcome to the Community!

Yes, a lookup table for character replacement together with the “Lookup with another table” action is a working solution to replace/remove unwanted characters.

Another option is the keepchars() function. It can remove unwanted characters (but not replace them).

See also this topic: Map string and diacritical character mapping [DONE]

Thank you!
I ended up using the keepchars() function and it works like a charm. :slight_smile:

Hi,

I faced recently quite a strange issue : exotic characters
some fields had values with some hidden characters, very hard to detect.
So I did this :

  • extract all the values
  • extract one by one each characters
  • test and put aside the exotic characters
  • clean and apply diacritics
  • rebuild the cleaned value

So here is the sample, it’s a little clumsy but it may help some of you.

Cheers

extract illegal char.zip (5.5 KB)

1 Like