Syntax Error 42601 when Exporting to Database (Redshift/PostgreSQL)

Just putting this out there in case anyone else comes across this issue, and how I went about resolving it.

I recently came across this error and it took a while to find the cause. The error I received was:

Action “Export to database” in table “[TableName]” in project “[Project Path]”: Export of row #15420 failed with the following error: 42601: syntax error at or near “41494”

My particular issue was due to the fact that someone in entering data included a backslash at the end of one of their fields. It appears the 42601 error was being thrown by the next column that had data in it given that backslash is used to escape characters.

In the above example, the 41494 is a date type three columns to the right of the data entry issue - the other two columns were blank. I initially tried to hack it away making all columns a Character-Varying(65535), but it persisted. I then tried removing all date columns, same issue, but it was instead throwing the syntax error on a standard text field.

I eventually saved the file out to csv and investigated around line 15420, and there I saw the backslash. The issue was actually on the following line 15421.

I tried to use the Sanitize Text transformation, but this did not fix the issue. I eventually just used a Modify Column transformation with the calc =replace([field],"","") and the export to the database went through with no issues.

I hope this helps someone. I dont know if it’s possible to update the Sanitize Text transformation to handle this?

2 Likes

Hi Aaron,

Thank you for reporting this issue. It should affect not only Redshift/PostgreSQL, but any other database connector. It seems to be an oversight on our end. We will try to fix it in one of the following releases.

As for the “Sanitize Text” action - it’s not intended to fix such issues. But we are thinking about creating an action like “Sanitize value for database” or something like that.

Hi Andrew,

I thought I’d report it in case it would impact other export destinations.

On further analysis, the issue only seems to appear when the end of a column value is a backslash.

The new fix I’ve used is a simple Modify Column transform using a more simple If…endswith…removeend type expression - however the Modify Column transform is per-column, it doesn’t apply over all columns.

Given this and some of the datasets we use, i see a risk that future data loads may break by bad data entry (or bad input verification/programming), even though historical loads all work fine.

The “Sanitize value for database” action sounds good, especially if it could apply over all columns.