Quick Tip: Double Quoted CSV Data

Sometimes I get SAP data that is quoted with both single quotes and double quotes. The data looks something like this:

  • “‘001′”,”’01′”,”‘100123′”,”‘#'”,”‘#'”,”‘#'”,”‘100123′”

CSVFIX will strip out the extra single quotes with the following command:

  • csvfix read_dsv -s “,” -csv Input_File1.csv > Output_File1.csv

Seems unneeded but otherwise your output is double quoted. I blame SAP.

The magic is the ‘-CSV’ switch. Make this your first command and you’re right as rain. I made it the second in a script I had that first stripped the first header row, see today’s previous posting, and results were not as expected. Could be just a fluke as it should not have mattered.

BTW Excel calls this ‘Text Qualifier” on Step 2 of the data import wizard, another way to fix or strip these out but then that eliminates automation which is the whole point of CSVFIX. Excel can do single and double but in the above example will read it correctly ad-hoc analysis and test ETL.

Leave a Reply

Your email address will not be published. Required fields are marked *