.ultrageek. Computers, Music, and Thoughts


Running CSVFIX on Mac

CSVFIX has been my go to for command line ETL but I have always run it on Windows. After running in to some issues with a CSV file loading in to Salesforce I needed something on the mac to 'peak' in to the file and do the usual amount of stuff that I'm used to doing with CSVFIX. A quick google search and I find some handy instructions but nothing is ever easy and because the internet sometimes comes and goes I'm putting those here with the changes I made to work on Mac (vs. Linux) and a couple of the apps I had to update or install.

Here's the list of commands:

  1. wget https://bitbucket.org/neilb/csvfix/get/version-1.6.zip
  2. unzip version-1.6.zip
  3. cd neilb-csvfix-e804a794d175
  4. make lin
  5. cd csvfix/bin
  6. sudo cp csvfix /usr/local/bin

Wget by default is not on the mac. I used "brew install wget" to install it. It is a handy package manager that is specifically for Mac: https://brew.sh/

On step #4 when you try and 'Make' or compile you might get this error:

  • xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun

To fix this run: xcode-select --install and that installation will fix the error and allow you to compile/make csvfix.

The last thing that I changed from the original instructions is to update step #6 from /usr/bin to /usr/local/bin which is specific to mac (vs Linux).

Here is the link to the blog post that started this journey: Install csvfix on linux


Tagged as: , No Comments

Quick Tip: Double Quoted CSV Data

Sometimes I get SAP data that is quoted with both single quotes and double quotes. The data looks something like this:

  • "'001'","'01'","'100123'","'#'","'#'","'#'","'100123'"

CSVFIX will strip out the extra single quotes with the following command:

  • csvfix read_dsv -s "," -csv Input_File1.csv > Output_File1.csv

Seems unneeded but otherwise your output is double quoted. I blame SAP.

The magic is the '-CSV' switch. Make this your first command and you're right as rain. I made it the second in a script I had that first stripped the first header row, see today's previous posting, and results were not as expected. Could be just a fluke as it should not have mattered.

BTW Excel calls this 'Text Qualifier" on Step 2 of the data import wizard, another way to fix or strip these out but then that eliminates automation which is the whole point of CSVFIX. Excel can do single and double but in the above example will read it correctly ad-hoc analysis and test ETL.