Mix and Match CSV data with Ruby

robabank

funny

Multiple times a year I have to submit my VAT or ‘BTW’ tax forms. As I have a small business to run, I use good old Excel to calculate the numbers for the last Quarter. Most of the input comes from the payment transactions of my online bank account.

I found myself copy-pasting all relevant transactions from the financial transactions overview screen in the browser to the appropriate excel columns. Very boring and repetitive work. Time consuming as well. Translating dates into the appropriate format, mixing the columns to get the expected order, …

To go with my philosophy to automate everything that is repetitive, I created the following solution that I know will save me time all upcoming VAT rounds.

There is an option to download all transaction data from my account to a comma separated values file, or CSV file. You can download transactions with a start and end date, which makes it easy to get a file with records for the last Quarter only.

Next, with a little Ruby script, extract all relevant rows and columns and put them in the right order.

Then copy-paste the resulting file to the area in my BTW excel sheet and … done.

For this I used the fastercsv library, which made parsing the file quite simple.

require 'rubygems'
require 'faster_csv'
require 'parsedate'

# parse a csv file from Rabobank online banking download
#
# copy-paste or import the resulting output into excel
#
# note: to install the gem for faster_csv, use 'gem install fastercsv' e.g. without underscore!
#
# This is input for the BTW excel file!
whole_file = File.open(ARGV[0]) 

FasterCSV.parse(whole_file) { |row|
  # "tb" seems code between own accounts: skip these
  # filter out the Debit records "D" or Credit records "C"
  if row[3] == "D" and not row[8] == "tb"
	# put together the description fields
    description = "#{row[6]} #{row[10]} #{row[11]} #{row[12]}".strip.gsub(/s+/, ' ').downcase
	# put date in expected format for excel
    date = ParseDate.parsedate("#{row[2]}")
    my_date = "#{date[2]}-#{date[1]}-#{date[0]}"
	# change comma into a dot for price field
    price = row[4].gsub('.', ',')
    puts "#{my_date}|#{description}|#{price}"
  end
}

Note that this is used with the download of the Rabobank (or “Rob-a-bank”, as pitched by Boom Chicago)