The Ruby Transmogrifier, Part II

Input Output Graphic

Getting information into the Transmogrifier

In our last episode, we transmogrified data from one format into another. Now you need to get data into it using the transmogrifier. We could hard code the file names in there but that will come back to haunt us. Let’s make is so we can load in the definition file, the data file, and the name of the output file.

You’ll need to read the csv data file. How do you read a file line by line in Ruby?

require 'csv'

File.open("def.txt") do |file|
  while line = file.gets
    puts line
  end
end

You can save the file. I saved it as transmogrifier.rb

You can go ahead and run it.

$ ruby transmogrifier.rb
FIELD NAME      FORMAT
NAME            A(50)
ADDRESS1        A(50)
ADDRESS2        A(50)
CITY            A(50)
STATE           A(2)
ZIP             A(10)
CONTACT         A(50)
CONTACTPHONE    A(10)
ACCOUNTOPENED   9(8)

It works. We have data now we need to load up the definitions. In the transmogrifier.rb go ahead and add the code to do that. Remember what to do?

File.open("data.csv") do |file|
  while line = file.gets
    puts line
  end
end

If you’re curious go ahead and run it.

Now that you have the definitions and the data loaded you need to send them to the transmogrifier. You need to send the data, length of the field, the type, and the column name. Three of the four come from the definition file.

Let’s write out the Type, Length and, Field name. The Field name starts at the beginning of a line
We could split the line up and put it in an array. Since the file uses spaces, not tabs, for aligning things we can use split.

require 'csv'

definitions = Array.new

File.open("def.txt") do |file|
  while line = file.gets
    definitions = line.split()
    puts definitions[0]
    puts definitions[1]
  end
end
...

That gets us the field name but the Type and Length are still together. Were you thinking of regular expressions to find the Type and the Length?

require 'csv'

definitions = Array.new

File.open("def.txt") do |file|
  while line = file.gets
    definitions = line.split()
    field = definitions[0]
    type = definitions[1] =~ /9/? '9' : 'A'
    length = definitions[1].slice(1..definitions[1].length).gsub(/[^0-9]/, "")
    puts 'field => ' + field.upcase + ' type => ' + type + ' length => ' + length.to_s
  end
end

Go ahead run it.

$ ruby transmogrifier.rb
field => FIELD type => A length =>
field => NAME type => A length => 50
field => ADDRESS1 type => A length => 50
field => ADDRESS2 type => A length => 50
field => CITY type => A length => 50
field => STATE type => A length => 2
field => ZIP type => A length => 10
field => CONTACT type => A length => 50
field => CONTACTPHONE type => A length => 10
field => ACCOUNTOPENED type => 9 length => 8

Now we’re getting somewhere. Three of the four variables that get passed to the transmogrifier. Maniacal laugh.

One Queston.

How do you think you will get the data into that array?

What if we throw the array into a hash? Then, take that hash into another hash where the key is the field name and the data is from the first hash.

All we need to do is loop through data and find the key from the hash that goes with that column and send all that to the transmogrifier. Keeping things simple here.

First, go ahead and put the definition data into a hash. Then stuff that into a hash. For fun, go ahead and output that last hash so we can see it.

require 'csv'

object_name = Hash.new
definitions = Array.new

File.open("def.txt") do |file|
  while line = file.gets
    definitions = line.split()
    field = definitions[0]
    type = definitions[1] =~ /9/? '9' : 'A'
    length = definitions[1].slice(1..definitions[1].length).gsub(/[^0-9]/, "")
    object_formatting = Hash.new
      object_formatting["type"] = type
      object_formatting["length"] = length
      object_formatting["field"] = field
      object_name[field.upcase] = object_formatting
  end
end
puts object_name

Run it and let’s and see what we have.

$ ruby transmogrifier.rb
{"FIELD"=>{"type"=>"A", "length"=>"", "field"=>"FIELD"}, "NAME"=>{"type"=>"A", "length"=>"50", "field"=>"NAME"}, "ADDRESS1"=>{"type"=>"A", "length"=>"50", "field"=>"ADDRESS1"}, "ADDRESS2"=>{"type"=>"A", "length"=>"50", "field"=>"ADDRESS2"}, "CITY"=>{"type"=>"A", "length"=>"50", "field"=>"CITY"}, "STATE"=>{"type"=>"A", "length"=>"2", "field"=>"STATE"}, "ZIP"=>{"type"=>"A", "length"=>"10", "field"=>"ZIP"}, "CONTACT"=>{"type"=>"A", "length"=>"50", "field"=>"CONTACT"}, "CONTACTPHONE"=>{"type"=>"A", "length"=>"10", "field"=>"CONTACTPHONE"}, "ACCOUNTOPENED"=>{"type"=>"9", "length"=>"8", "field"=>"ACCOUNTOPENED"}}
...

Lovely. Now you need to process the data file. Let’s get the Field name and the data for that column

CSV.foreach("data.csv", headers: true) do |data|
  data.headers.each do |field|
    puts field + ' => ' + data[field].to_s
  end
end

Hold on! What is that CVS.foreach() call? Let’s take a look at that.
That line will process a CSV file and treat the first line as a header and not data.

When you run that you output should look like

name => Wonder widgets
address1 => 1600 Vassar Street
address2 =>
city => dallas
state => tx
zip => 75220
contact => Tim Smith
contactPhone => 214-555-1212
accountOpened => 12052001

name => Timmy's Bikes
address1 => 2723 Auburn Street
address2 => Building 3
city => Erie
state => PA
zip => 16508-1234
contact =>
contactPhone => 814-555-4321
accountOpened => 865289

We’re Almost Home.

Now we have the data and the definitions. Let’s send that information to the transmogrifier.

As we loop through the data, we need to find the matching key form the hash we made earlier. When we match it send it to the transmogrifier

CSV.foreach("data.csv", headers: true) do |data|
  data.headers.each do |field|
    object_name.each do |key,value|
      if field.upcase == key
        field.upcase!
        line = transmogrifier(data[field].to_s, object_name[field]["length"].to_i, object_name[field]["type"], field)
        print line
      end
    end
  end
  puts ' '
  puts ' --------- '
end

That will do all that we talked about. Had we written tests like in the first part we would catch a bunch of gotchas. What if there is no data for a column? How do we get all the data on one row? How do we save the data? How do we pass the definition and data file names in?

Here’s what I came up with before refactoring

require 'csv'

def transmogrifier(data,len,type,column)

proper_formatted = ''

unless data.nil?
    case
    when data.length > len
      data = data.slice(0..(len-1))
    when data.length < len
      if type == "A"
        data = data.ljust(len)
      else
        data = data.rjust(len)
      end
    else
      if column == "ACCOUNTOPENED"
        data = data.slice(4..7)+data.slice(0..3)
      else
        data
      end
    end
    proper_formatted += data
  end
  proper_formatted += '|' # added pipes to see where the field ends
  proper_formatted
end

object_name = Hash.new

file = File.new(ARGV[0],"r")

definitions = Array.new

while (line = file.gets)
    definitions = line.split()
    field = definitions[0]
    type = definitions[1] =~ /9/? '9' : 'A'
    length = definitions[1].slice(1..definitions[1].length).gsub(/[^0-9]/, "")
    object_formatting = Hash.new
      object_formatting["type"] = type
      object_formatting["length"] = length
      object_formatting["field"] = field
      object_name[field.upcase] = object_formatting
  end
file.close

aFile = File.new(ARGV[2], "w")

CSV.foreach(ARGV[1], headers: true) do |data|
  data.headers.each do |field|
    object_name.each do |key,value|
      if field.upcase == key
        field.upcase!
        line = transmogrifier(data[field].to_s, object_name[field]["length"].to_i, object_name[field]["type"], field)
        aFile.write(line)
        print line  #so you can see something in the terminal window
      end
    end
  end
  aFile.write("n")
  puts ' '
  puts ' --------- '
end

aFile.close

If you run $ ruby transmogrifier.rb def.txt data.csv formated.txt
You should have a new file in the folder you are working in called formated.txt

Now go forth and transmogrify Maniacal laugh.

Win an Annual Membership to Learnable,

SitePoint's Learning Platform

No Reader comments

Comments on this post are closed.