A Guide to Ruby Collections IV: Tips and Tricks
This is the last article in the Guide to Ruby Collections series. I want to finish up by providing some miscellaneous tips that you will hopefully find helpful during your Ruby adventures.
More Detailed Array Creation
In the introduction to Arrays, I showed you that you can fill an array with a default value by passing a second argument to Array.new
>> Array.new(3, 1)
=> [1,1,1]
So if we wanted to initialize an Array with random numbers, we could just pass in rand(x)
instead, right?
>> Array.new(3, rand(100))
=> [28,28,28]
Hmm. That wasn’t exactly random. It looks like rand(100)
is only being evaluated once. Fortunately, Array#new
can take a block, and it will run the block for each element, assigning the element the result.
>> Array.new(3) { rand(100) }
=> [10,53,27]
Mass Assignment
Similar to Python, Ruby allows assignment to more than one variable, or more than one index in an Array, in the same statement.
>> letter1, letter2 = ["a", "b"]
=> ["a", "b"]
>> letter1
=> "a"
>> letter2
=> "b"
>> transportation = []
>> transportation[0..1] = ["trains", "planes"]
>> transportation[0]
=> "trains"
>> transportation[1]
=> "planes"
Bounds Checking
By default, Ruby Arrays return nil when sent an invalid index.
>> [1,2,3][99]
=> nil
If you don’t like this behavior (perhaps your Arrays have lots of nil in them), you could redefine #[]
in the Array class.
>> class Array
>> def [](idx)
>> self.fetch(idx)
>> end
>> end
>> [1,2,3][99]
=> IndexError: index 99 outside of array bounds: -3...3
!!WARNING!!: Monkeying around with Ruby’s core libraries could result in bugs that are very difficult to track down. In fact, even the previous example kills Array’s ability to accept more than one argument and appears to cause errors in irb. If you don’t want to research how to redefine Array methods without causing problems, a safer alternative would be to create a subclass of Array.
>> class BoundsCheckedArray < Array
>> def [](idx)
>> self.fetch(idx)
>> end
>> end
>> arr = BoundsCheckedArray.new([1,2,3])
>> arr[99]
=> IndexError: index 99 outside of array bounds: -3...3
Random Array Elements
We’ve seen how to create random numbers in Ruby, but what about anything else? Perhaps your significant other has recently implemented a method like this:
>> def random_restaurant
>> "I don't care. Where do you want to go?"
>> end
>> 3.times.map { random_restaurant }
=> ["I don't care. Where do you want to go", "I don't care. Where do you want to go", "I don't care. Where do you want to go"]
Choking back tears of frustration, you could put a selection of nearby restaurants into an array and use the #sample method to get one at random.
>> def random_restaurant
>> ["Process: The Forkeria", "Shenanigans", "Grease Factory"].sample
>> end
>> 3.times.map { random_restaurant }
=> ["Process: The Forkeria", "Grease Factory", "Process: The Forkeria"]
If you call #sample
twice, you risk the chance of getting the same element from both calls. If you need more than one unique, random element, you can pass the number of elements to #sample.
>> ["a","b","c"].sample(2)
=> ["c", "a"]
Multidimensional and High-Performance Arrays
Unlike its cousin Python, Ruby is pretty good about letting you declare multidimensional Arrays in an intuitive manner.
>> two_d = [[1,2,3],[4,5,6],[7,8,9]]
>> two_d[1][1]
=> 5
Unfortunately, people generally do not look to Ruby when performing expensive, mathematical operations. In this case it is often much easier to abandon ship and learn NumPy
However, a Japanese researcher has built the NArray gem which provides fast vectors and matrices. Unfortunately, the price of speed is that you are limited to homogeneous, numerical collections.
$ gem install narray --version 0.6.0.8
Creating n-dimensional arrays with NArray is really easy. Each argument to the NArray constructor functions as a size for a dimension.
>> require 'narray'
>> NArray.int(2)
=> NArray.int(2):
[ 0, 0 ]
>> NArray.int(2,2)
=> NArray.int(2,2):
[ [ 0, 0 ],
[ 0, 0 ] ]
>> NArray.int(2,3)
=> NArray.int(2,3):
[ [ 0, 0 ],
[ 0, 0 ],
[ 0, 0 ] ]
>> NArray.int(1,2)
=> NArray.int(1,2):
[ [ 0 ],
[ 0 ] ]
NArray uses the math-flavoured [x,y] notation instead of [x][y] notation for specifying indices.
>> arr = NArray.int(3,3)
>> arr[1,1] = 7
>> arr
=> NArray.int(3,3):
[ [ 0, 0, 0 ],
[ 0, 7, 0 ],
[ 0, 0, 0 ] ]
This is not co-compatible with Array. arr[x][y]
does not work.
>> arr[1][1]
=> 0
>> arr[1,1]
=> 7
The NArray gem is not limited to integers or NArray objects. The English documentation for NArray can be found here
Benchmarking
With the seemingly infinite number of ways to access and manipulate Ruby collections, you might be left wondering how to determine whether the methods you are using are efficient. For example, is it really faster to perform a membership test with Set than with Array?
Ruby provides Benchmark for timing code.
>> require "benchmark"
>> require "set"
>> array = ["a", -3.14, 0, []]
>> set = array.to_set
>> Benchmark.bm do |bench|
>> bench.report("array:") do
>> 1000.times { array.include? -3.14 }
>> end
>> bench.report("set:") do
>> 1000.times { set.include? -3.14 }
>> end
>> end
user system total real
array: 0.000000 0.000000 0.000000 ( 0.000200)
set: 0.000000 0.000000 0.000000 ( 0.000212)
Although one could hardly call this a complete comparison, the difference appears to be marginal for my Ruby implementation (1.9.3-p448).
Conclusion
Well, thus endeth my series on Ruby collections. Here are some additional resources if you would like to continue your education: