A Guide to Ruby Collections, II: Hashes, Sets, and Ranges

This entry is part 2 of 4 in the series A Guide to Ruby Collections

A Guide to Ruby Collections

collections_hashes

The first article in this series focused on Array and the basics of idiomatic Ruby iteration. Array is a pretty flexible class, but there are better solutions for particular situations. This article covers some of the other collection types that ship with Ruby.

Hashes

Sometimes you need to map one value to another. For example, you might want to map a product ID to an array containing information about that product. If the product IDs were all integers, you could do this with Array, but at the risk of wasting a lot of space in between IDs. Ruby hashes function as associative arrays where keys are not limited to integers. They are similar to Python’s dictionaries.

Creation

Like arrays, hashes have both literal and constructor initialization syntax.

>> colors = {}
>> colors['red'] = 0xff0000

>> colors = Hash.new
>> colors['red'] = 0xff0000

As with arrays, it’s possible to create a hash with starting values. This is where we get to see the idiomatic => (“hash rocket”) operator.

>> colors = {
>>   'red' => 0xff0000,
>>   'blue' => 0x0000ff
>> } 
=> {"red"=>16711680, "blue"=>255}

If you try to access a hash key with no value, it will return nil. You can change this default value by passing an argument to the constructor.

>> h = Hash.new
>> h[:blah]
=> nil

>> h = Hash.new(0)
>> h[:blah]
=> 0

Note that accessing a non-existent key does not create it.

>> h = {}
>> h.size
=> 0
>> h[:blah]
>> h.size
=> 0

Deletion

If you want to remove a key-pair from a hash, you can use #delete. It might be tempting to simply set the value at the key to nil like in Lua tables, but the key will still be part of the hash and thus included in iteration.

>> colors['red'] = nil
>> colors.size
=> 2

>> colors.delete('red')
>> colors.size 
=> 1

Iteration

Hashes are iterated like Arrays, except two values are passed to blocks instead of one.

>> hash = {"Juan" => 24, "Isidora" => 35}
>> hash.each { |name, age| puts "#{name}: #{age}" }
Juan: 24
Isidora: 35

Block variables like name and age in the previous example are just placeholders. They are arbitrary and could be anything, though it is good practice to make them descriptive.

To Hash Rocket or Not to Hash Rocket

It’s popular to use symbols as the Hash keys because they are descriptive like strings but fast like integers.

>> farm_counts = {
>>   :cow => 8,
>>   :chicken => 23,
>>   :pig => 11,
>> }
=> {:cow=>8, :chicken=>23, :pig=>11}

Starting with Ruby 1.9, hashes whose keys are symbols can be built sans hash rocket (=>), looking more like JavaScript or Python.

>> farm_counts = {
>>   cow: 8,
>>   chicken: 23,
>>   pig: 11
>> }
=> {:cow=>8, :chicken=>23, :pig=>11}

Both styles are common, but one thing to keep in mind is that all other key types still use a hash rocket, so using a colon instead in one part of your code might throw newcomers off.

Keyword Arguments With Hashes

Python provides the ability to call functions using keywords arguments With keywords, it isn’t necessary to pass arguments in a specific order or pass any particular arguments at all. Although Ruby technically does not provide keyword arguments, a hash can be used to simulate them. If a hash is the last argument in a method call, the curly braces can be left off.

>> class Person
>>   attr_accessor :first, :last, :weight, :height

>>   def initialize(params = {})
>>     @first = params[:first]
>>     @last = params[:last]
>>     @weight = params[:weight]
>>     @height = params[:height]   
>>   end
>> end

>> p = Person.new(
>>   height: 170cm,
>>   weight: 72,
>>   last: 'Doe',
>>   first: 'John'
>> )

Note that params = {} isn’t strictly necessary, but it protects your code from throwing an ArgumentError if no argument is passed, and it makes the intended argument type clearer.

Smaller Hashes with Array Fields

Someone got the bright idea of making a lighter hash out of the Array class.

$ gem install arrayfields

>> require 'arrayfields'
>> h = ArrayFields.new
>> h[:lunes] = "Monday"
>> h[:martes] = "Tuesday"
>> h.fields
=> [:lunes, :martes]
>> h.values
=> ["Monday", "Tuesday"]

I’m not ultra-familiar with the arrayfields gem or how it applies across different Ruby implementations, but it’s very popular on Ruby Toolbox, and if you’re going to be serializing a lot of Hash data, it’s probably worth checking out.

Sets

If you need a collection where the order does not matter, and the elements are guaranteed to be unique, then you probably want a set.

Unlike the other collection types, you must add a require statement to make use of the Set class.

>> require 'set'

Also, unlike Array and Hash, Set does not have any kind of special literal syntax. However, you can pass an Array to Set#new.

>> s = Set.new([1,2,3])
=> #<Set: {1, 2, 3}>

Alternatively, you can use Array#to_set.

>> [1,2,3,3].to_set
=> #<Set: {1, 2, 3}>

Set uses the << operator like Array, but #add is used instead of #push.

>> s = Set.new
>> s << 1
>> s.add 2

To remove an element from a set, use the #delete method.

>> s.delete(1)
=> #<Set: {2}>

As with Array, #include? can be used for membership testing.

>> s.include? 1
=> false
>> s.include? 2
=> true

One of the useful features of Set is that it will not add elements that it already includes.

>> s = Set.new [1,2]
=> #<Set: {1, 2}> 
>> s.add 2
=> #<Set: {1, 2}>

Earlier I pointed out that Array can perform boolean operations. Naturally, Set can do these as well.

>> s1 = [1,2,3].to_set
>> s2 = [2,3,4].to_set

>> s1 & s2
=> #<Set: {2, 3}>

>> s1 | s2
=> #<Set: {1, 2, 3, 4}>

It also can do exclusive-or operations with the ^ operator, unlike Array.

>> [1,2,3] ^ [2,3,4]
=> NoMethodError: undefined method `^' for [1, 2, 3]:Array

>> s1 ^ s2
=> #<Set: {4, 1}>

Ranges

I pointed out Ranges before in part I. The Range class is a sort of quasi-collection. It can be iterated like the other collections that use Enumerable, but it’s not a container for arbitrary elements.

>> r = Range.new('a', 'c')
=> 'a'..'c'
>> r.each { |i| puts i }
a
b
c

Before, I showed that Ranges can slice arrays or produce indices for iterating through them.

>> letters = [:a,:b,:c,:d,:e]
>> letters[1..3]
=> [:b, :c, :d]

>> (1..3).map { |i| letters[i].upcase }
=> [:B, :C, :D]

In addition to slicing arrays, Ranges can simplify case statement logic.

>> def theme(year)
>>   case year
>>     when 1970..1979 then "War Bad, Black People Not Bad"
>>     when 1980..1992 then "Cocaine, Money, and The Future"
>>     when 1993..2000 then "Gillian Anderson, Sitcoms in The FriendZone, and AOL"
>>     when 2000..2013 then "RIP, Music"
>>   end
>> end

>> theme(1987)
=> "Cocaine, Money, and The Future"

There is also this stackoverflow question about generating random strings which got some answers that put ranges to good use.

>> (0...10).map{ ('a'..'z').to_a[rand(26)] }.join
=> "vphkjxysly"

Conclusion

That covers Hashes, Sets, and Ranges. It the next post, I’ll discuss Enumerable, Enumerator, and the neat things you can do with such tools.

A Guide to Ruby Collections

<< A Guide to Ruby Collections, Part I: ArraysA Guide to Ruby Collections III: Enumerable and Enumerator >>

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Anonymous

    The Case section is hilarious. Please continue this series ~ great Ruby practice.

  • Andy

    A more literal syntax for creating a set that can be used is: Set[1]