A Guide to Ruby Collections, Part I: Arrays

Key Takeaways

Ruby arrays are dynamic, versatile data structures that can function as sets, stacks, or queues and are the functional equivalent of Python lists.
Arrays in Ruby do not require preallocation of space and can be heterogeneous, allowing a mix of different data types within the same array.
Ruby arrays handle non-existent indices gracefully by returning nil, and support negative indexing which counts back from the end of the array.
Ruby arrays offer powerful methods for adding, removing, and transforming elements, supporting operations like concatenation, intersection, and even more complex manipulations like reversing or rotating elements.
Iteration in Ruby is highly flexible and can be customized using methods like `each`, `map`, and `each_with_index`, allowing for both simple and advanced manipulation of array elements during traversal.

collections_arrays

Programming consists largely of sorting and searching. In an older language like C, you might be expected to write your own data structures and algorithms for these tasks. However, with Ruby these constructs have been abstracted away in favor of the ability to focus on the task at hand.

What follows is a guide to these abstractions. It isn’t completely comprehensive-an entire book could be written on Ruby collections-but I cast a wide net, and I cover what I think you will encounter often as a Ruby programmer. It is broken into 4 parts:

Arrays and Iteration
Hashes, Sets, and Ranges
Enumerable and Enumerator
Tricks and Tips

This is a highly example-driven guide. I think the best way to learn this stuff is to just pop open an irb shell and follow along, creating your own clever variations along the way.

Arrays

Arrays are the workhorses of Ruby collections. Most methods operating on collections will return an Array as a result, even if the original collection was not one. In truth, they aren’t really arrays at all, but a kind of one-size-fits-all data structure. You can make them act like sets, stacks, or queues. They are the functional equivalent of Python lists.

Creation

Ruby arrays are created similarly to those found in other dynamic languages.


>> numbers = [1, 0, 7]
>> numbers[2]
=> 7
>> numbers.size
=> 3

Arrays do not need to consist of the same data type. They can be heterogeneous.


>> data = ["bob", 3, 0.931, true]

Since Ruby is completely object-oriented, arrays are represented as objects rather than merely special interpreter rules. This means you can construct them like other objects.


>> numbers = Array.new([1,2,3])
=> [1, 2, 3]

The array constructor can be passed a starting size, but it might not work like you expect. Since Ruby arrays are dynamic, it isn’t necessary to preallocate space for them. When you pass in a number by itself to Array#new, an Array with that many nil objects is created.


>> numbers = Array.new(3)
=> [nil, nil, nil]

Although nil is a singleton object, it takes up a slot in collections like any other object. So when you add an element to an Array of nil objects, it gets tacked onto the end.


>> numbers << 3
=> [nil, nil, nil, 3]
>> numbers.size
=> 4

If you pass Array#new a second argument, it becomes the fill value instead of nil.


>> Array.new(3, 0)
=> [0, 0, 0] 
>> Array.new(3, :DEFAULT)
=> [:DEFAULT, :DEFAULT, :DEFAULT]

In addition to the standard literal, Ruby provides some other syntax shortcuts through %notation.


>> ORD = "ord"
>> %W{This is an interpolated array of w#{ORD}s}
=> ["This", "is", "an", "interpolated", "array", "of", "words"] 
>> %w{This is a non-interpolated array of w#{ORD}s}
=> ["This", "is", "a", "non-interpolated", "array", "of", "w\#{ORD}s"]

Array Indices

Most languages will throw an exception if you attempt to access an array index that does not yet exist. If you attempt to read a non-existent index, Ruby returns nil.


>> spanish_days = []
>> spanish_days[3]
=> nil

If you write to a non-existent index, Ruby will insert nil into the array up to that index.


>> spanish_days[3] = "jueves"
=> [nil, nil, nil, "jueves"]
>> spanish_days[6] = "domingo"
=> [nil, nil, nil, "jueves", nil, nil, "domingo"]

Most languages will also error out if you try to access a negative array index. Ruby considers negative indices to start at the end of the array, working back towards the beginning as they increase.


>> ["a","b","c"][-1]
=> "c"
>> ["a","b","c"][-3]
=> "a"

If you provide a nonexistent, negative array index, the result is the same as a nonexistent, positive one – nil


>> ["a","b","c"][-4]
=> nil

Array Ranges

Another useful feature of Ruby arrays is the ability to access ranges of elements. However, they can be tricky given that there are many different ways to specify a range of elements in Ruby.


>> letters = %w{a b c d e f}
=> ["a", "b", "c", "d", "e", "f"]
>> letters[0..1]
=> ["a", "b"]
>> letters[0, 2]
=> ["a", "b"]
>> letters[0...2]
=> ["a", "b"]
>> letters[0..-5]
=> ["a", "b"]
>> letters[-6, 2]
=> ["a", "b"]

Here are the reasonings behind these examples:

letters[0..1] – give me elements 0 through 1
letters[0, 2] – starting with index 0, give me 2 elements
letters[0…2] – give me elements 0 until 2
letters[0..-5] – give me elements 0 through -5
letters[-6, 2] – starting with element -6, give me 2 elements

If you are new to Ruby, you might be wondering how this is even possible. It turns out that array accesses are nothing more than calls to the #[] method.


>> letters.[](0..1)
=> ["a", "b"]

In addition, 0..1 is nothing more than a Range object in disguise. You can verify this by using the #class method.


>> (0..1).class
=> Range

So what is really going on is a Range object representing the target range of elements is passed to Array#[].


>> letters.[](Range.new(0,1))
=> ["a", "b"]

This object-oriented nature of Ruby enables us to do some pretty crazy things, if we like. What if we wanted to use numbers in the form of English words?

The numerouno gem can be used to parse English numbers.


$ gem install numerouno
>> require 'numerouno'
>> "one-hundred sixty three".as_number
=> 163

With numerouno you can now make an array class that takes English indices.


class EnglishArray < Array
  def [](idx)
    if String === idx
      self.at(idx.as_number)
    end
  end
end 
>> arr = EnglishArray.new(["a","b","c"])
>> arr["one"]
=> "b"

Transformation

Remember how I said Ruby’s Array is a one-size-fits-all data structure? Here are some examples of the kinds of operations you can perform on Array.

Adding Elements


>> [1,2,3] << "a"
=> [1,2,3,"a"]
>> [1,2,3].push("a")
=> [1,2,3,"a"]
>> [1,2,3].unshift("a")
=> ["a",1,2,3]
>> [1,2,3] << [4,5,6]
=> [1,2,3,[4,5,6]]

Removing Elements


>> arr = [1,2,3]
>> arr.pop
=> 3
>> arr
=> [1,2]
>> arr = ["a",1,2,3]
>> arr.shift
=> "a"
>> arr
=> [1,2,3]
>> arr = [:a, :b, :c]
>> arr.delete(:b)
=> :b
>> arr
=> [:a, :c]
>> arr.delete_at(1)
=> :c
>> arr
=> [:a]

Combining Arrays


>> [1,2,3] + [4,5,6]
=> [1,2,3,4,5,6]
>> [1,2,3].concat([4,5,6])
=> [1,2,3,4,5,6]
>> ["a",1,2,3,"b"] - [2,"a","b"]
=> [1,3]

Boolean Operations


>> [1,2,3] & [2,3,4]
=> [2,3]
>> [1,2,3] | [2,3,4]
=> [1,2,3,4]
>> arr1 = [1,2,3]
>> arr2 = [2,3,4]
>> xor = arr1 + arr2 - (arr1 & arr2)
=> [1,4]

Moving Elements


>> [1,2,3].reverse
=> [3,2,1]
>> [1,2,3].rotate
=> [2,3,1]
>> [1,2,3].rotate(-1)
=> [3,1,2]

Safeguarding


>> arr = [1,2,3]
>> arr.freeze
>> arr << 4
=> RuntimeError: can't modify frozen Array

Combining Elements Into a String


>> words = ["every","good","boy","does","fine"]
>> words.join
=> "everygoodboydoesfine"
>> words.join(" ")
=> "every good boy does fine"

Removing Nesting


>> [1,[2,3],[4,["a", nil]]].flatten
=> [1,2,3,4,"a",nil]
>> [1,[2,3],[4,["a", nil]]].flatten(1)
=> [1,2,3,4,["a", nil]]

Removing Duplicates


>> [4,1,2,1,5,4].uniq
=> [4,1,2,5]

Slicing


  >> arr = [1,2,3,4,5]
>> arr.first(3)
  => [1,2,3]
>> arr.last(3)
=> [3,4,5]

Querying


>> ["a","b","c"].include? "d"
=> false
>> ["a", "a", "b"].count "a"
=> 2
>> ["a", "a", "b"].count "b"
=> 1
>> [1,2,[3,4]].size
=> 3

Iteration

Iteration is an area where Ruby really shines. In many languages iteration feels awkwardly tacked on. However, in Ruby you should never feel the need to write a classical for loop.

The central construct in Ruby iteration is the #each method.


>> ["first", "middle", "last"].each { |i| puts i.capitalize }
First
Middle
Last

Although #each is the core iterator in Ruby, there are many others. For example, you can iterate through a collection backwards by using #reverse_each.


>> ["first", "middle", "last"].reverse_each { |i| puts i.upcase }
LAST
MIDDLE
FIRST

Another handy method is #each_with_index which passes the current index as the second argument to the block.


>> ["a", "b", "c"].each_with_index do |letter, index|
>>   puts "#{letter.upcase}: #{index}"
>> end
A: 0
B: 1
C: 2

How, exactly, does this #each method work? The best way to understand is to make your own #each.


class Colors
  def each
    yield "red"
    yield "green"
    yield "blue"
  end
end
>> c = Colors.new
>> c.each { |i| puts i }
red
green
blue

The #yield method calls the block you pass to #each. This can be difficult for many ruby newcomers to wrap their heads around. Think of #yield as calling an anonymous body of code that you provide to the method #yield is in. In the previous example, yield is called three times, so “{ |i| puts i }” runs three times.

Partial Iteration

One of the nice things about for loops is that beginning and ending points can be specified. #each, however, always iterates through an entire collection.

If you need to iterate through only part of a collection, there are at least a couple ways of doing this:

Slice the collection and then iterate through the slice


>> [1,2,3,4,5][0..2].each { |i| puts i }
1
2
3

Use a Range to generate indices


>> arr = [1,2,3,4,5]
>> (0..2).each { |i| puts arr[i] }
1
2
3

Although it’s a bit uglier, I’m willing to bet that option 2 is more efficient as Array element sizes get larger and take longer to copy.

#each vs. #map/#collect

In addition to #each, you will likely encounter #map quite often as well. #map is like #each except it builds an Array out of the results of each block call. This is useful because #each only returns the caller.


>> [1,2,3].each { |i| i + 1 }
=> [1,2,3]
>> [1,2,3].map { |i| i + 1 }
=> [2,3,4]

If you’re just calling a method on each element, you can use a handy shortcut.


>> [1,2,3].map(&:to_f)
=> [1.0, 2.0, 3.0]

Which is the same as:


>> [1,2,3].map { |i| i.to_f }
=> [1.0, 2.0, 3.0]

Note that #map does not alter the original collection. It merely returns an array based on the result of each block call. If you want the original collection to reflect the changes, use #map! instead.


>> numbers = [1,2,3]
>> numbers.map(&:to_f)
=> [1.0, 2.0, 3.0]
>> numbers
=> [1, 2, 3]
>> numbers.map!(&:to_f)
=> [1.0, 2.0, 3.0]
>> numbers
=> [1.0, 2.0, 3.0]

You may have noticed #collect in Ruby code as well. It’s identical to #map, so
the two are interchangeable.


>> letters = ["a", "b", "c"]
>> letters.map(&:capitalize)
=> ["A", "B", "C"]
>> letters.collect(&:capitalize)
=> ["A", "B", "C"]

Classical Iteration

Ruby provides the classical “for” idiom. Although it looks cleaner and is more familiar to newcomers from other languages, it is un-idiomatic since it is not object-oriented and does not accept a block.


>> animals = ["cat", "dog", "bird", "chuck testa"]
>> for animal in animals
>>   puts animal.upcase
>> end
CAT
DOG
BIRD
CHUCK TESTA

In case being “un-idiomatic” is not enough of a deterrent for you, here is an unintuitive result demonstrated by Nikals B. on stackoverflow:


>> results = []
>> (1..3).each { |i| results << lambda { i } }
>> results.map(&:call)
=> [1, 2, 3]
>> results = []
>> for i in 1..3
>>   results << lambda { i }
>> end
>> result.map(&:call)
=> [3, 3, 3]

Additionally, “temporary” variables created in for loops are not temporary at all. That animal variable used in the animals for loop above is gone now right?

Nope.


>> animal
=> "chuck testa"

Conclusion: Ruby for loops – best used to induce eye rolls or expressions of concern from conference audience members.

Frequently Asked Questions about Ruby Arrays

How can I create a multidimensional array in Ruby?

In Ruby, you can create a multidimensional array by nesting arrays within an array. Here’s an example of a two-dimensional array: multi_array = [[1,2,3],[4,5,6],[7,8,9]]. Each inner array represents a row. So, multi_array[0] would give you the first row: [1,2,3]. To access individual elements, you can use two indices. For example, multi_array[0][1] would give you the second element of the first row: 2.

How can I add elements to an array in Ruby?

You can add elements to an array in Ruby using the push method or the << operator. Both methods add elements to the end of the array. Here’s an example: array = [1, 2, 3], array.push(4) or array << 4 would result in [1, 2, 3, 4].

How can I remove elements from an array in Ruby?

You can remove elements from an array in Ruby using the pop method, which removes the last element, or the shift method, which removes the first element. For example, if array = [1, 2, 3, 4], array.pop would result in [1, 2, 3].

How can I sort an array in Ruby?

You can sort an array in Ruby using the sort method. This method sorts the array in ascending order by default. If you want to sort in descending order, you can use the sort method with the reverse method, like so: array.sort.reverse.

How can I iterate over an array in Ruby?

You can iterate over an array in Ruby using the each method. This method takes a block of code and executes it for each element in the array. Here’s an example: array.each { |element| puts element }. This code will print each element of the array.

How can I find the length of an array in Ruby?

You can find the length of an array in Ruby using the length or size method. Both methods return the number of elements in the array. For example, if array = [1, 2, 3, 4], array.length or array.size would return 4.

How can I check if an array includes a certain element in Ruby?

You can check if an array includes a certain element in Ruby using the include? method. This method returns true if the array includes the element, and false otherwise. For example, if array = [1, 2, 3, 4], array.include?(3) would return true.

How can I convert an array to a string in Ruby?

You can convert an array to a string in Ruby using the join method. This method concatenates the elements of the array into a string. You can specify a separator to be placed between the elements. For example, if array = ["a", "b", "c"], array.join("-") would return "a-b-c".

How can I find the index of a certain element in an array in Ruby?

You can find the index of a certain element in an array in Ruby using the index method. This method returns the index of the first occurrence of the element. If the element is not found, it returns nil. For example, if array = [1, 2, 3, 4], array.index(3) would return 2.

How can I create an array with a range of numbers in Ruby?

You can create an array with a range of numbers in Ruby using the .. operator. This operator creates a range from the start number to the end number. You can convert this range to an array using the to_a method. For example, (1..10).to_a would return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10].