Zip It! Zip It Good with Rails and Rubyzip

Key Takeaways

Rubyzip is a useful gem for managing zip archives within a Ruby or Rails application. It allows developers to easily read and create archives or generate them on the fly.
The gem can be used to create database records from a zip file sent by a user and to send an archive containing all records from a table.
The Rubyzip gem also allows for customization, with options including whether existing files should be overwritten during extraction or archive creation, setting non-unicode file names on Windows Vista and earlier, displaying a warning if an archive has an incorrect date format, setting the default compression level, and disabling Zip64 support for writing.
The Rubyzip gem can be used to generate an archive on the fly and use the send_data method to display the response as an attachment. This is more complex, but avoids unnecessary complexity and files persisting on disk.

In our day-to-day activities we are often interacting with archives. When you want to send your friend a bunch of documents, you’d probably archive them first. When you download a book from the web, it will probably be archived alongside with accompanying materials. So, how can we interact with archives in Ruby?

Today we will discuss a popular gem called rubyzip that is used to manage zip archives. With its help, you can easily read and create archives or generate them on the fly. In this article I will show you how to create database records from the zip file sent by the user and how to send an archive containing all records from a table.

Source code is available at GitHub.

Before getting started, I want to remind you that various compressed formats have different compression ratios. As such, even if you archive a file, its size might remain more or less the same:

Text files compress very nicely. Depending on their contents, the ratio is about 3:1.
Some images can benefit from compression, but when using a format like .jpg that already has native compression, it won’t change much.
Binary files may be compressed up to 2 times of their original size.
Audio and video are generally poor candidates for compression.

Getting Started

Create a new Rails app:

$ rails new Zipper -T

I am using Rails 5 beta 3 and Ruby 2.2.3 for this demo, but rubyzip works with Ruby 1.9.2 or higher.

In our scenario today, the demo app keeps track of animals. Each animal has the following attributes:

name (string)
age (integer) – of course, you can use decimal instead
species (string)

We want to list all the animals, add abilities to them, and download data about them in some format.

Create and apply the corresponding migration:

$ rails g model Animal name:string age:integer species:string
$ rake db:migrate

Now let’s prepare the default page for our app:

animals_controller.rb

class AnimalsController < ApplicationController
  def index
    @animals = Animal.order('created_at DESC')
  end
end

views/animals/index.html.erb

<h1>My animals</h1>

<ul>
  <% @animals.each do |animal| %>
    <li>
      <strong>Name:</strong> <%= animal.name %><br>
      <strong>Age:</strong> <%= animal.age %><br>
      <strong>Species:</strong> <%= animal.species %>
    </li>
  <% end %>
</ul>

config/routes.rb

[...]
resources :animals, only: [:index, :new, :create]
root to: 'animals#index'
[...]

Nice! Proceed to the next section and let’s take care of creation first.

Creating Animals from the Archive

Introduce the new action:

animals_controller.rb

[...]
def new
end
[...]

*views/animals/index.html.erb

<h1>My animals</h1>

<%= link_to 'Add!', new_animal_path %>
[...]

Of course, we could craft a basic Rails form to add animals one by one, but instead let’s allow users to upload archives with JSON files. Each file will then contain attributes for a specific animal. The file structure looks like this:

animals.zip
- animal-1.json
- animal-2.json

Each JSON file will have the following structure:

{
  name: 'My name',
  age: 5,
  species: 'Dog'
}

Of course, you may use another format, like XML, for example.

Our job is to receive an archive, open it, read each file, and create records based on the input. Start with the form:

views/animals/new.html.erb

<h1>Add animals</h1>

<p>
  Upload a zip archive with JSON files in the following format:<br>
  <code>{name: 'name', age: 1, species: 'species'}</code>
</p>

<%= form_tag animals_path, method: :post, multipart: true do %>
  <%= label_tag 'archive', 'Select archive' %>
  <%= file_field_tag 'archive' %>

  <%= submit_tag 'Add!' %>
<% end %>

This is a basic form allowing the user to select a file (don’t forget the multipart: true option).

Now the controller’s action:

animals_controller.rb

def create
  if params[:archive].present?
    # params[:archive].tempfile ...
  end
  redirect_to root_path
end

The only parameter that we are interested in is the :archive. As long as it contains a file, it responds to the tempfile method that returns path to the uploaded file.

To read an archive we will use the Zip::File.open(file) method that accepts a block. Inside this block you can fetch each archived file and either extract it somewhere by using extract or read it into memory with the help of get_input_stream.read. We don’t really need to extract our archive anywhere, so let’s instead store the contents in the memory.

animals_controller.rb

require 'zip'

[...]

def create
  if params[:archive].present?
    Zip::File.open(params[:archive].tempfile) do |zip_file|
      zip_file.each do |entry|
        Animal.create!(JSON.load(entry.get_input_stream.read))
      end
    end
  end
  redirect_to root_path
end
[...]

Pretty simple, isn’t it? entry.get_input_stream.read reads the file’s contents and JSON.load parses it. We are only interested in .json files though, so let’s limit the scope using the glob method:

animals_controller.rb

[...]
def create
  if params[:archive].present?
    Zip::File.open(params[:archive].tempfile) do |zip_file|
      zip_file.glob('*.json').each do |entry|
        Animal.create!(JSON.load(entry.get_input_stream.read))
      end
    end
  end
  redirect_to root_path
end
[...]

You can also extract part of the code to the model and introduce a basic error handling:

animals_controller.rb

[...]
  def create
    if params[:archive].present?
      Zip::File.open(params[:archive].tempfile) do |zip_file|
        zip_file.glob('*.json').each { |entry| Animal.from_json(entry) }
      end
    end
    redirect_to root_path
  end
  [...]

animal.rb

[...]
class << self
  def from_json(entry)
    begin
      Animal.create!(JSON.load(entry.get_input_stream.read))
    rescue => e
      warn e.message
    end
  end
end
[...]

I also want to whitelist attributes that the user can assign preventing him from overriding id or created_at fields:

animal.rb

[...]
WHITELIST = ['age', 'name', 'species']

class << self
  def from_json(entry)
    begin
      Animal.create!(JSON.load(entry.get_input_stream.read).select {|k,v| WHITELIST.include?(k)})
    rescue => e
      warn e.message
    end
  end
end
[...]

You may use a blacklist approach instead by replacing select with except, but whitelisting is more secure.

Great! Now go ahead, create a zip archive and try to upload it!

Generating and Downloading an Archive

Let’s perform the opposite operation, allowing the user to download an archive containing JSON files representing animals.

Add a new link to the root page:

views/animals/index.html.erb

[...]
<%= link_to 'Download archive', animals_path(format: :zip) %>

We’ll use the same index action and equip it with the respond_to method:

animals_controller.rb

[...]
def index
  @animals = Animal.order('created_at DESC')

  respond_to do |format|
    format.html
    format.zip do
    end
  end
end
[...]

To send an archive to the user, you may either create it somewhere on the disk or generate it on the fly. Creating the archive on disk involves the following steps:

Create an array of files that has to be placed inside the archive:

files << File.open("path/name.ext", 'wb') { |file| file << 'content' }

Create an archive:

Zip::File.open('path/archive.zip', Zip::File::CREATE) do |z|

Add your files to the archive:


Zip::File.open('path/archive.zip', Zip::File::CREATE) do |z|
  files.each do |f|
    z.add('file_name', f.path)
  end
end

The add method accepts two arguments: the file name as it should appear in the archive and the original file’s path and name.

Send the archive:
```
send_file 'path/archive.zip', type: 'application/zip',
      disposition: 'attachment',
      filename: "my_archive.zip"
```
This, however, means that all these files and the archive itself will persist on disk. Of course, you may remove them manually and even try to create a temporary zip file as described here but that involves too much unnecessary complexity.

What I’d like to do instead is to generate our archive on the fly and use send_data method to display the response as an attachment. This is a bit more tricky, but there is nothing we can’t manage.

In order to accomplish this task, we’ll require a method called Zip::OutputStream.write_buffer that accepts a block:

animals_controller.rb

[...]
def index
  @animals = Animal.order('created_at DESC')

  respond_to do |format|
    format.html
    format.zip do
      compressed_filestream = Zip::OutputStream.write_buffer do |zos|
      end
    end
  end
end
[...]

To add a new file to the archive, use zos.put_next_entry while providing a file name. You can even specify a directory to nest your file by saying zos.put_next_entry('nested_dir/my_file.txt'). To write something to the file, use print:

animals_controller.rb

compressed_filestream = Zip::OutputStream.write_buffer do |zos|
  @animals.each do |animal|
    zos.put_next_entry "#{animal.name}-#{animal.id}.json"
    zos.print animal.to_json(only: [:name, :age, :species])
  end
end

We don’t want fields like id or created_at to be present in the file, so by saying :only we limit them to name, age and species.

Now rewind the stream:

compressed_filestream.rewind

And send it:

send_data compressed_filestream.read, filename: "animals.zip"

Here is the resulting code:

animals_controller.rb

[...]
def index
  @animals = Animal.order('created_at DESC')

  respond_to do |format|
    format.html
    format.zip do
      compressed_filestream = Zip::OutputStream.write_buffer do |zos|
        @animals.each do |animal|
          zos.put_next_entry "#{animal.name}-#{animal.id}.json"
          zos.print animal.to_json(only: [:name, :age, :species])
        end
      end
      compressed_filestream.rewind
      send_data compressed_filestream.read, filename: "animals.zip"
    end
  end
end
[...]

Go ahead and try the “Download archive” link!

You can even protect the archive with a password. This feature of rubyzip is experimental and may change in the future, but it seems to be working currently:

animals_controller.rb

[...]
compressed_filestream = Zip::OutputStream.write_buffer(::StringIO.new(''), Zip::TraditionalEncrypter.new('password')) do |zos|
[...]

Customizing Rubyzip

Rubyzip does provide a bunch of configuration options that can be either provided in the block:

Zip.setup do |c|
end

or one-by-one:

Zip.option = value

Here are the available options:

on_exists_proc – Should the existing files be overwritten during extraction? Default is false.
continue_on_exists_proc – Should the existing files be overwritten while creating an archive? Default is false.
unicode_names – Set this if you want to store non-unicode file names on Windows Vista and earlier.Default is false.
warn_invalid_date – Should a warning be displayed if an archive has incorrect date format? Default is true.
default_compression – Default compression level to use. Initially set to Zlib::DEFAULT_COMPRESSION, other possible values are Zlib::BEST_COMPRESSION and Zlib::NO_COMPRESSION.
write_zip64_support – Should Zip64 support be disabled for writing? Default is false.

Conclusion

In this article we had a look the rubyzip library. We wrote an app that reads users’ archives, creates records based on them, and generates archives on the fly as a response. Hopefully, the provided code snippets will come in handy in one of your projects.

As always, thanks for staying with me and see you soon!

Frequently Asked Questions (FAQs) about Accepting and Sending ZIP Archives with Rails and Rubyzip

How can I install Rubyzip in my Rails application?

To install Rubyzip in your Rails application, you need to add it to your Gemfile. Open your Gemfile and add the following line: gem 'rubyzip'. After adding this line, run bundle install in your terminal. This command will install Rubyzip along with any other gems listed in your Gemfile.

How can I create a ZIP file using Rubyzip?

Creating a ZIP file using Rubyzip involves initializing a new Zip::File and adding files to it. Here’s a basic example:

require 'rubyzip'

Zip::File.open('my_archive.zip', Zip::File::CREATE) do |zipfile|
zipfile.add('file.txt', '/path/to/file.txt')
end

In this example, ‘my_archive.zip’ is the name of the ZIP file you’re creating, and ‘file.txt’ is the file you’re adding to the archive.

How can I read a ZIP file using Rubyzip?

Reading a ZIP file with Rubyzip involves opening the ZIP file and iterating over its entries. Here’s a basic example:

require 'rubyzip'

Zip::File.open('my_archive.zip') do |zipfile|
zipfile.each do |entry|
puts "Extracting #{entry.name}"
entry.extract(dest_path)
end
end

In this example, ‘my_archive.zip’ is the ZIP file you’re reading, and ‘dest_path’ is the destination path where you want to extract the files.

How can I add a directory to a ZIP file using Rubyzip?

To add a directory to a ZIP file, you can use the add method of the Zip::File class. Here’s an example:

require 'rubyzip'

Zip::File.open('my_archive.zip', Zip::File::CREATE) do |zipfile|
Dir.glob('path/to/directory/*').each do |file|
zipfile.add(file.sub('path/to/directory/', ''), file)
end
end

In this example, ‘path/to/directory’ is the directory you’re adding to the ZIP file.

How can I handle errors when working with ZIP files in Rubyzip?

Rubyzip provides several exception classes that you can use to handle errors. For example, you can use Zip::Error to catch any ZIP-related errors, or Zip::EntryExistsError to handle cases where you’re trying to add a file to a ZIP archive that already contains a file with the same name. Here’s an example:

require 'rubyzip'

begin
Zip::File.open('my_archive.zip', Zip::File::CREATE) do |zipfile|
zipfile.add('file.txt', '/path/to/file.txt')
end
rescue Zip::EntryExistsError => e
puts "A file with the same name already exists in the archive: #{e.message}"
rescue Zip::Error => e
puts "An error occurred: #{e.message}"
end

In this example, if a file with the same name already exists in the archive, or if any other ZIP-related error occurs, the error message will be printed to the console.

How can I update a ZIP file using Rubyzip?

Updating a ZIP file involves opening the ZIP file in write mode and modifying its contents. Here’s a basic example:

require 'rubyzip'

Zip::File.open('my_archive.zip', Zip::File::CREATE) do |zipfile|
zipfile.get_output_stream('file.txt') { |f| f.puts 'Hello, world!' }
end

In this example, ‘my_archive.zip’ is the ZIP file you’re updating, and ‘file.txt’ is the file you’re modifying. The get_output_stream method opens the file in write mode, and the block passed to it writes ‘Hello, world!’ to the file.

How can I delete a file from a ZIP archive using Rubyzip?

To delete a file from a ZIP archive, you can use the remove method of the Zip::File class. Here’s an example:

require 'rubyzip'

Zip::File.open('my_archive.zip') do |zipfile|
zipfile.remove('file.txt')
end

In this example, ‘my_archive.zip’ is the ZIP file you’re modifying, and ‘file.txt’ is the file you’re deleting.

How can I rename a file in a ZIP archive using Rubyzip?

To rename a file in a ZIP archive, you can use the rename method of the Zip::File class. Here’s an example:

require 'rubyzip'

Zip::File.open('my_archive.zip') do |zipfile|
zipfile.rename('old_name.txt', 'new_name.txt')
end

In this example, ‘my_archive.zip’ is the ZIP file you’re modifying, ‘old_name.txt’ is the original name of the file, and ‘new_name.txt’ is the new name of the file.

How can I compress a ZIP file using Rubyzip?

Rubyzip automatically compresses files when you add them to a ZIP archive. However, you can control the level of compression by setting the compression_level option when adding a file. Here’s an example:

require 'rubyzip'

Zip::File.open('my_archive.zip', Zip::File::CREATE) do |zipfile|
zipfile.add('file.txt', '/path/to/file.txt', compression_level: Zlib::BEST_COMPRESSION)
end

In this example, ‘my_archive.zip’ is the ZIP file you’re creating, ‘file.txt’ is the file you’re adding to the archive, and Zlib::BEST_COMPRESSION is the highest level of compression.

How can I encrypt a ZIP file using Rubyzip?

Currently, Rubyzip does not support encryption. If you need to encrypt a ZIP file, you will need to use a different library or tool. Please be aware that encryption is a complex topic and should be handled with care to ensure the security of your data.