Digging with TracePoint

digging with spade

This article describes a rare use case, so I feel compelled to give some background explaining my motivations for “digging” with TracePoint. You can skip to the next section if it is not interesting to you.

Note. All examples are not production code! The samples are simply for the sake of this article. Please don’t copy-paste it to your project without a thorough review.

Why did I start playing with TracePoint?

In everyday Ruby programming, it’s common to have some code at the beginning of a method to validate its arguments. Especially if you write an API method that will be called from external code, as you can’t trust anything from outside sources.

This lead me to write code like:

def my_method(arg)
  arg = arg.to_sym if arg.is_a?(String)
  raise unless arg.is_a?(Symbol)
end

I said to myself “Stop.” I’m writing something similar to the above example in every method. This code should be unified and extracted into my everyday usage library. Google didn’t give me any usable gem, so I started the EnsureIt project.

The obvious way to provide this functionality is to monkeypatch (or refine, starting from 2.1.0) generic ruby objects with a set of methods like ensure_symbol or ensure_string and make appropriate validations there. This method gives us the fastest (in some cases, with no validation at all) code.

When I started coding bang methods (ensure_symbol! and so on), I got into trouble with error reporting. Error messages should be informative and contain at least the name of the invalid argument. But I needed to have clear and usable API, so passing the argument name as another argument (something like arg.ensure_symbol!(:arg)) was not going to work. My decision was to get the caller context from within the ensure_symbol! method to inspect it for arguments and local variables.

And that is the start pointing for this article.

Goal

My first goal is to get the caller context from within the called method. This is not very complicated, but I’ll describe my implementation below. What is really interesting, as well as being my second goal, is to raise an error in the called method with a message constructed from the caller context.

Getting Caller Context from Called Method

Let’s start with some code. Taking into account my background form above, monkeypath Object with the ensure_symbol!:

class Object
  def ensure_symbol!
    raise 'argument should be a Symbol or a String'
  end
end

class Symbol
  def ensure_symbol!
    self
  end
end

class String
  def ensure_symbol!
    to_sym
  end
end

def test_method(test_arg)
  test_arg = test_arg.ensure_symbol!
  puts "passed argument: #{test_arg.inspect}"
end

test_method(:symbol)
test_method('string')
test_method(0)

It’s simple. For symbols, return self. For strings, convert the value to a symbol. Raise an error for any other object.

Here is the output:

$ ruby example_01.rb
passed argument: :symbol
passed argument: :string
example_01.rb:3:in `ensure_symbol!': argument should be a Symbol or a String ( RuntimeError)
    from example_01.rb:20:in `test_method'
    from example_01.rb:26:in `<main>'

The problem is that error message is not informative. I want to have the argument name (test_arg) and the method name (test_method) in the message. Something like argument 'test_arg' in method 'test_method' should be a Symbol or a String.

To do this, I need the caller context inside ensure_symbol!. There are similary purposed implementations on the web. Many of them, like binding_of_caller are ruby-platform dependent. I prefer to use ruby native tools anywhere it’s possible. Here is where TracePoint comes into play.

Starting from version 2.0.0 Ruby introduces a new way (the old one was set_trace_func) of tracing program execution – the TracePoint class. It provides a number of ruby internal events, like returning from a method or raising an exception, which we can capture with our own code blocks. Please read the official documentation for a better understanding.

As a first attempt, I bind to the :return event and write out useful messages to see what happened during program execution:

def raise_error(method_name, message: nil)
  message = 'local variable have wrong type' unless message.is_a?(String)
  counter = 0
  TracePoint.trace(:return) do |tp|
    if (counter += 1) > 2
      tp.disable
      puts 'Caller context reached.'
      puts "Caller method name is: #{tp.method_id}"
      method = eval("method(:#{tp.method_id})", tp.binding)
      puts "Caller arguments: #{method.parameters}"
    end
  end
end # <= first call of trace point block

class Object
  def ensure_symbol!
    raise_error(
      :ensure_symbol!,
      message: 'argument should be a Symbol or a String'
    )
  end # <= second call of trace point block
end

def test_method(test_arg)
  test_arg = test_arg.ensure_symbol!
  puts "passed argument: #{test_arg.inspect}"
end # <= last call of trace point block

test_method(0)

Here we create a helper method called raise_error to do the job. We create a TracePoint instance inside it that will be called on return from every method until it is disabled. The initial call to raise_error only creates this trace point, so we increment our counter an move on. We haven’t reached our desired context yet. The next block is called on return from ensure_symbol!. Also skip it. Finally, the last block is called from our desired caller context and we can grab the method name with its arguments. Let’s see output:

$ ruby example_02.rb
passed argument: #<TracePoint:enabled>
Caller context reached.
Caller method name is: test_method
Caller arguments: [[:req, :test_arg]]

Yep! We have our coveted caller context. But first, we still have not raised the error. Also, our ensure_symbol! block executed after the remaining code in the caller method (puts "passed argument... ).

Let’s modify raise_error to execute our trace point block immediately after returning from ensure_symbol!:

def raise_error(method_name, message: nil)
  message = 'local variable have wrong type' unless message.is_a?(String)
  counter = 0
  TracePoint.trace(:return) do |first_tp|
    if (counter += 1) > 1
      first_tp.disable
      TracePoint.trace(:line) do |second_tp|
        second_tp.disable
        puts 'Caller context reached.'
        puts "Caller method name is: #{second_tp.method_id}"
        method = eval("method(:#{second_tp.method_id})", second_tp.binding)
        puts "Caller arguments: #{method.parameters}"
      end
    end
  end
end # <= first call of first_tp block

class Object
  def ensure_symbol!
    raise_error(
      :ensure_symbol!,
      message: 'argument should be a Symbol or a String'
    )
  end # <= second and last call of first_tp block
end

def test_method(test_arg)
  test_arg = test_arg.ensure_symbol!
  puts "passed argument: #{test_arg.inspect}" # <= call of second_tp block
end

test_method(0)

Now, we are tracing only the first two returns (not three, as in the previous example). In the second call, we are at the end of the ensure_symbol! method. Here, we use another trace event type – :line, so our block will be called for each line of code. Since raise_error will only be used internally, we can safely assume that calls to raise_error are always placed as the last line of the method. Under this condition, we can skip one line to reach the caller context. Let’s try to execute the code:

$ ruby example_03.rb
Caller context reached.
Caller method name is: test_method
Caller arguments: [[:req, :test_arg]]
passed argument: #<TracePoint:disabled>

And voila! We have reached our first goal! The caller context is obtained immediately before executing any code after the call to our ensure_symbol! method. However, the second problem still remains – we have not raised an error.

Raising a “Smart” Error

How can we raise our informative error message? Our first idea is to raise the error from the trace point block:

def raise_error(method_name, message: nil)
  message = 'local variable have wrong type' unless message.is_a?(String)
  counter = 0
  TracePoint.trace(:return) do |first_tp|
    if (counter += 1) > 1
      first_tp.disable
      TracePoint.trace(:line) do |second_tp|
        second_tp.disable
        raise "raised from trace point in method #{second_tp.method_id}"
      end
    end
  end
end

executing this gives us:

$ ruby example_04.rb
example_04.rb:9:in `block (2 levels) in raise_error': raised from trace point in method test_method (RuntimeError)
    from example_04.rb:38:in `test_method'
    from example_04.rb:41:in `<main>'

This looks nice enough, but it will not work if the call to ensure_symbol! is placed inside a begin-rescue block. Let’s modify our test_method:

def test_method(test_arg)
  begin
    test_arg = test_arg.ensure_symbol!
  rescue RuntimeError => err
    puts 'error occured while checking the argument'
    raise err
  end
  puts "passed argument: #{test_arg.inspect}"
end

which results in:

$ ruby example_05.rb
example_05.rb:9:in `block (2 levels) in raise_error': raised from trace point in method test_method (RuntimeError)
    from example_05.rb:43:in `test_method'
    from example_05.rb:46:in `<main>'

Same result. But where is the ‘error occured while checking the argument’ message? What happened? Let’s return to the last example in previous section.

Our second trace point skips one line in ensure_symbol! method and then gives us the caller context immedeately after ensure_symbol!. At this point, there is no raised exception yet. This explains why our code above is not working – at the moment of returning from the method there is no exception and ruby just skips the rescue block. Now, the exception in the trace point block is raised and output.

This is but one conclusion – we should raise the exception before we return from ensure_symbol!. OK, but now we have a different problem. The message we want to generate is based on the caller context, but we need to raise the exception before we get said context. I am reminded of the great ‘Back to the Future’ movies. Impossible?

In fact, it is possible! If the exception is raised inside raise_error and then modified once we have the context:

def raise_error(method_name, message: nil)
  message = 'local variable have wrong type' unless message.is_a?(String)
  counter = 0
  error = nil
  TracePoint.trace(:return, :raise) do |first_tp|
    if first_tp.event == :raise
      error = first_tp.raised_exception
      puts "error captured with message: #{error.message}"
    else
      if (counter += 1) > 1
        first_tp.disable
        TracePoint.trace(:line) do |second_tp|
          second_tp.disable
          unless error.nil?
            puts "now we have #{error.inspect} in #{second_tp.method_id} context "
          end
        end
      end
    end
  end
  raise %q{raised from '#raise_error' method}
end

And it gives us:

$ ruby example_06.rb
error captured with message: raised from '#raise_error' method
now we have #<RuntimeError: raised from '#raise_error' method> in test_method context
error occured while checking the argument
example_06.rb:21:in `raise_error': raised from '#raise_error' method (RuntimeError)
    from example_06.rb:26:in `ensure_symbol!'
    from example_06.rb:47:in `test_method'
    from example_06.rb:55:in `<main>'

Great! We are very close to success. The error is being raised in the right way and captured by a begin-rescue block, yet we still have access to the caller context.

Now we just need to modify the error message. Unfortunately, it’s not so simple. The Exception class does not have a message setter. We can create our own Exception class, but I want to let users choose any Exception type they want.

Hold on, what is an exception? It’s just a Ruby object. Thanks to Ruby’s singleton methods, we can redefine any method in a single object without changing anything else!

Here is the final raise_error version:

def raise_error(method_name, message: nil)
  message = 'local variable have wrong type' unless message.is_a?(String)
  counter = 0
  error = nil
  TracePoint.trace(:return, :raise) do |first_tp|
    if first_tp.event == :raise
      error = first_tp.raised_exception
    else
      if (counter += 1) > 1
        first_tp.disable
        TracePoint.trace(:line) do |second_tp|
          second_tp.disable
          unless error.nil?
            new_message = "#{message} in '#{second_tp.method_id}' method"
            error.define_singleton_method :message do
              new_message
            end
          end
        end
      end
    end
  end
  raise %q{raised from '#raise_error' method}
end

And the result:

$ ruby example_07.rb
error occured while checking the argument
example_07.rb:29:in `raise_error': argument should be a Symbol or a String in 'test_method' method (RuntimeError)
    from example_07.rb:34:in `ensure_symbol!'
    from example_07.rb:55:in `test_method'
    from example_07.rb:63:in `<main>'

Ruby magic in action!

Of course, there remains some technical work to create the correct error message. We could use regular expressions to get and analyze the call. The ensure_symbol method might be called for a method argument, a local variable or a literal. Explaining all of that is out of the scope of this article. You can find working source code of the EnsureIt project on github for the details.

Conclusion

Some may find this topic strange and hacky. I can see that point-of-view. However, exercises like this one are good experience with Ruby programming. Also, given the conditions of the project, calls to methods such as ensure_symbol! are very frequent. As such, my goal was to execute the minimum amount of code inside these methods, only for validation and conversion, doing any other work only if an error occurred.

When my tests became green, I looked through my window and saw a beautiful sunrise. :) Digging with ruby is really enthralling!

Good luck!

(Thanks to Alex Skazka and my sister for reviewing this article)

Win an Annual Membership to Learnable,

SitePoint's Learning Platform

No Reader comments

Comments on this post are closed.