String Interpolation in Ruby

In my recent gem squasher I was faced with a problem of inserting dynamic data into output. For example:

1
"There are no migrations prior %inputed date%"

Of course, at this moment every reader could say – “what? it’s easy enough. Just the one line of code”

1
puts "There are no migrations prior #{ date }"

But I decided to hold all messages into external json file. To generate some message code triggers Squasher#tell method with a key. It cleans code from longs lines of messages, allows easialy add color into messages and holds all text in the one place. View & system logic should always be splitted, even if there are a few messages. After quick search I found a working solution. A json file should contain messages in the next format:

1
"There are no migrations prior #{ opts[:date] }"

To evaluate this message with actual data, you need to do a next hack:

1
message = eval(%Q|"#{ json_message }"|)

Once again, this code does it’s job. But a solution was not good. I continued to searching for a better one. As you know out-of-box ruby offers a render engine called ERB. My problem was solved with next code:

1
2
3
require 'erb'
json_message = "There are no migrations prior <%= opt[:date] %>"
message = ERB.new(json_message).result(binding)

Using ERB makes message processing better, but still has few downsides. First of all, json messages have knowlandge that they will be evaluted in ruby env with a ruby hash opts. Second, this code adds extra dependency of erb and it’s usage is not the best expirience.

I felt that this problem could be solved in a more elegant way. I recalled that I18n has the same problem. And it offers the easy interface to insert some custom data in the next format:

1
"There are no migrations prior %{date}"

After opening source code(thanks, budler, for ability to do it) I was even more surpriced that you can do it in Ruby out-of-box via String#% method. If you don’t know ruby strings have % method which allows format & insert variables into string. For example:

1
2
$ "%d - %f" % [10.0, 10.0] =>
# "10 - 10.000000"

But if argument will be a hash instead of array, you could inject a value by it’s key:

1
"There are no migrations prior %{date}" % { :date => "2014/02/16" }

You can also use "%(date)" which produces the same string. The difference will be if a hash doesn’t have mentioned key.

  • "%(date)" will raise ArgumentError with “malformed format string – %(”
  • "%{date}", on my opinion, raises more informative KeyError with “key{date} not found”

Sum in Rails

Ruby has a rich set of the methods that work with arrays. #each, #select, #detect make your code smaller and more readable. With Rails you get an extra handy methods in array. One of this methods is #sum. Actually there are 2 implementations of a method #sum. A first version is injected into Enumerable module.

1
[3, 2, 4, 2].sum # => 11

Like any other method in Enumerable it accepts a block for the extra action on each item.

1
[2, 3, 4, 5].sum { |number| number ** 2  } # => 54

The another version of a #sum method is injected in ActiveRecord::Relation.

1
Song.where(:genre => "Blue").sum(:cost)

In this case code will not iterate through each object and summarize the #cost attribute. It just makes a sql request to db and returns a value calculated by the database. But if a block is specified the behavior will be change. ActiveRecord will load all matching records and iterate through a collection to aggregate a desired value.

1
Song.where(:genre => "Blue").sum(&:cost)

Above example will return the same result but a system will spend much more time. So a first conclusion – always try to move the all calculations to sql and use sum with a block only for the dependencies which can’t be easily transferred to sql. But it isn’t always true. Let’s review the next example. Suppose we need to display the all blues songs in a table and the last row should has a total cost of the all songs.

1
2
3
4
5
6
7
8
9
10
<% songs.each do |song| %>
  <tr>
    <td><%= song.name %></td>
    <td><%= number_to_currency song.cost %></td>
  </tr>
<% end %>
<tr>
  <td>Total cost:</td>
  <td><%= songs.sum(:cost) %></td>
</tr>

This code loads all matching records when executes #each and makes an extra request to a database when executes #sum. But if we’ve already loaded a records from db why we should make a request to our storage. Code already has the needed information. So in this situation better use a #sum method with a block.

1
2
3
4
5
6
7
8
9
10
<% songs.each do |song| %>
  <tr>
    <td><%= song.name %></td>
    <td><%= number_to_currency song.cost %></td>
  </tr>
<% end %>
<tr>
  <td>Total cost:</td>
  <td><%= songs.sum(&:cost) %></td>
</tr>

The final conclusion of this post is next: – if you need only summarized value of some column in a table – use #sum without a block. – if you also need information from other columns – use #sum with a block.

Fast Rails Tests With Factories

Ruby on Rails as a part of the Ruby ecosystem has a poor performance. I don’t touch a subject “Rails is slow! Let’s use Go, Erlang, Node js, etc”. I know it and continue use. All because of a rich infrastructure, a fast development process, a beautiful and flexible syntax. So every ruby developer should take care about optimization in his code. Most popular examples:

  • use symbols and numbers instead strings as keys for hashes
  • use interpolation instead concatenation in operations with strings
  • use #pluck method instead scope.select { |record| record.column_name } # ActiveRecord >= 3.2
  • use eager loading where it possible

But there are few places where we cannot fix a poor performance. Until recent days for me one of it was test suite. Unfortunately middle and bigger projects written on rails have test suite that runs at least 2 minutes. 2 minutes is a big period of time, so many developers don’t run tests before each commit and even pull request. It brings two situations:

  • passable. A team starting use CI servers. I don’t say that using of CI is a bad idea. I think it should be at every project. But CI gives result with a delay. The size of a delay related to a count of available CI servers. If your company like Github gives every developer CI server – I’m glad for you. But more often – one-two CI servers which configured for main branches(master, production, development). That why CI will run your changes only when you merge it into watched branch. So frequently the merge of a new feature followed by few “bugfix” commits. It makes history not so good as it could be. And if you work with other developers on project you can not so easily reset your commits with --force push. But if your tests were run before merge – you prevent this situation.

  • bad. A team doesn’t run test suite from time to time. Tests become outdated and worthless. So only developers guarantee that a code is fine.

That why try to hold your tests on a diet. When the time of a running test suite became to long – it’s time to review your code base. Quite often the reason of a slow test suite is bad implementation of a tested code. But it isn’t always true. Sometime the time spent on satisfaction of a test condition may be much higher than the time of an execution of a tested code.

A test suite makes time overhead on loading code’s dependencies and creation test data. An issue with loading dependent code may be solved with proper organized Gemfile or by preloading environment.

The optimization of creation test data is a topic of this post. My suggestions will be applicable only for tests which use fabrics. In next code examples I will use factory_girl. Before we begin I suggest you to read this nice post about how well organize your fabrics. Let’s suppose we have next models and fabrics:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#app/models/author.rb
class Author < ActiveRecord::Base
  has_one :profile, :dependent => :delete
  has_many :articles, :dependent => :delete_all

  has_secure_password

  validates :login, :password, :presence => true
  validates :login, :uniqueness => true
end

#app/models/article.rb
class Article < ActiveRecord::Base
  belongs_to :author

  validates :author, :presence => true
  validates :title, :uniqueness => true, :presence => true

  def self.recent_titles
    order("created_at DESC").limit(3).pluck(:title)
  end

  def can_edit?(author)
    self.author == author
  end

  def tags
    title.split(' ').map(&:downcase)
  end
end

#spec/factories.rb
FactoryGirl.define do
  factory :author do
    sequence(:login) { |i| "author_#{ i }" }
    password "123456"

    association :profile
  end

  factory :article do
    sequence(:title) { |i| "title_#{ i }" }
    body "Article body"

    association :author
  end
end

1. Use build where is possible

Quite often when I review a new project and it’s tests I see next:

1
2
3
4
5
it '#tags' do
  article = FactoryGirl.create(:article, :title => "One two three")

  expect(article.tags).to eq(['one', 'two', 'three'])
end

As a result the spec example runs slower than it should. And if you change create by build nothing bad will happen:

1
2
3
4
5
it '#tags' do
  article = FactoryGirl.build(:article, :title => "One two three")

  expect(article.tags).to eq(['one', 'two', 'three'])
end

If code like attached doesn’t care about persistence in database, so why we should store it? Another example may looks like this:

1
2
3
4
5
6
it '#can_edit?' do
  author = FactoryGirl.create(:author)
  article = FactoryGirl.create(:article, :author => author)

  expect(article.can_edit?(author)).to eq(true)
end

Once again what changes if we build user & article and then join them? The more faster alternative:

1
2
3
4
5
6
it '#can_edit?' do
  author = FactoryGirl.build(:author)
  article = FactoryGirl.build(:article, :author => author)

  expect(article.can_edit?(author)).to eq(true)
end

2. Light fabrics

Let’s come back to our spec #tags. The fabric of the Article model on every .build or .create save related associations to database first. But #tags example needs only title and nothing more. The simplest solution for a problem is manually nullify all associations:

1
2
3
4
5
it '#tags' do
  article = FactoryGirl.build(:article, :author => nil, :title => "One two three")

  expect(article.tags).to eq(['one', 'two', 'three'])
end

But if you have a lot of tests you will have a mess with a reseting associations. Also every time when you add or remove a association in Article model you should go though all this tests and update them. Too much work as for me. The more flexible way is unleash power of traits. Define a trait that clean all association in your fabric. For example:

1
2
3
4
5
6
factory :article do
  ...
  trait :pure do
    author nil
  end
end
1
2
3
4
5
it '#tags' do
  article = FactoryGirl.build(:article, :pure, :title => "One two three")

  expect(article.tags).to eq(['one', 'two', 'three'])
end

A mess with cleaning associations is gone. And you always sure that the test doesn’t generate unneeded data. Also I recommended to define trait :pure for any models and always build records with it. Even if a model doesn’t have associations now it may have it a few month later. So you protect your tests from slowdowns in future.

3. Increase speed when store records into a database

Any project has tests where impossible avoid the database. For example:

1
2
3
4
5
6
7
8
it '#recent_titles' do
  create(:article, :title => "One")
  create(:article, :title => "Two")
  create(:article, :title => "Three")
  create(:article, :title => "Four")

  expect(Article.last_article_titles).to eq(["Two", "Three", "Four"])
end

So a setup of a test data will make 20 requests to the database – 4 * (author uniqueness validation + author insert + profile insert + article uniqueness validation + article insert). After it I often want to write something like this:

1
ActiveRecord::Base.connection.execute("insert into articles ((..., 'One', ..), (..., 'Two', ..)...)")

But you bring a new problem with converting data to sql format. Myron Marston wrote a post in the forum that his current project uses Sequel to solve this issue. But I don’t want to add another big dependencies for tests. That why a few months ago I invented the next helper method:

1
2
3
4
5
6
7
8
9
10
def insert(*args)
  record = FactoryGirl.build(*args)
  def record.run_callbacks(*args, &block)
    if block_given?
      block.arity.zero? ? yield : yield(self)
    end
  end
  record.save!
  record
end

What it does. First it builds a instance with inputed arguments. Then it turns off all calbacks & validations for a builded instance and save it. It allows to skip creation of unneeded associations and validation requests to the database. So a new version will look like:

1
2
3
4
5
6
7
8
it '#recent_titles' do
  insert(:article, :pure, :title => "One")
  insert(:article, :pure, :title => "Two")
  insert(:article, :pure, :title => "Three")
  insert(:article, :pure, :title => "Four")

  expect(Article.recent_titles).to eq(["Two", "Three", "Four"])
end

As a result to generate needed data we make just 4 pure sql inserts to the database without losing a power of FactoryGirl & ActiveRecord

But you should always remember that a created record doesn’t run callbacks & validations.

With great power, comes great responsibility.

That all. For lazy rails developer I added a gist for rspec. Just put it into your spec/support folder and your have all this stuff.