Rails 2.3 upgrade gotcha: aggregate methods are now OrderedHash
Courtenay : May 8th, 2009
In our open source ticket tracker xtt, we have code that looks like this:
has_many :memberships do
def contexts
proxy_owner.memberships.sort.group_by &:context
end
end
Basically, it lets people group their memberships to various projects by a context. So each user can have their own grouping for projects. If you don't have a context for a project, it shows that last, in an implicit 'etc' context. In the tests, we check for this behavior like
it "leaves nil context for last" do
@contexts = @user.memberships.contexts
@contexts.last.should == [nil, [memberships(:default)]]
end
However, since rails now returns aggregated association results (such as group_by or total with :group) as an ActiveSupport::OrderedHash, you don't get all those tasty array methods that Rails is so famous for (#last, #second etc).
ActiveSupport::OrderedHash is just some code that duplicates Ruby 1.9's hash functionality.
The easy solution here is to make your results into an array at some point, rewrite your tests, or send a patch to the rails team to mix in the Array monkeypatched code into OrderedHash.
it "leaves nil context for last" do
@contexts = @user.memberships.contexts
@contexts.to_a.last.should == [nil, [memberships(:default)]]
end
CabooseConf '09
Courtenay : March 12th, 2009
Hey everyone! CabooseConf 09 is on again. For those of you who don’t know, we have been running a free anti-conference at the same time as the O’Reilly Railsconf behemoth. Last year, Chad and Gina approached me after the conference and asked if we would run it again this year, but under the banner of Railsconf itself. Keep your friends close, I guess ;)
I’m a big fan of going to conferences and attending the “hallway track”. So, this year it’s officially sanctioned.. Come to Vegas, network, hack, code, and don’t pay a cent. We have conference rooms staked out at the Railsconf hotel, and we’re doing our best to organize similar perks to last year (free energy drinks, etc.)
You do have to sign up at the O’Reilly “railsconf”:http://en.oreilly.com/rails2009/ site and get a badge. Did I mention it’s free? One thing you need to do – as usual – go write some open source code! No, you can’t use something you did two months ago. No, you can’t use some code on some PHP project. This is railsconf, so go show it like you mean it, and contribute something to rails. We won’t let you in unless you do. Seriously. (*Merb also accepted)
We hope to be hosting all the hackfests.. so bring your open source ruby codes.
Allowing custom CSS in your app
Courtenay : December 31st, 2008
There are a number of good reasons why you don't want your users providing their own CSS (for example, when theming their site). These are: taste (see: myspace) and security.
The former is pretty much your users' problem. The pages don't have to look terrible -- and in fact Myspace charges a LOT of money to do those custom movie or band pages (it's part of the service when you buy their primo ad space).
The latter, well, as it turns out there are a bunch of security vulnerabilities exposed in CSS. While these are mainly in IE, related to expressions (you can run javascript from your CSS). This means that users can steal others' sessions. So, while there are some excellent perl libraries out there for this, there hasn't been one for ruby -- until now! (at least that I could find).
So, here's my first attempt.
css_file_sanitize (github)
I stole most of the tests from LiveJournal's css sanitizing library, and rewrote the implementation in Ruby. I'd love to hear your collective feedback. It's a really lazy plugin; in fact, while it does have tests, you're best to just include the module in your model. This is a case of "it works on my machine" so send your patches!

A plam for splam
Courtenay : November 24th, 2008
A few weeks ago I wrote a fun plugin to fight spammers everywhere; I call him, Splam. I thought I wrote this up somewhere, but I can’t seem to find the article. So, I must have dreamed it. I soft-launched it on Github, so those of you following my github profile will have seen some commits.
Splam is a “Simple, pluggable, easily customizable score-based spam filter plugin for Ruby-based applications”. I couldn’t find any other Ruby projects outside of Defensio and Akismet, both hosted services, so while you might say, “but those work perfectly well!”, you can run this locally and get instant feedback. Install it as a plugin, and include it into your ActiveRecord (or other PORO) like so:
class Comment < ActiveRecord::Base
include Splam
splammable :body
end
Easy, right? Splam works by looking at the field, and applying a set of rules. Some of these rules are pretty simple; most forum/comment spam is pretty simple, too. For example, it looks for words like "porn" or "erotic" or "viagra", and gives 10 points for each of these. Then it looks in links in the body, and gives another 20 points each time a word appears in the link text. (Actually, I modified the code so it gives 10^ rather than 10*. That means that each time you use a banned word, it's exponentially more likely it's spam).
Some of the other rules target the idiots who try to spam your Ruby forum with bbcode: [url= or [b] Then it gives you points for each chinese character, and more points for Russian glyphs. It looks for bad HTML (a href=http://...) as well as extra long lines, sentences with lots of words and no punctuation, too many words with lots of letters, and so on.
Splam gives you a spam? boolean, a splammable_score score, and the splam_reasons why it marked something as spam, along with the points for each infraction.
I originally wrote Splam for the new app, Tender Support we've been building at entp -- I've been tweaking it until it gets zero false positives against Defensio, Akismet AND our sturdy human spam checker, Will the Defender, against the complete set of support/help requests in the Lighthouse project. Interestingly enough, I had to add a set of "good words" which takes spam points away (things related to our business).
In this way, you can see splam as this horrible manual system with no training ability outside the code. It's an arms race, but I think we're not up against a particularly clever enemy[1]. I'd really like to add some clever bayesian magic to it, but since it works well enough for me right now, I'm gonna throw it down to you guys. I'd also like to make the points themselves a percentage rating (adding % chance that it's spam) rather than an absolute (>100 points, and it’s spam).
Splam has a test suite, so you can check it out, put some of your corpus in individual text files in test/fixtures/comment, and send me a diff or pull request with anything that gets incorrectly marked as spam or ham.
Splam is at http://github.com/courtenay/splam/tree/master.
[1] This isn’t really meant to be a challenge to spammers; interestingly enough, most of the spam we get is just mass-blasted crap that isn’t really targeted at all. We were talking in the company campfire about how you could really make a bunch of money out of “spam”; if you’re intent on selling shady drugs, peddling nootropics to programmers would work better than sex enhancers, HGH to competitive cyclists, and so on. You could probably build some clever markov chains to interact on forums, leading people back to your own site where you start the pitch.
new plugin: acts_as_git
Courtenay : November 14th, 2008
With the help of Jamie van Dyke at Parfait and Scott Chacon at GitHub, I'm pleased to announce Acts As Git (no, I don't like the name either). It's a simple plugin which stores all changes you make to a text field in a git repository. This is ideal for something like a git-backed wiki.
Look at it here: github or check it out from
git://github.com/courtenay/acts_like_git.git
From the README:
ALG automagically saves the history of a given text or string field. It sits over the top of an ActiveRecord model; after a value is committed to the database, the plugin writes the new value to a text file and commits it to a git repository. This way you get all the advantages of using Git as version-control.
Usage:
class Post < ActiveRecord::Base
versioning(:title) do |version|
version.repository = '/home/git/repositories/postal.git'
version.message = lambda { |post| "Committed by #{post.author.name}" }
end
end
To view the complete list of changes:
>> @post = Post.find 15
<Post:15>
>> @post.title
=> 'Freddy'
>> @post.history(:title)
=> ['Joe', 'Frank', 'Freddy]
>> @post.log
=> ['bfec2f69e270d2d02de4e8c7a4eb2bd0f132bdbb', '643deb45c12982dde75ba71657792a2dbdda83e6',
'1ce6c7368219db7698f4acc3417e656510b4138d']
>> @post.revert_to '1ce6c7368219db7698f4acc3417e656510b4138d'
>> @post.title
=> 'Joe'
It uses the excellent Grit library, and doesn't actually have a checked-out repository. The latest version of your data is still stored in the database. You can actually clone this repo and view the changes; pushing back to it won't do anything useful.
Plugin configuration style?
Courtenay : November 10th, 2008
I’m putting the final touches on a super-sweet versioning plugin, and I’ve discovered that we’re using several different metaphors for configuring the plugin options. I’d like to get some opinions/feedback on your preferred style.
The DSL
Using a DSL and passing blocks in which get instance evalled. I’m normally very scathing of DSLs; I think that they’re Yet Another Language for people to learn to use – it’s usually your very own write-only syntax – but it’s been super-fun implementing the backend to this.
class Monkey < ActiveRecord::Base
versioning do
author do
name { user.current.name }
message { "Commited via #{name}" }
end
repository "Joe's DataStore"
end
Hashes
This seems to be the Rails plugin default:
class Monkey < ActiveRecord::Base
versioning :author => { :name => lambda{ |u| user.current.name } }, :repository => "Joe's DataStore"
end
Class vars / methods
Easy to monkeypatch later
class Monkey < ActiveRecord::Base
will_version
@@version_repository = "Joe's DataStory"
def version_author
current_name
end
end
Are there others? Which do you prefer? Currently I’m using all three in this one plugin, and it’s very un-awesome.
Ripping out your mocks
Courtenay : November 6th, 2008
I sat down with David Chelimsky at Rubyconf today to talk about rSpec and an interesting topic came up.
In my mind, there are two reasons to use a mock object: first, when you’re developing TDD style, you physically don’t have the objects yet; and second, so that you can tightly focus your unit tests. Maybe, these two different purposes should use a different mechanism.
His question to me then was, “Do you replace your mocks with the real objects after you’ve implemented those objects?”. I guess I hadn’t thought about that before. Do you? If so, how do you handle the extra complexity, maintaining sane associations and valid data?
On hiring Rubyists and Railsers
Courtenay : November 4th, 2008
We’re launching a new service at work in the next week or so that involves me looking through a lot of job applications: resumes and sample code.
I’d like to tell people right now, upfront, if you’re applying for a Ruby or Rails job, for anyone, there are a few ways of ensuring you get called back. They’re probably fairly simple.
Send some sample code, maybe a link to a project on Github, or a snippet of work you’ve done. Make sure you send the tests for the code. Any tests would be good, and you get bonus points for good tests. If you don’t have any tests, write them.
Don’t worry too much about sending some crazy complex code. Maybe some polymorphic associations (models), some ajax (views), a knowledge of the whole stack (simple controllers), some nested resources. Write a simple todo list application.
It’s not just a silly philosophy. Writing tests – hell, submitting tests with your job application’s code – shows that you’ve actually thought about the code, and that it actually works. You’ve permutated and permeated through the logic, actually think about the various ramifications of the design decisions in the code itself.
Just the pure act of sending tests with your sample code will put you above 90% of applicants, I promise.
We've stopped using rSpec ...
Courtenay : November 3rd, 2008
...for new projects.
![]()
We upgraded the gems for one of our client projects, and the auto-loading / config.gems managed to completely break all our other projects, requiring upgrades, which caused weird breakages in weird places in some of the specs.
The app would refuse to deploy (rake tmp:create failed, because lib/tasks/rspec.rake was being loaded, and spec wasn't installed on the server). The annoying thing was that just having whatever.11 installed (I don't know the exact version) broke older apps on whatever.4 or whatever.0.2. .. so those had to be upgraded too. We wasted a day or two (three, maybe four developers) which equates to several thousand dollars in wasteage. It was also really infuriating -- the culmination of a few years of frustration of rSpec's weirdnesses.
After that, I found that some of the specs had never run (who knows why). It stopped reading spec.opts and started doing some weirdness with pending options. Finally, Rick just snapped, threw out rSpec and his Model Stubbing library, and now we're playing with a combination of rr, context, and matchy, trying to get a feel for a decent workflow again. It's sad and maybe a bit exciting to be on the edge.
What are you testing with?
The awesomest filter and sort ever
Courtenay : August 26th, 2008
Update 2: seems like only one or two people knew about what can_search does :) I hope we’re all a little better educated.
Update: yes, I’m using these named scopes throughout the app in other places – they aren’t used only in this one controller.
Often you have an index action where you want to sort records, filter by a parameter, and maybe join on some other tables to get a result.
Let’s say you’re looking at a videos controller (where videos are acts_as_taggable) and you want to filter by user_id, filter by tag name, order by video title, or rating.
Maybe later, you’ll add a roles (hm:t) association and need to only show videos viewable by a certain user. How complex!
To solve this, we’re going to play with some things you may know, and finish up with a bam! pow! that’ll take your breath away.
Rather than build up some form of frankenquery with all sorts of conditionals and cases, joins, and other messing about, let’s use a brand-new bleeding edge feature of Rails: named scopes.
First, build up individual named scopes for each axis on which you wish to filter. Make sure and put the table name in that query.
named_scope :by_user, lambda { |user_id|
{ :conditions => ['videos.user_id = ?', user_id] }
}
named_scope :tag_name, lambda { |tag_name|
{ :joins => { :taggable => :tag },
{ :conditions => ['tags.name = ?', tag] }
}
named_scope :rating, lambda { |rating|
{ :conditions => ['ratings_count > ?', rating] }
}
OK, I cheated on the last one, but let’s assume you have a counter_cache on ratings count.
Now, if you have more than one scope with joins in it, you’ll need to apply this patch to your rails installation, or upgrade past 2.1.1. This will allow you to have as many joins as you like in your scopes.
Now, here’s where the magic happens: in the controller. Big shout out to protocool for this method.
Let’s build up a set of all the possible scopes that we might want to use, in an array form like [ named_scope, argument ]
def index
scopes = []
scopes << [ :by_user, params[:user_id] ] if params[:user_id]
scopes << [ :tag_name, params[:tag_name] ] if params[:tag_name]
scopes << [ :rating, params[:rating] ] if params[:rating]
end
Easy, right? Very readable.
How about some ordering?
order = { 'name' : 'videos.name ASC' }[params[:order]] || 'videos.id DESC'
Now, as you know, you can chain named scopes. So you could say Video.by_user(2).tag_name('monkeys') Let's take advantage of this, building up a chain of scopes dynamically using 'inject', starting from Video, and adding each scope we added to the array above. This is really fun magic, because it doesn't run any of the queries until the whole thing is built. I don't even know how this works, but it does. Swimmingly.
@videos = scopes.inject(Video) {|m,v| m.scopes[v[0]].call(m, v[1]) }.paginate(:all, :order => order)
The final method looks like this:
def index
scopes = []
scopes << [ :by_user, params[:user_id] ] if params[:user_id]
scopes << [ :tag_name, params[:tag_name] ] if params[:tag_name]
scopes << [ :rating, params[:rating] ] if params[:rating]
order = { 'name' : 'videos.name ASC' }[params[:order]] || 'videos.id DESC'
@videos = scopes.inject(Video) {|m,v| m.scopes[v[0]].call(m, v[1]) }.paginate(:all, :order => order, :page => params[:page])
end
One final caveat. Sometimes :joins doesn’t know where to get the video id from, so if you’re using id in your app, you’ll need a slight workaround involving manually getting the pagination count, and forcing :select => ‘distinct videos.*’ in the paginate call.
If this works for you, it’s really easy to add new filtering, ordering, or even scoping to your query. For example, you can add some form of role hackery to your video
named_scope :viewable_by, lambda { |user|
{ :joins => { :permissions => :roles },
:conditions => [ "roles.user_id = ? AND permissions.role = ?", user.id, "view"
}
Controller, you replace the first scope definition with this
scopes = [ :viewable_by, current_user ]
Or, you modify the scope inject statement
@videos = scopes.inject(Video.viewable_by(current_user)) { |m,v| ... }
If you consider this a giant hack, you’re probably at least partly right. However, the alternative in building up a complex query with many possible moving parts is just hideous. And consider this: you can unit test each part of the query on its own, in the model specs.
Sanitize your users' HTML input
Courtenay : August 25th, 2008
The default Rails sanitize helper is actually quite powerful. You can see some of its usage here:
<%= sanitize @article.body, :tags => %w(table tr td), :attributes => %w(id class style) %>
However, as the docs say,
Please note that sanitizing user-provided text does not
guarantee that the resulting markup is valid.
We were having an issue with users providing bad markup and leaving their tags unclosed.
This is <a href="http://foo.com">my dog<a/> and he’s super cool!
We solved it by running Hpricot over their input.
before_save :clean_html
def clean_html
self.body = Hpricot(body).to_html
end
For performance reasons, you should probably run the hpricot and sanitize methods on the way into the database, rather than rendering it in the views, because it’s somewhat slow, and is a calculation that you only need to perform once.
In fact, instead of saving it in a callback, you could overload the accessor like so:
def body=(new_body)
write_attribute :body, Hpricot(new_body).to_html
end
You’ll want to include the ActionView methods from ActionView::Helpers::SanitizeHelper to get ‘sanitize’ available in your model.
Authenticate like SSO with ActiveResource
Courtenay : July 18th, 2008
When you have multiple Rails applications that don’t share a common database and you want to share the user authentication information – or rather, use one app to provide authentication for another – there are a few options. Here’s how I solved it recently. This is the simplest way I could think of to get this working. I couldn’t find a plugin to do this, so here’s the result of my pdi.
Effectively what we’re doing is separating the user’s data - their profile info, if you like - from the credentials, and moving the latter to ActiveResource. This is something you should do in your own apps. Too frequently we stuff a bunch of data (like full name, phone number) into the user model, because it’s there. A more advanced version of this code might use the ‘profile’ as the resource name, updating the local profile with data from remote, and keeping User as a pure credential model.
Let’s assume we have App A which will act as the authenticator master. Our other application, App B, will still hold a User record, but we’ll override the authenticate method to use ActiveResource. We’ll also store some other fields like username and email, and will grab those each time the user logs in. That way, they can set an auth token in App A and they can login from cookies in app B (provided the cookie domain is shared).
class User < ActiveRecord::Base
class Auth < ActiveResource::Base
self.site = "http://app-a.com"
self.format = :json
self.element_name = 'user' # this is the name of the resource in your app
end
def self.authenticate(login, password)
Auth.user = login
Auth.password = password
# Authenticating against the app will actually 'prove' the login/pass details.
# We also want the user's details so we can cache them here.
authed = Auth.find :first, :params => { :login => login }
return false unless authed
# Now, pull the data from remote and store it locally.
user = User.find_or_initialize_by_login(login)
user.attributes = authed.attributes
user.save!
user.activate!
user
rescue ActiveResource::ClientError # 406 error -- bad username/password.
false
end
Interestingly enough, find first actually runs the ‘index’ action, and returns the first record. sigh
Now, in your App A: users_controller, you want to set up a filter in the index like so:
def index
if params[:login]
# for single-sign-on.
@users = User.find(:all, :conditions => { :login => params[:login] })
else
@users = User.paginate(:all, :page => params[:page]) #...
end
respond_to do |format|
format.html
format.json { render :json => @users }
end
end
Do you have a better way of doing this?
activerecord benchmarks: how fast is your system?
Courtenay : November 8th, 2007
Over a year ago we published some benchmarks on how fast your computers were running the complete ActiveRecord test suite. I consider this to be a great test for the fastest platform for developing Rails. (Let’s ignore the speed of your IDE or pseudo-IDE—this one’s all about waiting for your autotest. This probably isn’t a good indicator of server status)
It’s time to run this test again. Why? Because I’m buying a new computer, and I want to be the most efficient with my money as possible. That means a macbook, rather than macbook-pro.
Check out Rails revision 8117 (trunk at this time), install sqlite if you haven’t already (macports: rb-sqlite), and run rake test_sqlite
Comment here with your platform, and the time reported. If you want to be more accurate, run it a few times. I’m not a professional statistician; don’t tell Zed Shaw about my shoddy procedure.
Factors that may influence your times: disk speed, processor speed, your ruby version, luck …?
| Who | Hardware | Rake time (sec) | OS |
| |
|
||
| chrissturm | imac core2 | 18.88 | leopard |
| octopod | mbp-sr | 23.45 | tiger |
| technomancy | mbp-sr | 25.74 | ubuntu gutsy |
| defiler | mb1 | 25.772 | leopard |
| form | mb2 2.0 | 26.59 | leopard |
| courtenay | macpro 2×2.6 | 28.49 | tiger |
| mike | Athlon64/3000 | 34.63 | xp |
| courtenay | Sempron64/2600 | 57.49 | fc6 |
| courtenay | powerbook 1.5 | 92.92 | tiger |
- Summary
From the looks of it, most current-level professional macs whether laptop or desktop run the benchmarks at within 15% of the same time. This probably isn’t too much of a surprise, since ActiveRecord won’t run on multiple processors; but it’s nice to know that if you’re only really doing rails on your laptop, a macbook is as good as anything out there.
The move to Intel has really helped Apple get a nice standard baseline for performance, that clearly smokes the ‘old’ PPCs.
In fact, ‘ol faithful, my previous fast-rails-box running linux on an amd-64, has dropped to very lowly status of 57 seconds. It’s time to retire my trusty powerbook. I spend more time waiting than coding.
Notes:
- mb1 : MacBook 1 (Core Duo)
- mb2 : MacBook 2 (Core 2 Duo)
- mb3 : MacBook 3 (Santa Rosa)
- mbp-sr : MacBookPro (Santa Rosa)
Premcaching, updated
Courtenay : October 10th, 2007
See my previous article on premcaching, preloading data and stuffing it into memcached in a fork.
Paul McKellar just sent this snippet to re-establish the database connection in your fork.
def fork_with_new_connection(config, klass = ActiveRecord::Base, &block)
fork do
begin
klass.establish_connection(config)
yield
ensure
klass.remove_connection
end
end
end
def fire_and_forget(&block)
config = ActiveRecord::Base.remove_connection
pid = fork_with_new_connection(config) do
begin
yield
ensure
Process.exit!
end
end
ActiveRecord::Base.establish_connection(config)
Process.detach pid
end
Awesome. Is anyone else using techniques like this for their crazy scaling or pagination needs?
skinny controllers, skinnier controller specs
Courtenay : August 24th, 2007
So, you're happily using mocks to remove the database from your skinny™ controller.
The code has been hacked on by about four different people and looks something like
describe CategoriesController, "showing a record" do
before do
@store = mock_model(Store, :categories => mock('categories proxy'))
@product = mock_model(Product)
@store.categories.stub!(:find_by_permalink).and_return @product
@product.stub!(:name).and_return('foo')
end
it "should show successfully" do
get :show
response.should render_template('show')
end
it "should load one record" do
@store.products.should_receive(:find_by_permalink).with('1').and_return @product
get :show
end
end
To be honest, it's pretty nasty, and with rSpec, if it feels nasty it's probably wrong. The controller is quite simple
class CategoriesController < ApplicationController
before_filter :load_store
protected
def load_store
@store = Store.find(session[:store_id])
end
public
def show
@category = @store.categories.find_by_permalink(params[:id])
end
def edit
@category = @store.categories.find_by_permalink(params[:id])
end
def update
@category = @store.categories.find_by_permalink(params[:id])
@category.update_attributes(params[:category])
end
end
Now, there are two ways of DRYing up this. They both involve a "find_category" method. The holy war involves whether you load the data in a before_filter or explicitly set @category in each action. I think the first is much cooler.
class CategoriesController < ApplicationController
before_filter :find_category, :only => [ :show, :edit, :update ]
protected
def store
@store ||= Store.find(session[:store_id])
end
def find_category
@category = store.categories.find_by_permalink(params[:id])
end
public
def show
end
def edit
end
def update
@category.update_attributes(params[:category])
end
end
In the new spec, we can do something like this:
describe CategoriesController, "showing a record" do
before do
controller.stub!(:find_store)
controller.stub!(:find_category)
controller.instance_variable_set(:@category, mock_model(Category)
end
it "should show successfully" do
get :show
response.should render_template('show')
end
it "should load one record" do
controller.should_receive(:find_category)
get :show
end
end
describe CategoriesController, "finding a record" do
before do
@store = mock_model(Store)
controller.stub!(:store).and_return(@store)
end
it "should find a record by permalink" do
controller.stub!(:params).and_return({ :id => '1' })
@store.should_receive(:find_by_permalink).with('1')
controller.send(:find_category)
end
end
First, we test the "should show.." logic. Then, in a different context, we test that the "find" works as advertised.
Got a better way?