Keynoting RubyNation

Posted on May 08, 2008 by stu

I will be the closing keynote speaker at RubyNation. I'll be beating my current favorite drum: Ending Legacy Code In Our Lifetime. (And yes, the Ruby community has legacy code. If we don't change our ways soon, we will end up with a lot more of it.)

Intro to Ruby on Rails coming in May

Posted on April 07, 2008 by stu

I (Stu) will be heading to Washington, D.C. the week of May 5-9 to teach a public Introduction to Ruby on Rails course.

We'll start at the very beginning, exploring Ruby from the console shell irb. Then we'll dive into Rails 2.0. By the end of the course we will be talking about how best practice shops use Rails (hint: test and refactor!).

Rails is one important step on the path to a future without legacy code. Isn't it time to take that first step?

Frozen Gems Generator

Posted on March 06, 2008 by glenn

Jay Fields blogged recently (and not for the first time) about managing gems within Rails projects. This is a problem a lot of people have wrestled with; there are close to a dozen plugins, rake tasks, uncommitted patches, and published hacks that attempt to provide a solution (and those are just the ones I know of).

FrozenGemsGenerator is the solution that we’ve been using on some projects at Relevance, and we’re happy enough with it that we’ll be using it more. It’s a rails generator, packaged as a gem, that gives your Rails app a private gem repository, fully self-contained, and manageable just like your system-wide repository (except using script/gem instead of gem).

$ sudo gem install frozen_gems_generator
$ script/generate frozen_gems
$ script/gem install money

script/gem supports all of the subcommands that the regular gem command does.

I haven’t yet implemented a solution for gems that install binary extensions. I’m very interested in suggestions for how best to solve that problem. Several of the other approaches have at least partial support for architecture-specific gems; the best may be Jeremy Voorhis’ CarryOn plugin, which is also the solution that’s closest in spirit to the FrozenGems approach. If you have ideas or suggestions about how architecture-specific gems should be handled, please add comments here or post them on our Trac instance.

Tarantula and TextMate

Posted on February 28, 2008 by stu

Tarantula is under active development as we use it internally to police our apps. If you grab the bits from this morning (R243 or later in the repository), you will see that stack traces in the log report now link back into TextMate.

We have a ton of features we would like to add, and I bet the community can think of plenty more. Please add comments to this post, or post into Trac, letting us know what features you would like to see next. Here's some possible choices to get you started:

  • a "Johnny Droptables" fuzzer that tries specific SQL injection attacks
  • docs detailing the kinds of errors we have been finding and how to fix them
  • an XSS fuzzer that tries to inject script tags (this is challenging because it isn't obvious how to automatically detect the symptom)
  • CSS validation
  • JS validation
  • UI features to make the reports more navigable and usable (be specific!)
  • integration with RSpec
  • blacklist of files your server should never return
  • Ajax crawling (Tarantula currently simulates plain old web requests)
  • Integration with other IDEs (you'll probably have to send us a tested patch because we're happy with TextMate)

Get your votes in today and we can look at them during open source Friday.

Tarantula vs. your Rails app

Posted on February 26, 2008 by stu

The Tarantula is a fuzzy spider. It crawls your rails app, fuzzing inputs and analyzing what comes back. We have pointed Tarantula at about 20 Rails applications, both commercial and open source, and have never failed to uncover flaws.

How does your Rails app stand up? It's easy to find out. Install the plugin, and create a Tarantula integration test: (Update: Note that Tarantula integration tests live in test/tarantula so that you can treat them separately in your cruise builds. For a substantial app or fixture set Tarantula can take a while to run!)

 
# somewhere in your test
require 'relevance/tarantula'            

# customize to match your security setup  
def test_with_login
  post '/sessions/create', :password => 'your-pass'
  assert_response :redirect
  assert_redirected_to '/'
  follow_redirect!
  t = tarantula_crawler(self)
  t.crawl '/'
end

Then rake tarantula:test, and then start looking through the Failures section of the HTML report.

Tarantula is just a baby now, but we plan to feed it until it is a lot bigger and meaner. Suggestions and contributions are welcome via the Relevance Open Source Trac.

Hat tip to Courtenay, whose SpiderTest plugin inspired me to go down this road. Also congrats to Mephisto, which is the best behaved app under Tarantula to date (only three problems, all minor broken windows).

Troubleshooting LoadErrors in Rails tests

Posted on February 08, 2008 by stu

I am proposing a patch to help cope with the dreaded Rails LoadError:

LoadError: Expected foo.rb to define Foo

In Ruby, it is simple to load code, just require it. In script/console:

>> require 'account_controller'
=> ["AccountController"]

Rails extends this to magically find classes just based on their name:

>> AccountController
=> AccountController

Most of the time, if a class does not exist, you get a helpful exception:

>> AccountController
MissingSourceFile: no such file to load -- hpricot_scan

Ah, so my AccountController depends on hpricot, which isn't available for some reason. Solution: go find hpricot.

But once in a while this problem presents a different symptom:

>> AccountController
LoadError: Expected account_controller.rb to define AccountController

This is confusing, since account_controller.rb does define AccountController! Experienced Rails developers know that this cryptic message actually means "Something went wrong in application_controller.rb, but Rails swallowed the real exception." After being bitten by this on three different projects in the last two weeks, I decided to track the issue down. Turns out the problem is in how fixtures get loaded:

begin
  require_dependency file_name
rescue LoadError
  # Let's hope the developer has included it himself
end

After fixtures swallow the real MissingSourceFile for a subdependency such as hpricot, ActiveSupport raises a misleading LoadError for the original dependency (account_controller.rb) that references hpricot.

In a perfect world, I would simply have fixtures stop swallowing LoadErrors. But the comment strongly suggests that some code depends on this behavior. So weaker sauce is to at least log the problem:

def try_to_load_dependency(file_name)
  require_dependency file_name
rescue LoadError => e
  ActiveRecord::Base.logger.warn("Unable to load #{file_name}, underlying cause #{e.message} \n\n #{e.backtrace.join("\n")}")
end

If anybody knows how to distinguish the confusing LoadErrors from the expected ones, please go and improve the patch.

Layering and platform choice

Posted on February 04, 2008 by stu

Over the last few weeks I have repeatedly linked to Ola's post about the stable layer. I didn't take the time to go into detail, and I trusted that people (if they wanted to) could follow the link and understand what Ola was talking about.

Well, that didn't work so well. Most responders clearly did not understand Ola. A few informed me that I didn't understand Ola. :-) So I am going to make a clean break, and lay out my own argument in more detail. What follows are my views about how layered architecture affects language and platform choice. First, some ideas that are hopefully uncontroversial:

  • Good design is layered.
  • Leakage between layers should be minimal.
  • Features within a layer should be orthogonal, and should not have to be re-implemented in higher layers.
  • All kinds of programs benefit from this kind of layering, including languages, libraries, frameworks, and application code.

A small leap:

  • The lowest layers are the most important.

This might seem obvious. All other layers depend on the lower layers, so a problem at the bottom affects a lot of code. But if you are working several layers higher, problems at the bottom are part of the air you breathe. The air may smell terrible, but you are acclimated and don't notice.

A big but uncontroversial leap:

  • Java, the VM, is a good VM for the bottom layer.

This is uncontroversial because the majority has chosen. And they are right to do so: the Java VM is well-specified, widely implemented, carefully optimized, and supported by a huge array of tools.

A mistake:

  • Java, the language, is a good language for the bottom layer.

Noooo! Java is a high-ceremony language. At every turn, Java enforces a high busy-work/real-work ratio. Specifically:

  1. Java's checked exceptions bloat code, make components harder to use and maintain, and lead to tons of boilerplate code, each line of which is a bug-in-waiting.
  2. Java's new operator/constructors cannot pick a return type. The amount of code that exists only to work around this is staggering. Two entire cottage industries have sprung up to deal with this single issue: factory patterns and dependency injection.
  3. Java has no metaprogramming features to automate common tasks such as field accessors, standard constructors, and simple delegation.
  4. Primitives, functions, and classes are not first-class objects, leading to huge code bloat to deal with these types specially.
  5. Java's core reflection and interception capabilities are clunky, requiring tons of bolt-on technologies to make them workable, including AOP, annotations, and code generators.

That's a pretty big stink, but if you are used to it you probably can't smell it anymore.

The net result of these problems is that bottom layer code written in the Java language will be bloated and difficult to maintain. These problems multiply if we use the Java language for higher layers as well. What should we do?

Keeping the VM, avoiding the language

For better or worse, Java is already the bottom layer for many businesses. A complete rewrite is impossible, so we need an approach that lets us continue to use our existing Java code. There are two obvious choices:

  • Use a framework to hide Java's most glaring flaws, and continue to use the Java language for development. The most popular option here is Spring. Spring provides framework-level fixes for several problems in Java: dependency injection, unchecked exception wrappers, and a powerful AOP capability, to name a few.
  • Use a better VM language. There are lots of choices, including Clojure, JRuby, Rhino, and Groovy. All of them can interoperate nicely with existing Java code.

My opinion, based on extensive experience with both options, is that the "better VM language" approach is better than the "fix Java with a framework" approach.Smaller is better.

Some advice

In the past few weeks, I have been approached by several organizations to advise them on platform decisions. Every organization is different, but here are some guidelines to consider.

  • Your team matters far more than your language. Pick a platform at random, then take your platform analysis budget and spend it finding good team members and helping them get better.
  • Your process matters far more than your language. If your team is not delivering real business value on a regular, repeating timeframe, stop worrying about your platform and start worrying about things like estimating, agility, testing, and continuous integration.
  • The static/dynamic languages debate is a red herring. The Java language's problem is ceremony, not static typing. Use whatever combination of static and dynamic typing works for you. [1]
  • There can be more than one! The Java VM simplifies the interop and deployment story. So quit trying to decide, and try a few different JVM languages.
  • The hot new JVM languages have different syntaxes, but similar features. They all solve the problems with Java that I enumerated above. Throw a dart at the wall, pick one, and get started coding.
  • Beware "Use the right tool for the job." This is true, but useless without context, and it is becoming the weapon of choice for pundits who write no code. Be a polyglot, but also be articulate about why tool X is the right fit for job Y.
  • Stop writing plain old Java code. Groovy obsoletes plain old Java. We ought to just say "Java 7 = Groovy" and move on.

Keep an open mind. Try several approaches. Judge your choices by how easy they would be to unmake or adapt. Have fun!

Notes

[1] In the past I have had a lot to say about static/dynamic typing. I realize now that I was trying to talk about ceremony. I am still worried about the same problems, but I think I now know them by more accurate names.

Rails plugin authors on OS X, beware!

Posted on January 31, 2008 by stu

This morning I was troubleshooting a production problem with the simple_localization plugin. The code worked fine in development, had 100% passing C0 coverage in test, and worked fine in production on my local box. But on the staging box, we were getting the dreaded load error:

LoadError: Expected /simple_localization/lib/cached_lang_section_proxy.rb to define CachedLangSectionProxy

If you use Rails plugins and ever see this problem, read on...

A little background

In Ruby, you can load a Ruby source file from the load path by requiring it.

require 'my_class'

This is explicit, and easy to understand. But you might get tired of spelling things out all the time. So in Rails you can also load a class implicitly when it is needed:

MyClass

This is somewhat Java-like, in that magic happens to find the code based on some naming conventions, e.g. My::Namespaced::MyClass should be in a file namedmy/namespaced/my_class.rb somewhere on the load path. It is also Java-like in being difficult to debug, leading to errors like the LoadError above.

Workaround: ducking the issue

Knowing that the LoadError is a failed implicit load, the first step is to look at the point of failure in the file cached_lang_section_proxy. Here is is, elided for clarity:

module ArkanisDevelopment
  module SimpleLocalization
    class CachedLangSectionProxy

Ah hah, you say. The error is right on. This file doesn't define CachedLangSectionProxy, it defines CachedLangSectionProxy in the ArkanisDevelopment::SimpleLocalization module. So implicit loading can't work with the code as written. But we have a workaround: we can move this file (and probably several others) into a directory structure that matches Rails conventions. I am not going to do that, because...

Solution: getting deterministic

We can get implicit loading to work, but we still haven't tackled the real problem. Why did the code ever work on my local box to begin with? We know that implicit loading can't work, so somehow my local box must be explicitly loading the files, but in a machine-dependent way that fails on the staging box.

Rails plugins include an init.rb that runs during Rails startup, and is often used to explicitly load configuration and code. Here is that code from simple_localization:

Dir[File.dirname(__FILE__) + '/lib/*.rb'].each do |lib_file|
  require File.expand_path(lib_file)
end

This is broken, but if you develop on Mac OS X you may never notice. The plugin's internal dependencies are arranged in such a way that loading the files in alphabetical order works. In all of my experiments, Ruby's directory traversal APIs on the Mac return files in alphabetical order. However, this ordering is not required by the Ruby language. On Linux, the files can come back in any order.

Given that many Rails developers work on OS X, and deploy to Linux, this leads to an amusing variant of "It works on my box": It works on all developer boxes, and fails on all production boxes..

An easy fix is to sort the files explicitly:

Dir[File.dirname(__FILE__) + '/lib/*.rb'].sort.each do |lib_file|
  require File.expand_path(lib_file)
end

Better would be to organize init.rb so that the dependencies are clear (the fact that alphabetical order happens to work is a fragile coincidence).

Lessons learned

  1. If you write Rails plugins on Mac OS X, be careful how you use globbing APIs in init.rb. They will work deterministically on your box, but maybe not everywhere else.
  2. If you plan for your Ruby code to be used from Rails, follow the directory and naming conventions.
  3. Loading code is and always will be tricky. Many years ago, I thought that COM had solved many of the problems. I was so enthusiastic about Java's approach that I wrote a book about it. By the time .NET came out with yet another approach, I was a bit jaded and assumed it would have problems. (It did.) It's a hard problem.

On language aesthetics

Java and Ruby both have an explicit and implicit loading story. What is interesting is that in Java this story is implemented in the language, while in Ruby a significant part of the story is in the libraries. It is Rails, not Ruby, that implements implicit loading, and you can read much of that story in this source file (updated link: with syntax highlighting). Understand this file, and you will know much of what is best and worst in Ruby.

Safe by default

Posted on January 29, 2008 by stu

I met Luke Francl at Code Freeze last week, but we only had time to speak for a minute. It was enough to know we are of like mind: security should be on by default. Luke has written a new plugin, xss_terminate. It is inspired by acts_as_sanitized, but it has stricter defaults and more options. Nice.

Rails, The Cookie Store, and Security

Posted on January 27, 2008 by jgehtland

Tobi, Bob and David are all exactly dead spot on right. The Rails cookie store works as designed, data stored there should be tamper-proof and signed, and you are indeed the poorest kind of web programmer if you are assigning strong, valuable data into a cookie. Right, right, and right again.

However, the rub here is that:

  1. Rails is an awfully popular web framework,
  2. used by all kinds of developers, from neck-bearded Unix geeks down to baby-bottom-smooth-chinned highschool geeks,
  3. for all kinds of applications.

Given that the cookie store is the default session store, and that people either accidentally or on purpose store all kinds of goop in the session (often transiently, sometimes for the length of the session), then it behooves people to have a way to default to a more secure version. That’s all the EncryptedCookieStore is for: guaranteeing that if you screw up your app, you don’t also screw up your users.

So, in order to be clear: Relevance in no way suggests that you should store anything of any value in a cookie. In fact, we’ll shake our heads in disgust and drag you out behind the woodshed if we catch you doing it, you’re darn tootin’. But if you want to make sure that you don’t accidentally reveal something through this mechanism, defaulting it away might be useful.

And I did not mean to imply that the Rails team was either negligent, ignorant, or foolish for implementing the cookie store the way they did. I understand the reasoning well; our plugin is a safeguard against accidental misuse, not willful stupidity (we hope ;-) ).

Latest Relevance Open Source: encrypted cookie store

Posted on January 27, 2008 by jgehtland

Aaron just released our latest open source project, EncryptedCookieStore. It turns out that the default cookie session store for Rails is insecure in the worst way: it is simply Base64-encoded (which is French for “plain text”). It is slightly obfuscated, giving the uninitiated a false sense of security, but to anybody who was seen Base64 before, they can probably see the woman in the red dress without even squinting. So, enter the EncryptedCookieStore, a plugin that will truly encrypt the data you are shoving in the cookie. Check it out!

EDIT: Per accurate comments, I need to restate something. The default cookie store is not “insecure in the worst way”; it is, rather, “insecure for the worst developers”. It was not MEANT to be secure (from reads). My point was that, some developers might just assume that the default for sessions would be server side storage, and the element of surprise might lead some of them to Do Bad Things™. That is all; sorry for the confusion.

This week's yak shave: fuzzing dates

Posted on January 26, 2008 by stu

We have been fuzzing our Rails applications to see what breaks, and finding some interesting things. This week, the continuous integration build started to break at random, but only on one application. Pull out the Yak clippers!

By looking at the test logs, we quickly isolated the problem. Rails date helpers let users input bad dates, e.g. February 30, 2008. (In almost all scenarios, we avoid the built-in date helpers anyway. But almost only works for horseshoes and hand grenades, and I don't want special cases that break.)

Obviously our fuzz generator will generate bad dates occasionally. Before you read further, test yourself: How does ActiveRecord handle such dates? How should it? The answer is in ActiveRecord::Base's execute_callstack_for_multiparameter_attributes, and it surprised me:

  • If you are using the Time class, the day overflows into the next month, so 2/29 magically becomes 3/1. Yuck.
  • If you are using Date, Ruby raises an error. Rails then wraps this error, and by the time it gets back to you the offending values are not easily accessible.

Neither of these approaches appeal to me. A bad date should be validation error. Matthew pointed me to a plugin that lets you, on a per model basis, monkey patch ActiveRecord to report bad Dates as validation errors.

I decided that I wanted to have a patch that overrode Rails' behavior for all models, so I have written the first_class_dates plugin. This gets our tests passing again, and it keeps bad data from raising application exceptions, but it is still a hack. A better solution would:

  • work consistently for Time and Date.
  • show the user the bad values they chose. (This is tricky since Date refuses even to enter a bad value state.)

The better solution would require a more substantial patch, and should be done in Rails itself. If others agree with the approach, we will submit the patch.

We're disgusting!

Posted on January 24, 2008 by jgehtland

The RailsEnvy guys mentioned our newest releases on their podcast this week and Jason said our SimpleServices plugin was “disgusting, and has no place in a Rails app”. That might be the best review anything we’ve ever touched has ever gotten.

We’d like to thank the Academy, and all the little people who made it possible. Sniff. Thank you everybody!

Seriously, thanks for the shout out, guys! Keep up the great work!

Announcing: SimpleServices

Posted on January 18, 2008 by jgehtland

During the recent Rails/Grails kerfluffle, one of the memes that came up over and over again was that Grails had a specific feature which Rails lacked and that it was a Big Deal™. Specifically, Grails defines a service layer with automatic transaction demarcation which allows you to remove complex domain manipulation code from your controllers, leaving them to deal with loading resources and redirecting to views.

As a thought experiment, we set out to find out what would need to happen to enable this in a Rails app. By the end, we’d written the plugin we are releasing today: SimpleServices. It amounts to about 12 lines of code, but that isn’t really a boast, as we’ll see in a minute.

Before we look at SimpleServices, we should discuss what makes up the service layer in Grails:

  • Services are classes which are, by default, singletons and are assumed to be stateless.
  • Their methods are wrapped in a transaction by default, though you can turn this behavior off (not sure why you would want to, but whatever)
  • Services are injected into controllers via standard Spring dependency injection methods
  • Services are singletons by default, but can be declaratively scoped to one of many other scopes: request, session, flash, etc. This would enable stateful services.
  • “Transactions”, in this case, are full-fledged container-managed transactions, which can involve database transactions, messaging system transactions, and any other resources that expose a transactional API and can be enlisted in a container-managed transaction.

When we wrote this plugin, we were attempting to get out the most robust implementation of the default use case, which is:

  • singleton, stateless services
  • ActiveRecord transactions
  • encapsulation of service creation details (obscuring their instantiation API)
  • encapsulation of complex domain logic, outside of your controllers

When you install the plugin, you can create a folder in RAILS_ROOT/app called “services”. Within this folder, you can define one or more service classes. A service class has a name that ends with “Service” and derives from Service::Base. The plugin causes these to be auto-loaded in the traditional Rails fashion. Each method in the service is automatically wrapped in a transaction, and any errors raised within the method will cause the transaction to rollback.

Within your controllers, you can declare which services you will be using in your actions. For each service you declare your need for, a method will be added to the controller in the form of ”#{service_name}_service” which hides all the service instantiation code.

class AccountController < ApplicationController

  services :account, :user, :security

  def update
    base_account = Account.find(params[:base_id])
    target_account = Account.find(params[:target_id])
    account_service.transfer(base_account, target_account, params[:amount])
  end
end

You aren’t forced to use services from controllers, either. If you want the declarative support, but in your models, you just include SimpleServices::ServiceInjection in whatever class you want. For example, you could include it in ActiveRecord::Base and have it available for all your models. Likewise, you can create instances of services directly, without declarative support, by calling AccountService.instance.

Known Issues

  • The plugin has a dependency on the Aquarium gem. In order to provide the wrapping, we needed a good aspect library. (Aspectr has fallen on hard times, and Aquarium seems to have a lot of activity and a very simple API.) Just install the gem before using SimpleServices.
  • We don’t provide support for scoping the services yet. It seems that most people use the stateless singleton model, and this was a straightforward case to implement. We’ll start looking into scoping the services next, and we’d love to hear from anybody who can make a good case for, say, a stateful, session-scoped service. I honestly can’t think of a good one.
  • We are limited in this 0.1 release to single-database ActiveRecord transactions. We are looking ahead to enlisting the transaction manager when you are running in JRuby, but that will be in a future release.

We’d love to hear if people in the Rails community find this useful. We are playing around with it in a couple of apps, and it does have the benefit of cleaning up the controller codebase and providing zero-thought transaction support. Yay, services! But, we’d love to hear from others on this, and on how it could be improved.

Good luck, and happy servicing.

spec_converter released

Posted on January 18, 2008 by jgehtland

Muness just released spec_converter as part of the Relevance open source library. It converts Test::Unit tests to test/spec style. Check it out!

energizing development:

nobody does it better.