I will be the closing keynote speaker at RubyNation. I'll be beating my current favorite drum: Ending Legacy Code In Our Lifetime. (And yes, the Ruby community has legacy code. If we don't change our ways soon, we will end up with a lot more of it.)
I will be the closing keynote speaker at RubyNation. I'll be beating my current favorite drum: Ending Legacy Code In Our Lifetime. (And yes, the Ruby community has legacy code. If we don't change our ways soon, we will end up with a lot more of it.)
I (Stu) will be heading to Washington, D.C. the week of May 5-9 to teach a public Introduction to Ruby on Rails course.
We'll start at the very beginning, exploring Ruby from the console shell irb. Then we'll dive into Rails 2.0. By the end of the course we will be talking about how best practice shops use Rails (hint: test and refactor!).
Rails is one important step on the path to a future without legacy code. Isn't it time to take that first step?
As people spend more time with Ruby 1.9, JRuby, and Rubinius, we are seeing a lot more benchmarks. It has been a while since we published any metrics, so I thought now would be a good time to summarize our recent experience on some real projects. We chose to measure some different things, however:
Here's the numbers. The trends are not subtle.
DPC DPR DP5 CS5
Project A: 100% 100% 100% 0
Project B: 100% 100% 100% 0
Project C: 100% 100% 100% 0
Project D: 100% 100% 100% 0
At this point, even caring about Ruby's language performance would be a premature optimization at the business process level. Ruby's runtime performance is a non-issue for a broad spectrum of applications. In fact, I believe that all of our customers would be even happier with a language that ran 50% slower than Ruby, if it also made the development team a mere 10% more productive.
Jay Fields blogged recently (and not for the first time) about managing gems within Rails projects. This is a problem a lot of people have wrestled with; there are close to a dozen plugins, rake tasks, uncommitted patches, and published hacks that attempt to provide a solution (and those are just the ones I know of).
FrozenGemsGenerator is the solution that we’ve been using on some projects at Relevance, and we’re happy enough with it that we’ll be using it more. It’s a rails generator, packaged as a gem, that gives your Rails app a private gem repository, fully self-contained, and manageable just like your system-wide repository (except using script/gem instead of gem).
$ sudo gem install frozen_gems_generator $ script/generate frozen_gems $ script/gem install money
script/gem supports all of the subcommands that the regular gem command does.
I haven’t yet implemented a solution for gems that install binary extensions. I’m very interested in suggestions for how best to solve that problem. Several of the other approaches have at least partial support for architecture-specific gems; the best may be Jeremy Voorhis’ CarryOn plugin, which is also the solution that’s closest in spirit to the FrozenGems approach. If you have ideas or suggestions about how architecture-specific gems should be handled, please add comments here or post them on our Trac instance.
Tarantula is under active development as we use it internally to police our apps. If you grab the bits from this morning (R243 or later in the repository), you will see that stack traces in the log report now link back into TextMate.
We have a ton of features we would like to add, and I bet the community can think of plenty more. Please add comments to this post, or post into Trac, letting us know what features you would like to see next. Here's some possible choices to get you started:
Get your votes in today and we can look at them during open source Friday.
The Tarantula is a fuzzy spider. It crawls your rails app, fuzzing inputs and analyzing what comes back. We have pointed Tarantula at about 20 Rails applications, both commercial and open source, and have never failed to uncover flaws.
How does your Rails app stand up? It's easy to find out. Install the plugin, and create a Tarantula integration test: (Update: Note that Tarantula integration tests live in test/tarantula so that you can treat them separately in your cruise builds. For a substantial app or fixture set Tarantula can take a while to run!)
# somewhere in your test require 'relevance/tarantula' # customize to match your security setup def test_with_login post '/sessions/create', :password => 'your-pass' assert_response :redirect assert_redirected_to '/' follow_redirect! t = tarantula_crawler(self) t.crawl '/' end
Then rake tarantula:test, and then start looking through the Failures section of the HTML report.
Tarantula is just a baby now, but we plan to feed it until it is a lot bigger and meaner. Suggestions and contributions are welcome via the Relevance Open Source Trac.
Hat tip to Courtenay, whose SpiderTest plugin inspired me to go down this road. Also congrats to Mephisto, which is the best behaved app under Tarantula to date (only three problems, all minor broken windows).
See if you can guess what this code will do before you run it in ruby.
upc = Proc.new {|m| $1.upcase}
puts "hello world".gsub(/([aeiou])/, &upc)
puts "hello world".gsub(/(\w)/, &upc)
def doit(str, re, blk)
puts str.gsub(re, &blk)
end
doit "hello world", /([aeiou])/, upc
doit "hello world", /(\w)/, upc
Now try running it in JRuby. Whoa.
Design patterns are the enemy of agility. They introduce repetition and accidental variation to your codebase. Design patterns encourage you to create "point solutions" throughout your application, instead of cleanly isolating concerns. And they will make your code refactor-proof, no matter how cool your IDE is. But there is hope: Catch your design patterns while they are young, and teach them to be library calls instead. Here's one example:
In Ruby, we often re-open existing classes and add instance methods. One approach is simply to open the class:
class NilClass
def blank?
true
end
end
Or, you could create a new module and mix it in:
module MyNilExtensions
def blank?
true
end
end
class NilClass
include MyNilExtensions
end
There are other approaches that are similar but not quite the same. In other words, this is a design pattern. From the Wikipedia entry:
A design pattern is not a finished design that can be transformed directly into code. It is a description or template for how to solve a problem that can be used in many different situations.
The problem with design patterns is the "not a finished design" part of the definition. Rather that a DRY solution, design patterns give you repetition throughout the code. Worse yet, the repetition is not exact. It is repetition with variation, and there is often no evidence whether the variation is intentional or accidental.
I like Ruby because I can eliminate design patterns when they start to annoy me. This "Open Class Add Method" pattern annoyed me for the last time earlier today, when two different libraries defined incompatible versions of Object#metaclass. Enough is enough. Let's make a library call for reopening classes.
Here are my design goals:
These three goals are in conflict (and we could easily come up with more). This illustrates another problem with design patterns. Each time a design pattern is used, a programmer favors some design goals over others. Over time, this leads to a codebase at odds with itself. If the same pattern were captured in a reusable module, then changing design priorities could be handled from that module alone.
Here's a strawman proposal for cleanly adding methods to existing Ruby classes. The following code adds #jump to Object:
embrace{Object}.and_extend do
def jump
puts "jumping"
end
end
The syntax is simple and involves just one block, meeting goal one. Behind the scenes, I use __FILE__ and __LINE__ to define a module, which gives us auditability (goal two):
> puts Object.ancestors Object Anonymous module from /Users/stuart/Desktop/temp.rb 56 Kernel
Finally, the code that mixes in the module walks the inheritance hierarchy first, printing a warning whenever a name collision is encountered (goal three).
Warning: /Users/stuart/Desktop/temp.rb 64 is attempting to redefine jump
Originally defined in /Users/stuart/Desktop/temp.rb 56
The complete implementation is included at the bottom of this post. I am sure it can be improved in several ways, but even in its primitive state it beats a design pattern. As long as the API is decent, we can always make the implementation suck less later.
Should I make a gem out of this? What changes would you like to see in the API? How should the handling of method collisions be specified?
require 'set'
module Embrace
class <<self
def check_for_collisions(clazz, module_to_include)
new_methods = Set.new(module_to_include.instance_methods(false))
clazz.ancestors.each do |anc|
anc.instance_methods(false).each do |meth|
collision(anc, module_to_include, meth) if new_methods.member?(meth)
end
end
end
def collision(clazz, module_to_include, method)
puts "Warning: #{module_to_include} is attempting to redefine #{method}"
puts " Originally defined in #{clazz}"
end
end
end
def embrace(&clazz_block)
m = Module.new
file = eval("__FILE__", clazz_block.binding)
line = eval("__LINE__", clazz_block.binding)
clazz = clazz_block.call
meta = class << m; self; end
meta.class_eval do
def and_extend(&blk)
self.class_eval(&blk)
mixin_to_class
self
end
define_method("mixin_to_class") do
Embrace.check_for_collisions(clazz, m)
clazz.class_eval do
include m
end
end
define_method("to_s") do
"Anonymous module from #{file} #{line}"
end
end
m
end
o = Object.new
embrace{Object}.and_extend do
def jump
puts "jumping"
end
end
o.jump
embrace{Object}.and_extend do
def jump
puts "jumping higher"
end
end
o.jump
Facets defines metaclass like this:
def meta_class(&block)
if block_given?
(class << self; self; end).class_eval(&block)
else
(class << self; self; end)
end
end
alias_method :metaclass, :meta_class
RSpec defines it this way:
def metaclass class << self; self; end end
I just spent an hour figuring out why some carefully-tested code went no-op after adding RSpec to a project. As a community we need to commit to a standard definition here. What should it be?
Giles says It's Not Meta, It's Just Programming. Darn straight! The specific example he gives is the ability to add methods to specific instances, instead of to an entire class. As he demonstrates, this is invaluable for isolation when testing.
# written this way to demonstrate eigenclass syntax
class << @response
def body
({:foo => "bar"}.to_xml(:root => "thing"))
end
end
Combine eigenclass methods with open classes, and almost any idiom can be automated. If you regularly add stubbed instance methods for testing purposes, why not write a helper for just that? For many common tasks, including this one, the work is already done. Here is a more literate version of the above code:
# exact syntax depends on your choice of mocking library @response.stubs(:body).returns(some_canned_response)
If you are a n00b, this power is scary. Once you start treating code as data, the elegance of your code is dependent on your skill. You cannot hide behind the limitations of your programming language anymore, because there aren't any.
I am proposing a patch to help cope with the dreaded Rails LoadError:
LoadError: Expected foo.rb to define Foo
In Ruby, it is simple to load code, just require it. In script/console:
>> require 'account_controller' => ["AccountController"]
Rails extends this to magically find classes just based on their name:
>> AccountController => AccountController
Most of the time, if a class does not exist, you get a helpful exception:
>> AccountController MissingSourceFile: no such file to load -- hpricot_scan
Ah, so my AccountController depends on hpricot, which isn't available for some reason. Solution: go find hpricot.
But once in a while this problem presents a different symptom:
>> AccountController LoadError: Expected account_controller.rb to define AccountController
This is confusing, since account_controller.rb does define AccountController! Experienced Rails developers know that this cryptic message actually means "Something went wrong in application_controller.rb, but Rails swallowed the real exception." After being bitten by this on three different projects in the last two weeks, I decided to track the issue down. Turns out the problem is in how fixtures get loaded:
begin require_dependency file_name rescue LoadError # Let's hope the developer has included it himself end
After fixtures swallow the real MissingSourceFile for a subdependency such as hpricot, ActiveSupport raises a misleading LoadError for the original dependency (account_controller.rb) that references hpricot.
In a perfect world, I would simply have fixtures stop swallowing LoadErrors. But the comment strongly suggests that some code depends on this behavior. So weaker sauce is to at least log the problem:
def try_to_load_dependency(file_name)
require_dependency file_name
rescue LoadError => e
ActiveRecord::Base.logger.warn("Unable to load #{file_name}, underlying cause #{e.message} \n\n #{e.backtrace.join("\n")}")
end
If anybody knows how to distinguish the confusing LoadErrors from the expected ones, please go and improve the patch.
Over the last few weeks I have repeatedly linked to Ola's post about the stable layer. I didn't take the time to go into detail, and I trusted that people (if they wanted to) could follow the link and understand what Ola was talking about.
Well, that didn't work so well. Most responders clearly did not understand Ola. A few informed me that I didn't understand Ola. :-) So I am going to make a clean break, and lay out my own argument in more detail. What follows are my views about how layered architecture affects language and platform choice. First, some ideas that are hopefully uncontroversial:
This might seem obvious. All other layers depend on the lower layers, so a problem at the bottom affects a lot of code. But if you are working several layers higher, problems at the bottom are part of the air you breathe. The air may smell terrible, but you are acclimated and don't notice.
This is uncontroversial because the majority has chosen. And they are right to do so: the Java VM is well-specified, widely implemented, carefully optimized, and supported by a huge array of tools.
Noooo! Java is a high-ceremony language. At every turn, Java enforces a high busy-work/real-work ratio. Specifically:
That's a pretty big stink, but if you are used to it you probably can't smell it anymore.
The net result of these problems is that bottom layer code written in the Java language will be bloated and difficult to maintain. These problems multiply if we use the Java language for higher layers as well. What should we do?
For better or worse, Java is already the bottom layer for many businesses. A complete rewrite is impossible, so we need an approach that lets us continue to use our existing Java code. There are two obvious choices:
My opinion, based on extensive experience with both options, is that the "better VM language" approach is better than the "fix Java with a framework" approach.Smaller is better.
In the past few weeks, I have been approached by several organizations to advise them on platform decisions. Every organization is different, but here are some guidelines to consider.
Keep an open mind. Try several approaches. Judge your choices by how easy they would be to unmake or adapt. Have fun!
[1] In the past I have had a lot to say about static/dynamic typing. I realize now that I was trying to talk about ceremony. I am still worried about the same problems, but I think I now know them by more accurate names.
This morning I was troubleshooting a production problem with the simple_localization plugin. The code worked fine in development, had 100% passing C0 coverage in test, and worked fine in production on my local box. But on the staging box, we were getting the dreaded load error:
LoadError: Expected /simple_localization/lib/cached_lang_section_proxy.rb to define CachedLangSectionProxy
If you use Rails plugins and ever see this problem, read on...
In Ruby, you can load a Ruby source file from the load path by requiring it.
require 'my_class'
This is explicit, and easy to understand. But you might get tired of spelling things out all the time. So in Rails you can also load a class implicitly when it is needed:
MyClass
This is somewhat Java-like, in that magic happens to find the code based on some naming conventions, e.g. My::Namespaced::MyClass should be in a file namedmy/namespaced/my_class.rb somewhere on the load path. It is also Java-like in being difficult to debug, leading to errors like the LoadError above.
Knowing that the LoadError is a failed implicit load, the first step is to look at the point of failure in the file cached_lang_section_proxy. Here is is, elided for clarity:
module ArkanisDevelopment
module SimpleLocalization
class CachedLangSectionProxy
Ah hah, you say. The error is right on. This file doesn't define CachedLangSectionProxy, it defines CachedLangSectionProxy in the ArkanisDevelopment::SimpleLocalization module. So implicit loading can't work with the code as written. But we have a workaround: we can move this file (and probably several others) into a directory structure that matches Rails conventions. I am not going to do that, because...
We can get implicit loading to work, but we still haven't tackled the real problem. Why did the code ever work on my local box to begin with? We know that implicit loading can't work, so somehow my local box must be explicitly loading the files, but in a machine-dependent way that fails on the staging box.
Rails plugins include an init.rb that runs during Rails startup, and is often used to explicitly load configuration and code. Here is that code from simple_localization:
Dir[File.dirname(__FILE__) + '/lib/*.rb'].each do |lib_file| require File.expand_path(lib_file) end
This is broken, but if you develop on Mac OS X you may never notice. The plugin's internal dependencies are arranged in such a way that loading the files in alphabetical order works. In all of my experiments, Ruby's directory traversal APIs on the Mac return files in alphabetical order. However, this ordering is not required by the Ruby language. On Linux, the files can come back in any order.
Given that many Rails developers work on OS X, and deploy to Linux, this leads to an amusing variant of "It works on my box": It works on all developer boxes, and fails on all production boxes..
An easy fix is to sort the files explicitly:
Dir[File.dirname(__FILE__) + '/lib/*.rb'].sort.each do |lib_file| require File.expand_path(lib_file) end
Better would be to organize init.rb so that the dependencies are clear (the fact that alphabetical order happens to work is a fragile coincidence).
init.rb. They will work deterministically on your box, but maybe not everywhere else.Java and Ruby both have an explicit and implicit loading story. What is interesting is that in Java this story is implemented in the language, while in Ruby a significant part of the story is in the libraries. It is Rails, not Ruby, that implements implicit loading, and you can read much of that story in this source file (updated link: with syntax highlighting). Understand this file, and you will know much of what is best and worst in Ruby.
Jay Fields has envisioned a beautiful future for software development with his EDRY dialect of Ruby. But what is Enhanced DRY without better CoC (Convention over Configuration)?
I have modified Jay's code to rely more on convention. Why have a distinct vocabulary for fields vs. mixins, when the right thing to do can be inferred from the types involved? The result is some really tight code:
C Enumerable, :first_name, :last_name, :favorite_color do
d.complete_info? { nd(first_name,last_name) }
d.white?.red?.blue?.black? { |color| favorite_color.to_s == color.to_s.chop }
end
I am including the full source at the bottom of this entry. Can you make it even DRYer and more convention-driven?
class Object
def C(*args, &block)
attrs = args.find_all {|arg| Symbol === arg}
includes = args.find_all {|inc| inc.instance_of?(Module)}
name = File.basename(eval("__FILE__", block.binding),".rb")
klass = Struct.new(name.capitalize, *attrs)
Kernel.const_set(name.capitalize, klass)
klass.class_eval(&block)
klass.send :include, *includes
end
def s
self
end
end
class Class
def ctor(&block)
define_method :initialize, &block
end
def i(mod)
include mod
end
def d
DefineHelper.new(self)
end
def a(*args)
attr_accessor(*args)
end
end
class DefineHelper
def initialize(klass)
@klass = klass
end
def method_stack
@method_stack ||= []
end
def method_missing(sym, *args, &block)
method_stack << sym
if block_given?
method_stack.each do |meth|
@klass.class_eval do
define_method meth do
instance_exec meth, &block
end
end
end
end
self
end
end
# http://eigenclass.org/hiki.rb?instance_exec
module Kernel
def instance_exec(*args, &block)
mname = "__instance_exec_#{Thread.current.object_id.abs}_#{object_id.abs}"
Object.class_eval{ define_method(mname, &block) }
begin
ret = send(mname, *args)
ensure
Object.class_eval{ undef_method(mname) } rescue nil
end
ret
end
end
def nd(*args)
args.each {|x| return false unless x}
true
end
# convention: symbols are attributes, modules are to be included
C Enumerable, :first_name, :last_name, :favorite_color do
d.complete_info? { nd(first_name,last_name) }
d.white?.red?.blue?.black? { |color| favorite_color.to_s == color.to_s.chop }
end
I met Luke Francl at Code Freeze last week, but we only had time to speak for a minute. It was enough to know we are of like mind: security should be on by default. Luke has written a new plugin, xss_terminate. It is inspired by acts_as_sanitized, but it has stricter defaults and more options. Nice.