Contributing to Rubinius

Brian Ford 18 October 2011

Implementing Ruby is a lot of hard work. The Rubinius project has been lucky to have more than 240 contributors, many with hundreds of commits, including code, benchmarks, documentation, translations, and more. Lately, we have seen a surge of new folks working hard on 1.9 language features. While we concentrate on making Rubinius easy to contribute to, this post should clarify some things and pave an easier road to your first Rubinius commit.

Contributions

Before diving into the Rubinius code, I want to emphasize that there are many ways that you can contribute to Rubinius. One of the most valuable for us is trying your own library or application on Rubinius. We've worked super hard to make using Rubinius in place of MRI as simple as possible. For example, with a typical Rails application, here is all you should need to do to get set up:

rvm install rbx
rvm use rbx
gem install bundler

Once you've got Rubinius and Bundler installed, running your application should be this simple:

cd <my_application>
rm Gemfile.lock
bundle install

Note the step of removing the Gemfile.lock. This is necessary to force bundler to resolve the dependencies for Rubinius. This is very important. Not all gems for MRI will work with Rubinius, and in some cases Rubinius has built-in gems or stubs that need to be considered when resolving dependencies (ruby-debug and ffi are two examples).

Once bundle install is finished, you should be able to start your application just like you would under MRI. If you have any trouble, please let us know. Issues for Rubinius are tracked on Github.

Another way to contribute to Rubinius is talking about the project. If you tried your application and your 50 gems installed without problems, consider tweeting at us or writing up a quick blog post about your experiences. If you've done something fancy that you'd like to share with us, we're always happy to have guest blog posts, too. We even have documentation on how to write a blog post.

Clone & Build

Before we can do anything else, we need to get the Rubinius source code and build it. Run the commands below to do this:

git clone git://github.com/rubinius/rubinius.git
cd rubinius
./configure
rake

You can run Rubinius directly from the build directory, there is no need to install it. We provide symlinks for common commands like gem, rake, irb, ri, rdoc, and even ruby. Just add <rbx_clone>/bin to your PATH.

If you run into any trouble with these steps, see the Getting Started page for more information. You may need to install libraries required to build Rubinius. If you don't find answers there, visit the #rubinius channel on freenode.net and we'll help you out.

While the build is running, let's get a quick overview of how Rubinius is organized.

Code Tour

There are two main divisions in the Rubinius source code. The virtual machine, garbage collector, and just-in-time (JIT) compiler are written in C++. The Ruby core library, bytecode compiler, and various tools like the profiler and debugger are written mostly or all in Ruby.

Ruby Core Library

The Ruby core library is found in the kernel/ directory. The kernel is divided into subdirectories that are loaded in order when Rubinius boots. The divisions were made to help share the Ruby core library with other implementations. I'll cover those basic divisions here. For more details about how the loading process works, see the Bootstrapping documentation.

  1. alpha.rb - Sets up very basic Ruby features needed to start loading the rest of the core library.
  2. bootstrap/- Contains implementation-specific features needed to load the main parts of the core library.
  3. platform/ - Contains platform-specific features like the FFI (foreign-function interface) code that is used extensively in Rubinius to bind to standard libc functions.
  4. common/ - Contains the majority of the Ruby core library and the code should be implementation-agnostic.
  5. delta/ - Contains more implementation-specific code that may extend or override code from common/.

Most of the work on the Ruby core library will be done in kernel/common/.

Rubinius VM

The concept of a virtual machine is somewhat nebulous. It can be hard to draw the lines around the different components. In Rubinius, the code for the bytecode interpreter, garbage collector, and JIT compiler is under the vm/ directory. There are subdirectories for the garbage collector (vm/gc/) and the JIT compiler (vm/llvm/). The main() function is in vm/drivers/cli.cpp.

One of the important parts of Rubinius are the low-level operations that cannot be defined in Ruby. These are things like adding two Fixnums together. These operations are called primitives and the code for them is in vm/builtin. Since you will likely encounter these in the core library, we'll delve into them a bit.

Primitives

All methods that can be called in Ruby are exposed as, well, Ruby methods. If you open kernel/bootstrap/fixnum.rb, you should see the following code:

1 def to_f
2   Rubinius.primitive :fixnum_to_f
3   raise PrimitiveFailure, "Fixnum#to_f primitive failed"
4 end

The Rubinius.primitive :fixnum_to_f code looks like a normal Ruby method call but it is not. It's actually a compiler directive to tag this Ruby method as having an associated primitive operation. The name of the primitive is fixnum_to_f. This naming convention is standard, being composed of the class name and the method name. Methods in Ruby that are characters, like +, are given word names for the primitives.

When this method is run, the primitive operation is invoked. If the primitive operation fails, the Ruby code following the Rubinius.primitive line is run. This code can perform any Ruby operation. For example, it may coerce the arguments to a particular class and re-dispatch to itself. If no other operation is appropriate, the method should raise an exception.

To see how the Ruby method relates to the primitive code, open vm/builtin/fixnum.hpp:

1 // Rubinius.primitive :fixnum_to_f
2 Float* to_f(STATE);

The vm/builtin/*.hpp files are processed by the Rubinius build system to automatically generate C++ code to resolve and bind these primitive operations. The comment provides the link between the Ruby method and the C++ method.

Finally, the actual implementation of this primitive is found in vm/builtin/fixnum.cpp:

1 Float* Fixnum::to_f(STATE) {
2   return Float::create(state, (double)to_native());
3 }

Here you can see that a new Float object is being created from the value of the Fixnum. Rubinius names the C++ classes that implement the Ruby primitive operations the same as their Ruby counterparts. One of the goals of Rubinius is to build an elegant, easily comprehensible system, and we feel that this consistency has been a great benefit toward that goal.

Now that we have a basic idea of the structure of Rubinius, let's look at some aspects of its runtime behavior, in particular, supporting different Ruby language modes.

Language Modes

Rubinius 2.0 (the master branch) implements both 1.8 and 1.9 language features in one executable. You can select the language mode at runtime by passing the -X18 or -X19 flag, either as a command line option or by setting the RBXOPT environment variable. Both commands below should have the same effect:

RBXOPT=-X19 bin/rbx -v
bin/rbx -X19 -v

If I run that on my system, I will see the following:

rubinius 2.0.0dev (1.9.2 0f223599 yyyy-mm-dd JI) [x86_64-apple-darwin10.8.0]

The default language mode is 1.8, so if you invoke rbx with no other options, you'll be running in 1.8 mode. You can change the default mode with a configure time option as follows:

./configure --default-version=1.9

If you configure Rubinius to have a default language mode of 1.9, you can access 1.8 mode with the -X18 runtime option as discussed above.

Ok, we've got the code, we understand something about how it is organized, we've got the runtime behavior down, now let's look at actually implementing Ruby. To do that, we need to know how Ruby behaves, and that is what RubySpec is all about.

By the Spec

Rubinius created the RubySpec project to ensure that we would be faithfully implementing Ruby behavior, and we are constantly contributing more to it. Basically, Rubinius does it by the spec. So, any commit to the Ruby core library in Rubinius must either have new specs or make existing specs pass. To effectively contribute to Rubinius, you'll need to understand some basics about RubySpec. I recommend that you have a read through the documentation at rubyspec.org.

RubySpec includes a custom-built framework for the specs called MSpec. The syntax is intended to be consistent with RSpec, but there are various facilities that are purpose-built to support multiple implementations, multiple versions of Ruby, and multiple platforms.

Running Specs

MSpec provides several different scripts to run the specs under different conditions. The default behavior is to simply run all the specs. If you invoke the following command, it will run all the Ruby Array specs in the default language mode, which should be 1.8 unless you configured 1.9 to be the default:

bin/mspec core/array

To Run the specs in 1.9 mode, add the -tx19 option:

bin/mspec -tx19 core/array

The -t option specifies which target to run the specs under. The default in Rubinius is to run them with Rubinius, so -tx is implied. You can easily run with another target by giving the name of an executable on your PATH or the full path to an executable. Since the specs are intended to show the behavior of MRI, if you are writing new specs you need to run them under MRI 1.8.7 and 1.9.2. I have those on my PATH, so I can do the following:

bin/mspec -t ruby1.8.7 core/array
bin/mspec -t ruby1.9.2 core/array

Finally, if you are running bin/mspec in the Rubinius source directory, the location of the RubySpecs are known (spec/ruby/), so you can use the full path or the shortened version core/array above.

Continuous Integration

One goal of MSpec is to make it as easy as possible to run the specs for the parts of Ruby that have been implemented. It takes a long time to implement all of Ruby correctly, but we want to know that the parts we have implemented don't get broken while working on other parts. That is the role of continuous integration. To use CI effectively, we need to partition the specs into those that we expect to pass and those we know we don't pass yet. MSpec provides a facility for this, called tagging, that we'll look at shortly. For now, we'll just look at running the specs in CI mode.

To run all the Rubinius specs in CI mode under the default language version, use the following command:

bin/mspec ci

Likewise, to run these specs in the 1.9 language mode, add the -tx19 option:

bin/mspec ci -tx19

The bin/mspec ci command runs the mspec-ci script. You should be familiar with this mechanism from working with Git. It's the same idea. The mspec script itself is just a utility to invoke the various specific MSpec scripts. To see the options for mspec, run the following command

bin/mspec -h

There are three basic functions that MSpec performs and these correspond to mspec-run, mspec-ci, and mspec-tag. When not given an operation, mspec assumes run, so the following two commands are equivalent:

mspec core/array
mspec run core/array

If the operation is given, it must be the first parameter to mspec. In the case below, the first command runs mspec-ci with core/array while the second command runs mspec-run with core/array and ci as files.

mspec ci core/array
mspec core/array ci

Now that we've got the basics of MSpec down, let's look at how we find specs that fail on Rubinius. To do this, we'll use the mspec tag command.

Tagged Specs

Since Rubinius uses the tagging mechanism to create the set of CI specs to run, the best way to discover what parts of RubySpec that Rubinius isn't passing yet is to list the specs that are tagged. There's a command for that:

bin/mspec tag --list fails -tx19 :ci_files

This command lists all the specs that are tagged as failing. There's some new syntax here, namely :ci_files. MSpec has the concept of pseudo-directories. Basically, they are lists of files. The reason for this is that running all the core or standard library specs in RubySpec is not as simple as just running all the files under spec/ruby/core or spec/ruby/library. It's more complicated than that because there are 1.8- and 1.9-specific libraries. Rather than wrapping everything in ruby_version_is guards, MSpec adds version-specific lists and names them, for example, :core and :library.

In this case, we're using the list of files specified by :ci_files. This list excludes some files that are known to cause problems if they are run.

The list of specs that are currently marked as failing is pretty long. We can reduce the number of tags we are looking at by giving a smaller set of specs to run. For example, let's just run the File specs:

bin/mspec tag --list fails -tx19 core/file

Looking at the output from this command, we notice (at least at the time of writing this post) that there are several failures in the File.world_writable? specs. We can run just these specs:

bin/mspec tag --list fails -tx19 core/file/world_writable

If we look into the documentation for File.world_writable?, we'll find that it is a new method introduced in 1.9. Excellent, this gives us an opportunity to talk about language-specific changes in Rubinius.

Language-specific Changes

When Rubinius boots, it loads different files depending on what language mode it is running. In the kernel/**/ directories, there are load_order18.txt and load_order19.txt files. These files are used during the build process to create separate runtime kernels for Rubinius. You can see these in the runtime/18 and runtime/19 directories after building.

Here's how language-specific features are handled in the Rubinius kernel.

  1. If there are no language-specific methods, the name of the file in kernel/common is the name of the class. In the case here, the file is kernel/common/file.rb. This rule applies regardless of whether the class is 1.8- or 1.9-specific. For example, Rational is part of the 1.9 core library, but does not exist in the core library in 1.8. The Rational class is in kernel/core/rational.rb.
  2. If there are version-specific methods, they go in kernel/common/file18.rb and kernel/common/file19.rb. The correct file is then added to the appropriate load_orderNN.txt file.

In the case of File.world_writable?, there is no 1.8 version. So, we open kernel/common/file19.rb and add the method definition. After changing any of the kernel/**/*.rb files, we have to build Rubinius. Run the following command to do that:

rake build

After making the change, we verify that the specs pass by running the specs without the CI tags:

bin/mspec -tx19 core/file/world_writable

If all the specs pass, then you're ready to remove the CI tags. To do so, run the following command:

bin/mspec tag --del fails -tx19 core/file/world_writable

After removing the CI tags, the final step is to ensure that all specs still pass. To run all the CI specs in both 1.8 and 1.9 modes, simply do:

rake

If everything passes, you're ready to submit a pull request. All in all, that wasn't too bad, right?

One final note, if you are making changes to RubySpec, make separate commits in your pull request for changes to spec/ruby/**/*_specs.rb and another commit for any other Rubinius files. It is fine to commit the removed tags with the other Rubinius changes.

Wrapping Presents

The information here should give you everything you need to get your feet wet in Rubinius. By the way, today is Evan's birthday. If you're not taking him to dinner, why don't you show your appreciation for this fantastic project he created by grabbing Rubinius and hacking on some Ruby code. Be safe and have fun! We can't wait to hear from you.

Tweet at @rubinius on Twitter or email community@rubini.us. Please report Rubinius issues to our issue tracker.

We email about once a month and never share your email address for any reason

Recent Posts