This is the last post in the Rubinius 3.0 series. We started by talking about the Rubinius Team. In part 2, we looked at the development process and how we more effectively deliver features to you. Following that, we explored the new instruction set, which is the foundation for running Ruby code. And yesterday, we looked at the Rubinius system and integrated tools. Today, I'll talk about changes that I'm introducing to the language.
I mentioned that these posts were in order of importance. If we arrange the posts as in the figure below, we see that the Team and community form the foundation on which the development and delivery process is built. This gives us a basis for making good technical decisions, like the new Rubinius instruction set. In turn, that enables us to build more powerful tools for developers.
Finally, the language comes at the top. It's the least important piece, really representing the icing on the cake. The language is still important. After all, a cake without icing is a poor cake. However, the language needs to be seen in context and in proper relation to the rest of the system.
/ Language \
/ System & Tools \
/ Instruction Set \
/ Development & Delivery Process \
/ Team & Community \
Now that we see where language fits in, we can investigate it further. Is this chocolate icing or vanilla icing?
Everything Is An Object
There's no gentle way to say this, you've been misled about Ruby.
Everything is not an object, and objects are not everything.
I admit that I suffered this delusion that everything is an object for a long time as well, and I earnestly tried to convince others that this was true. This falsehood is causing us a lot of problems. Even worse, it's preventing us from fully benefiting from objects.
There's an important reason we use objects, and that's the reason objects are useful. That may sound circular, but it's not. Objects are useful because of the problems they help us solve. They are not abstractly useful independent of any context. In fact, when we misuse objects, they aren't very helpful.
In The Power of Interoperability: Why Objects Are Inevitable, the author suggests the following reason why objects are useful when writing programs. He actually goes further than useful and suggests that either objects or something that simulates objects are inevitable.
Object-oriented programming is successful in part because its key technical characteristic, dynamic dispatch, is essential to fulfilling the requirement for interoperable extension that is at the core of many high-value modern software frameworks and ecosystems.
Objects are useful because they allow pieces of a system to inter-operate while they evolve under different rates of change by encapsulating information such that coupling (ie dependencies, brittleness) is reduced to a minimum.
The idea of interoperability includes the ideas of interface, boundary, and integration. Objects inter-operate at their boundaries, which define the interface with other objects. To integrate well, those interfaces must match up well enough to do useful work. At the same time, where they do not match up must not interfere with doing useful work.
It's important to understand that "interoperability" is merely a fancy way of saying, to share the work. Everything here is about sharing the work. If A sends a message to B, A is relying on B to do the work specified by the message. A could just as well do all the work itself, but that would be wasteful if B already does exactly what A needs.
With objects, we have two ways of sharing work. When we inherit from a class in Ruby, or include a module, we are sharing work by being a kind-of the thing that we inherit from. When we delegate work to another object that we reference, we are using composition, or has-a relationship to share work.
This leads us to a new definition of "Object":
an Object is something you can send a message, not a thing you reference (i.e. hold onto in a variable or data structure).
This definition gives a simple, unambiguous way to identify objects: "Can I send this a message?". If the answer is No, it's not an object. The focus of message is on communication and behavior, not thingness. This is even more important when we consider proxies. The actual thing I send the message to is unimportant. Insisting that it be a particular thing causes endless pain in programs. The proxy may handle the message or delegate it, and this decoupling and encapsulation of information is essential for interoperability.
Inheritance and composition give us what I call the family and friends model of sharing work. But there's an important dimension missing from this model.
Everything Is Not An Object
We have seen that objects provide two things: a way to share work, and a means of inter-operating by encapsulating information. There is more than one way to share work. We aren't all friends and family.
At this moment, writing this post, I'm sitting at my desk in my apartment in Chicago in Illinois in the USA on earth, and so on. This boxes in boxes containment relationship is essentially about context. In the context of my kitchen, I may cook or clean dishes. I do not typically clean dishes in my bedroom. I'm the same person in each of these contexts, but my behavior may be substantially different. Of course, some behaviors may be the same. Whether I'm cleaning dishes or sleeping, I certainly hope I'm still breathing.
We are familiar with this containment relationship in Ruby. In the following code, the method
name returns the value of the constant
X. Ruby finds the constant by looking up a chain of boxes that in Rubinius are represented by the ConstantScope objects.
X = "Ruby"
There is a need in Ruby to better express this sort of relationship. We need objects to be able to share work without relying solely on friends and family. It turns out, there's a simple idea that provides this very ability: they're called functions. In Ruby, we've been so busy thinking that objects and functions are opposites that we didn't realize they are mostly complementary. I would say objects and functions are orthogonal, serving different and independent purposes.
As we see with the constant search example above, containing lexical scopes exist in Ruby. In Rubinius, they are objects you can reference and send messages to. The lexical scopes provide a mechanism to relate objects and functions.
It turns out that Ruby's syntax is just flexible enough to permit us to use a syntax for functions that is reasonably consistent with the syntax for methods (except for the ugly
do on the end):
fun polynomial(a, b, c) do
a + b * c
polynomial 2, 4, 9
Just like the constant search for X above, the
compute method can refer to the
polynomial function because it exists in the method's containing lexical scope.
This Boundaries talk by Gary Bernhardt is the best illustration of these ideas that I know of right now. I highly recommend watching it. I'm not going into depth about functions today, other than introducing them. They are a very well-understood area of computation and they are extremely useful. In the coming weeks, I'll write more about how we are using them to rewrite the Ruby core library in Rubinius 3.0
Gradual Types For Functions
Related to functions are the concept of types. Types are a mechanism to ensure that for any "well-typed" expression, we are guaranteed that the result of evaluating the expression will be well-typed, and that evaluation will succeed. This idea is referred to as progress and preservation. Types are an extremely powerful tool, when properly applied.
Ruby's syntax is also flexible enough to permit adding type annotations like the following:
fun polynomial(a: int64, b: int64, c: int64) do
a + b * c
Again, I'm not going into detail about types in this post. However, Rubinius 3.0 will include gradual typing for functions. The field of gradual typing is experiencing growing interest, as illustrated by this recent talk by Philip Wadler at Galois. We will apply the best current research on gradual typing in Rubinius 3.0.
There's one aspect of gradual typing that I do want to make clear: Objects are the absolute worst place to put types because types conflict with the reason objects are useful.
Objects need to provide the minimum interface to inter-operate. In other words, objects need to be as isolated as possible. Objects also need to have the ability to be incomplete. This incompleteness, or partial completeness, is not just defined by something missing. The partial-ness provides a space for behavior to evolve in a way that integrates with the already existing behavior.
In Rubinius, we have no intention to add typing to objects. Down that road awaits infinite pain and suffering.
There's one final idea I want to present today: the idea of multiple dispatch for functions and methods. For methods, dispatch (or sending a message), is now done only by considering the kind of object that the message receiver represents. Unfortunately, this forces a single method body to include logic for any number and kinds of objects that can be passed as parameters.
Array# or element reference, can take different numbers and kinds of arguments. It might receive a single Fixnum, two Fixnums, a Range, an Object that responds to
#to_int. I'd have to go look at the RubySpecs to know if I've covered the cases. This method is not unusual in the complexity of its interface. There are worse.
IO.popen is an egregious example. It has at least 43 possible combination of arguments. Some of those arguments can partially overlap and the semantics when they do are essentially undefined. The APIs in the Ruby core library are embarrassingly messy. It's obvious that we need additional support in the language to handle the complexity without a mound of the proverbial balls of mud.
In multiple dispatch, the receiver, number of arguments, and kinds of objects passed as parameters are all considered when finding the correct method to handle the message that was sent.
By using multiple dispatch, we can write each method to handle the specific work that it needs to perform based on the kinds of objects it receives and correctly factor the shared work into a separate method. This improves our ability to comprehend the code while also improving the performance of the system as well.
In Rubinius 3.0, we are implementing multiple dispatch and using it to rewrite the Ruby core classes. Following the example above, we might define
Array# as follows:
# return element at index
def (index=Fixnum(), num=Fixnum())
# return num elements starting at index
# return elements from range.start to range.end
# coerce index and dispatch
def (index, num)
# coerce index, num and dispatch
The compiler that is used to compile the Rubinius 3.0 kernel will understand multiple dispatch, so successive method definitions add to, rather than overwrite, the set of methods that can handle a message.
A note about the syntax above:
def (index=Fixnum) defines a method that takes a single parameter that is a kind-of Fixnum. The "default argument" syntax in Ruby is the only thing that permits expressing this simply. To distinguish this positional argument from a default argument, note that
Fixnum() has no value in parenthesis. In contrast:
def (index=Fixnum(123)) defines a single default argument with value 123. Passing a parameter that is a kind-of Fixnum will match, and if no parameter is passed, the value 123 will be used.
There's an additional aspect of the
Fixnum() syntax that I want to highlight. It looks like a function or operation and that's important. These are not "types". They are match-syntax for a kind of object and also reflect an operation that would coerce an arbitrary Object instance into an object of the specified kind. In the case of Fixnum() or Integer(), it would be the #to_int method.
To summarize, we have these things in Rubinius 3.0: functions, gradual types for functions, and multiple dispatch for methods and functions.
This Is Not Rubinius X
I want to emphasize that this is not Rubinius X.
Rubinius X includes these ideas but has many additional features. My objective for introducing these features into Rubinius 3.0 is to massively reduce the complexity of the current implementation of Ruby, significantly improve the performance of Ruby, and build the foundation for Rubinius X (and other languages) to integrate with Ruby.
The ideas explained in the other posts about the new instruction set and the tools we are building are all focused on making it possible to transition existing applications to Rubinius X without paying the cost of disruptive rewrites. With this in mind, here's one more thing.
A New Machine
We are living at a time where active experimentation with languages is escaping academia and having a major commercial impact. There was a dreary day when it looked like Java, C#, and C++ would dominate programming. Thankfully, that's no longer the case. Very good new languages like Rust and Swift are commercially viable, and "experimental" languages like Haskell and Idris are making their way into industry. Very exciting!
While working on Rubinius, we have learned a lot about features that facilitate language development. However, underneath, we have been biased toward many features in Ruby. This has limited the utility of Rubinius in building languages with features that don't significantly overlap those in Ruby. However, as I've described in this post, Ruby's semantics are too limited to provide a language that is useful for many critical programming tasks today.
Accordingly, we are extracting a more useful language platform out of Rubinius as the Titanius system. With the function support I'm adding in Rubinius 3.0, we will use dynamic (Ruby-like), static (C-like), and complex (Idris-like) semantics to refine our design and implementation. We want to ensure that the languages are able to maximally reuse existing components while still having the ability to express their own semantics in a fundamental way.
I hope you have enjoyed this series on Rubinius 3.0 and that it has given you a view into a much more useful and refined Ruby language.
There are so many hard problems that we need to solve. To be happy writing code, the language must solve the problems we have. Then we can help people using our products to be happy, too. Then businesses can be profitable by building those products that we are happy making. We can't avoid understanding this deeply and we must take responsibility for it. I hope you'll join us on this journey.
I want to thank the following people: Chad Slaughter for entertaining endless conversations about the purpose of programming languages and challenging my ideas. Yehuda Katz for planting the seed about functions. Brian T. Rice for trying to convince me that multiple dispatch was useful even if it took six years to see it. Joe Mastey and Gerlando Piro for review and feedback, some of it on these topics going back more than a year. The Rubinius Team, Sophia, Jesse, Valerie, Stacy, and Yorick, for reviewing and putting up with my last-minute requests.
Yesterday, I presented a look into the new instruction set in Rubinius 3.0. Today, I want to talk about changes to some of the bigger systems, like the garbage collector, just-in-time compiler, and the Rubinius Console. This will be the shortest post but we'll be writing a lot more about these parts in the coming weeks.
I hope you're enjoying these posts and finding them inspiring. I'm excited to be bringing these ideas to you. In case you've been thinking about contributing or joining the Rubinius Team but are unsure if you want a lot of public attention, I wanted to share a book I've been reading: Invisibles: The Power of Anonymous Work in an Age of Relentless Self-Promotion.
To summarize, we're not recruiting rock stars. If you are one, that's great. For the rest of us, the online world can be very hostile at times, especially with the harassment of women and minorities that we are seeing on a daily basis. We must work hard to end these harmful actions, and at the same time give people safe places to work from. If you'd like to contribute but stay anonymous, we completely support you.
The phrase "virtual machine" is most often used to refer to a system like Rubinius whose primary purpose is execute a program written in some programming language. The phrase is quite vague. The main subsystems in Rubinius are the garbage collector, the just-in-time compiler, and instruction interpreter (which we discussed in Part 3), and code that coordinates these components and starts and stops native threads. All of these are quite common in a system like Rubinius.
In this post, we also look at a set of tools that are deeply integrated into the rest of the Rubinius components. Sometimes these sort of tools are considered an after-thought. In Rubinius 3.0, we are approaching these tools as fundamental parts of the system.
There are two changes coming to the Rubinius garbage collector. It will move toward a fully concurrent implementation and we'll be working on implementing near-realtime guarantees on any critical pauses. Also, we'll add a new type of memory structure that I'm calling a "functional object".
Right now in Rubinius, every object basically looks like the schematic below:
| Object header |
| Object class |
| Instance variables |
| ... |
| Optional Object |
| reference fields |
| Object-specific |
| data |
In Rubinius 3.0, there is an additional type of object:
| Header |
| Optional Object |
| reference fields |
| Data |
We are already using the second kind of object in Rubinius now, primarily for Bignum memory management, but we are formalizing and expanding our use of it to many other contexts.
The Rubinius just-in-time compiler processes bytecode to generate native machine code "functions" that execute the methods in your Ruby program. The JIT leverages the LLVM compiler library to generate the machine code. Because most of the JIT is currently implemented in C++ to easily interface with LLVM libraries, it is distant from Ruby and not easy to work with.
The most important change for the Rubinius JIT is that we'll move it as deeply into Ruby as possible. The extremely difficult aspects of generating machine code, like register allocation, instruction selection, and instruction scheduling, will still be handled by LLVM.
The other changes to the JIT are architectural. Right now, when a method is called a lot, it will eventually be queued for the JIT to compile. During compilation, the JIT will use runtime type information to combine (or inline) not just the single method itself, but a chain of methods along the call path. There are several problems with this approach that are addressed by the changes below:
- Multi-phase: A running program does not have exactly the same behavior at every point during its execution. Recognizing that programs have distinct phases of execution, the Rubinius JIT will adjust decisions to be more appropriate for that phase of the program.
- Multi-paradigm: There are two very broad categories of JIT compiler: one essentially compiles a complete method at a time while the other compiles a specific execution trace. The Rubinius JIT is currently a method JIT. In some cases, especially hot loops, a tracing JIT may be more appropriate.
- Multi-faceted: There is more than one way to improve performance of a piece of code, but right now the Rubinius JIT only has one way to do this. A multi-faceted JIT, on the other hand, will use many different approaches. Some methods may be transformed at the bytecode level, with new methods written from optimizing the original bytecode. Or methods may be split into many different ones depending on the types of values they see. The multi-faceted approach is not a set of tiers where higher tiers are better. It's the idea of better tailoring the kinds of optimizations to the features in the code.
The feature of the new Rubinius JIT that I'm most excited about is the JIT planner. Similar to a query planner in an RDBMS, the JIT planner will provide a way to record and analyze JIT decisions.
There are many moving pieces in Rubinius. Making sense of how they are performing and interacting is important. To support this, Rubinius includes a number of low-cost counters that capture key metrics of the Rubinius subsystems. Metrics make it possible to observe the actual effects of system changes in the context of production code.
The Rubinius metrics subsystem currently has a built-in emitter for StatsD. Other emitters can be provided if they are useful. Already, Yorick is sending Rubinius metrics to New Relic's Insights API to monitor an application's behavior.
This has tremendous value to us as we develop Rubinius. Many times we are asked about an issue in Rubinius and when we inquire about the application source code, we're told it's proprietary. We understand the need for protecting intellectual property, but it severely limits our ability to investigate. The metrics output makes it possible to share non-critical application information in a way that will help us improve our ability to address issues that you encounter with Rubinius.
The user interface is not something we often discuss about programming languages. As language designers, we may think and talk about it. But I have seen far fewer such discussions between language designers and language users, and almost no serious, extensive studies of usability in language design. (If you have references, please send them!)
One common discussion of usability that we do hear about in programming is the Unix tools philosophy, or the idea of doing one thing well. A simple program that does one thing well can be composed with other programs to do more complex things. I don't object to these ideas, but there's another side of the story no one talks about: What is the system underneath that makes it possible to pipe output from one program to another?
In art we may talk about figure and ground, and in building tools we must consider both the pieces the user interacts with, as well as the system behind those pieces. Over-emphasize the pieces in front and the user gets a bag of disparate fancy things that are collectively junk. Over-emphasize the system underneath, and the user gets a rigid, unwieldy block of granite that is equally unusable.
The Rubinius::Console is a set of tools combined with a coherent, integrated, systematic set of features that enable the tools to perform and coordinate well. I'll briefly talk about the main components below.
All the tools are built on the foundation of the REPL, or little-c console. A REPL generally takes commands, executes them, and displays the results. All of the tools here are part of the Rubinius::Console component. I want to give a brief introduction to each one for now, but we'll be writing a lot more about them soon.
The inspector is a collection of features that enable tracing through the execution of a program and inspecting the program state. It can show the value of local variables, what methods are currently executing, and other aspects of the running program. These features are usually included in a separate tool called a debugger. We think these features should be available at any time, whether running in development mode on your local machine, or in production mode on a remote server.
When investigating program behavior, sometimes it is helpful to measure how long a particular piece of code takes to run. Typically, this requires setting up a separate run with separate tools to do a benchmark. We think the essence of a benchmark is simply measurement and it should be available at any time.
Another type of measurement is the relative measurement of multiple components, usually the chain of methods invoked to perform some computation. This is usually called a profile and the focus is on the relationship between the measurements so that the most costly ones can be improved.
A running program has both an object graph and an execution graph. The object graph is the relationship between all the objects in the system. The execution graph includes all the call paths that have been run during the program execution. The execution graph is not just the current stack of methods executing.
The analysis tools are available to investigate allocation issues or unwanted retention of references to objects, something often referred to as a memory leak. They can also investigate the execution graph to find relationships between code that are not visible in the source code due to Ruby's dynamic types.
While a Ruby program is running, an enormous amount of important and useful data is generated in Rubinius. When the program exits, almost all that data is dropped on the floor. The CodeDB will preserve that data, enabling it to be used at many points in the lifetime of a program, from the first line written to inspection of a problem in a running production system.
The CodeDB is more of a functional description than a specific component or piece of code. In Rubinius today, we store the bytecode from compiling Ruby code and read from this cache instead of recompiling the Ruby code. However, we still load all the code regardless of whether it is used. In Rubinius 3.0, we will only load code that is used, which will improve load time and reduce memory usage from storing unused code.
As we covered in the last post, the bytecode is merely one representation of Ruby. The CodeDB will enable us to store many representations of Ruby code across many invocations of the program, and potentially across many computers. The representations of Ruby code combined with the rich data created by running programs gives us the foundation for even more exciting tools. One of these may be a refactoring editor, which seems to be the holy grail of every object-oriented programmer. We think there are even more interesting tools than automated refactoring and are excited to tell you more about them.
Tomorrow, we will finally tie some of these pieces together in Rubinius 3.0 - Part 5: The Language.
I want to thank the following reviewers: Chad Slaughter, Joe Mastey, and the Rubinius Team, Sophia, Jesse, Valerie, Stacy, and Yorick.
So far in this series, I've talked about the Rubinius Team and our approach to building Rubinius and delivering it to you. Today, I'll start talking about technical aspects of Rubinius 3.0, beginning with the new instruction set.
In this context, I want to reiterate what I wrote in the first post about over-emphasis of technology. In many projects, there appears to be an implicit assumption that those who code do the "technical" tasks, while beginners or those who can't code do the "non-technical" tasks, and the latter are inherently less valuable. This often goes unquestioned, but it's obvious that we spend time on what we consider valuable. If documentation is lagging, it's less important.
One manifestation of this that has always bothered me is tagging issues for "beginners", or suggesting that beginners start out with tasks like documentation, or other "sweeping the floor" tasks. If the technology tasks were the most important, we'd try to get everyone to work on them, even beginners. In reality, they are neither the most important, nor the most difficult, which is why it's easy for us writing code to go do them (and over-emphasize their importance).
We have done poorly in this regard with Rubinius. Which is why I'm highlighting it. These posts are in order of importance. By the last post, we'll see an interesting relationship between the parts covered in each post. On post three of five, I'm starting to talk about technical details. If we had done the first two parts better in Rubinius, you'd already be using the features I'm starting to talk about today.
In Rubinius, there is a parser that turns Ruby source code into a tree structure and a compiler that turns this tree into bytecode instructions. The "virtual machine" then interprets the instructions to run the Ruby program. So, Rubinius uses a bytecode interpreter to run Ruby.
In contrast, MRI 1.8 will parse Ruby source code into a tree and then walk the tree to execute the Ruby program. It does not convert the tree into bytecode instructions first. This illustrates that there is more than one way to execute code. The tree and the bytecode are called intermediate representations because they come in between the source code and the execution of the program specified by the source code.
If you have Rubinius installed, you can see these intermediate representations. Let's look at the tree that results from a simple program:
$ rbx compile -A -e 'puts "Hello" ", " "world"'
@string: "Hello, world"
If you look closely at the source code, you'll see that there are three Strings next to each other with no operator in between them. In the tree output, there is a single String. The Ruby parser concatenates two adjacent String literals. This knowledge may come in handy if you take a class from Sandi Metz.
The tree above is the abstract syntax tree (AST). It has all the details necessary to generate bytecode. Sometimes this representation is too verbose. There is another tree representation, called an s-expression, that is often used.
$ rbx compile -S -e 'puts "Hello" ", " "world"'
[:script, [:call, nil, :puts, [:arglist, [:str, "Hello, world"]]]]
If you squint at this and mentally replace square brackets with parentheses, colons and commas with nothing, you end up with something that may look familiar.
(script (call nil :puts (arglist (str "Hello, world"))))
This looks almost exactly like Lisp and proves that Ruby really was based on Lisp like Matz has said. You might mention that to any Clojure programmers that are trying to convince you that their version of Lisp is better.
Joking aside, what it shows is that there is a deep structure shared by different programming languages. In Lisp, that structure is visible to begin with. In Ruby, we need tools to derive that representation for us. This is important to note because the next step we look at, bytecode, is merely another representation of this structure.
$ rbx compile -B -e 'puts "Hello" ", " "world"'
============= :__script__ ==============
Arguments: 0 required, 0 post, 0 total
Stack size: 2
Literals: 2: "Hello, world", :puts
Lines to IP: 1: 0..10
0001: push_literal "Hello, world"
0005: send_stack :puts, 1
These different representations are equivalent from the view of preserving the fundamental semantics, or meaning, of the program. So, why do we use these different forms?
The reason is, they provide different advantages. The AST is easy to understand and easy to process. Once the source code is parsed, the virtual machine can begin executing the program immediately. However, the AST takes up quite a bit of space and requires doing some things over and over. The effort to optimize the tree to remove unneeded steps can be complex. On the other hand, bytecode takes more time to generate but removes some redundant steps in the process. Further optimizing the bytecode can often be performed just by looking at a sequence of a few instructions, something called peephole optimization.
The main takeaway is that a Ruby program can have many representations. In the next post, we'll even look at a Ruby program represented by LLVM IR. Any one of these general classes of representations, like trees, may itself have multiple forms. For example, the object graph and the s-expression above are both forms of trees. In the same way, "bytecode" instructions can have different forms. But before we get to that, let's talk about the general design of an instruction set.
The Instruction Set
There are two main problems that I've encountered in the design of an instruction set. I'll call them the primitiveness fallacy and the granularity fallacy.
The primitiveness fallacy is the idea that the instructions must be primitive operations. If an instruction can be decomposed into other instructions, then it is not primitive enough. That seems to be the conventional wisdom. For example, addition of two machine integers is seen as a primitive operation, so there is an instruction for that. But searching for a particular byte in a sequence of bytes can be represented by more "primitive" operations like incrementing an index to reference the next character, and comparing the byte with the target the same way two integers are compared.
The primitiveness fallacy is really the manifestation of an arbitrary decision on what to include in an instruction set. The operations of the instruction set are not really primitive, or the instruction set would consist of just a NAND gate. Forcing the operations in an instruction set to be too simple carries the risk of losing important information about the program. In a semantic sense, the information is not lost because the program executes correctly. However, there is another dimension to a program besides correctness, and that is performance. We'll return to that in a moment.
The second fallacy is the idea that all the instructions should have about the same granularity. If one instruction adds two integers and another instruction reads a file into a buffer, the "granularity" is significantly different. This may appear to be the primitive fallacy because this example, reading from a file, can be implemented with simpler instructions. However, reading from a file involves an operation that cannot typically be performed by simpler instructions, namely, invoking a system call. System calls, in typical virtual machines, are handled separately from instructions, usually by a supporting library of functions. System calls are not the only illustration of the granularity fallacy, but they represent any need to access some functionality outside the program itself.
The problem with these fallacies is that they undermine the whole system. Primitive operations, all of about the same granularity, sound a lot like sand to me. However, the instruction set is the foundation of executing a program. Foundations are the most important part of a structure where the biggest stones are used, not sand. The same is true of intellectual systems like logic and math. If the foundation is insufficient or inconsistent, everything unravels.
A second problem is that they lead to something I call the semantic cone, an inverted cone that is larger at the top than at the bottom. At the top, there's the program source code. At the bottom, there are the virtual machine instructions. The shape of the code (larger at top than at bottom) represents the loss of information as the source code is processed into instructions.
There is a saying, "First make it correct, then make it fast". This is good advice. It highlights that a program can have many forms, but to be useful, they need to be correct. So we start with a correct form, then transform it to one that is still correct but performs better. It's this second step that suffers in the context of a semantic cone. It's possible to make the program correct, but the amount of information lost makes it very difficult to make the program fast.
To address this limitation, we can use something like a just-in-time (JIT) compiler. However, to make the JIT work well, we have to invert the semantic cone. That is, we have to recreate the information that we lost going through the cone so that we can use that information to make the program fast. That seems wasteful. What if we just don't throw away the information to start with? Great question!
My answer to this is that we need to convert the semantic cone into a semantic column to the greatest extent possible. The way to do that is to have instructions that represent information that we normally throw away. This brings us to the new Rubinius instruction set.
Kinds of Instructions
In Rubinius, every executable context, every script body, class or module body, block, or method, is represented by a CompiledCode instance. Every compiled code object has a separate executor, or "function" that executes that compiled code's "instructions". These instructions may be bytecode, or they may be machine code generated by the JIT compiler.
Stack instructions are operations that remove one or more values from a special structure, the stack, then compute a new value and push it back onto the stack. Stack instructions are illustrated in the bytecode example above.
The advantage of stack instructions is that they are easy to generate and fairly easy to reason about. In fact, there are languages that are essentially just stack instructions, like Forth, Factor, and PostScript. If you have ever used an RPN calculator, you have written a stack program.
The downside to stack instructions is that they can be somewhat difficult to optimize. The stack itself obscures the relationship between operations.
The current Rubinius instruction set is stack-based. In Rubinius 3.0, we will retain and refine the stack instructions. This enables us to develop the new instructions incrementally while continuing to compile to the stack instructions.
When computers were first invented, there was a law passed that required every real or virtual machine to have only one kind of instruction. Except there really was no such law passed. We just act like there was.
Every major virtual machine that I know of basically chooses either stack-based or register-based instruction sets. You may have heard that Lua or mruby uses a "register-based virtual machine". Of course, whenever two things are different, our natural tendency is to rank them. If they are different, one must be better than the other. So, we may hear that register machines are "faster".
As noted above, there are advantages and disadvantages of stack-based instructions. There are different advantages and disadvantages of register-based instructions. In some cases, the advantages outweigh the disadvantages. Since Rubinius can use a different function to execute every compiled code object, we can use stack-based instructions for one and register-based for others. This enables us to benefit from the advantages while limiting the disadvantages.
Rubinius 3.0 adds register-based instructions to the bytecode. In fact, there's nothing preventing us from mixing register and stack instructions in the same method.
In my experience, 95% of the time that someone wants to write a C-extension and escape Ruby code, they are either parsing some text, generating some text, or both. Parsing is basically decomposing text into some structure, and templating is basically composing some structure into text.
Parsing is such a common and essential part of programs that we should have special support for it. Rubinius 3.0 adds support for parsing instructions modeled after LPEG, which is an implementation of a parsing machine in Lua for parsing-expression grammars (PEGs).
There are many nice properties to PEGs, perhaps the most interesting being composability. This enables building up more complex grammars from simpler pieces. For instance, what if parsing dates were part of the base language and you wanted to create a special-purpose language, perhaps for parsing a simple config file format? You could compose the base language's date parsing with other parts of your language.
Adding parsing instructions also enables Rubinius to read pre-compiled bytecode describing parsing operations and execute them. This is extremely useful when bootstrapping, where we need to parse Ruby code to execute it, but we need to execute Ruby code to be able to parse it.
Assertion instructions describe constraints for computed values without changing the computation itself in any way. However, the assertion instructions may change the way the program runs, depending on configuration. The assertion could raise an exception, which would abort the program if not handled. Or it could just log the location and values that failed the constraint validation.
Tests for code are important. But tests are usually only representative. If you have a test for how many cats can play in the same room, you don't typically write tests for 0, 1, 2, 3, ..., N cats. This is especially true if the range of values is huge.
Usually, it's possible to describe, as very simple predicates, the constraints on a value. It should be greater than zero. Or it should be between 200 and 5 million. Assertion instructions can optionally be executed when running tests or in production to check that values conform to constraints. In this way, defects can more readily be pinpointed.
Instrumentation instructions enable analysis and monitoring of code. Like assertion instructions, they do not change the semantics of code in any way.
The design of the instrumentation instructions was influenced by a paper titled, "The JVM is Not Observable Enough (and What To Do About It)". 1 The authors had implemented a framework for instrumenting Java bytecode. This is a common approach for program analysis. They discovered serious problems with this approach, including causing the Java virtual machine to deadlock or even segfault.
An even more influential paper for me was, "Hidden in Plain Sight". 2 The authors of DTrace described the constraints they were working under. They wanted DTrace to be used in production, so it was essential that it had no performance impact if not in use, it had to be absolutely safe for production, and it needed to collect only the relevant data so that it would operate well.
The instrumentation instructions enable Rubinius to build powerful tools that can analyze and monitor production code with the guarantee that the semantics do not change. This is especially important in a regulatory environment where code changes must be strictly controlled.
That was a fast and high-level introduction to the new Rubinius 3.0 instruction set. I'll be writing in much greater detail about all parts of this in the coming weeks.
One important aspect of the new instruction set is the attempt to completely describe the language semantics at the instruction set itself. This means not using supporting functions in a separate place. This enables us to use all the existing tools for compiling Ruby, and then run the resulting program on anything that provides these instructions.
For example, some people are very interested in Rust and have asked if we're going to rewrite the virtual machine in Rust. We have no plans to do so, but if someone is interested, all they would need to do is implement the instruction set. The same goes for Haskell or Go or even asm.js.
The possibilities here are pretty exciting. The Ruby language itself is not the most important piece, as we'll see in the last post of this series.
I want to thank the following people: Chad Slaughter for entertaining endless conversations about how we build software and challenging my ideas. Yehuda Katz for bringing us many nice ideas. Joe Mastey and Gerlando Piro for review and feedback, some of it on these topics going back more than a year. The Rubinius Team, Sophia, Jesse, Valerie, Stacy, and Yorick, for putting up with my last-minute requests.
In this post, I'll talk about the release process for Rubinius 3.0. We want you to use the new Rubinius 3.0 features as soon as possible. To explain our approach, I'll first talk about releasing software. I'm spending 20% of this week's posts on the release process because it's one of the hardest things we've struggled with.
Over the past about twelve months, we released Rubinius fifteen times. It was less than the one release per week that I was aiming for. However, in the two years previous to that, we did not release Rubinius a single time. It's not that we didn't do tons of work; we did! In fact, the number of commits per year hasn't varied that much. But if we don't release, it's as if the features don't exist. Thus, how we release has a big impact.
Why did we wait so long to release Rubinius 2.0? It's simple, we put too much into it! The reason we did that was concern for quality, not to delay release. However, our decision about when to release was flawed. We were aiming for broad feature coverage at a point in time instead of optimizing for how quickly we could roll out features to you. To understand why, we need to look at what our priority should have been and why it wasn't that.
As a project, our priority is to have an impact, providing value to people using Rubinius. We did not have our priorities straight, and this is why: we did not prioritize for people using Rubinius. Instead, we were persuaded by the following conventional wisdom:
If it doesn't work when someone tries it, they may wait a long time before trying it again, if ever.
This advice was often offered, and it is completely wrong. There are several problems with this fallacy. It confuses local and global effects. For every one person for whom a feature did not work, there are an unknown number of people for whom some feature did work. The latter group may have a much bigger global affect than the one person. It also does not account for the cost of people waiting for features. Finally, it's rooted in fear. The best antidote to fear is facts. If engagement is important, we should measure engagement and take steps to improve it.
Now that we know what to not do, what should we do? To understand that, we need to question how we release software.
How we release software is heavily influenced by how we build software, including our definition of working software. It is also influenced by the relative cost to release it. And finally, by social factors related to adoption of changes in general.
I see three fallacies in how we release software: we model software as being mechanical, we act like releasing it costs a lot, and we think individual pieces of software should be reliable instead of building systems to be reliable.
Software Is Mechanical
It seems that since software is technology, then it must be mechanical. Of course, this is not a new realization. We spend a lot of time arguing whether we should apply manufacturing or construction analogies and methods to creating software. However, software is not primarily mechanical.
Really, software is much more like biological systems than mechanical systems. We needlessly impose mechanical limitations on software. In software, we can build something figuratively in mid-air, with neither support nor suspension. It's a choice we made to model software on mechanical processes. We can choose differently.
The fact that features are constantly being developed does not mean that a system must be chaotic. Children do not wait till they are full-grown to chew on things. And while they are chewing on things with baby teeth, their adult teeth are growing. Some transitions (like losing teeth) can be disruptive, but for the most part, they get along fine even as their abilities are constantly developing. Why don't we choose biology as a model for creating software?
As opposed to mechanical systems, biological systems are very well adapted to functioning as a whole while changing or growing. Consider the difference between a plant growing from a seed and building a car. It's only at the very end of the manufacturing process that a car is able to function as a whole. Meanwhile, the fundamental metabolic processes in a plant function from the very beginning. The plant changes as it grows, so not every part is functional at the start. The key is to decide which parts should function first.
Another important aspect of biological systems is where the boundaries lie. There are cellular boundaries, system boundaries, and the organism as a whole, which has a boundary between itself and its environment. These boundaries serve the dual purpose of keeping things separate but in contact. Along with these boundaries, different parts of a biological system have different degrees of resilience. For example, a skeleton versus soft tissue. These two conceptsboundaries and joining different types of resiliencecan be useful in understanding software release. I'll return to them later.
Releasing Costs A Lot
There is a significant, often unmeasured, cost of features sitting in the deployment pipeline or queue. That cost used to be balanced by another cost, the cost of delivery. Driving down one meant driving up the other.
In the era of shrink-wrapped software, the cost of each release was huge. Physical diskettes or CDs had to be created, put in boxes, wrapped in plastic, put on trucks, driven to stores or mailed to businesses. All that costs time and money. If a release had N features, then the total cost of release C would give C/N as the release cost of each feature. So the total cost of a feature would be the cost to develop it, D, plus the cost of release, or D + C/N. In other words, the cost of releasing software was like a tax on every feature.
Releases are now practically free. The better we make the release process and the better your continuous delivery process, the cost of delivering you features approaches zero. That's called eliminating waste, which are resources spent on activities that do not provide value. Furthermore, the value of a feature is not that it exists, but in using it. The sooner you have it, the sooner you can leverage its value. As the cost of releasing software approaches zero, the relative cost of waiting for features gets higher.
Programs Should Be Perfect
We assume that programs (individual pieces of software) should be perfect, or bug-free. The problem is, simply, that it's impossible. Instead, we should expect individual pieces of software to have defects, and build systems to be robust. It's possible to create a system of components where the reliability of the system is greater than the reliability of any single component. In fact, this is how many parts of a computer are constructed.
Not only do we expect programs to be perfect, but we expend a lot of effort trying to maintain that illusion. We craft processes to check and double-check things, signing off on this or that check box. We engineer systems with the assumption that the parts will work and consequently make the whole system brittle.
I'm not advocating that people be reckless or not care about failure. I'm advocating for the exact opposite. We can actually make systems better by expecting failure. Once you slay the lie that programs can be perfect, the oppressive fear of a program failing evaporates, replaced with a realistic effort to build resilient systems. 1
To summarize, building software is more like a biological than mechanical process, we are approaching zero cost to deliver software, and we should focus on resilient systems that tolerate individual programs failing.
Changing The Release Process
To reiterate, our priority is for Rubinius to have an impact on its users. With Rubinius 3.0, we continue the approach we started with the 2.0 release, pushing more releases with a smaller number of changes in each release.
Our goal is that soon, continuous delivery for you will include pushing a new Rubinius release straight to production. If that sounds crazy, consider how much has changed in the past five years. Today, pushing Rubinius to production can mean dropping a Docker container in and if something goes awry, drop the whole mess and start over. It's truly amazing.
In software release, as everywhere else, we must focus on managing complexity. As the release cost goes down, other costs rise in relative importance. Someone mentioned recently that newer developers mistake the complexity of N features as being N. It's actually exponential, or 2N. If you have 2 features, it's not 2, it's 4. If you have 3 features, the complexity is 8. That's because each feature may interact with one or more of the others. Ideally, each release would have one feature.
This was underscored for me in a training with Sandi Metz recently. She was teaching us refactoring and we used the mantra "one undo from green" to keep us on the golden path and out of the weeds. If we have made one change and the tests fail, we can go back and then go forward, rather than piling confusion on confusion. The reduction in cognitive cost is significant.
The same can be applied to changing software. If we release Rubinius with small enough changes and you are many releases behind the current one, you can jump all the way forward at once, or apply one change at a time, or bisect. Whichever way you choose, you can significantly simplify upgrading and more easily pinpoint any problem.
We aim for each release to be so small, it's hardly worth mentioning. The focus changes from a big release announcement to the point when a customer notices the value of a new feature. Think of it like a menu. Increasing the granularity makes it possible for the customer to combine things in ways that work for them, rather than packaging it up for them in only one way, and taking a long time to do so.
Besides changing the focus of the release process, we are adding a feature to Rubinius to support the process: automatic update.
The first time you start Rubinius after a fresh installation, if the setting does not already exist, you'll get a message like the following:
Rubinius is able to automatically install new versions. Installing
a new version will not overwrite your existing version.
Enable this feature? ([Y]es, [n]o, [a]uto, ne[v]er, [h]elp):
When a new version is available, you'll get a message like the following:
Version 3.0.5 is available. You are running version 2.8.12.
Installing the new version will not overwrite this version, both versions
will be available.
Install the new version? ([Y]es, [n]o, [a]uto, ne[v]er, [h]elp):
We will install the new Rubinius from a binary build using the same packager that was used to install the existing one. Since chruby already works exceedingly well with many existing system package managers, the new version of Rubinius should seamlessly fit into your workflow if you are using
How You Can Help
There are a number of things you can do to help us be more effective in getting new Rubinius features into your hands.
- Help us strip the ornamentation out of releases. Help us find the equivalents of the cardboard boxes, plastic wrap, CD cases, and related waste in our release process and eliminate them so we can deliver features more quickly.
- Help us create binary packages. We're going to dedicate December as Binary Packages for Rubinius month. We hope that if you have some free time over the holidays, you will experiment with building binary packages for Rubinius or helping existing maintainers update their packages
- Communicate with us about what is working and what can improve. Help us understand the problem you have. Open an issue or write a post about how you're using Rubinius and link us.
- Help other people get started with Rubinius. The chruby utility is our favorite Ruby switcher. It's so simple and just works. It also can switch Rubies installed by many OS system packages, and other installers/switchers like rbenv. Even if you don't end up changing your current Ruby switcher, give chruby a shot. It has made life so much better for us.
Now that we've looked at the problems with releasing software and at what Rubinius is doing differently with 3.0, as well as what you can do to help, let's tie everything together into one simple idea.
Recently, Tom Dale posted The Road to Ember 2.0. The Ember authors have many new features and better approaches they want to implement derived from ample contact with the struggles that developers using Ember face. However, they realize that Ember users need things now. To balance these needs, they have a mantra: stability without stagnation.
I've always been impressed with the Ember development effort and it's exciting to read about the work they're doing. It's also validating to hear them talk about tackling similar issues to the ones we've faced with Rubinius. However, I would phrase their idea it the opposite way. In Rubinius, we are aiming for progress with purpose.
The guiding principle is to iterate on what we want, not toward what we want. We want to start with a functioning kernel of a feature and grow it into a full-fledged, mature component. Going back to the discussion of biological versus mechanical models above, we are focused on getting just enough of the skeleton and boundaries in place to enable consistent, functional growth. Much of the design for Rubinius 3.0 has been done. The next three posts will get into technical details.
Ed: Some minor grammatical and spelling changes suggested by Joe Mastey were made to the original version to improve readability and clarity.
I want to thank the following people: Chad Slaughter for entertaining endless conversations about how we build software and challenging my ideas. Yehuda Katz for bringing us many nice ideas. Joe Mastey and Gerlando Piro for review and feedback, some of it on these topics going back more than a year. The Rubinius Team, Sophia, Jesse, Valerie, Stacy, and Yorick, for putting up with my last-minute requests.
Today, I'm introducing some big changes for Rubinius. As the title says, this
post is only part one. There are five parts and I'll publish one each day this
week. I'll be covering these topics: the Rubinius Team, the development and
release process, the Rubinius instruction set, the Rubinius system and tools,
and one more thing.
Also, as the title says, this is Rubinius 3.0. The past year has been
incredibly influential in helping me understand the many facets of Rubinius as
a project and Ruby as a language and community. The other posts will dive into
more detail, but I want to highlight that all of this is Rubinius 3.0.
Introducing the Rubinius Team
Sometimes we save the best for last, but this is not one of those times. I'm
tremendously honored and excited to introduce you to the Rubinius Team.
They've all volunteered to contribute their time, experience, and passion to
improving Rubinius and its impact to make the world better.
Here we are, in no particular order:
Sophia Shao: As a recent graduate of Carnegie Mellon University's
Electrical & Computer Engineering department, Sophia is currently tackling a
massive application migration from MRI to Rubinius. She's also been improving
Rubinius every day. Hit her up for tips about debugging machine code.
Jesse Cooke: As co-founder of Watsi, a venture to
fund healthcare for people around the world, Jesse was part of YCombinator's
first ever non-profit. Jesse has been contributing in any way he can to
Rubinius for a long time. If you visit Portland, OR, you may see him riding
this weird bike with a belt instead of a chain.
Valerie Concepcion: If you're interested in getting things like Raspberry
PI's, Legos, and Wii Remotes to play well together, Valerie can help. Drawn to
the Maker movement and inspired by her friends who work in non-profits, she is
interested in applying technology for social good.
Stacy Mullins: At one point, Stacy would have gladly chosen a typewriter
over a computer. But at school for graphic design, she became fascinated by
technologies like HTML and CSS and the ability to create something from
scratch. Now she's learning about crafting code and communicating well with
Yorick Peterse: When not breaking code, Yorick is fixing it and asking
questions. Either way, there is a lot of code happening. He's drawn to the
deep technical details of systems like just-in-time compilers and concurrency.
He may or may not be a Dr. Evil character hatching plans for world domination.
Brian Shirai: Having once passed over Ruby for being too much like Perl,
Brian rediscovered Ruby over ten years ago and has been working on Rubinius
for the past eight. Inadvertently, he's also learned Perl.
After all these years, why do I want to form a Rubinius Team? And what is it?
Is it like the "core team" we see in Rails or other projects?
I'm so glad you're wondering about that!
Early in the Rubinius project, Evan Phoenix started a policy we called "the
open commit bit": if we accept your patch, you get permission to commit
changes to the source code repository.
This contrasted with many open source projects that had a small number of
people who could make changes to the code. Usually, this group was called a
"core team". Limiting permission to change the code was seen as an essential
part of maintaining code quality. If anyone could commit, people would just
make a big mess.
This conventional wisdom turned out to be false. We let anyone who made one
good patch have access to commit any changes they wanted. In practice, almost
everyone was extremely careful. We rarely had to revert changes, and when we
did, it was not usually a question of quality. Hundreds of people committed
changes and Rubinius benefited a great deal.
For this reason, whenever the topic of a "core team" for Rubinius came up,
Evan opposed it. There was no real value in trying to be gate keepers. Giving
people the opportunity to contribute and welcoming them to do so had a
positive impact and showed appreciation for their efforts.
The Rubinius Team is not about creating a different class of contributor,
exclusiveness, gate-keepers, or overseers.
Another characteristic of the typical open source project "core team" is that
the members are usually the most technically skilled and have the greatest
number of commits. This automatically creates an imbalance of emphasis on only
technical issues and technical expertise, despite the fact that the vast
majority of people using, contributing to, or impacted by a project will not
be "top technical contributors".
The Rubinius Team is not focused exclusively, or even primarily, on the
technical aspects of the project.
A third characteristic of typical "core teams" is the implicit privilege of
the members and the resulting economic, gender and diversity imbalance.
Someone struggling with two jobs won't have time to be a top committer, no
matter how capable they are. Likewise for someone caring for kids at home, a
responsibility that disproportionately rests with women. All of these problems
stem from the dangerous fallacy that open source software is a "meritocracy".
So, what is the Rubinius Team?
The Rubinius Team is a group of people who work together, influenced by our
values, to accomplish things that fulfill the Rubinius vision and mission.
Our vision is a world where Ruby is the most useful programming language for
building things that improve people's well-being and quality of life.
"Most useful" means the most benefit for the least amount of effort for the
greatest number of people. There will always be incredibly smart people who do
very difficult things. For the rest of us, to steal a quote by Moshe
Feldenkrais, we want to "make the impossible possible, the hard easy, and the
Our mission is to build the best Ruby implementation and the best programming
tools that benefit the greatest number of people, prioritizing our efforts to
improve access for people who have been marginalized and excluded.
We value impact, quality, inclusiveness, diversity and balance, and we
actively promote them. We celebrate our differences and appreciate them as a
source of strength. We prioritize improving access and championing the needs
of people who have traditionally been excluded. We get things done, lead by
example and we constantly strive to improve. We realize that we enjoy a lot of
privilege and we work hard to empower others rather than advancing our own
We welcome anyone who shares our vision, mission and values to be a part of
the Rubinius Team. And one of our objectives will be growing the team. There
are many roles to play. From outreach to industry, academia, and communities
like Women Who Code and Black Girls
Code to marketing, budgeting, and planning.
From documentation to organizing meetups. There are many ideas we don't even
know about yet, and are waiting for you to create.
It's about quality
I want to talk more about the over-emphasis of technology in open source
projects because I don't hear this discussed often.
The source code written is a small part of a much bigger picture. The purpose
of design is to create something that is useful for humans. Better
understanding leads to better design. Better design leads to a more effective
tool. A more effective tool leads to better engagement. Better engagement
leads to greater understanding. There is no hierarchy here; there is no
ranking. They form a circle of interaction. Each of these is important, and
any one of them is only as good as all the others.
We strive to ensure that we are reaching the people we want to help, and that
we are helping the people we want to reach. We do this by seeking global
understanding of the problems our community needs to solve. Too narrow a focus
on the local technology problems will mislead us.
Pondering these matters leads us to consider the Rubinius community.
The Rubinius Community
I have a very broad view of the Rubinius community. It includes developers and
people learning to write Ruby. It also involves people who are not primarily
involved in programming but may need to understand or even write some Ruby
code. For example, a database developer working with a team of Ruby
programmers on an application. The community also includes the people who use
the software written in Ruby. And it includes the businesses who employ people
to write in Ruby.
The Rubinius Team is also a part of the Rubinius community. The relationship
between the Rubinius Team and the Rubinius community is important. The Team's
purpose is to help the community. And here, "help" means to serve.
In business the people we serve are our customers, but the concept of a
customer is not common to open source projects. Since people do not usually
pay for open source software, the idea of a customer does not seem to make
sense. However, envisioning the user of Rubinius as a customer has many
benefits. To develop an effective product, we must deeply understand the needs
of a customer.
The customer relationship provides important benefits to both sides. On one
hand it clarifies who we, the Team, are trying to help and to whom we are
responsible. On the other, it makes clear the customer's responsibility to
engage and communicate clearly, and to provide feedback to help us improve.
Both sides must be vested in the relationship.
This is where the analogy of a typical business relationship begins to break
down when applied to open source. We provide a thing of value: Rubinius. What
thing of value does the customer provide in return? One thing is the person's
time. Taking the time to try Rubinius, open an issue, or share their
experience with someone else is a thing of value they are giving Rubinius.
However, there is not yet a thing that has the same tangible value as money.
When we are asked to pay money for something, it increases the stakes for us.
We want the Rubinius community to be healthy, inclusive, safe, and helpful. We
want people to learn and grow and build awesome things. So we are adopting a
Code of Conduct for the community based on the Citizen Code of
Conduct by the excellent Stumptown
Syndicate. We know this will be an important
aspect of creating an environment of respect and support as we continue to
explore how to improve the relationship between Rubinius as a product and
project and those who use Rubinius.
I'm excited to share more about the path of Rubinius 3.0 in the other posts
this week. We'd love to hear from you. Please send your comments to
I want to thank the following people: Ashe Dryden and James Coglan, who
have forced me to question many things about open source projects. Evan
Phoenix for starting and leading Rubinius. Chad Slaughter for taking a risk
and being a stellar mentor. The Rubinius Team, Sophia, Jesse, Valerie, Stacy,
and Yorick, for their generosity and feedback on the post. Joe Mastey and
Gerlando Piro for their review and many fruitful conversations.
Enova, for giving me hard problems to solve. And thanks
to you, the Rubinius community, for making it worthwhile.
New beginnings are exciting and I'm delighted to announce that I've joined a
terrific team of people at Enova who are working hard to
innovate and push Ruby well beyond its comfort zone.
I'm looking forward to sharing the journey with you as we build fantastic
developer tools, migrate giant monolithic Rails apps, and create
next-generation distributed applications that scale efficiently to all the
cores. There are huge challenges given the engineering tasks, but also
tremendous opportunities to demonstrate how powerful Ruby is and can be.
At Enova, I'll be continuing my OSS work on Rubinius and RubySpec. This is a
generous contribution from Enova to the Ruby community. I'll also be devoting
time to Rubinius X as we explore and address the many deficiencies in Ruby
that continue to drive developers and businesses to other languages like Go,
Clojure, and Node.js.
For those still running on Ruby 1.8.7, we'd love for you to collaborate with
us as we build tools to migrate to newer Ruby versions. If you're already on
Ruby 1.9.3 or later, we'd love to help you explore migrating from MRI to a
Ruby implementation with a modern garbage collector, JIT compiler, and good
support for concurrency. If you're considering rewriting your Ruby app in some
other language, please let us know why.
If you've already rewritten your Ruby apps in another language, I'd especially
like to hear from you. Rewrites always involve new architecture decisions as
well. I'd like to understand if the Ruby language prevented your new
architecture decisions or if better technology in Ruby would have saved the
cost of rewriting an application.
Kumiko, Miwa, and I will be moving to Chicago. If you're in the area, we look
forward to meeting you. If not, hopefully you'll come visit Chicago and say
hello, maybe at RailsConf 2014!
Happy New Year!