Sunday, October 25, 2009

Recipe 22.1. Writing a C Extension for Ruby










Recipe 22.1. Writing a C Extension for Ruby




Credit: Garrett Rooney



Problem


You want to implement part of your Ruby program in C. This might be the part of your program that needs to run really fast, it might contain some very platformspecific code, or you might just have a C implementation already, and you don't want to also write one in Ruby.




Solution


Write a C extension that implements that portion of your program. Compile it with extconf.rb and require it in your Ruby program as though it were a Ruby library. You'll need to have the Ruby header files installed on your system.


Here's a simple Ruby program that requires a library called example. It instantiates an instance of Example::Class from that library, and calls a method on that library:



require 'example'
e = Example::Class.new
e.print_string("Hello World\n")
# Hello World



What would the example library look like if it were written in Ruby? Something like this:



# example.rb
module Example
class Class
def print_string(s)
print s
end
end
end



Let's implement that same functionality in C code. This small C library, example.c, defines a Ruby module, class, and method using the functions made available by ruby.h:



#include <ruby.h>
#include <stdio.h>

static VALUE rb_mExample;
static VALUE rb_cClass;

static VALUE
print_string(VALUE class, VALUE arg)
{
printf("%s", RSTRING(arg)->ptr);
return Qnil;
}
void
Init_example()
{
rb_mExample = rb_define_module("Example");

rb_
cClass = rb_define_class_under(rb_mExample, "Class", rb_cObject);

rb_define_method(rb_cClass, "print_string", print_string, 1);
}



To build the extension, you also need to create an extconf.rb file:



# extconf.rb
require 'mkmf'

dir_config('example')
create_makefile('example')



Then you can build your library by running extconf.rb, then make:



$ ls
example.c extconf.rb

$ ruby extconf.rb
creating Makefile

$ make
gcc -fPIC -Wall -g -O2 -fPIC -I. -I/usr/lib/ruby/1.8/i486-linux
-I/usr/lib/ruby/1.8/i486-linux -I. -c example

gcc -shared -L"/usr/lib" -o example.so example.o -lruby1.8
-lpthread -ldl -lcrypt -lm -lc

$ ls
Makefile example.c example.o example.so extconf.rb



The example.so file contains your extension. As long as it's in your Ruby include path (and there's no example.rb that might mask it), you can use it like any other Ruby library:



require 'example'
e = Example::Class.new
e.print_string("Hello World\n")
# Hello World





Discussion


Most programs can be implemented using plain old Ruby code, but occasionally it turns out that it's better to implement part of the program in C. The example library above simply provides an interface to C's printf function, and Ruby already has a perfectly good IO#printf method.


Perhaps you need to perform a calculation hundreds of thousands of times, and implementing it in Ruby would be too slow (the Example::Class#print_string method is faster than IO#printf). Or maybe you need to interact with some platformspecific API that's not exposed by the Ruby standard library. There are a number of reasons you might want to fall back to C code, so Ruby provides you with a reasonably simple way of doing it.


Unfortunately, the fact that it's easy doesn't always mean it's a good idea. You must remember that when writing C-level code, you're playing with fire. The Ruby interpreter does its best to limit the damage you can do if you write bad Ruby code. About the worst you can do is cause an exception: another part of your program can catch the exception, handle it, and carry on. But C code runs outside the Ruby interpreter, and an error in C code can crash the Ruby interpreter.


With that in mind, let's go over some of the details you need to know to write a
C extension.


A Ruby extension is just a small, dynamically loadable library, which the Ruby interpreter loads via dlopen or something similar. The entry point to your extension is via its Init function. For our example module, we defined an Init_example function to set everything up. Init_example is the first function to be called by the Ruby interpreter when it loads our extension.


The Init_example function uses a number of functions provided by the Ruby interpreter to declare modules, classes, and methods, just as you might in Ruby code. The difference, of course, is that here the methods are implemented in C. In this example, we used rb_define_module to create the Example module, then rb_define_class_under to define the Example::Class class (which inherits from Object), and finally rb_define_ method to give Example::Class a print_string method.


The first thing to notice in the C code is all the VALUE variables lying around. A VALUE is the C equivalent of a Ruby reference, and it can point to any Ruby object. Ruby provides you with a number of functions and macros for manipulating VALUEs.


The rb_cObject variable is a VALUE, a reference to Ruby's Object class. When we pass it into rb_define_class_under, we're telling the Ruby interpreter to define a new subclass of Object. The ruby.h header file defines similar variables for many other Rubylevel modules (named using the rb_mFoo convention) and classes (the convention is rb_cFoo).


To manipulate a VALUE, you need to know something about it. It makes no more sense in C code than in Ruby code to call a method of File on a value that refers to a string. The simplest way to check a Ruby object's type is to use the Check_Type macro, which lets you see whether or not a VALUE points to an instance of a particular Ruby class. For convenience, the ruby.h file defines constants T_STRING, T_ARRAY, and so on, to denote built-in Ruby classes.


But that's not what we'd do in Ruby code. Ruby enforces duck typing, in which objects are judged on the methods they respond to, rather than the class they instantiate. C code can operate on Ruby objects the same way. To check whether an object responds to a
particular message, use the
function rb_respond_to. To send the message, use rb_funcall. It looks like this:



static VALUE
write_string(VALUE object, VALUE str)
{
if (rb_respond_to(object, rb_intern("<<")))
{
rb_funcall(object, rb_intern("<<"), 1, str);
}
return Qnil;
}



That's the C-level equivalent of the following Ruby code:



def write_string(object, str)
object << str if object.respond_to?('<<')
return nil
end



A few more miscellaneous tips: the rb_intern function takes a symbol name as a C string and returns the corresponding Ruby symbol ID. You use this with functions like rb_respond_to and rb_funcall to refer to a Ruby method. Qnil is just the C-level name for Ruby's special nil object. There are a few similar constants, like Qfalse and Qtrue, which do just about what you'd think they'd do.


There are a number of other C level functions that let you create and manipulate strings (look in for functions that start with rb_str), arrays (rb_ary), and hashes (rb_ hash). These APIs are pretty self-explanatory, so we won't go into them in depth here, but you can find them in the Ruby header files, specifically ruby.h and intern.h.


Ruby also defines some macros to do convenient things with common data types. For example, the StringValuePtr macro takes a VALUE that refers to a ruby String and returns a C-style char pointer. This can be useful for interacting with C-level APIs. You can find this and other similar helpers in the ruby.h header.




See Also


  • The file README.EXT file in the Ruby source tree

  • Recipe 22.2, "Using a C
    Library from Ruby"













No comments:

Post a Comment