Friday, October 30, 2009

Recipe 13.6. Using Berkeley DB Databases










Recipe 13.6. Using Berkeley DB Databases





Problem


You want a simple, fast database that doesn't need a server to run.




Solution


Ruby's standard dbm library lets you store a database in a set of standalone binary files. It's not a SQL database: it's more like a fast disk-based hash that only stores strings.



require 'dbm'

DBM.new('random_thoughts') do |db|
db['tape measure'] =
"What if there was a tape measure you could use as a yo-yo?"
db[23] = "Fnord."
end

DBM.open('random_thoughts') do |db|
puts db['tape measure']
puts db['23']
end
# What if there was a tape measure you could use as a yo-yo?
# Fnord.

DBM.open('random_thoughts') { |db| db[23] }
# TypeError: can't convert Fixnum into String

Dir['random_thoughts.*']
# => ["random_thoughts.pag", "random_thoughts.dir"]





Discussion


The venerable
Berkeley DB format lets you store enormous associative datasets on disk and quickly access them by key. It dates from before programming languages had built-in hash structures, so it's not as useful as it used to be. In fact, if your hash is small enough to fit in memory, it's faster to simply use a Ruby hash that you serialize to disk with Marshal.


If you do need to use a DBM object, you can treat it almost exactly like a Ruby hash: it supports most of the same methods.


There are many, many implementations of the Berkeley DB, and the file formats differ widely between versions, so DBM files are not very portable. If you're creating your own databases, you should use the generic dbm library. It provides a uniform interface to all the DBM implementations, using the best library you have installed on your computer.[7]

[7] Actually, it uses the best DBM library you had installed when you installed the dbm Ruby extension.


Ruby also provides gdbm and sdbm libraries, interfaces to specific database formats, but you should only need these if you're trying to load a
Berkeley DB file produced by some other program.


There's also the SleepyCat library, a more ambitious implementation of the
Berkeley DB that implements features of traditional databases like transactions and locking. Its Ruby bindings are available as a third-party download. It's still much closer to a disk-based data structure than to a relational database, and the basic interface is similar to that of dbm, though less Ruby-idiomatic:



require 'bdb'

db = BDB::Hash.create('random_thoughts2.db', nil, BDB::CREATE)
db['Why do we park on a driveway but'] = 'it never rains but it pours.'
db.close

db = BDB::Hash.open('random_thoughts2.db', nil, 'r')
db['Why do we park on a driveway but']
# => "it never rains but it pours."
db.close



The SleepyCat library provides several different hashlike data structures. If you want a hash whose keys stay sorted alphabetically, you can create a BDB::Btree instead of a BDB::Hash:



db = BDB::Btree.create('element_reviews.db', nil, BDB::CREATE)
db['earth'] = 'My personal favorite element.'
db['water'] = 'An oldie but a goodie.'
db['air'] = 'A good weekend element when you're bored with other elements.'
db['fire'] = 'Perhaps the most overrated element.'

db.each { |k,v| puts k }
# air
# earth
# fire
# water

db['water'] # => "An oldie but a goodie."
db.close





See Also


  • On Debian GNU/Linux, the DBM extensions to Ruby come in separate packages from Ruby itself: libdbm-ruby, libgdbm-ruby, and libsdbm-ruby

  • You can get the Ruby binding to the Sleepycat library at http://moulon.inra.fr/ruby/bdb.html

  • Confused by all the different, mutually incompatible implementations of the
    Berkeley DB idea? Try reading "
    Unix Incompatibility Notes: DBM Hash Libraries" (
    http://www.unixpapa.com/incnote/dbm.html)

  • If you need a relational database that doesn't require a server to run, try SQLite: it keeps its databases in standalone files, and you can use it with ActiveRecord or DBI; its Ruby binding is packaged as the sqlite3-ruby gem, and its home page is at http://www.sqlite.org/













No comments:

Post a Comment