Friday, October 30, 2009

Recipe 16.10. Sharing a Hash Between Any Number of Computers










Recipe 16.10. Sharing a Hash Between Any Number of Computers




Credit: James Edward Gray II



Problem


You want to easily share some application data with remote programs. Your needs are as trivial as, "What if all the computers could share this hash?"




Solution


Ruby's built-in
DRb library can share Ruby objects across a network. Here's a simple hash server:



#!/usr/local/ruby -w
#
drb_hash_server.rb
require 'drb'

# Start up DRb with a URI and a hash to share
shared_hash = {:server => 'Some data set by the server' }
DRb.start_service('druby://127.0.0.1:61676', shared_hash)
puts 'Listening for connection…'
DRb.thread.join # Wait on DRb thread to exit…



Run this server in one Ruby session, and then you can run a client in another:



require 'drb'

# Prep DRb
DRb.start_service
# Fetch the shared object
shared_data = DRbObject.new_with_uri('druby://127.0.0.1:61676')

# Add to the Hash
shared_data[:client] = 'Some data set by the client'
shared_data.each do |key, value|
puts "#{key} => #{value}"
end
# client => Some data set by the client
# server => Some data set by the server





Discussion


If this looks like magic, that's the point. DRb hides the complexity of distributed programming. There are some complications (covered in later recipes), but for the most part DRb simply makes remote objects look like local objects.


The solution given above may meet your needs if you're working with a single server and client on a trusted network, but applications aren't always that simple. Issues like thread-safety and security may force you to find a more robust solution. Luckily, that doesn't require too much more work.


Let's take thread-safety first. Behind the scenes, a DRb server handles each client connection in a separate Ruby thread. Ruby's Hash class is not automatically thread-safe, so we need to do a little extra work before we can reliably share a hash between multiple concurrent users.


Here's a library that uses delegation to implement a thread-safe hash. A
ThreadsafeHash
object delegates all its method calls to an underlying Hash object, but it uses a Mutex to ensure that only one thread (or DRb client) can have access to the hash at a time.



# threadsafe_hash.rb
require 'rubygems'
require 'facet/
basicobject' # For the
BasicObject class
require 'thread' # For the Mutex class



We base our thread-safe hash on the BasicObject class in the
Facets More library (available as the facets_more gem). A BasicObject is an ordinary Ruby object, except it defines no methods at allnot even the methods of Object. This gives us a blank slate to work from. We can make sure that every single method of ThreadsafeHash gets forwarded to the underlying hash, even methods like inspect, which are defined by Object and which wouldn't normally trigger method_missing.



# A thread-safe Hash that delegates all its methods to a real hash.
class ThreadsafeHash < BasicObject
def initialize(*args, &block)
@hash = Hash.new(*args, &block) # The shared hash
@lock = Mutex.new # For thread safety
end

def method_missing(method, *args, &block)
if @hash.respond_to? method # Forward Hash method calls…
@lock.synchronize do # but wrap them in a thread safe lock.
@hash.send(method, *args, &block)
end
else
super
end
end
end



The next step is to build a RemoteHash using BlankSlate. The implementation is trivial. Just forward method calls onto the Hash, but wrap each of them in a synchronization block in order to ensure only one thread can affect the object at a time.


Now that we have a thread-safe RemoteHash, we can build a better server:



#!/usr/bin/ruby -w
# threadsafe_hash_server.rb

require 'threadsafe_hash' # both sides of DRb connection need all classes
require 'drb'



We begin by pulling in our RemoteHash library and DRb:



$SAFE = 1 # Minimum acceptable paranoia level when
sharing code!



The $SAFE=1 line is critical! Don't put any code on a network without a minimum of $SAFE=1. It's just too dangerous. Malicious code, like obj.instance_eval("`rm -rf / *`"), must be controlled. Feel free to raise $SAFE even higher, in fact.



# Start up DRb with a URI and an object to share.
DRb.start_service('druby://127.0.0.1:61676', Threadsafe.new)
puts 'Listening for connection…'
DRb.thread.join # wait on DRb thread to exit…



We're now ready to start the DRb service, which we do with a URI and an object to share. If you don't want to allow external connections, you may want to replace "127.0.0.1" with "localhost" in the URI.


Since DRb runs in its own threads, the final line of the server is needed to ensure that we don't exit before those threads have done their job.


Run that code, and then you can run this client code to share a hash:



#!/usr/bin/ruby
# threadsafe_hash_client.rb

require 'remote_hash' # Both sides of DRb connection need all classes
require 'drb'

# Prep DRb
DRb.start_service

# Fetch the shared hash
$shared_data = DRbObject.new_with_uri('druby://127.0.0.1:61676')

puts 'Enter Ruby commands using the shared hash $shared_data…'
require 'irb'
IRB.start



Here again we pull in the needed libraries and point DRb at the served object. We store that object in a variable so that we can continue to access it as needed.


Then, just as an example of what can be done, we enter an IRb session, allowing you to manipulate the variable any way you like. Remember, any number of clients can connect and share this hash.


Let's illustrate some sample sessions. In the first one, we add some data to the hash:



$ ruby threadsafe_hash_client.rb
Enter Ruby commands using the shared hash $shared_data…
irb(main):001:0> $shared_data.keys
=> []
irb(main):002:0> $shared_data[:terminal_one] = 'Hello other terminals!'
=> "Hello other terminals!"



Let's attach a second client and see what the two of them find:



$ ruby threadsafe_hash_client.rb
Enter Ruby commands using the shared hash $shared_data…
irb(main):001:0> $shared_data.keys
=> [:terminal_one]
irb(main):002:0> $shared_data[:terminal_one]
=> "Hello other terminals!"
irb(main):003:0> $shared_data[:terminal_two] = 'Is this thing on?'
=> "Is this thing on?"



Going back to the first session, we can see the new data:



irb(main):003:0> $shared_data.each_pair do |key, value|
irb(main):004:1* puts "#{key} => #{value}"
irb(main):005:1> end
terminal_one => Hello other terminals!
terminal_two => Is this thing on?



Notice that, as you'd hope, the DRb magic can even cope with a method that takes a code block.




See Also


  • There is a good beginning tutorial for DRb at http://www.rubygarden.org/ruby?DrbTutorial

  • There is a helpful DRb presentation by Mark Volkmann in the "Why Ruby?" repository at http://rubyforge.org/docman/view.php/251/216/DistributedRuby.pdf

  • The standard library documentation for DRb can be found at http://www.ruby-doc.org/stdlib/libdoc/drb/rdoc/index.html

  • For more on the internal workings of the thread-safe hash, see Recipe 8.8, "Delegating Method Calls to Another Object," and Recipe 20.4, "Synchronizing Access to an Object"

  • Recipe 20.11, "Avoiding Deadlock," for another common problem with multi-threaded programming













No comments:

Post a Comment