Zookeeper – Distributed cluster software
I read this article about Apache Zookeeper at Igvita and was intrigued.
I started looking around for ruby libraries, but nothing was as mature as I would like. I forked a branch on github and chugged along on getting the jruby and c versions to work with the same api.
Zookeeper exposes a super simple API, but with that simple stuff you can build a lot of complex cluster logic.
[ruby]
zk = ZooKeeper.new(“localhost:2181”, :watcher => :default) #this will handle events using my built-in event handler
zk2 = ZooKeeper.new(“localhost:2181”, :watcher => false) #this one won’t receive watch events
zk.watcher.register(“/mypath”) do |event, zookeeper_client|
$stderr.puts(“got an event on: #{event.path}”)
end
zk.exists?(“/mypath”, :watch => true) # returns nil, but sets up the app to watch for the existence of /mypath
zk2.create(“/mypath”, “my data up to 1mb”, :mode => :ephemeral)
# create modes can be any of
# :persistent_sequential, :ephemeral_sequential, :persistent, :ephemeral
# now the registered watcher will fire (at least within a few 100 miliseconds)
# because we set that node to be :ephemeral – when zk2 closes its connection, the “/mypath” will go away
# but watches are one-time firing only – so we need to set it up again
zk.exists?(“/mypath”, :watch => true) #returns true
zk2.close! #or delete or whatever
# the watcher fires again and
zk.exists?(“/mypath”) #returns false
[/ruby]
A limited api of create, delete, get, set, watch lets you do some really advanced things around a cluster.
Examples
I added some abstractions based on the Zookeeper recipes.
Locks
[ruby]
#these 2 clients could be on totally separate boxes, different processes, whatever
zk = ZooKeeper.new(“localhost:2181”, :watcher => :default)
zk2 = ZooKeeper.new(“localhost:2181”, :watcher => :default)
lock1 = zk.locker(“/mypath”)
lock1.lock #true
lock2 = zk2.locker(“/mypath”)
lock2.lock #false
lock1.unlock #true
lock2.lock #true
# locks are also released on a client close/crash
lock1.lock #false
zk2.close!
lock1.lock #true
[/ruby]
Message Queues
I also implemented a simple message queue on top of zookeeper. However, because of the way the zookeeper “children” calls are made (returning all children), I wouldn’t recommend using this for queues where pending messages will reach into the thousands.
[ruby]
client1 = ZooKeeper.new(“localhost:2181”, :watcher => :default)
client2 = ZooKeeper.new(“localhost:2181”, :watcher => :default)
publisher = client1.queue(“myqueue”)
receiver = client2.queue(“myqueue”)
receiver.subscribe do |title, data|
# data will be whatever was published, title will be the node name
# for the message
$stderr.puts “got a message with: #{data}”
# having a true state from the block will mark the message as ‘answered’
# sending back a false will requeue
true
end
[/ruby]
More…
There’s a ton you can do with this thing (priority queues, meta data store, etc). I think it’s a nice addition to the ruby toolset.
Pingback: Delicious Bookmarks for March 31st from 01:12 to 01:48 « Lâmôlabs()