bandalore/README.md

303 lines
9.4 KiB
Markdown
Raw Normal View History

2012-02-24 15:36:21 +00:00
# Bandalore
2011-02-18 18:32:40 +00:00
2012-02-24 15:36:21 +00:00
[Bandalore](http://github.com/cemerick/bandalore) is a Clojure client
library for Amazon's [Simple Queue Service](http://aws.amazon.com/sqs/). It depends upon
the standard [AWS SDK for Java](http://aws.amazon.com/sdkforjava/),
2011-02-18 18:32:40 +00:00
and provides a Clojure-idiomatic API for the SQS-related functionality
therein.
2012-02-24 15:36:21 +00:00
## "Installation"
2011-02-18 18:32:40 +00:00
2011-02-22 19:07:48 +00:00
Bandalore is available in Maven central. Add it to your Maven project's `pom.xml`:
2011-02-18 18:32:40 +00:00
2012-02-24 15:36:21 +00:00
```xml
2011-02-18 18:32:40 +00:00
<dependency>
2011-02-23 12:50:47 +00:00
<groupId>com.cemerick</groupId>
2011-02-18 18:32:40 +00:00
<artifactId>bandalore</artifactId>
2013-10-17 10:29:49 +00:00
<version>0.0.4</version>
2011-02-18 18:32:40 +00:00
</dependency>
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
or your leiningen project.clj:
2012-02-24 15:36:21 +00:00
```clojure
2013-10-17 10:29:49 +00:00
[com.cemerick/bandalore "0.0.4"]
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
2013-10-17 10:29:49 +00:00
Bandalore is compatible with Clojure 1.2.0+.
2012-02-24 15:36:21 +00:00
## Logging
2011-02-22 19:07:29 +00:00
I strongly recommend squelching the AWS SDK's very verbose logging
before using Bandalore (the former spews a variety of stuff out on
INFO that I personally think should be in DEBUG or TRACE). You can
do this with this snippet:
2012-02-24 15:36:21 +00:00
```clojure
2011-02-22 19:07:29 +00:00
(.setLevel (java.util.logging.Logger/getLogger "com.amazonaws")
java.util.logging.Level/WARNING)
2012-02-24 15:36:21 +00:00
```
2011-02-22 19:07:29 +00:00
Translate as necessary if you're using log4j, etc.
2012-02-24 15:36:21 +00:00
## Usage
2011-02-18 18:32:40 +00:00
2012-02-24 16:14:34 +00:00
You should be familiar with [SQS itself](http://aws.amazon.com/sqs/)
2011-02-18 18:32:40 +00:00
before sensibly using this library. That said, Bandalore's API
is well-documented.
You'll first need to load the library and create a SQS client object
to do anything:
2012-02-24 15:36:21 +00:00
```clojure
2011-02-18 18:32:40 +00:00
(require '[cemerick.bandalore :as sqs])
(def client (sqs/create-client "your aws id" "your aws secret-key"))
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
**Security Note** If your application using Bandalore is deployed to EC2, _you
should not put your AWS credentials on those EC2 nodes_. Rather,
[give your EC2 instances IAM roles](http://docs.aws.amazon.com/IAM/latest/UserGuide/role-usecase-ec2app.html),
and use the nullary arity of `create-client`:
```clojure
(require '[cemerick.bandalore :as sqs])
(def client (sqs/create-client))
```
This will use credentials assigned to your EC2 node based on its
role that are automatically rotated.
2011-02-18 18:32:40 +00:00
You can create, delete, and list queues:
2012-02-24 15:36:21 +00:00
```clojure
#> (sqs/create-queue client "foo")
2011-02-18 18:32:40 +00:00
"https://queue.amazonaws.com/499312652346/foo"
2012-02-24 15:36:21 +00:00
#> (sqs/list-queues client)
2011-02-18 18:32:40 +00:00
("https://queue.amazonaws.com/499312652346/foo")
2012-02-24 15:36:21 +00:00
#> (sqs/delete-queue client (first *1))
2011-02-18 18:32:40 +00:00
nil
2012-02-24 15:36:21 +00:00
#> (list-queues client)
2011-02-18 18:32:40 +00:00
nil
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
*Note that SQS is _eventually consistent_. This means that a created
queue won't necessarily show up in an immediate listing of queues,
messages aren't necessarily immediately available to be received, etc.*
You can send, receive, and delete messages:
2012-02-24 15:36:21 +00:00
```clojure
#> (def q (sqs/create-queue client "foo"))
2011-02-18 18:32:40 +00:00
#'cemerick.bandalore-test/q
2012-02-24 15:36:21 +00:00
#> (sqs/send client q "my message body")
2011-02-18 18:32:40 +00:00
{:id "75d5d7a1-2274-4163-97b2-aa4c75f209ee", :body-md5 "05d358de00fc63dd2fa2026b77e112f6"}
2012-02-24 15:36:21 +00:00
#> (sqs/receive client q)
2011-02-18 18:32:40 +00:00
({:attrs #<HashMap {}>, :body "my message body", :body-md5 "05d358de00fc63dd2fa2026b77e112f6",
:id "75d5d7a1-2274-4163-97b2-aa4c75f209ee",
:receipt-handle "…very long string…"})
;;
;; …presumably do something with the received message(s)…
;;
2012-02-24 15:36:21 +00:00
#> (sqs/delete client q (first *1))
2011-02-18 18:32:40 +00:00
nil
2012-02-24 15:36:21 +00:00
#> (sqs/receive client q)
2011-02-18 18:32:40 +00:00
()
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
That's cleaner than having to interop directly with the Java SDK, but it's all
pretty pedestrian stuff. You can do more interesting things with some
simple higher-order functions and other nifty Clojure facilities.
2012-02-24 15:36:21 +00:00
### Sending and receiving Clojure values
2011-02-18 18:32:40 +00:00
SQS' message bodies are strings, so you can stuff anything in them that you can
serialize to a string. That said, `pr-str` and `read-string` are too handy
to not use, assuming your consumers are using Clojure as well:
2012-02-24 15:36:21 +00:00
```clojure
#> (sqs/send client q (pr-str {:a 5 :b "blah" :c 6.022e23}))
2011-02-18 18:32:40 +00:00
{:id "3756c302-866a-4fcc-a7a3-746e6f531f47", :body-md5 "60052fc2ffb835257c26b9957c6e9ffd"}
2012-02-24 15:36:21 +00:00
#> (-?> (sqs/receive client q) first :body read-string)
2011-02-18 18:32:40 +00:00
{:a 5, :b "blah", :c 6.022E23}
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
2012-02-24 15:36:21 +00:00
### Sending seqs of messages
2011-02-18 18:32:40 +00:00
…with more gratuitous use of `pr-str` and `read-string` to send and receive
Clojure values:
2012-02-24 15:36:21 +00:00
```clojure
#> (->> [:foo 'bar ["some vector" 42] #{#"silly place for a regex"}]
2011-02-18 18:32:40 +00:00
(map (comp (partial sqs/send client q) pr-str))
dorun)
nil
2012-02-24 15:36:21 +00:00
#> (map (comp read-string :body)
2011-02-18 18:32:40 +00:00
(sqs/receive client q :limit 10))
(bar ["some vector" 42])
2012-02-24 15:36:21 +00:00
#> (map (comp read-string :body)
2011-02-18 18:32:40 +00:00
(sqs/receive client q :limit 10))
(#{#"silly place for a regex"})
2012-02-24 15:36:21 +00:00
#> (map (comp read-string :body)
2011-02-18 18:32:40 +00:00
(sqs/receive client q :limit 10))
(:foo)
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
2012-02-24 15:36:21 +00:00
### (Mostly) automatic deletion of consumed messages
2011-02-18 18:32:40 +00:00
When you're done processing a received message, you need to delete it from its
originaing queue:
2012-02-24 15:36:21 +00:00
```clojure
2011-02-18 18:32:40 +00:00
; ensure our queue is empty to start
2012-02-24 15:36:21 +00:00
#> (get (sqs/queue-attrs client q) "ApproximateNumberOfMessages")
2011-02-18 18:32:40 +00:00
"0"
2012-02-24 15:36:21 +00:00
#> (dorun (map (partial sqs/send client q) (map str (range 100))))
2011-02-18 18:32:40 +00:00
nil
2012-02-24 15:36:21 +00:00
#> (get (sqs/queue-attrs client q) "ApproximateNumberOfMessages")
2011-02-18 18:32:40 +00:00
"100"
; received messages must be removed from the queue or they will
; be delivered again after their visibility timeout expires
2012-02-24 15:36:21 +00:00
#> (sqs/receive client q)
2011-02-18 18:32:40 +00:00
(…message seq…)
2012-02-24 15:36:21 +00:00
#> (get (sqs/queue-attrs client q) "ApproximateNumberOfMessages")
2011-02-18 18:32:40 +00:00
"100"
2012-02-24 15:36:21 +00:00
#> (->> (sqs/receive client q) first (sqs/delete client))
2011-02-18 18:32:40 +00:00
nil
2012-02-24 15:36:21 +00:00
#> (get (sqs/queue-attrs client q) "ApproximateNumberOfMessages")
2011-02-18 18:32:40 +00:00
"99"
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
Rather than trying to remember to do this, just use the
`deleting-consumer` "middleware" to produce a function that calls
the message-processing function you provide to it, and then
automatically deletes the processed message from the origining queue:
2012-02-24 15:36:21 +00:00
```clojure
#> (doall (map
2011-02-18 18:32:40 +00:00
(sqs/deleting-consumer client (comp println :body))
(sqs/receive client q :limit 10)))
0
4
9
12
26
36
40
44
52
55
(nil nil nil nil nil nil nil nil nil nil)
2012-02-24 15:36:21 +00:00
#> (get (sqs/queue-attrs client q) "ApproximateNumberOfMessages")
2011-02-18 18:32:40 +00:00
"90"
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
2012-02-24 15:36:21 +00:00
### Consuming queues as seqs
2011-02-18 18:32:40 +00:00
seqs being the _lingua franca_ of Clojure collections, it would be helpful if we
could treat an SQS queue as a seq of messages. While `receive` does return
a seq of messages, each `receive` call is limited to receiving a maximum of
10 messages, and there is no streaming or push counterpart in the SQS API.
The solution to this is `polling-receive`, which returns a lazy seq that
reaches out to SQS as necessary:
2012-02-24 15:36:21 +00:00
```clojure
#> (map (sqs/deleting-consumer client :body)
2011-02-18 18:32:40 +00:00
(sqs/polling-receive client q :limit 10))
("3" "5" "7" "8" ... "81" "90" "91")
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
`polling-receive` accepts all of the same optional kwargs as `receive` does,
but adds two more to control its usage of `receive`:
:period - time in ms to wait after an unsuccessful `receive` request (default: 500)
:max-wait - maximum time in ms to wait to successfully receive messages before terminating
the lazy seq (default 5000ms)
Often queues are used to direct compute resources, so you'd like to be able to saturate
those boxen with as much work as your queue can offer up. The obvious solution
is to `pmap` across a seq of incoming messages, which you can do trivially with the seq
provided by `polling-receive`. Just make sure you tweak the `:max-wait` time so that,
assuming you want to continuously process incoming messages, the seq of messages doesn't
terminate because none have been available for a while.
Here's an example where one thread sends a message once a second for a minute,
and another consumes those messages using a lazy seq provided by `polling-receive`:
2012-02-24 15:36:21 +00:00
```clojure
#> (defn send-dummy-messages
2011-02-18 18:32:40 +00:00
[client q count]
(future (doseq [n (range count)]
(Thread/sleep 100)
(sqs/send client q (str n)))))
#'cemerick.bandalore-test/send-dummy-messages
2012-02-24 15:36:21 +00:00
#> (defn consume-dummy-messages
2011-02-18 18:32:40 +00:00
[client q]
(future (dorun (map (sqs/deleting-consumer client (comp println :body))
(sqs/polling-receive client q :max-wait Long/MAX_VALUE :limit 10)))))
2011-02-18 18:32:40 +00:00
#'cemerick.bandalore-test/consume-dummy-messages
2012-02-24 15:36:21 +00:00
#> (consume-dummy-messages client q) ;; start the consumer
2011-02-18 18:32:40 +00:00
#<core$future_call$reify__5500@a6f00bc: :pending>
2012-02-24 15:36:21 +00:00
#> (send-dummy-messages client q 1000) ;; start the sender
2011-02-18 18:32:40 +00:00
#<core$future_call$reify__5500@18986032: :pending>
3
4
1
0
2
8
5
7
...
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
You'd presumably want to set up some ways to control your consumer, but hopefully
you see that it would be trivial to parallelize the processing function being
wrapped by `deleting-consumer` using `pmap`, distribute processing among agents
if that's more appropriate, etc.
2012-02-24 15:36:21 +00:00
## Building Bandalore
2011-02-18 18:32:40 +00:00
Have maven. From the command line:
2012-02-24 15:36:21 +00:00
```
$ mvn clean verify
```
2011-02-18 18:32:40 +00:00
2011-02-23 12:09:15 +00:00
*The tests are all live*, so:
1. They create and delete queues (though with unique queue names).
2. They aren't written to be particularly efficient w.r.t. SQS usage. If you do decide to run the tests, the associated fees should be trivial (or nonexistent if your account is under the SQS free usage cap).
In any case, you are so warned. Make a new AWS account dedicated to testing if you're concerned on either count.
Since the tests are live, you either need to add your AWS credentials to your
2011-02-18 18:32:40 +00:00
`~/.m2/settings.xml` file as properties, or specify them on the command line
using `-D` switches:
2012-02-24 15:36:21 +00:00
```
2013-10-17 10:29:49 +00:00
$ mvn -Daws.id=XXXXXXX -Daws.secret-key=YYYYYYY clean install
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
2011-02-23 12:09:15 +00:00
Or, you can skip the tests entirely:
2011-02-18 18:32:40 +00:00
2012-02-24 15:36:21 +00:00
```
2013-10-17 10:29:49 +00:00
$ mvn -Dmaven.test.skip=true clean install
2012-02-24 15:36:21 +00:00
```
2011-02-18 18:32:40 +00:00
In any case, you'll find a built `.jar` file in the `target` directory, and in
its designated spot in `~/.m2/repository` (assuming you ran `install` rather than
e.g. `package`).
2012-02-24 15:36:21 +00:00
## Need Help?
2011-02-18 18:32:40 +00:00
Ping `cemerick` on freenode irc or twitter if you have questions
or would like to contribute patches.
2012-02-24 15:36:21 +00:00
## License
2011-02-18 18:32:40 +00:00
2013-10-17 10:29:49 +00:00
Copyright © 2011-2013 Chas Emerick and contributors.
2011-02-18 18:32:40 +00:00
Licensed under the EPL. (See the file epl-v10.html.)