A Little Riak Book for LFE

Lookup

Aside: Supported Languages

Riak 2.0 has official drivers for the following languages:
  • Erlang
  • Java
  • Python
  • Ruby
Including community-supplied drivers, supported languages are even more numerous: C/C++, PHP, Clojure, Common Lisp, Dart, Go, Groovy, Haskell, JavaScript (jQuery and NodeJS), Lisp Flavored Erlang, .NET, Perl, PHP, Play, Racket, Scala, Smalltalk. Dozens of other project-specific addons can be found in the Basho docs.

Since Riak is a KV database, the most basic commands are setting and getting values. We'll use the HTTP interface, via curl, but we could just as easily use Erlang, Ruby, Java, or any other supported language.

The basic structure of a Riak request is setting a value, reading it, and maybe eventually deleting it. The actions are related to HTTP methods (PUT, GET, POST, DELETE).

PUT    /types/<type>/buckets/<bucket>/keys/<key>
GET    /types/<type>/buckets/<bucket>/keys/<key>
DELETE /types/<type>/buckets/<bucket>/keys/<key>

For the examples in this chapter, let's call an environment variable $RIAK that points to our access node's URL.

export RIAK=http://localhost:8098

PUT

The simplest write command in Riak is putting a value. It requires a key, value, and a bucket. In curl, all HTTP methods are prefixed with -X. Putting the value pizza into the key favorite under the food bucket and items bucket type is done like this:

curl -XPUT "$RIAK/types/items/buckets/food/keys/favorite" \
  -H "Content-Type:text/plain" \
  -d "pizza"

I threw a few curveballs in there. The -d flag denotes the next string will be the value. We've kept things simple with the string pizza, declaring it as text with the proceeding line -H 'Content-Type:text/plain'. This defines the HTTP MIME type of this value as plain text. We could have set any value at all, be it XML or JSON---even an image or a video. Riak does not care at all what data is uploaded, so long as the object size doesn't get much larger than 4MB (a soft limit but one that it is unwise to exceed).

GET

The next command reads the value pizza under the type/bucket/key items/food/favorite.

curl -XGET "$RIAK/types/items/buckets/food/keys/favorite"
pizza

This is the simplest form of read, responding with only the value. Riak contains much more information, which you can access if you read the entire response, including the HTTP header.

In curl you can access a full response by way of the -i flag. Let's perform the above query again, adding that flag (-XGET is the default curl method, so we can leave it off).

curl -i "$RIAK/types/items/buckets/food/keys/favorite"
HTTP/1.1 200 OK
X-Riak-Vclock: a85hYGBgzGDKBVIcypz/fgaUHjmdwZTImMfKcN3h1Um+LAA=
Vary: Accept-Encoding
Server: MochiWeb/1.1 WebMachine/1.9.0 (someone had painted...
Last-Modified: Wed, 10 Oct 2012 18:56:23 GMT
ETag: "1yHn7L0XMEoMVXRGp4gOom"
Date: Thu, 11 Oct 2012 23:57:29 GMT
Content-Type: text/plain
Content-Length: 5

pizza

The anatomy of HTTP is a bit beyond this little book, but let's look at a few parts worth noting.

Status Codes

The first line gives the HTTP version 1.1 response code 200 OK. You may be familiar with the common website code 404 Not Found. There are many kinds of HTTP status codes, and the Riak HTTP interface stays true to their intent: 1xx Informational, 2xx Success, 3xx Further Action, 4xx Client Error, 5xx Server Error

Different actions can return different response/error codes. Complete lists can be found in the official API docs.

Timings

A block of headers represents different timings for the object or the request.

  • Last-Modified - The last time this object was modified (created or updated).
  • ETag - An entity tag which can be used for cache validation by a client.
  • Date - The time of the request.
  • X-Riak-Vclock - A logical clock which we'll cover in more detail later.
Content

These describe the HTTP body of the message (in Riak's terms, the value).

  • Content-Type - The type of value, such as text/xml.
  • Content-Length - The length, in bytes, of the message body.

Some other headers like Link will be covered later in this chapter.

POST

Similar to PUT, POST will save a value. But with POST a key is optional. All it requires is a bucket name (and should include a type), and it will generate a key for you.

Let's add a JSON value to represent a person under the json/people type/bucket. The response header is where a POST will return the key it generated for you.

curl -i -XPOST "$RIAK/types/json/buckets/people/keys" \
  -H "Content-Type:application/json" \
  -d '{"name":"aaron"}'
HTTP/1.1 201 Created
Vary: Accept-Encoding
Server: MochiWeb/1.1 WebMachine/1.9.2 (someone had painted...
Location: /riak/people/DNQGJY0KtcHMirkidasA066yj5V
Date: Wed, 10 Oct 2012 17:55:22 GMT
Content-Type: application/json
Content-Length: 0

You can extract this key from the Location value. Other than not being pretty, this key is treated the same as if you defined your own key via PUT.

Body

You may note that no body was returned with the response. For any kind of write, you can add the returnbody=true parameter to force a value to return, along with value-related headers like X-Riak-Vclock and ETag.

curl -i -XPOST "$RIAK/types/json/buckets/people/keys?returnbody=true" \
  -H "Content-Type:application/json" \
  -d '{"name":"billy"}'
HTTP/1.1 201 Created
X-Riak-Vclock: a85hYGBgzGDKBVIcypz/fgaUHjmdwZTImMfKkD3z10m+LAA=
Vary: Accept-Encoding
Server: MochiWeb/1.1 WebMachine/1.9.0 (someone had painted...
Location: /riak/people/DnetI8GHiBK2yBFOEcj1EhHprss
Last-Modified: Tue, 23 Oct 2012 04:30:35 GMT
ETag: "7DsE7SEqAtY12d8T1HMkWZ"
Date: Tue, 23 Oct 2012 04:30:35 GMT
Content-Type: application/json
Content-Length: 16

{"name":"billy"}

This is true for PUTs and POSTs.

DELETE

The final basic operation is deleting keys, which is similar to getting a value, but sending the DELETE method to the type/bucket/key.

curl -XDELETE "$RIAK/types/json/buckets/people/keys/DNQGJY0KtcHMirkidasA066yj5V"

A deleted object in Riak is internally marked as deleted, by writing a marker known as a tombstone. Unless configured otherwise, another process called a reaper will later finish deleting the marked objects.

This detail isn't normally important, except to understand two things:

  1. In Riak, a delete is actually a read and a write, and should be considered as such when calculating read/write ratios.
  2. Checking for the existence of a key is not enough to know if an object exists. You might be reading a key after it has been deleted, so you should check for tombstone metadata.

Lists

Riak provides two kinds of lists. The first lists all buckets in your cluster, while the second lists all keys under a specific bucket. Both of these actions are called in the same way, and come in two varieties.

The following will give us all of our buckets as a JSON object.

curl "$RIAK/types/default/buckets?buckets=true"

{"buckets":["food"]}

And this will give us all of our keys under the food bucket.

curl "$RIAK/types/default/buckets/food/keys?keys=true"
{
  ...
  "keys": [
    "favorite"
  ]
}

If we had very many keys, clearly this might take a while. So Riak also provides the ability to stream your list of keys. keys=stream will keep the connection open, returning results in chunks of arrays. When it has exhausted its list, it will close the connection. You can see the details through curl in verbose (-v) mode (much of that response has been stripped out below).

curl -v "$RIAK/types/default/buckets/food/keys?keys=stream"
...

* Connection #0 to host localhost left intact
...
{"keys":["favorite"]}
{"keys":[]}
* Closing connection #0

You should note that list actions should not be used in production (they're really expensive operations). But they are useful for development, investigations, or for running occasional analytics at off-peak hours.

Conditional requests

It is possible to use conditional requests with Riak, but these are fragile due to the nature of its availability/eventual consistency model.

GET

When retrieving values from Riak via HTTP, a last-modified timestamp and an ETag are included. These may be used for future GET requests; if the value has not changed, a 304 Not Modified status will be returned.

For example, let's assume you receive the following headers.

Last-Modified: Thu, 17 Jul 2014 21:01:16 GMT
ETag: "3VhRP0vnXbk5NjZllr0dDE"

Note that the quotes are part of the ETag.

If the ETag is used via the If-None-Match header in the next request:

curl -i "$RIAK/types/default/buckets/food/keys/dinner" \
  -H 'If-None-Match: "3VhRP0vnXbk5NjZllr0dDE"'
HTTP/1.1 304 Not Modified
Vary: Accept-Encoding
Server: MochiWeb/1.1 WebMachine/1.10.5 (jokes are better explained)
ETag: "3VhRP0vnXbk5NjZllr0dDE"
Date: Mon, 28 Jul 2014 19:48:13 GMT

Similarly, the last-modified timestamp may be used with If-Modified-Since:

curl -i "$RIAK/types/default/buckets/food/keys/dinner" \
  -H 'If-Modified-Since: Thu, 17 Jul 2014 21:01:16 GMT'
HTTP/1.1 304 Not Modified
Vary: Accept-Encoding
Server: MochiWeb/1.1 WebMachine/1.10.5 (jokes are better explained)
ETag: "3VhRP0vnXbk5NjZllr0dDE"
Date: Mon, 28 Jul 2014 19:51:39 GMT

PUT & DELETE

When adding, updating, or removing content, the HTTP headers If-None-Match, If-Match, If-Modified-Since, and If-Unmodified-Since can be used to specify ETags and timestamps.

If the specified condition cannot be met, a 412 Precondition Failed status will be the result.