A Little Riak Book for LFE

New in Riak 2.0

Riak has been a project since 2009. And in that time, it has undergone a few evolutions, largely technical improvements, such as more reliability and data safety mechanisms like active anti-entropy.

Riak 2.0 was not a rewrite, but rather, a huge shift in how developers who use Riak interact with it. While Basho continued to make backend improvements (such as better cluster metadata) and simplified using existing options (repair-2i is now a riak-admin command, rather than code you must execute), the biggest changes are immediately obvious to developers. But many of those improvements are also made easier for operators to administrate. So here are a few highlights of the new 2.0 interface options.

Bucket Types

A centerpiece of the new Riak 2.0 features is the addition of a higher-level bucket configuration namespace called bucket types. We discussed the general idea of bucket types in the previous chapters, but one major departure from standard buckets is that they are created via the command-line. This means that operators with server access can manage the default properties that all buckets of a given bucket type inherit.

Bucket types have a set of tools for creating, managing and activating them.

$ riak-admin bucket-type
Usage: riak-admin bucket-type <command>

The follow commands can be used to manage bucket types for the cluster:

   list                           List all bucket types and their activation status
   status <type>                  Display the status and properties of a type
   activate <type>                Activate a type
   create <type> <json>           Create or modify a type before activation
   update <type> <json>           Update a type after activation

It's rather straightforward to create a bucket type. The JSON string accepted after the bucket type name are any valid bucket propertied. Any bucket that uses this type will inherit those properties. For example, say that you wanted to create a bucket type whose n_val was always 1 (rather than the default 3), named unsafe.

$ riak-admin bucket-type create unsafe '{"props":{"n_val":1}}'

Once you create the bucket type, it's a good idea to check the status, and ensure the properties are what you meant to set.

$ riak-admin bucket-type status unsafe

A bucket type is not active until you propgate it through the system by calling the activate command.

$ riak-admin bucket-type activate unsafe

If something is wrong with the type's properties, you can always update it.

$ riak-admin bucket-type update unsafe '{"props":{"n_val":1}}'

You can update a bucket type after it's actived. All of the changes that you make to the type will be inherited by every bucket under that type.

Of course, you can always get a list of the current bucket types in the system. The list will also say whether the bucket type is activated or not.

Other than that, there's nothing interesting about bucket types from an operations point of view, per se. Sure, there are some cool internal mechanisms at work, such as propogated metadata via a path laied out by a plum-tree and causally tracked by dotted version vectors. But that's only code plumbing. What's most interesting about bucket types are the new features you can take advantage of: datatypes, strong consistency, and search.

Datatypes

Datatypes are useful for engineers, since they no longer have to consider the complexity of manual conflict merges that can occur in fault situations. It can also be less stress on the system, since larger objects need only communicate their changes, rather than reinsert the full object.

Riak 2.0 supports four datatypes: map, set, counter, flag. You create a bucket type with a single datatype. It's not required, but often good form to name the bucket type after the datatype you're setting.

$ riak-admin bucket-type create maps '{"props":{"datatype":"map"}}'
$ riak-admin bucket-type create sets '{"props":{"datatype":"set"}}'
$ riak-admin bucket-type create counters '{"props":{"datatype":"counter"}}'
$ riak-admin bucket-type create flags '{"props":{"datatype":"flag"}}'

Once a bucket type is created with the given datatype, you need only active it. Developers can then use this datatype like we saw in the previous chapter, but hopefully this example makes clear the suggestion of naming bucket types after their datatype.

curl -XPUT "$RIAK/types/counters/buckets/visitors/keys/index_page" \
  -H "Content-Type:application/json"
  -d 1

Strong Consistency

Strong consistency (SC) is the opposite of everything that Riak stands for. Where Riak is all about high availability in the face of network or server errors, strong consistency is about safety over liveness. Either the network and servers are working perfectly, or the reads and writes fail. So why on earth would we ever want to provide SC and give up HA? Because you asked for. Really.

There are some very good use-cases for strong consistency. For example, when a user is completing a purchase, you might want to ensure that the system is always in a consistent state, or fail the purchase. Communicating that a purchase was made when it in fact was not, is not a good user experience. The opposite is even worse.

While Riak will continue to be primarily an HA system, there are cases where SC is useful, and developers should be allowed to choose without having to install an entirely new database. So all you need to do is activate it in riak.conf.

strong_consistency = on

One thing to note is, although we generally recommend you have five nodes in a Riak cluster, it's not a hard requirement. Strong consistency, however, requires three nodes. It will not operate with fewer.

Once our SC systme is active, you'll lean on bucket types again. Only buckets that live under a bucket type setup for strong consistency will be strongly consistent. This means that you can have some buckets HA, other SC, in the same database. Let's call our SC bucket type strong.

$ riak-admin bucket-type create strong '{"props":{"consistent":true}}'
$ riak-admin bucket-type activate strong

That's all the operator should need to do. The developers can use the strong bucket similarly to other buckets.

curl -XPUT "$RIAK/types/strong/buckets/purchases/keys/jane" \
  -d '{"action":"buy"}'

Jane's purchases will either succeed or fail. It will not be eventually consistent. If it fails, of course, she can try again.

What if your system is having problems with strong consistency? Basho has provided a command to interrogate the current status of the subsystem responsible for SC named ensemble. You can check it out by running ensemble-status.

$ riak-admin ensemble-status

It will give you the best information it has as to the state of the system. For example, if you didn't enable strong_consistency in every node's riak.conf, you might see this.

============================== Consensus System ===============================
Enabled:     false
Active:      false
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

Note: The consensus subsystem is not enabled.

================================== Ensembles ==================================
There are no active ensembles.

In the common case when all is working, you should see an output similar to the following:

============================== Consensus System ===============================
Enabled:     true
Active:      true
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

================================== Ensembles ==================================
 Ensemble     Quorum        Nodes      Leader
-------------------------------------------------------------------------------
   root       4 / 4         4 / 4      riak@riak1
    2         3 / 3         3 / 3      riak@riak2
    3         3 / 3         3 / 3      riak@riak4
    4         3 / 3         3 / 3      riak@riak1
    5         3 / 3         3 / 3      riak@riak2
    6         3 / 3         3 / 3      riak@riak2
    7         3 / 3         3 / 3      riak@riak4
    8         3 / 3         3 / 3      riak@riak4

This output tells you that the consensus system is both enabled and active, as well as lists details about all known consensus groups (ensembles).

There is plenty more information about the details of strong consistency in the online docs.

Search 2.0

From an operations standpoint, search is deceptively simple. Functionally, there isn't much you should need to do with search, other than activate it in riak.conf.

search = on

However, looks are deceiving. Under the covers, Riak Search 2.0 actually runs the search index software called Solr. Solr runs as a Java service. All of the code required to convert an object that you insert into a document that Solr can recognize (by a module called an Extractor) is Erlang, and so is the code which keeps the Riak objects and Solr indexes in sync through faults (via AAE), as well as all of the interfaces, security, stats, and query distribution. But since Solr is Java, we have to manage the JVM.

If you don't have much experience running Java code, let me distill most problems for you: you need more memory. Solr is a memory hog, easily requiring a minimum of 2 GiB of RAM dedicated only to the Solr service itself. This is in addition to the 4 GiB of RAM minimum that Basho recommends per node. So, according to math, you need a minimum of 6 GiB of RAM to run Riak Search. But we're not quite through yet.

The most important setting in Riak Search are the JVM options. These options are passed into the JVM command-line when the Solr service is started, and most of the options chosen are excellent defaults. I recommend not getting to hung up on tweaking those, with one notable exception.

## The options to pass to the Solr JVM.  Non-standard options,
## i.e. -XX, may not be portable across JVM implementations.
## E.g. -XX:+UseCompressedStrings
## 
## Default: -d64 -Xms1g -Xmx1g -XX:+UseStringCache -XX:+UseCompressedOops
## 
## Acceptable values:
##   - text
search.solr.jvm_options = -d64 -Xms1g -Xmx1g -XX:+UseStringCache -XX:+UseCompressedOops

In the default setting, Riak gives 1 GiB of RAM to the Solr JVM heap. This is fine for small clusters with small, lightly used indexes. You may want to bump those heap values up---the two args of note are: -Xms1g (minimum size 1 gigabyte) and -Xmx1g (maximum size 1 gigabyte). Push those to 2 or 4 (or even higher) and you should be fine.

In the interested of completeness, Riak also communicates to Solr internally through a port, which you can configure (along with an option JMX port). You should never need to connect to this port yourself.

## The port number which Solr binds to.
## NOTE: Binds on every interface.
## 
## Default: 8093
## 
## Acceptable values:
##   - an integer
search.solr.port = 8093

## The port number which Solr JMX binds to.
## NOTE: Binds on every interface.
## 
## Default: 8985
## 
## Acceptable values:
##   - an integer
search.solr.jmx_port = 8985

There's generally no great reason to alter these defaults, but they're there if you need them.

I should also note that, thanks to fancy bucket types, you can associate a bucket type with a search index. You associate buckets (or types) with indexes by adding a search_index property, with the name of a Solr index. Like so, assuming that you've created a solr index named my_index:

$ riak-admin bucket-type create indexed '{"props":{"search_index":"my_index"}}'
$ riak-admin bucket-type activate indexed

Now, any object that a developer puts into yokozuna under that bucket type will be indexed.

There's a lot more to search than we can possibly cover here without making it a book in its own right. You may want to checkout the following documentation in docs.basho.com for more details.

Security

Riak has lived quite well in the first five years of its life without security. So why did Basho add it now? With the kind of security you get through a firewall, you can only get coarse-grained security. Someone can either access the system or not, with a few restrictions, depending on how clever you write your firewall rules.

With the addition of Security, Riak now supports authentication (identifying a user) and authorization (restricting user access to a subset of commands) of users and groups. Access can also be restricted to a known set of sources. The security design was inspired by the full-featured rules in PostgreSQL.

Before you decide to enable security, you should consider this checklist in advance.

  1. If you use security, you must upgrade to Riak Search 2.0. The old Search will not work (neither will the deprecated Link Walking). Check any Erlang MapReduce code for invocations of Riak modules other than riak_kv_mapreduce. Enabling security will prevent those from succeeding unless those modules are available via add_path
  2. Make sure that your application code is using the most recent drivers
  3. Define users and (optionally) groups, and their sources
  4. Grant the necessary permissions to each user/group

With that out of the way, you can enable security with a command-line option (you can disable security as well). You can optionally check the status of security at any time.

$ riak-admin security enable
$ riak-admin security status
Enabled

Adding users is as easy as the add-user command. A username is required, and can be followed with any key/value pairs. password and groups are special cases, but everything is free form. You can alter existing users as well. Users can belong to any number of groups, and inherit a union of all group settings.

$ riak-admin security add-group mascots type=mascot
$ riak-admin security add-user bashoman password=Test1234
$ riak-admin security alter-user bashoman groups=mascots

You can see the list of all users via print-users, or all groups via print-groups.

$ riak-admin security print-users
+----------+----------+----------------------+---------------------+
| username |  groups  |       password       |       options       |
+----------+----------+----------------------+---------------------+
| bashoman | mascots  |983e8ae1421574b8733824| [{"type","mascot"}] |
+----------+----------+----------------------+---------------------+

Creating user and groups is nice and all, but the real reason for doing this is so we can distinguish authorization between different users and groups. You grant or revoke permissions to users and groups by way of the command line, of course. You can grant/revoke a permission to anything, a certain bucket type, or a specific bucket.

$ riak-admin security grant riak_kv.get on any to all
$ riak-admin security grant riak_kv.delete on any to admin
$ riak-admin security grant search.query on index people to bashoman
$ riak-admin security revoke riak_kv.delete on any to bad_admin

There are many kinds of permissions, one for every major operation or set of operations in Riak. It's worth noting that you can't add search permissions without search enabled.

  • riak_kv.get --- Retrieve objects
  • riak_kv.put --- Create or update objects
  • riak_kv.delete --- Delete objects
  • riak_kv.index --- Index objects using secondary indexes (2i)
  • riak_kv.list_keys --- List all of the keys in a bucket
  • riak_kv.list_buckets --- List all buckets
  • riak_kv.mapreduce --- Can run MapReduce jobs
  • riak_core.get_bucket --- Retrieve the props associated with a bucket
  • riak_core.set_bucket --- Modify the props associated with a bucket
  • riak_core.get_bucket_type --- Retrieve the set of props associated with a bucket type
  • riak_core.set_bucket_type --- Modify the set of props associated with a bucket type
  • search.admin --- The ability to perform search admin-related tasks, like creating and deleting indexes
  • search.query --- The ability to query an index

Finally, with our group and user created, and given access to a subset of permissions, we have one more major item to deal with. We want to be able to filter connection from specific sources.

$ riak-admin security add-source all|<users> <CIDR> <source> [<option>=<value>[...]]

This is powerful security, since Riak will only accept connections that pass specific criteria, such as a certain certificate or password, or from a specific IP address. Here we trust any connection that's initiated locally.

$ riak-admin security add-source all 127.0.0.1/32 trust

There's plenty more you can learn about in the Authentication and Authorization online documentation.

Dynamic Ring Resizing

As of Riak 2.0, you can now resize the number of vnodes in the ring. The number of vnodes must be a power of 2 (eg. 64, 256, 1024). It's a very heavyweight operation, and should not be a replacement for proper growth planning (aiming for 8 to 16 vnodes per node). However, if you experience greater than expected growth, this is quite a bit easier than transfering your entire dataset manually to a larger cluster. It just continues the Riak philosophy of easy operations, and no downtime!

$ riak-admin cluster resize-ring 128
Success: staged resize ring request with new size: 128

Then commit the cluster plan in required two phase plan/commit steps.

$ riak-admin cluster plan
$ riak-admin cluster commit

It can take quite a while for ring resizing to complete. You're effectively moving around half (or more) of the cluster's values around to new partitions. You can track the status of this resize with the a couple commands. The ring-status command we've seen before, which will show you all of the changes that are queued up or in progress.

$ riak-admin ring-status

If you want to see a different view of specifically handoff transfers, there's the transfers command.

$ riak-admin transfers
'[email protected]' waiting to handoff 3 partitions
'[email protected]' waiting to handoff 1 partitions
'[email protected]' waiting to handoff 1 partitions
'[email protected]' waiting to handoff 2 partitions

Active Transfers:

transfer type: resize_transfer
vnode type: riak_kv_vnode
partition: 1438665674247607560106752257205091097473808596992
started: 2014-01-20 21:03:53 [1.14 min ago]
last update: 2014-01-20 21:05:01 [1.21 s ago]
total size: 111676327 bytes
objects transferred: 122598

                         1818 Objs/s                          
     [email protected]        =======>       [email protected]      
        |=========================                  |  58%    
                         950.38 KB/s

If the resize activity is taking too much time, or consuming too many resources, you can alter the handoff_concurrency limit on the fly. This limit is the number of vnodes per physical node that are allowed to perform handoff at once, and defaults to 2. You can change the setting in the entire cluster, or per node. Say you want to change transfer up to 4 vnodes at a time.

riak-admin transfer-limit 4

Or for a single node.

$ riak-admin transfer-limit [email protected] 4

Ring resizing will be complete once you get this message from riak-admin transfers:

No transfers active

What if something goes wrong? What if you made a mistake? No problem, you can always abort the ring resize command.

$ riak-admin cluster resize-ring abort
$ riak-admin cluster plan
$ riak-admin cluster commit

Any queued handoffs will be stopped. But any completed handoffs may have to be transfered back. Easy-peasy!