Riak Setup Instructions ------ This document explains how to set up a Riak cluster. It assumes that you have already downloaded an successfully built Riak. For help with those steps, please refer to riak/README. Overview --- Riak has many knobs to tweak, affecting everything from distribution to disk storage. This document will attempt to give a description of the common configuration parameters, then describe two typical setups, one small, one large. Configuration --- Configurations are stored in the simple text files vm.args and app.config. Initial versions of these files are stored in the rel/overlay/etc/ subdirectory of the riak source tree. When a release is generated, these "overlays" are copied to rel/riak/etc/. vm.args --- The vm.args configuration file sets the parameters passed on the command line to the Erlang VM. Lines starting with a '#' are comments, and are ignored. The other lines are concatenated and passed on the command line verbatim. Two important parameters to configure here are "-name", the name to give the Erlang node running Riak, and "-setcookie", the cookie that all Riak nodes need to share in order to communicate. app.config --- The app.config configuration file is formatted as an Erlang VM config file. The syntax is simply: [ {AppName, [ {Option1, Value1}, {Option2, Value2}, ... ]}, ... ]. Normally, this will look something like: [ {riak, [ {storage_backend, riak_dets_backend}, {riak_dets_backend_root, "data/dets"} ]}, {sasl, [ {sasl_error_logger, {file, "log/sasl-error.log"}} ]} ]. This would set the 'storage_backend' and 'riak_dets_backend_root' options for the 'riak' application, and the 'sasl_error_logger' option for the 'sasl' application. The following parameters can be used in app.config to configure Riak behavior. Some of the terminology used below is better explained in riak/doc/architecture.txt. cluster_name: string The name of the cluster. Can be anything. Used mainly in saving ring configuration. All nodes should have the same cluster name. gossip_interval: integer The period, in milliseconds, at which ring state gossiping will happen. A good default is 60000 (sixty seconds). Best not to change it unless you know what you're doing. ring_creation_size: integer The number of partitions to divide the keyspace into. This can be any number, but you probably don't want to go lower than 16, and production deployments will probably want something like 1024 or greater. This is a very difficult parameter to change after your ring has been created, so choose a number that allows for growth, if you expect to add nodes to this cluster in the future. ring_state_dir: string Directory in which the ring state should be stored. Ring state is stored to allow an entire cluster to be restarted. storage_backend: atom Name of the module that implements the storage for a vnode. The four backends that ship with Riak are riak_fs_backend, riak_ets_backend, and riak_dets_backend. Some backends have their own set of configuration parameters. riak_fs_backend: A backend that uses the filesystem directly to store data. Data are stored in Erlang binary format in files in a directory structure on disk. riak_fs_backend_root: string The directory under which this backend will store its files. riak_ets_backend: A backend that uses ETS to store its data. riak_dets_backend: A backend that uses DETS to store its data. riak_dets_backend_root: string The directory under which this backend will store its files. Single-node Configuration --- If you're running a single Riak node, you likely don't need to change any configuration at all. After compiling and generating the release ("./rebar compile generate"), simply start Riak from the rel/ directory. (Details about the "riak" control script can be found in the README.) Large (Production) Configuration --- If you're running any sort of cluster that could be labeled "production", "deployment", "scalable", "enterprise", or any other word implying that the cluster will be running interminably with on-going maintenance, then you will want to change configurations a bit. Some recommended changes: * Uncomment the "-heart" line in vm.args. This will cause the "heart" utility to monitor the Riak node, and restart it if it stops. * Change the name of the Riak node in vm.args from riak@127.0.0.1 to riak@VISIBLE.HOSTNAME. This will allow Riak nodes separate machines to communicate. * Change 'riak_web_ip' in app.config if you'll be accessing that interface from a non-host-local client. * Consider adding a 'ring_creation_size' entry to app.config, and setting it to a number higher than the default of 64. More partitions will allow you to add more Riak nodes later, if you need to. * Consider changing the 'riak_storage_backend' entry in app.config. Depending on your use case, riak_dets_backend may not be your best choice. To get the cluster, up and running, first start Riak on each node with the usual "riak start" command. Next, tell each node to join the cluster with the riak-admin script: box2$ bin/riak-admin join riak@box1.example.com Sent join request to riak@box1.example.com To check that all nodes have joined the cluster, attach a console to any node, and request the ring from the ring manager, then check that all nodes are represented: $ bin/riak attach Attaching to /tmp/erlang.pipe.1 (^D to exit) (riak@box1.example.com)1> {ok, R} = riak_ring_manager:get_my_ring(). {ok,{chstate,'riak@box1.example.com', ...snip... (riak@box1.example.com)2> riak_ring:all_members(R). ['riak@box1.example.com','riak@box2.example.com'] Your cluster should now be ready to accept requests. See riak/doc/basic-client.txt for simple instructions on connecting and storing and fetching data, though you'll need to use an Erlang node name for your client that isn't hosted on "127.0.0.1". Starting more nodes in production is just as easy: 1. Install Riak on another host, modifying hostnames in configuration files, if necessary. 2. Start the node with "riak start" 3. Add the node to the cluster with "riak-admin join ExistingClusterNode" Developer Configuration --- If you're hacking on Riak, and you need to run multiple nodes on a single physical machine, use the "devrel" make command: $ make devrel mkdir dev cp -R rel/overlay rel/reltool.config dev ./rebar compile && cd dev && ../rebar generate ==> mochiweb (compile) ==> webmachine (compile) ==> riak (compile) ==> dev (generate) Generating target specification... Constructing release... cp -Rn dev/riak dev/dev1 ...snip... cp -Rn dev/riak dev/dev2 ...snip... cp -Rn dev/riak dev/dev3 ...snip... This make target creates a release, and then modifies configuration files such that each Riak node uses a different Erlang node name (riak1-3), web port (8091-3), data directory (dev/dev1-3/data/), etc. Start each developer node just as you would a regular node, but use the 'riak' script in dev/devN/bin/. Logging --- Viewing the activity log of a given riak node is done using the riak-admin script. Just pass it the command "logger", the Erlang node name of the Riak node, and the cookie for that node: $ riak-admin logger dev1@127.0.0.1 riak [13/Jan/2010:14:45:37 -0500]: {riak_connect,send_ring_to,'dev1@127.0.0.1','dev2@127.0.0.1'} [13/Jan/2010:14:45:37 -0500]: {riak_ring_manager,write_ringfile,'dev1@127.0.0.1', <<"data/ring/riak_ring.default.20100113194537">>} [13/Jan/2010:14:45:37 -0500]: {riak_connect,changed_ring,'dev2@127.0.0.1',gossip_changed} ...snip...