Friday, December 20, 2013

GlusterFS and its nature of configuration high availability

Nature

Providing high availability for a volume configuration for a clustered filesystem is of utmost importance, over last couple of years GlusterFS client never really exposed an essential necessity - so we relied on rudimentary style through shell script to provide for and use RRDNS as a workaround with virtual ip's.

Both shell script and RRDNS never really provided us with complete claim that we are going to provide for such configuration high availability. Lot of consumers of GlusterFS client side ended up with broken setup's due to this fact which led to sysadmin anguish.

Recently we came up with an idea that we should get rid of RRDNS as a necessity and also be able to provide all the servers as a list to poll for mounting from command line. We ended with two different patches which provides for the must needed functionality - which are listed below.

commit b610f1be7cd71b8f3e51c224c8b6fe0e7366c8cf
Author: Harshavardhana <harsha@harshavardhana.net>
Date:   Wed Jul 24 13:16:08 2013 -0700

    glusterfsd: Round robin DNS should not be relied upon with
    config service availability for clients.
 
    Backupvolfile server as it stands is slow and prone to errors
    with mount script and its combination with RRDNS. Instead in
    theory it should use all the available nodes in 'trusted pool'
    by default (Right now we don't have a mechanism in place for
    this)
 
    Nevertheless this patch provides a scenario where a list of
    volfile-server can be provided on command as shown below
 
    -----------------------------------------------------------------
    $ glusterfs -s server1 .. -s serverN --volfile-id=<volname> \
          <mount_point>
    -----------------------------------------------------------------
                       OR
    -----------------------------------------------------------------
    $ mount -t glusterfs -obackup-volfile-servers=<server2>: \
          <server3>:...:<serverN> <server1>:/<volname> <mount_point>
    -----------------------------------------------------------------
 
    Here ':' is used as a separator for mount script parsing
 
    Now these will be remembered and recursively attempted for
    fetching vol-file until exhausted. This would ensure that the
    clients get 'volume' configs in a consistent manner avoiding the
    need to poll through RRDNS.
 
    Change-Id: If808bb8a52e6034c61574cdae3ac4e7e83513a40
    BUG: 986429
    Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
    Reviewed-on: http://review.gluster.org/5400
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anand Avati <avati@redhat.com>


commit 0404be9ca1d9fa15c83bc4132561091c1c839d84
Author: Harshavardhana <harsha@harshavardhana.net>
Date:   Sat Sep 14 19:51:13 2013 -0700

    mount.glusterfs: getopts support and cleanup
    
    This patch is an attempt to provide some much needed
    cleanup for future maintenance of `mount.glusterfs`
    
    - Add checks for command failures
    - Spliting large code into subsequent simpler functions
    - Standardized variables
    - use 'bash' instead of 'sh' - since string manipulation
      and variable handling is far superior
    - Overall code cleanup and Copyright change to Red, Hat Inc.
    - Add new style of mounting with a comma separated list
      ~~~
      $ mount -t glusterfs <IP1/HOSTNAME1>,<IP2/HOSTNAME2>,..<IPN/HOSTNAMEN>:/<VOLUME> <MNT>
      ~~~
    - Update age old `manpage` with new options :-)
    
    Change-Id: I294e4d078a067d67d9a67eb8dde5eb2634cc0e45
    BUG: 1040348
    Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
    Reviewed-on: http://review.gluster.org/5931
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Amar Tumballi <amarts@gmail.com>

    Reviewed-by: Vijay Bellur <vbellur@redhat.com>

One patch provides for `glusterfsd` changes which allow us to specify multiple servers from command line

-----------------------------------------------------------------
    $ glusterfs -s server1 .. -s serverN --volfile-id=<volname> \
          <mount_point>
-----------------------------------------------------------------

and the other for `mount.glusterfs`

 -----------------------------------------------------------------
    $ mount -t glusterfs -obackup-volfile-servers=<server2>: \
          <server3>:...:<serverN> <server1>:/<volname> <mount_point>
 -----------------------------------------------------------------

 -----------------------------------------------------------------    
    $ mount -t glusterfs server1,server2,.. serverN:/<volname> \
          <mount_point>
 -----------------------------------------------------------------

Deprecation

Some options which were deprecated from previous releases

glusterfsd


  • `volfile-max-fetch-attempts` - This was deprecated and morphed from `so many attempts for single server`  to  `so many attempts of  multiple servers` . 

mount.glusterfs


  • `fetch-attempts` - This didn't make sense in the new style since we are not going to allow for unnecessary attempts for only single server while we could actually get the same configuration from a different one if that is available, logic dictated that priority should be provided for the getting configuration quickly.  Internally `fetch-attempts` morphed from `so many attempts for single server` to `so many attempts of  multiple servers`.  Backward compatibility is still provided but that would just be a dummy place holder.
  • `backupvolfile-server` - This option did not really do much rather than provide a 'shell' script based failover which was highly racy and wouldn't work during many occasions.  It was necessary to remove this to make room for better options (while it is still provided for backward compatibility in the code)


Monday, December 16, 2013

CTDB(Samba) support for AWS?

Background

It's been long since we have wondered why Amazon AWS infrastructure didn't provide for HA based ip failover. The real issue is due to the fact that 'CTDB' by itself doesn't provide any specific tools for these purposes.  As we know that the overall management with in any given AWS `instance` happens through 'ec2-tools', CTDB also needs in turn access to these tools to provide IP failover.

So i worked on adding some functionality and here is the link to that project CTDB VIP support 

This project is about collecting scripts to support VIP (Virtual IP) and CTDB based HA on AWS infrastructure. Currently tested and supported configuration is using the instances running under VPC (Virtual Private Cloud). 

Usage

To start with configure ec2-tools on any given instance - Amazon docs have extensive documentation about SettingUp EC2 CommandLine.

After setting up EC2 tools, you should be able to run any ec2-tools without errors, if they still report errors - you should go back to Amazon documentation for further inputs.

On your Linux AWS instance

$ wget http://goo.gl/zGLpqa -O ctdb-ec2-1.1.tar.gz
$ rpmbuild -ta ctdb-ec2-1.1.tar.gz
$ yum localinstall ~/rpmbuild/RPMS/noarch/ctdb-ec2*noarch.rpm

Configuration

Once you have installed 'ctdb-ec2' package you have to add now your AWS credentials for everything to be working properly.

$ cat /etc/ctdb/ec2-config
## EDIT THIS FILE
export AWS_ACCESS_KEY=<EDIT>
export AWS_SECRET_KEY=<EDIT>
export JAVA_HOME=<java_installation_directory>
export EC2_HOME=<ec2_tools_installation_directory>
export EC2_URL=<ec2_region_url> ## example --> "https://ec2.us-west-1.amazonaws.com"
export PATH=$PATH:$EC2_HOME/bin

`ec2-config` file needs all the environment variables perhaps supplied or added to your `.bashrc` while you were following `SettingUp EC2 Command line`. Once you are done with the EC2 configuration we are all set.

Repeat the steps listed in 'Usage and Configuration' on all the nodes which are part of the CTDB cluster.

Lastly restart 'ctdb' service after completing 'CTDB' configuration.

$ chkconfig ctdb on
$ service ctdb restart