Friday, December 20, 2013

GlusterFS and its nature of configuration high availability

Nature

Providing high availability for a volume configuration for a clustered filesystem is of utmost importance, over last couple of years GlusterFS client never really exposed an essential necessity - so we relied on rudimentary style through shell script to provide for and use RRDNS as a workaround with virtual ip's.

Both shell script and RRDNS never really provided us with complete claim that we are going to provide for such configuration high availability. Lot of consumers of GlusterFS client side ended up with broken setup's due to this fact which led to sysadmin anguish.

Recently we came up with an idea that we should get rid of RRDNS as a necessity and also be able to provide all the servers as a list to poll for mounting from command line. We ended with two different patches which provides for the must needed functionality - which are listed below.

commit b610f1be7cd71b8f3e51c224c8b6fe0e7366c8cf
Author: Harshavardhana <harsha@harshavardhana.net>
Date:   Wed Jul 24 13:16:08 2013 -0700

    glusterfsd: Round robin DNS should not be relied upon with
    config service availability for clients.
 
    Backupvolfile server as it stands is slow and prone to errors
    with mount script and its combination with RRDNS. Instead in
    theory it should use all the available nodes in 'trusted pool'
    by default (Right now we don't have a mechanism in place for
    this)
 
    Nevertheless this patch provides a scenario where a list of
    volfile-server can be provided on command as shown below
 
    -----------------------------------------------------------------
    $ glusterfs -s server1 .. -s serverN --volfile-id=<volname> \
          <mount_point>
    -----------------------------------------------------------------
                       OR
    -----------------------------------------------------------------
    $ mount -t glusterfs -obackup-volfile-servers=<server2>: \
          <server3>:...:<serverN> <server1>:/<volname> <mount_point>
    -----------------------------------------------------------------
 
    Here ':' is used as a separator for mount script parsing
 
    Now these will be remembered and recursively attempted for
    fetching vol-file until exhausted. This would ensure that the
    clients get 'volume' configs in a consistent manner avoiding the
    need to poll through RRDNS.
 
    Change-Id: If808bb8a52e6034c61574cdae3ac4e7e83513a40
    BUG: 986429
    Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
    Reviewed-on: http://review.gluster.org/5400
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anand Avati <avati@redhat.com>


commit 0404be9ca1d9fa15c83bc4132561091c1c839d84
Author: Harshavardhana <harsha@harshavardhana.net>
Date:   Sat Sep 14 19:51:13 2013 -0700

    mount.glusterfs: getopts support and cleanup
    
    This patch is an attempt to provide some much needed
    cleanup for future maintenance of `mount.glusterfs`
    
    - Add checks for command failures
    - Spliting large code into subsequent simpler functions
    - Standardized variables
    - use 'bash' instead of 'sh' - since string manipulation
      and variable handling is far superior
    - Overall code cleanup and Copyright change to Red, Hat Inc.
    - Add new style of mounting with a comma separated list
      ~~~
      $ mount -t glusterfs <IP1/HOSTNAME1>,<IP2/HOSTNAME2>,..<IPN/HOSTNAMEN>:/<VOLUME> <MNT>
      ~~~
    - Update age old `manpage` with new options :-)
    
    Change-Id: I294e4d078a067d67d9a67eb8dde5eb2634cc0e45
    BUG: 1040348
    Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
    Reviewed-on: http://review.gluster.org/5931
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Amar Tumballi <amarts@gmail.com>

    Reviewed-by: Vijay Bellur <vbellur@redhat.com>

One patch provides for `glusterfsd` changes which allow us to specify multiple servers from command line

-----------------------------------------------------------------
    $ glusterfs -s server1 .. -s serverN --volfile-id=<volname> \
          <mount_point>
-----------------------------------------------------------------

and the other for `mount.glusterfs`

 -----------------------------------------------------------------
    $ mount -t glusterfs -obackup-volfile-servers=<server2>: \
          <server3>:...:<serverN> <server1>:/<volname> <mount_point>
 -----------------------------------------------------------------

 -----------------------------------------------------------------    
    $ mount -t glusterfs server1,server2,.. serverN:/<volname> \
          <mount_point>
 -----------------------------------------------------------------

Deprecation

Some options which were deprecated from previous releases

glusterfsd


  • `volfile-max-fetch-attempts` - This was deprecated and morphed from `so many attempts for single server`  to  `so many attempts of  multiple servers` . 

mount.glusterfs


  • `fetch-attempts` - This didn't make sense in the new style since we are not going to allow for unnecessary attempts for only single server while we could actually get the same configuration from a different one if that is available, logic dictated that priority should be provided for the getting configuration quickly.  Internally `fetch-attempts` morphed from `so many attempts for single server` to `so many attempts of  multiple servers`.  Backward compatibility is still provided but that would just be a dummy place holder.
  • `backupvolfile-server` - This option did not really do much rather than provide a 'shell' script based failover which was highly racy and wouldn't work during many occasions.  It was necessary to remove this to make room for better options (while it is still provided for backward compatibility in the code)


No comments:

Post a Comment