Dynamically add replicated server

From Codawiki

bocabocp This document will detail the steps required when you want to add a new server to your cluster and have it replicate an existing volume group. It applies to version 6.x of Coda and will not work on older versions.

Table of contents

Setup Replica

First setup the new coda server by running the configure scripts. See Optimizing Coda 6.x for help making configuration decisions. Don't create any volumes or at least don't create any volumes with names identical to those you will be replicating. Use ps to check and make sure codasrv has started and check the /vice/srv/SrvLog and /vice/srv/SrvErr files to be sure there are no problems.

Note: DNS must be setup properly or else you need to be sure your /etc/hosts, /etc/coda/realms, and /etc/coda/server.conf files have everything properly configured. Coda is very sensitive to mistakes in this area.

Replicate Volumes

Go to the scm machine (the master coda server), which should have one or more volumes that are setup and running fine with clients. If this isn't the case or you need to create a new volume that you want to replicate, use the createvol_rep script and the standard documentation. Now follow these steps:

  • Add a new line to the /vice/db/servers file with the name of the new server and a unique number. File should look like this (where scm is the resolvable host name of the scm and replica is the host name of the replica):
scm         1
replica     2
  • Restart the server on the scm so that the changes to the server file take effect.
  • Look at the /vice/db/VRList file and pull the volume name, replica id, number of entries, and existing volume ids from the file. Here's a sample:
volumename replica_id numentries volid1  volid2  volidn
myvol      7f000000   2          1000001 2000001 0 0 0 0 0 0 0
  • Using the replica_id found in the VRList file, run this command to make sure the volume is available for conflict resolution:
# volutil setlogparms replica_id reson 4  
  • Using the information taken from the VRList file, run this command (/vicepa is the data partition location on the replica server as specified in vicetab on that server):
# volutil -h newservername create_rep /vicepa volname.numentries replica_id
  • The output from that command will have a new volume id. This should be plugged in in place of the first zero on the appropriate line of VRList. The number of entries number needs to be increased by one. So using the VRList example line from above, and assuming volutil gave us a new volume id of 3000001, we would change the line to look like this:
myvol 7f000000 3 1000001 2000001 3000001 0 0 0 0 0 0
  • Now run the commands:
# bldvldb.sh newservername
# volutil -h localhost makevrdb /vice/db/VRList

Copy Data To New Volumes

You need to move to a client to cause the data to be copied from the existing volumes to the new volumes, but first you need to get the client to recognize that the volume is now replicated.

From a client, run the following commands:

$ cfs checkservers
$ cfs checkvolumes
$ cfs strong
$ cfs flushobject volumename
$ ls -lR /coda/yourrealm

That last part with the ls command is the part that does the magic. Before that you're just making sure that the client is connected and has updated information. If you hit any problems, try restarting the coda client. Also, it may be wise to run codacon in a separate window to make sure you don't get disconnected during the update process. If you do, you'll need to reconnect (cfs reconnect and cfs wr should do the trick) and do your ls again.

Making Sure It Worked

There are a number of checks to make. If the replica server is brand new and has no other volumes on it, you can look in your /vicepa directory and see if the FTREEDB file is larger than 0 bytes. But there are some other things to do as well.

From a Client

Note: if any of this doesn't work, you may want to try restarting your client and doing it again before you investigate any further.

Ask coda what servers it thinks are hosting a particular file or directory:

$ cfs whereis /coda/realm/path/to/volume

From a Server

In the commands below, it will be assumed that scm is the resolvable host name of your master coda server and replica is the resolvable host name of your replica.

# getvolinfo scm volume_name

which should return something like this:

RPC2 connection to scm:2432 successful.
Returned volume information for volume_name
       VolumeId 7f000000
       Replicated volume (type 3)

       Type0 id 0
       Type1 id 0
       Type2 id 0
       Type3 id 7f000000
       Type4 id 0

       ServerCount 2
       Replica0 id 1000001, Server0 ip_of_scm
       Replica1 id 2000001, Server1 ip_of_replica
       Replica2 id 0, Server2 0.0.0.0
       Replica3 id 0, Server3 0.0.0.0
       Replica4 id 0, Server4 0.0.0.0
       Replica5 id 0, Server5 0.0.0.0
       Replica6 id 0, Server6 0.0.0.0
       Replica7 id 0, Server7 0.0.0.0

       VSGAddr 0

References

There are three threads on the mailing list that I'm aware of that have decent information on this topic: