Man Linux: Main Page and Category List

NAME

       lamshrink - Shrink a LAM universe.

SYNTAX

       lamshrink [-dhv] [-w <delay>] <nodeid>

OPTIONS

       -d            Print detailed debugging information.

       -h            Print useful information on this command.

       -v            Be verbose.

       <nodeid>      Remove the LAM node with this ID.

       -w <delay>    Notify processes on the doomed node and pause for <delay>
                     seconds before proceeding.

DESCRIPTION

       An existing LAM session, initiated by  lamboot(1),  can  be  shrunk  to
       include  less  nodes  with  lamshrink.   One  node  is removed for each
       invocation.  At a minimum, the node ID is given on  the  command  line.
       Once  lamshrink  completes, the node ID is invalid across the remaining
       nodes (as can be seen by running lamnodes(1)).

       Existing application processes on the target  node  can  be  warned  of
       impending  shutdown with the -w option.  A LAM signal (SIGFUSE) will be
       sent to these processes and lamshrink will then  pause  for  the  given
       number  of  seconds  before  proceeding  with  removing  the  node.  By
       default, SIGFUSE is ignored.  A different handler can be installed with
       ksignal(2).

       All application processes on all remaining nodes are always informed of
       the death of a node.  This is also  done  with  a  signal  (SIGSHRINK),
       which  by  default causes a process's runtime route cache to be flushed
       (to remove any cached information on the dead node).  If this signal is
       re-vectored  for the purpose of fault tolerance, the old handler should
       be called at the beginning of the new handler.  The signal does not, by
       itself,  give  the  process information on which node has been removed.
       One technique for getting this information is to query the  router  for
       information  on  all  relevant  nodes using getroute(2).  The dead node
       will cause this routine to return an error.

   FAULT TOLERANCE
       If enabled with lamboot(1), LAM will watch for nodes  that  fail.   The
       procedure  for removing a node that has failed is the same as lamshrink
       after the  warning  step.   In  particular,  the  SIGSHRINK  signal  is
       delivered.

EXAMPLES

       lamshrink -v n1 Remove LAM on n1.  Report about important steps as
           they are done.

       lamshrink n30 -w 10
           Inform  all processes on LAM node 30, that the node will be dead in
           10  seconds.   Wait  10  seconds  and  remove  the  node.   Operate
           silently.

SEE ALSO

       lamboot(1), lamnodes(1), ksignal(2), getroute(2)