Man Linux: Main Page and Category List

NAME

       lamtrace - Unload LAM trace data.

SYNTAX

       lamtrace   [-hkvR]  [-mpi]  [-l  <listno>]  [-f  <#secs>]  [<filename>]
              [<nodes>] [<processes>]

OPTIONS

       -h            Print useful information on this command.

       -k            Copy and do not remove trace data.

       -v            Be verbose.

       -R            Delete all trace data from the specified nodes.

       -l            Unload only from the given list number.

       -mpi          Unload trace data for an MPI application.

       -f <#secs>    Signal target  processes  to  flush  trace  data  to  the
                     daemon.  Then wait <#secs> before unloading.

       <filename>    Place trace data into this file (default: def.lamtr).

DESCRIPTION

       The  -t  option  of  mpirun(1)  and loadgo(1) allows the application to
       generate execution traces.  These traces are first stored in  a  buffer
       within  each application process.  When the buffer is full and when the
       application terminates, the runtime buffer  is  flushed  to  the  trace
       daemon  (a  structural  component  within  the  LAM daemon).  The trace
       daemon will also collect data up to a pre-compiled limit.  Beyond  this
       limit,  the  oldest  traces  in will be forgotten in favor of the newer
       traces.

       After an application has finished,  the  record  of  its  execution  is
       stored  in  the  trace  daemons  of  each  node  that  was  running the
       application.  The lamtrace command can be used to retrieve these traces
       and  store  them in one file for display by a performance visualization
       tool, such as xmpi(1).  If the  application  was  started  by  xmpi(1),
       lamtrace  is  not  normally  needed  as the equivalent functionality is
       invoked with a button.

       Incomplete trace data can be unloaded while the application is running.
       The  output  file must not exist prior to invoking lamtrace.  This is a
       good situation to use the -k option, which preserves the trace daemon's
       contents  after  unloading.  Each reload will then get the entire run's
       trace data up to the present time.

       A running process is likely to be holding the most recent trace data in
       an internal buffer.  A standard LAM signal, LAM_SIGTRACE (see doom(1)),
       causes trace enabled processes to flush the internal  trace  buffer  to
       the  daemon.   The  -f option tells lamtrace to send this signal to all
       target  processes  before  unloading  trace  data.   A  race  condition
       develops  between  the  target process storing trace data to the daemon
       and the unloading procedure.  The problem is foisted upon the user  who
       gives a delay parameter after -f.

       Trace  data  are organized by node, process identifier and list number.
       A process can store traces on any node, although the local node is  the
       obvious,  least  intrusive  choice.  The process can identify itself in
       any meaningful way (getpid(2) is a good idea) The list number  is  also
       chosen  by  the  process.   These  values may be set by an instrumented
       library, such  as  libmpi(3),  or  directly  by  the  application  with
       lam_rtrstore(2).   Unloading  flexibility  follows that of storing with
       the -l option selecting the list number, and standard LAM command  line
       mnemonics selecting nodes and processes.

       Dropping  old  traces  when a pre-compiled volume limit is reached only
       happens for positive list numbers.  Traces in negatively numbered lists
       will  be  collected until the underlying system runs out of memory.  Do
       not use negative list numbers for high volume trace data.

       If no process selection is given on the command line, trace  data  will
       be unloaded for all processes on each specified node.

       LAM,  its  trace  daemon and lamtrace are all unaware of the format and
       meaning of traces.

       The -R option does not unload trace data.  It causes the  target  trace
       daemons  to  free  the memory occupied by trace data in the given list.
       If all lists  are  specified  (no  -l  option),  the  trace  daemon  is
       effectively reset to its state after initiating LAM.

   Unloading MPI Trace Data
       A special capability, selected by the -mpi option, exists to search for
       and unload only the trace data generated by an  MPI  application.   For
       this purpose, lamtrace is aware of the particular reserved list numbers
       that libmpi(3) uses to  store  traces.   It  begins  by  searching  all
       specified  nodes and processes (the whole LAM multicomputer, if nothing
       is specified) for a special  trace  generated  by  process  rank  0  in
       MPI_COMM_WORLD  of an MPI application.  This special trace contains the
       node and process identifiers of all processes  in  that  MPI_COMM_WORLD
       communicator.   lamtrace  then  uses  the node / process information to
       collect all trace data generated by libmpi(3).

       If multiple world communicators exist within LAM's trace  daemons,  the
       first  one  found  is  used.   Multiple  worlds  may  be present due to
       multiple concurrent applications, trace data from a  previous  run  not
       removed  (either  with lamtrace or lamclean(1)), or an application that
       spawns processes.  A particular world communicator can  be  located  by
       providing precise node and process location to lamtrace.

       The -mpi option is not compatible with the -l option.

EXAMPLES

       lamtrace -v -mpi mytraces
           Unload  trace  data  into  the  file  "mytraces" from the first MPI
           application found in a search  of  the  entire  LAM  multicomputer.
           Report on important steps as they are done.

       lamtrace n30 -l 5 p21367
           Unload  trace  data  from  list  5  of process ID 21367 on node 30.
           Operate silently.

       lamtrace -mpi n30 p21367
           Unload trace data  from  the  MPI  application  world  group  whose
           process rank 0 has PID 21367 and is/was running on node 30.

BUGS

       Since  trace  data  can  be unloaded during an application's execution,
       there should be a way to incrementally append to an output file.   This
       is a bit tricky with -mpi, but it can be done.

FILES

       def.lamtr     default output file

SEE ALSO

       mpirun(1), loadgo(1), lam_rtrstore(1), lamclean(1), libmpi(3), xmpi(1)