Man Linux: Main Page and Category List

NAME

       compalign - compare two multiple alignments

SYNOPSIS

       compalign [-options] <trusted-alignment> <test-alignment>

DESCRIPTION

       compalign  calculates  the  fractional  "identity"  between the trusted
       alignment and the test alignment. The two files  must  contain  exactly
       the same sequences, in exactly the same order.

       The  identity  of  the  multiple  sequence alignments is defined as the
       averaged identity over all N(N-1)/2 pairwise alignments.

       The fractional identity of two sets of pairwise alignments is  in  turn
       defined  as follows (for aligned known sequences k1 and k2, and aligned
       test sequences t1 and t2):

            matched columns / total columns

            where total columns = the total number of columns in which there is
            a valid (nongap) symbol in k1 or k2;

            matched columns = the number of columns in which one of the
            following is true:

                 k1 and k2 both have valid symbols at a given column; t1 and t2
                 have the same symbols aligned in a column of the t1/t2
                 alignment;

                 k1 has a symbol aligned to a gap in k2; that symbol in t1 is
                 also aligned to a gap;

                 k2 has a symbol aligned to a gap in k1; that symbol in t2 is
                 also aligned to a gap.

       Because scores for all possible pairs are calculated, the algorithm  is
       of  order  (N^2)L for N sequences of length L; large sequence sets will
       take a while.

OPTIONS

       Available options:

       -h     Print short help and usage info.

       -c     Only compare under marked #=CS consensus structure.

       --informat <s>
              Specify that  both  alignments  are  in  format  <s>  (MSF,  for
              instance).

       --quiet
              Suppress verbose header (used in regression testing).

SEE ALSO

       afetch(1),    alistat(1),   compstruct(1),   revcomp(1),   seqsplit(1),
       seqstat(1),    sfetch(1),    shuffle(1),    sindex(1),    sreformat(1),
       stranslate(1), weight(1).

AUTHOR

       Sean Eddy
       HHMI/Department of Genetics
       Washington University School of Medicine
       4444 Forest Park Blvd., Box 8510
       St Louis, MO 63108 USA
       Phone: 1-314-362-7666
       FAX  : 1-314-362-2157
       Email: eddy@genetics.wustl.edu

       This manual page was written by Nelson A. de Oliveira <naoliv@gmail.com>,
       for the Debian project (but may be used by others).

                        Mon, 01 Aug 2005 15:28:08 -0300