The "background probabilities" (by default, dinucleotide
probabilities) that Sigma uses are, if no other options exist,
extracted from the input sequence itself.  This is generally not
advisable: it is better to estimate them from larger quantities of
similar sequence (eg, all intergenic sequence in the organism of
interest).  Sigma offers two options for this:  with the -b option,
one can supply an auxiliary file (in fasta format) containing such
sequences, or with the -B option, one can supply a file containing
just the single-nucleotide and di-nucleotide frequencies.  An
example is in this directory.  Each line either begins with # (and
is ignored), or contains a single nucleotide or di-nucleotide,
whitespace, and the frequency.  All possible single- and
di-nucleotides should be present; tri-nucleotides and higher will be
ignored.

An example file is yeast_bg in this directory and corresponds to
intergenic sequence in S. cerevisiae.
