=head1 NAME

iPE::SequenceReader::NoLoad - Base class of sequence readers which do not store the acutal data in memory.

=head1 DESCRIPTION

NoLoad sequences are ones implemented to save on memory by never actually loading the sequence into memory.  This is a base class, which reads in the size of the file, the header, and the header length itself.  Typically all NoLoad sequences will have a FASTA style header, regardless of whether they conform to FASTA standards.

Some subclass will be needed to implement this 

=head1 FUNCTIONS

=over 8

=cut

package iPE::SequenceReader::NoLoad;

use base("iPE::SequenceReader");
use Carp;
use strict;

=item new (members)

This initializes the filehandle (by inheritence) then retrieves the header, and file size.  Subclasses will need to do more with the information provided by the file in order to do proper calculations for subsequence offsets.

=cut
sub new {
    my $class = shift;
    my $this = $class->SUPER::new(@_);

    my $fh = $this->fh;
    $this->{fileSize_} = (-s $this->filename);
    my $header = <$fh>;
    $this->{headerLength_} = length($header);
    chomp $header;
    $header =~ s/>//;
    $this->{header_} = $header;
    
    seek $fh, 0, 0;
    
    return $this;
}

sub DESTROY {
    my ($this) = @_;

    close $this->fh;
}

=item fileSize (), header (), headerLength ()

fileSize () is the total size of the file.  header () is the chomped and ">" free part of the FASTA header, and headerLength () is the original length of the header.

=cut
sub fileSize        { shift->{fileSize_}        }
sub header          { shift->{header_}          }
sub headerLength    { shift->{headerLength_}    }

sub _undefed_subroutine_for_noload {
    my ($this, $name) = @_;
    croak __PACKAGE__." does not define subroutine $name.\n".
       "This is a NoLoad type sequence, so you must get pieces of the sequence ".
       "at a time.\n"
}

sub _undefed_subroutine {
    my ($this, $name) = @_;
    die __PACKAGE__." does not define subroutine $name.\n".
       "Override in ".ref($this)."\n";
}

sub def     { shift->_undefed_subroutine_for_noload("def")  }
sub seqRef  { shift->_undefed_subroutine_for_noload("seq")  }
sub arr     { shift->_undefed_subroutine_for_noload("arr")  }
sub length  { shift->_undefed_subroutine("length")          }

sub next    { 
    croak __PACKAGE__." does not define next.\n".
        "NoLoad type sequences only use a single header-sequence(s) pair.\n"
}

=item getSeq (start, end[, seqNum]), getRevSeq (start, length)

This routine is nonexistent in the base class, but is a prototype for all sequences which fall under the NoLoad category.  The sequences in NoLoad should implement a getSeq routine similar to this one such that the subsequence from start to end of the seqNum-th sequence (if defined, 0 by default) is retrieved.  Note that getRevSeq also reverse-complements the sequence.

=cut
sub getSeq      { _undefed_subroutine("getSeq")     }
sub getRevSeq   { _undefed_subroutine("getRevSeq")  }

=item getContext (pos, order[, targetOrder])

This routine (again, unimplemented in the base class) is a prototype for all NoLoad sequences.  This should return the context, in whatever expected format is there for this sequence type, of the base at position pos with order order.  That is if you pass 15, 1, it will return the base at position 15 plus the previous base.  targetOrder is specified for multiple sequence NoLoads, and is relevant when a different context size is expected for the target sequence from the informant sequences.

=cut
sub getContext  { _undefed_subroutine("getContext")     }

sub type { "noload" }

=back

=head1 SEE ALSO

L<iPE::SequenceReader>

=head1 AUTHOR

Bob Zimmermann (rpz@cse.wustl.edu) 
(With much acknowledgement to Sam Gross's code).

=cut
1;
