public abstract class DefaultAbstractBag extends Object implements DataBag
| Modifier and Type | Class and Description |
|---|---|
static class |
DefaultAbstractBag.BagDelimiterTuple |
static class |
DefaultAbstractBag.EndBag |
static class |
DefaultAbstractBag.StartBag |
| Modifier and Type | Field and Description |
|---|---|
static Tuple |
endBag |
protected static int |
MAX_SPILL_FILES |
protected Collection<Tuple> |
mContents |
protected long |
mSize |
protected FileList |
mSpillFiles |
static Tuple |
startBag |
| Constructor and Description |
|---|
DefaultAbstractBag() |
| Modifier and Type | Method and Description |
|---|---|
void |
add(Tuple t)
Add a tuple to the bag.
|
void |
addAll(Collection<Tuple> c) |
void |
addAll(DataBag b)
Add contents of a bag to the bag.
|
void |
addAll(Iterable<Tuple> iterable)
Add contents of an iterable (a collection or a DataBag)
|
void |
clear()
Clear out the contents of the bag, both on disk and in memory.
|
int |
compareTo(Object other)
This method is potentially very expensive since it may require a
sort of the bag; don't call it unless you have to.
|
boolean |
equals(Object other) |
long |
getMemorySize()
Return the size of memory usage.
|
protected DataOutputStream |
getSpillFile()
Get a file to spill contents to.
|
int |
hashCode() |
protected void |
incSpillCount(Enum counter) |
protected void |
incSpillCount(Enum counter,
long numRecsSpilled) |
protected void |
markSpillableIfNecessary()
All bag implementations that can get big enough to be spilled
should call this method after every time they add an element.
|
void |
markStale(boolean stale)
This is used by FuncEvalSpec.FakeDataBag.
|
void |
readFields(DataInput in)
Read a bag from disk.
|
protected void |
reportProgress()
Report progress to HDFS.
|
protected void |
sampleContents()
Sample every SPILL_SAMPLE_FREQUENCYth tuple
until we reach a max of SPILL_SAMPLE_SIZE
to get an estimate of the tuple sizes.
|
long |
size()
Get the number of elements in the bag, both in memory and on disk.
|
String |
toString()
Write the bag into a string.
|
protected void |
warn(String msg,
Enum warningEnum,
Throwable e) |
void |
write(DataOutput out)
Write a bag's contents to disk.
|
clone, finalize, getClass, notify, notifyAll, wait, wait, waitisDistinct, isSorted, iteratorforEach, spliteratorprotected Collection<Tuple> mContents
protected FileList mSpillFiles
protected long mSize
public static final Tuple startBag
public static final Tuple endBag
protected static final int MAX_SPILL_FILES
public long size()
protected void sampleContents()
public void add(Tuple t)
protected void markSpillableIfNecessary()
public void addAll(DataBag b)
DataBagpublic void addAll(Collection<Tuple> c)
public void addAll(Iterable<Tuple> iterable)
iterable - a Collection or DataBag to add contents ofpublic long getMemorySize()
getMemorySize in interface Spillablepublic void clear()
public int compareTo(Object other)
compareTo in interface Comparablepublic void write(DataOutput out) throws IOException
write in interface org.apache.hadoop.io.Writableout - DataOutput to write data to.IOException - (passes it on from underlying calls).public void readFields(DataInput in) throws IOException
readFields in interface org.apache.hadoop.io.Writablein - DataInput to read data from.IOException - (passes it on from underlying calls).public void markStale(boolean stale)
protected DataOutputStream getSpillFile() throws IOException
IOExceptionprotected void reportProgress()
protected void incSpillCount(Enum counter)
protected void incSpillCount(Enum counter, long numRecsSpilled)
Copyright © 2007-2017 The Apache Software Foundation