public class JsonMetadata extends Object implements LoadMetadata, StoreMetadata
| Constructor and Description |
|---|
JsonMetadata() |
JsonMetadata(String schemaFileName,
String headerFileName,
String statFileName) |
| Modifier and Type | Method and Description |
|---|---|
protected Set<ElementDescriptor> |
findMetaFile(String path,
String metaname,
org.apache.hadoop.conf.Configuration conf)
.
|
String[] |
getPartitionKeys(String location,
org.apache.hadoop.mapreduce.Job job)
Find what columns are partition keys for this input.
|
ResourceSchema |
getSchema(String location,
org.apache.hadoop.mapreduce.Job job)
For JsonMetadata schema is considered optional
This method suppresses (and logs) errors if they are encountered.
|
ResourceSchema |
getSchema(String location,
org.apache.hadoop.mapreduce.Job job,
boolean isSchemaOn)
Read the schema from json metadata file
If isSchemaOn parameter is false, the errors are suppressed and logged
|
ResourceStatistics |
getStatistics(String location,
org.apache.hadoop.mapreduce.Job job)
For JsonMetadata stats are considered optional
This method suppresses (and logs) errors if they are encountered.
|
void |
setFieldDel(byte fieldDel) |
void |
setPartitionFilter(Expression partitionFilter)
Set the filter for partitioning.
|
void |
setRecordDel(byte recordDel) |
void |
storeSchema(ResourceSchema schema,
String location,
org.apache.hadoop.mapreduce.Job job)
Store schema of the data being written
|
void |
storeStatistics(ResourceStatistics stats,
String location,
org.apache.hadoop.mapreduce.Job job)
Store statistics about the data being written.
|
protected Set<ElementDescriptor> findMetaFile(String path, String metaname, org.apache.hadoop.conf.Configuration conf) throws IOException
For each object represented by the path (either directly, or via a glob): If object is a directory, and path/metaname exists, use that as the metadata file. Else if parentPath/metaname exists, use that as the metadata file.
Resolving conflicts, merging the metadata, etc, is not handled by this method and should be taken care of by downstream code.
path - Path, as passed in to a LoadFunc (may be a Hadoop glob)metaname - Metadata file designation, such as .pig_schema or .pig_statsconf - configuration objectIOExceptionpublic String[] getPartitionKeys(String location, org.apache.hadoop.mapreduce.Job job)
LoadMetadatagetPartitionKeys in interface LoadMetadatalocation - Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)job - The Job object - this should be used only to obtain
cluster properties through JobContextImpl.getConfiguration() and not to set/query
any runtime job information.public void setPartitionFilter(Expression partitionFilter) throws IOException
LoadMetadataLoadMetadata.getPartitionKeys(String, Job), then this method is not
called by Pig runtime. This method is also not called by the Pig runtime
if there are no partition filter conditions.setPartitionFilter in interface LoadMetadatapartitionFilter - that describes filter for partitioningIOException - if the filter is not compatible with the storage
mechanism or contains non-partition fields.public ResourceSchema getSchema(String location, org.apache.hadoop.mapreduce.Job job) throws IOException
getSchema in interface LoadMetadatalocation - Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)job - The Job object - this should be used only to obtain
cluster properties through JobContextImpl.getConfiguration() and not to set/query
any runtime job information.IOException - if an exception occurs while determining the schemapublic ResourceSchema getSchema(String location, org.apache.hadoop.mapreduce.Job job, boolean isSchemaOn) throws IOException
location - job - isSchemaOn - IOExceptionpublic ResourceStatistics getStatistics(String location, org.apache.hadoop.mapreduce.Job job) throws IOException
getStatistics in interface LoadMetadatalocation - Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)job - The Job object - this should be used only to obtain
cluster properties through JobContextImpl.getConfiguration() and not to set/query
any runtime job information.IOException - if an exception occurs while retrieving statisticsLoadMetadata.getStatistics(String, Job)public void storeStatistics(ResourceStatistics stats, String location, org.apache.hadoop.mapreduce.Job job) throws IOException
StoreMetadatastoreStatistics in interface StoreMetadatastats - statistics to be recordedlocation - Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)job - The Job object - this should be used only to obtain
cluster properties through JobContextImpl.getConfiguration() and not to set/query
any runtime job information.IOExceptionpublic void storeSchema(ResourceSchema schema, String location, org.apache.hadoop.mapreduce.Job job) throws IOException
StoreMetadatastoreSchema in interface StoreMetadataschema - Schema to be recordedlocation - Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)job - The Job object - this should be used only to obtain
cluster properties through JobContextImpl.getConfiguration() and not to set/query
any runtime job information.IOExceptionpublic void setFieldDel(byte fieldDel)
public void setRecordDel(byte recordDel)
Copyright © 2007-2017 The Apache Software Foundation