Medial Code Documentation
Loading...
Searching...
No Matches
Public Types | Public Member Functions | Static Public Member Functions
dmlc::Parser< IndexType, DType > Class Template Referenceabstract

parser interface that parses input data used to load dmlc data format into your own data format Difference between RowBlockIter and Parser: RowBlockIter caches the data internally that can be used to iterate the dataset multiple times, Parser holds very limited internal state and was usually used to read data only once More...

#include <data.h>

Inheritance diagram for dmlc::Parser< IndexType, DType >:
dmlc::DataIter< DType > dmlc::data::ParserImpl< IndexType, real_t > dmlc::data::DensifyParser< IndexType > dmlc::data::ParserImpl< IndexType, DType > dmlc::data::TextParserBase< IndexType, real_t > dmlc::data::ParquetParser< IndexType, DType > dmlc::data::TextParserBase< IndexType, DType > dmlc::data::CSVParser< IndexType, real_t > dmlc::data::LibFMParser< IndexType, real_t > dmlc::data::LibSVMParser< IndexType, real_t > dmlc::data::CSVParser< IndexType, DType > dmlc::data::LibFMParser< IndexType, DType > dmlc::data::LibSVMParser< IndexType, DType > parser_test::CSVParserTest< IndexType, DType > parser_test::LibFMParserTest< IndexType, DType > parser_test::LibSVMParserTest< IndexType, DType >

Public Types

typedef Parser< IndexType, DType > *(* Factory) (const std::string &path, const std::map< std::string, std::string > &args, unsigned part_index, unsigned num_parts)
 Factory type of the parser.
 

Public Member Functions

virtual size_t BytesRead (void) const =0
 
Parser< uint32_t, real_t > * Create (const char *uri_, unsigned part_index, unsigned num_parts, const char *type)
 
Parser< uint64_t, real_t > * Create (const char *uri_, unsigned part_index, unsigned num_parts, const char *type)
 
Parser< uint32_t, int32_t > * Create (const char *uri_, unsigned part_index, unsigned num_parts, const char *type)
 
Parser< uint64_t, int32_t > * Create (const char *uri_, unsigned part_index, unsigned num_parts, const char *type)
 
Parser< uint32_t, int64_t > * Create (const char *uri_, unsigned part_index, unsigned num_parts, const char *type)
 
Parser< uint64_t, int64_t > * Create (const char *uri_, unsigned part_index, unsigned num_parts, const char *type)
 
- Public Member Functions inherited from dmlc::DataIter< DType >
virtual ~DataIter (void) DMLC_THROW_EXCEPTION
 destructor
 
virtual void BeforeFirst (void)=0
 set before first of the item
 
virtual bool Next (void)=0
 move to next item
 
virtual const DType & Value (void) const =0
 get current data
 

Static Public Member Functions

static Parser< IndexType, DType > * Create (const char *uri_, unsigned part_index, unsigned num_parts, const char *type)
 create a new instance of parser based on the "type"
 

Detailed Description

template<typename IndexType, typename DType = real_t>
class dmlc::Parser< IndexType, DType >

parser interface that parses input data used to load dmlc data format into your own data format Difference between RowBlockIter and Parser: RowBlockIter caches the data internally that can be used to iterate the dataset multiple times, Parser holds very limited internal state and was usually used to read data only once

See also
RowBlockIter
Template Parameters
IndexTypetype of index in RowBlock
DTypetype of label and value in RowBlock Create function was only implemented for IndexType uint64_t and uint32_t and DType real_t and int

Member Function Documentation

◆ BytesRead()

template<typename IndexType , typename DType = real_t>
virtual size_t dmlc::Parser< IndexType, DType >::BytesRead ( void  ) const
pure virtual

◆ Create()

template<typename IndexType , typename DType = real_t>
static Parser< IndexType, DType > * dmlc::Parser< IndexType, DType >::Create ( const char *  uri_,
unsigned  part_index,
unsigned  num_parts,
const char *  type 
)
static

create a new instance of parser based on the "type"

Parameters
uri_the uri of the input, can contain hdfs prefix
part_indexthe part id of current input
num_partstotal number of splits
typetype of dataset can be: "libsvm", "auto", ...

When "auto" is passed, the type is decided by format argument string in URI.

Returns
the created parser

The documentation for this class was generated from the following file: