Medial Code Documentation
Loading...
Searching...
No Matches
Data Structures | Namespaces | Macros | Typedefs
data.h File Reference

defines common input data structure, and interface for handling the input data More...

#include <string>
#include <vector>
#include <map>
#include "./base.h"
#include "./io.h"
#include "./logging.h"
#include "./registry.h"

Go to the source code of this file.

Data Structures

class  dmlc::DataIter< DType >
 data iterator interface this is not a C++ style iterator, but nice for data pulling:) This interface is used to pull in the data The system can do some useful tricks for you like pre-fetching from disk and pre-computation. More...
 
class  dmlc::Row< IndexType, DType >
 one row of training instance More...
 
struct  dmlc::RowBlock< IndexType, DType >
 a block of data, containing several rows in sparse matrix This is useful for (streaming-sxtyle) algorithms that scans through rows of data examples include: SGD, GD, L-BFGS, kmeans More...
 
class  dmlc::RowBlockIter< IndexType, DType >
 Data structure that holds the data Row block iterator interface that gets RowBlocks Difference between RowBlockIter and Parser: RowBlockIter caches the data internally that can be used to iterate the dataset multiple times, Parser holds very limited internal state and was usually used to read data only once. More...
 
class  dmlc::Parser< IndexType, DType >
 parser interface that parses input data used to load dmlc data format into your own data format Difference between RowBlockIter and Parser: RowBlockIter caches the data internally that can be used to iterate the dataset multiple times, Parser holds very limited internal state and was usually used to read data only once More...
 
struct  dmlc::ParserFactoryReg< IndexType, DType >
 registry entry of parser factory More...
 

Namespaces

namespace  dmlc
 namespace for dmlc
 

Macros

#define __DMLC_COMMA   ,
 
#define DMLC_REGISTER_DATA_PARSER(IndexType, DataType, TypeName, FactoryFunction)
 Register a new distributed parser to dmlc-core.
 

Typedefs

typedef float dmlc::real_t
 this defines the float point that will be used to store feature values
 
typedef unsigned dmlc::index_t
 this defines the unsigned integer type that can normally be used to store feature index
 

Detailed Description

defines common input data structure, and interface for handling the input data

Copyright (c) 2015 by Contributors

Macro Definition Documentation

◆ DMLC_REGISTER_DATA_PARSER

#define DMLC_REGISTER_DATA_PARSER (   IndexType,
  DataType,
  TypeName,
  FactoryFunction 
)
Value:
DMLC_REGISTRY_REGISTER(ParserFactoryReg<IndexType __DMLC_COMMA DataType>, \
ParserFactoryReg ## _ ## IndexType ## _ ## DataType, TypeName) \
.set_body(FactoryFunction)
#define DMLC_REGISTRY_REGISTER(EntryType, EntryTypeName, Name)
Generic macro to register an EntryType There is a complete example in FactoryRegistryEntryBase.
Definition registry.h:250

Register a new distributed parser to dmlc-core.

Parameters
IndexTypeThe type of Batch index, can be uint32_t or uint64_t
DataTypeThe type of Batch label and value, can be real_t or int
TypeNameThe typename of of the data.
FactoryFunctionThe factory function that creates the parser.
// define the factory function
template<typename IndexType, typename DType = real_t>
Parser<IndexType, DType>*
CreateLibSVMParser(const char* uri, unsigned part_index, unsigned num_parts) {
return new LibSVMParser(uri, part_index, num_parts);
}
// Register it to DMLC
// Then we can use Parser<uint32_t>::Create(uri, part_index, num_parts, "libsvm");
// to create the parser
DMLC_REGISTER_DATA_PARSER(uint32_t, real_t, libsvm, CreateLibSVMParser<uint32_t>);
DMLC_REGISTER_DATA_PARSER(uint64_t, real_t, libsvm, CreateLibSVMParser<uint64_t>);
#define DMLC_REGISTER_DATA_PARSER(IndexType, DataType, TypeName, FactoryFunction)
Register a new distributed parser to dmlc-core.
Definition data.h:358