This class is used to store some meta(non-feature) data for training data, e.g. labels, weights, initial scores, qurey level informations.
More...
|
| Metadata () |
| Null costructor.
|
|
void | Init (const char *data_filename, const char *initscore_file) |
| Initialization will load qurey level informations, since it is need for sampling data.
|
|
void | Init (const Metadata &metadata, const data_size_t *used_indices, data_size_t num_used_indices) |
| init as subset
|
|
void | LoadFromMemory (const void *memory) |
| Initial with binary memory.
|
|
| ~Metadata () |
| Destructor.
|
|
void | Init (data_size_t num_data, int weight_idx, int query_idx) |
| Initial work, will allocate space for label, weight(if exists) and query(if exists)
|
|
void | PartitionLabel (const std::vector< data_size_t > &used_indices) |
| Partition label by used indices.
|
|
void | CheckOrPartition (data_size_t num_all_data, const std::vector< data_size_t > &used_data_indices) |
| Partition meta data according to local used indices if need.
|
|
void | SetLabel (const label_t *label, data_size_t len) |
|
void | SetWeights (const label_t *weights, data_size_t len) |
|
void | SetQuery (const data_size_t *query, data_size_t len) |
|
void | SetInitScore (const double *init_score, data_size_t len) |
| Set initial scores.
|
|
void | SaveBinaryToFile (const VirtualFileWriter *writer) const |
| Save binary data to file.
|
|
size_t | SizesInByte () const |
| Get sizes in byte of this object.
|
|
const label_t * | label () const |
| Get pointer of label.
|
|
void | SetLabelAt (data_size_t idx, label_t value) |
| Set label for one record.
|
|
void | SetWeightAt (data_size_t idx, label_t value) |
| Set Weight for one record.
|
|
void | SetQueryAt (data_size_t idx, data_size_t value) |
| Set Query Id for one record.
|
|
const label_t * | weights () const |
| Get weights, if not exists, will return nullptr.
|
|
const data_size_t * | query_boundaries () const |
| Get data boundaries on queries, if not exists, will return nullptr we assume data will order by query, the interval of [query_boundaris[i], query_boundaris[i+1]) is the data indices for query i.
|
|
data_size_t | num_queries () const |
| Get Number of queries.
|
|
const label_t * | query_weights () const |
| Get weights for queries, if not exists, will return nullptr.
|
|
const double * | init_score () const |
| Get initial scores, if not exists, will return nullptr.
|
|
int64_t | num_init_score () const |
| Get size of initial scores.
|
|
Metadata & | operator= (const Metadata &)=delete |
| Disable copy.
|
|
| Metadata (const Metadata &)=delete |
| Disable copy.
|
|
This class is used to store some meta(non-feature) data for training data, e.g. labels, weights, initial scores, qurey level informations.
Some details:
- Label, used for traning.
- Weights, weighs of records, optional
- Query Boundaries, necessary for lambdarank. The documents of i-th query is in [ query_boundarise[i], query_boundarise[i+1] )
- Query Weights, auto calculate by weights and query_boundarise(if both of them are existed) the weight for i-th query is sum(query_boundarise[i] , .., query_boundarise[i+1]) / (query_boundarise[i + 1] - query_boundarise[i+1])
- Initial score. optional. if exsitng, the model will boost from this score, otherwise will start from 0.