Medial Code Documentation
|
A helper class for prediction when the DMatrix is split by column. More...
Public Member Functions | |
ColumnSplitHelper (std::int32_t n_threads, gbm::GBTreeModel const &model, uint32_t tree_begin, uint32_t tree_end) | |
ColumnSplitHelper (ColumnSplitHelper const &)=delete | |
ColumnSplitHelper & | operator= (ColumnSplitHelper const &)=delete |
ColumnSplitHelper (ColumnSplitHelper &&) noexcept=delete | |
ColumnSplitHelper & | operator= (ColumnSplitHelper &&) noexcept=delete |
void | PredictDMatrix (DMatrix *p_fmat, std::vector< bst_float > *out_preds) |
void | PredictInstance (SparsePage::Inst const &inst, std::vector< bst_float > *out_preds) |
void | PredictLeaf (DMatrix *p_fmat, std::vector< bst_float > *out_preds) |
A helper class for prediction when the DMatrix is split by column.
When data is split by column, a local DMatrix only contains a subset of features. All the workers in a distributed/federated environment need to cooperate to produce a prediction. This is done in two passes with the help of bit vectors.
First pass: for each tree: for each row: for each node: if the feature is available and passes the filter, mark the corresponding decision bit if the feature is missing, mark the missing bit
Once the two bit vectors are populated, run allreduce on both, using bitwise OR for the decision bits, and bitwise AND for the missing bits.
Second pass: for each tree: for each row: find the leaf node using the decision and missing bits, return the leaf value
The size of the decision/missing bit vector is: number of rows in a batch * sum(number of nodes in each tree)