|
Medial Code Documentation
|
DMatrix type for QuantileDMatrix, the naming IterativeDMatix is due to its construction process.
More...
#include <iterative_dmatrix.h>
Public Member Functions | |
| IterativeDMatrix (DataIterHandle iter_handle, DMatrixHandle proxy, std::shared_ptr< DMatrix > ref, DataIterResetCallback *reset, XGDMatrixCallbackNext *next, float missing, int nthread, bst_bin_t max_bin) | |
| bool | EllpackExists () const override |
| bool | GHistIndexExists () const override |
| bool | SparsePageExists () const override |
| DMatrix * | Slice (common::Span< int32_t const >) override |
| DMatrix * | SliceCol (int, int) override |
| BatchSet< SparsePage > | GetRowBatches () override |
| BatchSet< CSCPage > | GetColumnBatches (Context const *) override |
| BatchSet< SortedCSCPage > | GetSortedColumnBatches (Context const *) override |
| BatchSet< GHistIndexMatrix > | GetGradientIndex (Context const *ctx, BatchParam const ¶m) override |
| BatchSet< EllpackPage > | GetEllpackBatches (Context const *ctx, const BatchParam ¶m) override |
| BatchSet< ExtSparsePage > | GetExtBatches (Context const *ctx, BatchParam const ¶m) override |
| bool | SingleColBlock () const override |
| MetaInfo & | Info () override |
| MetaInfo const & | Info () const override |
| Context const * | Ctx () const override |
Public Member Functions inherited from xgboost.core.DMatrix | |
| None | __init__ (self, DataType data, Optional[ArrayLike] label=None, *Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[float] missing=None, bool silent=False, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[int] nthread=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[ArrayLike] feature_weights=None, bool enable_categorical=False, DataSplitMode data_split_mode=DataSplitMode.ROW) |
| None | __del__ (self) |
| None | set_info (self, *Optional[ArrayLike] label=None, Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[ArrayLike] feature_weights=None) |
| np.ndarray | get_float_info (self, str field) |
| np.ndarray | get_uint_info (self, str field) |
| None | set_float_info (self, str field, ArrayLike data) |
| None | set_float_info_npy2d (self, str field, ArrayLike data) |
| None | set_uint_info (self, str field, ArrayLike data) |
| None | save_binary (self, Union[str, os.PathLike] fname, bool silent=True) |
| None | set_label (self, ArrayLike label) |
| None | set_weight (self, ArrayLike weight) |
| None | set_base_margin (self, ArrayLike margin) |
| None | set_group (self, ArrayLike group) |
| np.ndarray | get_label (self) |
| np.ndarray | get_weight (self) |
| np.ndarray | get_base_margin (self) |
| np.ndarray | get_group (self) |
| scipy.sparse.csr_matrix | get_data (self) |
| Tuple[np.ndarray, np.ndarray] | get_quantile_cut (self) |
| int | num_row (self) |
| int | num_col (self) |
| int | num_nonmissing (self) |
| "DMatrix" | slice (self, Union[List[int], np.ndarray] rindex, bool allow_groups=False) |
| Optional[FeatureNames] | feature_names (self) |
| None | feature_names (self, Optional[FeatureNames] feature_names) |
| Optional[FeatureTypes] | feature_types (self) |
| None | feature_types (self, Optional[FeatureTypes] feature_types) |
Additional Inherited Members | |
Data Fields inherited from xgboost.core.DMatrix | |
| missing | |
| nthread | |
| silent | |
| handle | |
| feature_names | |
| feature_types | |
Protected Member Functions inherited from xgboost.core.DMatrix | |
| None | _init_from_iter (self, DataIter iterator, bool enable_categorical) |
DMatrix type for QuantileDMatrix, the naming IterativeDMatix is due to its construction process.
QuantileDMatrix is an intermediate storage for quantilization results including quantile cuts and histogram index. Quantilization is designed to be performed on stream of data (or batches of it). As a result, the QuantileDMatrix is also designed to work with batches of data. During initializaion, it walks through the data multiple times iteratively in order to perform quantilization. This design helps us reduce memory usage significantly by avoiding data concatenation along with removing the CSR matrix SparsePage. However, it has its limitation (can be fixed if needed):