Medial Code Documentation
|
DMatrix type for QuantileDMatrix
, the naming IterativeDMatix
is due to its construction process.
More...
#include <iterative_dmatrix.h>
Public Member Functions | |
IterativeDMatrix (DataIterHandle iter_handle, DMatrixHandle proxy, std::shared_ptr< DMatrix > ref, DataIterResetCallback *reset, XGDMatrixCallbackNext *next, float missing, int nthread, bst_bin_t max_bin) | |
bool | EllpackExists () const override |
bool | GHistIndexExists () const override |
bool | SparsePageExists () const override |
DMatrix * | Slice (common::Span< int32_t const >) override |
DMatrix * | SliceCol (int, int) override |
BatchSet< SparsePage > | GetRowBatches () override |
BatchSet< CSCPage > | GetColumnBatches (Context const *) override |
BatchSet< SortedCSCPage > | GetSortedColumnBatches (Context const *) override |
BatchSet< GHistIndexMatrix > | GetGradientIndex (Context const *ctx, BatchParam const ¶m) override |
BatchSet< EllpackPage > | GetEllpackBatches (Context const *ctx, const BatchParam ¶m) override |
BatchSet< ExtSparsePage > | GetExtBatches (Context const *ctx, BatchParam const ¶m) override |
bool | SingleColBlock () const override |
MetaInfo & | Info () override |
MetaInfo const & | Info () const override |
Context const * | Ctx () const override |
![]() | |
None | __init__ (self, DataType data, Optional[ArrayLike] label=None, *Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[float] missing=None, bool silent=False, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[int] nthread=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[ArrayLike] feature_weights=None, bool enable_categorical=False, DataSplitMode data_split_mode=DataSplitMode.ROW) |
None | __del__ (self) |
None | set_info (self, *Optional[ArrayLike] label=None, Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[ArrayLike] feature_weights=None) |
np.ndarray | get_float_info (self, str field) |
np.ndarray | get_uint_info (self, str field) |
None | set_float_info (self, str field, ArrayLike data) |
None | set_float_info_npy2d (self, str field, ArrayLike data) |
None | set_uint_info (self, str field, ArrayLike data) |
None | save_binary (self, Union[str, os.PathLike] fname, bool silent=True) |
None | set_label (self, ArrayLike label) |
None | set_weight (self, ArrayLike weight) |
None | set_base_margin (self, ArrayLike margin) |
None | set_group (self, ArrayLike group) |
np.ndarray | get_label (self) |
np.ndarray | get_weight (self) |
np.ndarray | get_base_margin (self) |
np.ndarray | get_group (self) |
scipy.sparse.csr_matrix | get_data (self) |
Tuple[np.ndarray, np.ndarray] | get_quantile_cut (self) |
int | num_row (self) |
int | num_col (self) |
int | num_nonmissing (self) |
"DMatrix" | slice (self, Union[List[int], np.ndarray] rindex, bool allow_groups=False) |
Optional[FeatureNames] | feature_names (self) |
None | feature_names (self, Optional[FeatureNames] feature_names) |
Optional[FeatureTypes] | feature_types (self) |
None | feature_types (self, Optional[FeatureTypes] feature_types) |
Additional Inherited Members | |
![]() | |
missing | |
nthread | |
silent | |
handle | |
feature_names | |
feature_types | |
![]() | |
None | _init_from_iter (self, DataIter iterator, bool enable_categorical) |
DMatrix type for QuantileDMatrix
, the naming IterativeDMatix
is due to its construction process.
QuantileDMatrix
is an intermediate storage for quantilization results including quantile cuts and histogram index. Quantilization is designed to be performed on stream of data (or batches of it). As a result, the QuantileDMatrix
is also designed to work with batches of data. During initializaion, it walks through the data multiple times iteratively in order to perform quantilization. This design helps us reduce memory usage significantly by avoiding data concatenation along with removing the CSR matrix SparsePage
. However, it has its limitation (can be fixed if needed):