Medial Code Documentation
|
Public Member Functions | |
None | __init__ (self, DataType data, Optional[ArrayLike] label=None, *Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[float] missing=None, bool silent=False, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[int] nthread=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[ArrayLike] feature_weights=None, bool enable_categorical=False, DataSplitMode data_split_mode=DataSplitMode.ROW) |
None | __del__ (self) |
None | set_info (self, *Optional[ArrayLike] label=None, Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[ArrayLike] feature_weights=None) |
np.ndarray | get_float_info (self, str field) |
np.ndarray | get_uint_info (self, str field) |
None | set_float_info (self, str field, ArrayLike data) |
None | set_float_info_npy2d (self, str field, ArrayLike data) |
None | set_uint_info (self, str field, ArrayLike data) |
None | save_binary (self, Union[str, os.PathLike] fname, bool silent=True) |
None | set_label (self, ArrayLike label) |
None | set_weight (self, ArrayLike weight) |
None | set_base_margin (self, ArrayLike margin) |
None | set_group (self, ArrayLike group) |
np.ndarray | get_label (self) |
np.ndarray | get_weight (self) |
np.ndarray | get_base_margin (self) |
np.ndarray | get_group (self) |
scipy.sparse.csr_matrix | get_data (self) |
Tuple[np.ndarray, np.ndarray] | get_quantile_cut (self) |
int | num_row (self) |
int | num_col (self) |
int | num_nonmissing (self) |
"DMatrix" | slice (self, Union[List[int], np.ndarray] rindex, bool allow_groups=False) |
Optional[FeatureNames] | feature_names (self) |
None | feature_names (self, Optional[FeatureNames] feature_names) |
Optional[FeatureTypes] | feature_types (self) |
None | feature_types (self, Optional[FeatureTypes] feature_types) |
Data Fields | |
missing | |
nthread | |
silent | |
handle | |
feature_names | |
feature_types | |
Protected Member Functions | |
None | _init_from_iter (self, DataIter iterator, bool enable_categorical) |
Data Matrix used in XGBoost. DMatrix is an internal data structure that is used by XGBoost, which is optimized for both memory efficiency and training speed. You can construct DMatrix from multiple different sources of data.
None xgboost.core.DMatrix.__init__ | ( | self, | |
DataType | data, | ||
Optional[ArrayLike] | label = None , |
||
*Optional[ArrayLike] | weight = None , |
||
Optional[ArrayLike] | base_margin = None , |
||
Optional[float] | missing = None , |
||
bool | silent = False , |
||
Optional[FeatureNames] | feature_names = None , |
||
Optional[FeatureTypes] | feature_types = None , |
||
Optional[int] | nthread = None , |
||
Optional[ArrayLike] | group = None , |
||
Optional[ArrayLike] | qid = None , |
||
Optional[ArrayLike] | label_lower_bound = None , |
||
Optional[ArrayLike] | label_upper_bound = None , |
||
Optional[ArrayLike] | feature_weights = None , |
||
bool | enable_categorical = False , |
||
DataSplitMode | data_split_mode = DataSplitMode.ROW |
||
) |
Parameters ---------- data : Data source of DMatrix. See :ref:`py-data` for a list of supported input types. label : Label of the training data. weight : Weight for each instance. .. note:: For ranking task, weights are per-group. In ranking task, one weight is assigned to each group (not each data point). This is because we only care about the relative ordering of data points within each group, so it doesn't make sense to assign weights to individual data points. base_margin : Base margin used for boosting from existing model. missing : Value in the input data which needs to be present as a missing value. If None, defaults to np.nan. silent : Whether print messages during construction feature_names : Set names for features. feature_types : Set types for features. When `enable_categorical` is set to `True`, string "c" represents categorical data type while "q" represents numerical feature type. For categorical features, the input is assumed to be preprocessed and encoded by the users. The encoding can be done via :py:class:`sklearn.preprocessing.OrdinalEncoder` or pandas dataframe `.cat.codes` method. This is useful when users want to specify categorical features without having to construct a dataframe as input. nthread : Number of threads to use for loading data when parallelization is applicable. If -1, uses maximum threads available on the system. group : Group size for all ranking group. qid : Query ID for data samples, used for ranking. label_lower_bound : Lower bound for survival training. label_upper_bound : Upper bound for survival training. feature_weights : Set feature weights for column sampling. enable_categorical : .. versionadded:: 1.3.0 .. note:: This parameter is experimental Experimental support of specializing for categorical features. Do not set to True unless you are interested in development. Also, JSON/UBJSON serialization format is required.
Reimplemented in xgboost.core._ProxyDMatrix, xgboost.core.DeviceQuantileDMatrix, and xgboost.core.QuantileDMatrix.
Optional[FeatureNames] xgboost.core.DMatrix.feature_names | ( | self | ) |
Labels for features (column labels). Setting it to ``None`` resets existing feature names.
Optional[FeatureTypes] xgboost.core.DMatrix.feature_types | ( | self | ) |
Type of features (column types). This is for displaying the results and categorical data support. See :py:class:`DMatrix` for details. Setting it to ``None`` resets existing feature types.
np.ndarray xgboost.core.DMatrix.get_base_margin | ( | self | ) |
Get the base margin of the DMatrix. Returns ------- base_margin
scipy.sparse.csr_matrix xgboost.core.DMatrix.get_data | ( | self | ) |
Get the predictors from DMatrix as a CSR matrix. This getter is mostly for testing purposes. If this is a quantized DMatrix then quantized values are returned instead of input values. .. versionadded:: 1.7.0
np.ndarray xgboost.core.DMatrix.get_float_info | ( | self, | |
str | field | ||
) |
Get float property from the DMatrix. Parameters ---------- field: str The field name of the information Returns ------- info : array a numpy array of float information of the data
np.ndarray xgboost.core.DMatrix.get_group | ( | self | ) |
Get the group of the DMatrix. Returns ------- group
np.ndarray xgboost.core.DMatrix.get_label | ( | self | ) |
Get the label of the DMatrix. Returns ------- label : array
Tuple[np.ndarray, np.ndarray] xgboost.core.DMatrix.get_quantile_cut | ( | self | ) |
Get quantile cuts for quantization. .. versionadded:: 2.0.0
np.ndarray xgboost.core.DMatrix.get_uint_info | ( | self, | |
str | field | ||
) |
Get unsigned integer property from the DMatrix. Parameters ---------- field: str The field name of the information Returns ------- info : array a numpy array of unsigned integer information of the data
np.ndarray xgboost.core.DMatrix.get_weight | ( | self | ) |
Get the weight of the DMatrix. Returns ------- weight : array
int xgboost.core.DMatrix.num_col | ( | self | ) |
Get the number of columns (features) in the DMatrix.
int xgboost.core.DMatrix.num_nonmissing | ( | self | ) |
Get the number of non-missing values in the DMatrix. .. versionadded:: 1.7.0
int xgboost.core.DMatrix.num_row | ( | self | ) |
Get the number of rows in the DMatrix.
None xgboost.core.DMatrix.save_binary | ( | self, | |
Union[str, os.PathLike] | fname, | ||
bool | silent = True |
||
) |
Save DMatrix to an XGBoost buffer. Saved binary can be later loaded by providing the path to :py:func:`xgboost.DMatrix` as input. Parameters ---------- fname : string or os.PathLike Name of the output buffer file. silent : bool (optional; default: True) If set, the output is suppressed.
None xgboost.core.DMatrix.set_base_margin | ( | self, | |
ArrayLike | margin | ||
) |
Set base margin of booster to start from. This can be used to specify a prediction value of existing model to be base_margin However, remember margin is needed, instead of transformed prediction e.g. for logistic regression: need to put in value before logistic transformation see also example/demo.py Parameters ---------- margin: array like Prediction margin of each datapoint
None xgboost.core.DMatrix.set_float_info | ( | self, | |
str | field, | ||
ArrayLike | data | ||
) |
Set float type property into the DMatrix. Parameters ---------- field: str The field name of the information data: numpy array The array of data to be set
None xgboost.core.DMatrix.set_float_info_npy2d | ( | self, | |
str | field, | ||
ArrayLike | data | ||
) |
Set float type property into the DMatrix for numpy 2d array input Parameters ---------- field: str The field name of the information data: numpy array The array of data to be set
None xgboost.core.DMatrix.set_group | ( | self, | |
ArrayLike | group | ||
) |
Set group size of DMatrix (used for ranking). Parameters ---------- group : array like Group size of each group
None xgboost.core.DMatrix.set_info | ( | self, | |
*Optional[ArrayLike] | label = None , |
||
Optional[ArrayLike] | weight = None , |
||
Optional[ArrayLike] | base_margin = None , |
||
Optional[ArrayLike] | group = None , |
||
Optional[ArrayLike] | qid = None , |
||
Optional[ArrayLike] | label_lower_bound = None , |
||
Optional[ArrayLike] | label_upper_bound = None , |
||
Optional[FeatureNames] | feature_names = None , |
||
Optional[FeatureTypes] | feature_types = None , |
||
Optional[ArrayLike] | feature_weights = None |
||
) |
Set meta info for DMatrix. See doc string for :py:obj:`xgboost.DMatrix`.
None xgboost.core.DMatrix.set_label | ( | self, | |
ArrayLike | label | ||
) |
Set label of dmatrix Parameters ---------- label: array like The label information to be set into DMatrix
None xgboost.core.DMatrix.set_uint_info | ( | self, | |
str | field, | ||
ArrayLike | data | ||
) |
Set uint type property into the DMatrix. Parameters ---------- field: str The field name of the information data: numpy array The array of data to be set
None xgboost.core.DMatrix.set_weight | ( | self, | |
ArrayLike | weight | ||
) |
Set weight of each instance. Parameters ---------- weight : array like Weight for each data point .. note:: For ranking task, weights are per-group. In ranking task, one weight is assigned to each group (not each data point). This is because we only care about the relative ordering of data points within each group, so it doesn't make sense to assign weights to individual data points.
"DMatrix" xgboost.core.DMatrix.slice | ( | self, | |
Union[List[int], np.ndarray] | rindex, | ||
bool | allow_groups = False |
||
) |
Slice the DMatrix and return a new DMatrix that only contains `rindex`. Parameters ---------- rindex List of indices to be selected. allow_groups Allow slicing of a matrix with a groups attribute Returns ------- res A new DMatrix containing only selected indices.