|
Medial Code Documentation
|
Public Member Functions | |
| None | __init__ (self, DataType data, Optional[ArrayLike] label=None, *Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[float] missing=None, bool silent=False, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[int] nthread=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[ArrayLike] feature_weights=None, bool enable_categorical=False, DataSplitMode data_split_mode=DataSplitMode.ROW) |
| None | __del__ (self) |
| None | set_info (self, *Optional[ArrayLike] label=None, Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[ArrayLike] feature_weights=None) |
| np.ndarray | get_float_info (self, str field) |
| np.ndarray | get_uint_info (self, str field) |
| None | set_float_info (self, str field, ArrayLike data) |
| None | set_float_info_npy2d (self, str field, ArrayLike data) |
| None | set_uint_info (self, str field, ArrayLike data) |
| None | save_binary (self, Union[str, os.PathLike] fname, bool silent=True) |
| None | set_label (self, ArrayLike label) |
| None | set_weight (self, ArrayLike weight) |
| None | set_base_margin (self, ArrayLike margin) |
| None | set_group (self, ArrayLike group) |
| np.ndarray | get_label (self) |
| np.ndarray | get_weight (self) |
| np.ndarray | get_base_margin (self) |
| np.ndarray | get_group (self) |
| scipy.sparse.csr_matrix | get_data (self) |
| Tuple[np.ndarray, np.ndarray] | get_quantile_cut (self) |
| int | num_row (self) |
| int | num_col (self) |
| int | num_nonmissing (self) |
| "DMatrix" | slice (self, Union[List[int], np.ndarray] rindex, bool allow_groups=False) |
| Optional[FeatureNames] | feature_names (self) |
| None | feature_names (self, Optional[FeatureNames] feature_names) |
| Optional[FeatureTypes] | feature_types (self) |
| None | feature_types (self, Optional[FeatureTypes] feature_types) |
Data Fields | |
| missing | |
| nthread | |
| silent | |
| handle | |
| feature_names | |
| feature_types | |
Protected Member Functions | |
| None | _init_from_iter (self, DataIter iterator, bool enable_categorical) |
Data Matrix used in XGBoost. DMatrix is an internal data structure that is used by XGBoost, which is optimized for both memory efficiency and training speed. You can construct DMatrix from multiple different sources of data.
| None xgboost.core.DMatrix.__init__ | ( | self, | |
| DataType | data, | ||
| Optional[ArrayLike] | label = None, |
||
| *Optional[ArrayLike] | weight = None, |
||
| Optional[ArrayLike] | base_margin = None, |
||
| Optional[float] | missing = None, |
||
| bool | silent = False, |
||
| Optional[FeatureNames] | feature_names = None, |
||
| Optional[FeatureTypes] | feature_types = None, |
||
| Optional[int] | nthread = None, |
||
| Optional[ArrayLike] | group = None, |
||
| Optional[ArrayLike] | qid = None, |
||
| Optional[ArrayLike] | label_lower_bound = None, |
||
| Optional[ArrayLike] | label_upper_bound = None, |
||
| Optional[ArrayLike] | feature_weights = None, |
||
| bool | enable_categorical = False, |
||
| DataSplitMode | data_split_mode = DataSplitMode.ROW |
||
| ) |
Parameters
----------
data :
Data source of DMatrix. See :ref:`py-data` for a list of supported input
types.
label :
Label of the training data.
weight :
Weight for each instance.
.. note::
For ranking task, weights are per-group. In ranking task, one weight
is assigned to each group (not each data point). This is because we
only care about the relative ordering of data points within each group,
so it doesn't make sense to assign weights to individual data points.
base_margin :
Base margin used for boosting from existing model.
missing :
Value in the input data which needs to be present as a missing value. If
None, defaults to np.nan.
silent :
Whether print messages during construction
feature_names :
Set names for features.
feature_types :
Set types for features. When `enable_categorical` is set to `True`, string
"c" represents categorical data type while "q" represents numerical feature
type. For categorical features, the input is assumed to be preprocessed and
encoded by the users. The encoding can be done via
:py:class:`sklearn.preprocessing.OrdinalEncoder` or pandas dataframe
`.cat.codes` method. This is useful when users want to specify categorical
features without having to construct a dataframe as input.
nthread :
Number of threads to use for loading data when parallelization is
applicable. If -1, uses maximum threads available on the system.
group :
Group size for all ranking group.
qid :
Query ID for data samples, used for ranking.
label_lower_bound :
Lower bound for survival training.
label_upper_bound :
Upper bound for survival training.
feature_weights :
Set feature weights for column sampling.
enable_categorical :
.. versionadded:: 1.3.0
.. note:: This parameter is experimental
Experimental support of specializing for categorical features. Do not set
to True unless you are interested in development. Also, JSON/UBJSON
serialization format is required.
Reimplemented in xgboost.core._ProxyDMatrix, xgboost.core.DeviceQuantileDMatrix, and xgboost.core.QuantileDMatrix.
| Optional[FeatureNames] xgboost.core.DMatrix.feature_names | ( | self | ) |
Labels for features (column labels). Setting it to ``None`` resets existing feature names.
| Optional[FeatureTypes] xgboost.core.DMatrix.feature_types | ( | self | ) |
Type of features (column types). This is for displaying the results and categorical data support. See :py:class:`DMatrix` for details. Setting it to ``None`` resets existing feature types.
| np.ndarray xgboost.core.DMatrix.get_base_margin | ( | self | ) |
Get the base margin of the DMatrix. Returns ------- base_margin
| scipy.sparse.csr_matrix xgboost.core.DMatrix.get_data | ( | self | ) |
Get the predictors from DMatrix as a CSR matrix. This getter is mostly for testing purposes. If this is a quantized DMatrix then quantized values are returned instead of input values. .. versionadded:: 1.7.0
| np.ndarray xgboost.core.DMatrix.get_float_info | ( | self, | |
| str | field | ||
| ) |
Get float property from the DMatrix.
Parameters
----------
field: str
The field name of the information
Returns
-------
info : array
a numpy array of float information of the data
| np.ndarray xgboost.core.DMatrix.get_group | ( | self | ) |
Get the group of the DMatrix. Returns ------- group
| np.ndarray xgboost.core.DMatrix.get_label | ( | self | ) |
Get the label of the DMatrix. Returns ------- label : array
| Tuple[np.ndarray, np.ndarray] xgboost.core.DMatrix.get_quantile_cut | ( | self | ) |
Get quantile cuts for quantization. .. versionadded:: 2.0.0
| np.ndarray xgboost.core.DMatrix.get_uint_info | ( | self, | |
| str | field | ||
| ) |
Get unsigned integer property from the DMatrix.
Parameters
----------
field: str
The field name of the information
Returns
-------
info : array
a numpy array of unsigned integer information of the data
| np.ndarray xgboost.core.DMatrix.get_weight | ( | self | ) |
Get the weight of the DMatrix. Returns ------- weight : array
| int xgboost.core.DMatrix.num_col | ( | self | ) |
Get the number of columns (features) in the DMatrix.
| int xgboost.core.DMatrix.num_nonmissing | ( | self | ) |
Get the number of non-missing values in the DMatrix. .. versionadded:: 1.7.0
| int xgboost.core.DMatrix.num_row | ( | self | ) |
Get the number of rows in the DMatrix.
| None xgboost.core.DMatrix.save_binary | ( | self, | |
| Union[str, os.PathLike] | fname, | ||
| bool | silent = True |
||
| ) |
Save DMatrix to an XGBoost buffer. Saved binary can be later loaded
by providing the path to :py:func:`xgboost.DMatrix` as input.
Parameters
----------
fname : string or os.PathLike
Name of the output buffer file.
silent : bool (optional; default: True)
If set, the output is suppressed.
| None xgboost.core.DMatrix.set_base_margin | ( | self, | |
| ArrayLike | margin | ||
| ) |
Set base margin of booster to start from.
This can be used to specify a prediction value of existing model to be
base_margin However, remember margin is needed, instead of transformed
prediction e.g. for logistic regression: need to put in value before
logistic transformation see also example/demo.py
Parameters
----------
margin: array like
Prediction margin of each datapoint
| None xgboost.core.DMatrix.set_float_info | ( | self, | |
| str | field, | ||
| ArrayLike | data | ||
| ) |
Set float type property into the DMatrix.
Parameters
----------
field: str
The field name of the information
data: numpy array
The array of data to be set
| None xgboost.core.DMatrix.set_float_info_npy2d | ( | self, | |
| str | field, | ||
| ArrayLike | data | ||
| ) |
Set float type property into the DMatrix
for numpy 2d array input
Parameters
----------
field: str
The field name of the information
data: numpy array
The array of data to be set
| None xgboost.core.DMatrix.set_group | ( | self, | |
| ArrayLike | group | ||
| ) |
Set group size of DMatrix (used for ranking).
Parameters
----------
group : array like
Group size of each group
| None xgboost.core.DMatrix.set_info | ( | self, | |
| *Optional[ArrayLike] | label = None, |
||
| Optional[ArrayLike] | weight = None, |
||
| Optional[ArrayLike] | base_margin = None, |
||
| Optional[ArrayLike] | group = None, |
||
| Optional[ArrayLike] | qid = None, |
||
| Optional[ArrayLike] | label_lower_bound = None, |
||
| Optional[ArrayLike] | label_upper_bound = None, |
||
| Optional[FeatureNames] | feature_names = None, |
||
| Optional[FeatureTypes] | feature_types = None, |
||
| Optional[ArrayLike] | feature_weights = None |
||
| ) |
Set meta info for DMatrix. See doc string for :py:obj:`xgboost.DMatrix`.
| None xgboost.core.DMatrix.set_label | ( | self, | |
| ArrayLike | label | ||
| ) |
Set label of dmatrix
Parameters
----------
label: array like
The label information to be set into DMatrix
| None xgboost.core.DMatrix.set_uint_info | ( | self, | |
| str | field, | ||
| ArrayLike | data | ||
| ) |
Set uint type property into the DMatrix.
Parameters
----------
field: str
The field name of the information
data: numpy array
The array of data to be set
| None xgboost.core.DMatrix.set_weight | ( | self, | |
| ArrayLike | weight | ||
| ) |
Set weight of each instance.
Parameters
----------
weight : array like
Weight for each data point
.. note:: For ranking task, weights are per-group.
In ranking task, one weight is assigned to each group (not each
data point). This is because we only care about the relative
ordering of data points within each group, so it doesn't make
sense to assign weights to individual data points.
| "DMatrix" xgboost.core.DMatrix.slice | ( | self, | |
| Union[List[int], np.ndarray] | rindex, | ||
| bool | allow_groups = False |
||
| ) |
Slice the DMatrix and return a new DMatrix that only contains `rindex`.
Parameters
----------
rindex
List of indices to be selected.
allow_groups
Allow slicing of a matrix with a groups attribute
Returns
-------
res
A new DMatrix containing only selected indices.