Inheritance diagram for xgboost.core.DMatrix:

Public Member Functions
None	__init__ (self, DataType data, Optional[ArrayLike] label=None, *Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[float] missing=None, bool silent=False, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[int] nthread=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[ArrayLike] feature_weights=None, bool enable_categorical=False, DataSplitMode data_split_mode=DataSplitMode.ROW)

None	__del__ (self)

None	set_info (self, *Optional[ArrayLike] label=None, Optional[ArrayLike] weight=None, Optional[ArrayLike] base_margin=None, Optional[ArrayLike] group=None, Optional[ArrayLike] qid=None, Optional[ArrayLike] label_lower_bound=None, Optional[ArrayLike] label_upper_bound=None, Optional[FeatureNames] feature_names=None, Optional[FeatureTypes] feature_types=None, Optional[ArrayLike] feature_weights=None)

np.ndarray	get_float_info (self, str field)

np.ndarray	get_uint_info (self, str field)

None	set_float_info (self, str field, ArrayLike data)

None	set_float_info_npy2d (self, str field, ArrayLike data)

None	set_uint_info (self, str field, ArrayLike data)

None	save_binary (self, Union[str, os.PathLike] fname, bool silent=True)

None	set_label (self, ArrayLike label)

None	set_weight (self, ArrayLike weight)

None	set_base_margin (self, ArrayLike margin)

None	set_group (self, ArrayLike group)

np.ndarray	get_label (self)

np.ndarray	get_weight (self)

np.ndarray	get_base_margin (self)

np.ndarray	get_group (self)

scipy.sparse.csr_matrix	get_data (self)

Tuple[np.ndarray, np.ndarray]	get_quantile_cut (self)

int	num_row (self)

int	num_col (self)

int	num_nonmissing (self)

"DMatrix"	slice (self, Union[List[int], np.ndarray] rindex, bool allow_groups=False)

Optional[FeatureNames]	feature_names (self)

None	feature_names (self, Optional[FeatureNames] feature_names)

Optional[FeatureTypes]	feature_types (self)

None	feature_types (self, Optional[FeatureTypes] feature_types)

Data Fields
	missing

	nthread

	silent

	handle

	feature_names

	feature_types

Protected Member Functions
None	_init_from_iter (self, DataIter iterator, bool enable_categorical)

Detailed Description

Data Matrix used in XGBoost.

DMatrix is an internal data structure that is used by XGBoost, which is optimized
for both memory efficiency and training speed.  You can construct DMatrix from
multiple different sources of data.

Constructor & Destructor Documentation

◆ init()

None xgboost.core.DMatrix.__init__	(		self,
		DataType	data,
		Optional[ArrayLike]	label = `None`,
		*Optional[ArrayLike]	weight = `None`,
		Optional[ArrayLike]	base_margin = `None`,
		Optional[float]	missing = `None`,
		bool	silent = `False`,
		Optional[FeatureNames]	feature_names = `None`,
		Optional[FeatureTypes]	feature_types = `None`,
		Optional[int]	nthread = `None`,
		Optional[ArrayLike]	group = `None`,
		Optional[ArrayLike]	qid = `None`,
		Optional[ArrayLike]	label_lower_bound = `None`,
		Optional[ArrayLike]	label_upper_bound = `None`,
		Optional[ArrayLike]	feature_weights = `None`,
		bool	enable_categorical = `False`,
		DataSplitMode	data_split_mode = `DataSplitMode.ROW`
	)

Parameters
----------
data :
    Data source of DMatrix. See :ref:`py-data` for a list of supported input
    types.
label :
    Label of the training data.
weight :
    Weight for each instance.

     .. note::

         For ranking task, weights are per-group.  In ranking task, one weight
         is assigned to each group (not each data point). This is because we
         only care about the relative ordering of data points within each group,
         so it doesn't make sense to assign weights to individual data points.

base_margin :
    Base margin used for boosting from existing model.
missing :
    Value in the input data which needs to be present as a missing value. If
    None, defaults to np.nan.
silent :
    Whether print messages during construction
feature_names :
    Set names for features.
feature_types :

    Set types for features.  When `enable_categorical` is set to `True`, string
    "c" represents categorical data type while "q" represents numerical feature
    type. For categorical features, the input is assumed to be preprocessed and
    encoded by the users. The encoding can be done via
    :py:class:`sklearn.preprocessing.OrdinalEncoder` or pandas dataframe
    `.cat.codes` method. This is useful when users want to specify categorical
    features without having to construct a dataframe as input.

nthread :
    Number of threads to use for loading data when parallelization is
    applicable. If -1, uses maximum threads available on the system.
group :
    Group size for all ranking group.
qid :
    Query ID for data samples, used for ranking.
label_lower_bound :
    Lower bound for survival training.
label_upper_bound :
    Upper bound for survival training.
feature_weights :
    Set feature weights for column sampling.
enable_categorical :

    .. versionadded:: 1.3.0

    .. note:: This parameter is experimental

    Experimental support of specializing for categorical features.  Do not set
    to True unless you are interested in development. Also, JSON/UBJSON
    serialization format is required.

Reimplemented in xgboost.core._ProxyDMatrix, xgboost.core.DeviceQuantileDMatrix, and xgboost.core.QuantileDMatrix.

Member Function Documentation

◆ feature_names()

Optional[FeatureNames] xgboost.core.DMatrix.feature_names ( self )

Labels for features (column labels).

Setting it to ``None`` resets existing feature names.

◆ feature_types()

Optional[FeatureTypes] xgboost.core.DMatrix.feature_types ( self )

Type of features (column types).

This is for displaying the results and categorical data support. See
:py:class:`DMatrix` for details.

Setting it to ``None`` resets existing feature types.

◆ get_base_margin()

np.ndarray xgboost.core.DMatrix.get_base_margin ( self )

Get the base margin of the DMatrix.

Returns
-------
base_margin

◆ get_data()

scipy.sparse.csr_matrix xgboost.core.DMatrix.get_data ( self )

Get the predictors from DMatrix as a CSR matrix. This getter is mostly for
testing purposes. If this is a quantized DMatrix then quantized values are
returned instead of input values.

.. versionadded:: 1.7.0

◆ get_float_info()

np.ndarray xgboost.core.DMatrix.get_float_info	(		self,
		str	field
	)

Get float property from the DMatrix.

Parameters
----------
field: str
    The field name of the information

Returns
-------
info : array
    a numpy array of float information of the data

◆ get_group()

np.ndarray xgboost.core.DMatrix.get_group ( self )

Get the group of the DMatrix.

Returns
-------
group

◆ get_label()

np.ndarray xgboost.core.DMatrix.get_label ( self )

Get the label of the DMatrix.

Returns
-------
label : array

◆ get_quantile_cut()

Tuple[np.ndarray, np.ndarray] xgboost.core.DMatrix.get_quantile_cut ( self )

Get quantile cuts for quantization.

.. versionadded:: 2.0.0

◆ get_uint_info()

np.ndarray xgboost.core.DMatrix.get_uint_info	(		self,
		str	field
	)

Get unsigned integer property from the DMatrix.

Parameters
----------
field: str
    The field name of the information

Returns
-------
info : array
    a numpy array of unsigned integer information of the data

◆ get_weight()

np.ndarray xgboost.core.DMatrix.get_weight ( self )

Get the weight of the DMatrix.

Returns
-------
weight : array

◆ num_col()

int xgboost.core.DMatrix.num_col ( self )

Get the number of columns (features) in the DMatrix.

◆ num_nonmissing()

int xgboost.core.DMatrix.num_nonmissing ( self )

Get the number of non-missing values in the DMatrix.

.. versionadded:: 1.7.0

◆ num_row()

int xgboost.core.DMatrix.num_row ( self )

Get the number of rows in the DMatrix.

◆ save_binary()

None xgboost.core.DMatrix.save_binary	(		self,
		Union[str, os.PathLike]	fname,
		bool	silent = `True`
	)

Save DMatrix to an XGBoost buffer.  Saved binary can be later loaded
by providing the path to :py:func:`xgboost.DMatrix` as input.

Parameters
----------
fname : string or os.PathLike
    Name of the output buffer file.
silent : bool (optional; default: True)
    If set, the output is suppressed.

◆ set_base_margin()

None xgboost.core.DMatrix.set_base_margin	(		self,
		ArrayLike	margin
	)

Set base margin of booster to start from.

This can be used to specify a prediction value of existing model to be
base_margin However, remember margin is needed, instead of transformed
prediction e.g. for logistic regression: need to put in value before
logistic transformation see also example/demo.py

Parameters
----------
margin: array like
    Prediction margin of each datapoint

◆ set_float_info()

None xgboost.core.DMatrix.set_float_info	(		self,
		str	field,
		ArrayLike	data
	)

Set float type property into the DMatrix.

Parameters
----------
field: str
    The field name of the information

data: numpy array
    The array of data to be set

◆ set_float_info_npy2d()

None xgboost.core.DMatrix.set_float_info_npy2d	(		self,
		str	field,
		ArrayLike	data
	)

Set float type property into the DMatrix
   for numpy 2d array input

Parameters
----------
field: str
    The field name of the information

data: numpy array
    The array of data to be set

◆ set_group()

None xgboost.core.DMatrix.set_group	(		self,
		ArrayLike	group
	)

Set group size of DMatrix (used for ranking).

Parameters
----------
group : array like
    Group size of each group

◆ set_info()

None xgboost.core.DMatrix.set_info	(		self,
		*Optional[ArrayLike]	label = `None`,
		Optional[ArrayLike]	weight = `None`,
		Optional[ArrayLike]	base_margin = `None`,
		Optional[ArrayLike]	group = `None`,
		Optional[ArrayLike]	qid = `None`,
		Optional[ArrayLike]	label_lower_bound = `None`,
		Optional[ArrayLike]	label_upper_bound = `None`,
		Optional[FeatureNames]	feature_names = `None`,
		Optional[FeatureTypes]	feature_types = `None`,
		Optional[ArrayLike]	feature_weights = `None`
	)

Set meta info for DMatrix.  See doc string for :py:obj:`xgboost.DMatrix`.

◆ set_label()

None xgboost.core.DMatrix.set_label	(		self,
		ArrayLike	label
	)

Set label of dmatrix

Parameters
----------
label: array like
    The label information to be set into DMatrix

◆ set_uint_info()

None xgboost.core.DMatrix.set_uint_info	(		self,
		str	field,
		ArrayLike	data
	)

Set uint type property into the DMatrix.

Parameters
----------
field: str
    The field name of the information

data: numpy array
    The array of data to be set

◆ set_weight()

None xgboost.core.DMatrix.set_weight	(		self,
		ArrayLike	weight
	)

Set weight of each instance.

Parameters
----------
weight : array like
    Weight for each data point

    .. note:: For ranking task, weights are per-group.

        In ranking task, one weight is assigned to each group (not each
        data point). This is because we only care about the relative
        ordering of data points within each group, so it doesn't make
        sense to assign weights to individual data points.

◆ slice()

"DMatrix" xgboost.core.DMatrix.slice	(		self,
		Union[List[int], np.ndarray]	rindex,
		bool	allow_groups = `False`
	)

Slice the DMatrix and return a new DMatrix that only contains `rindex`.

Parameters
----------
rindex
    List of indices to be selected.
allow_groups
    Allow slicing of a matrix with a groups attribute

Returns
-------
res
    A new DMatrix containing only selected indices.

The documentation for this class was generated from the following file:

External/xgboost/python-package/xgboost/core.py

Public Member Functions

Data Fields

Protected Member Functions