|
Medial Code Documentation
|
Public Member Functions | |
| None | __init__ (self, Optional[BoosterParam] params=None, Optional[Sequence[DMatrix]] cache=None, Optional[Union["Booster", bytearray, os.PathLike, str]] model_file=None) |
| None | __del__ (self) |
| Dict | __getstate__ (self) |
| None | __setstate__ (self, Dict state) |
| "Booster" | __getitem__ (self, Union[int, tuple, slice] val) |
| Generator["Booster", None, None] | __iter__ (self) |
| str | save_config (self) |
| None | load_config (self, str config) |
| "Booster" | __copy__ (self) |
| "Booster" | __deepcopy__ (self, Any _) |
| "Booster" | copy (self) |
| Optional[str] | attr (self, str key) |
| Dict[str, Optional[str]] | attributes (self) |
| None | set_attr (self, **Optional[Any] kwargs) |
| Optional[FeatureTypes] | feature_types (self) |
| None | feature_types (self, Optional[FeatureTypes] features) |
| Optional[FeatureNames] | feature_names (self) |
| None | feature_names (self, Optional[FeatureNames] features) |
| None | set_param (self, Union[Dict, Iterable[Tuple[str, Any]], str] params, Optional[str] value=None) |
| None | update (self, DMatrix dtrain, int iteration, Optional[Objective] fobj=None) |
| None | boost (self, DMatrix dtrain, np.ndarray grad, np.ndarray hess) |
| str | eval_set (self, Sequence[Tuple[DMatrix, str]] evals, int iteration=0, Optional[Metric] feval=None, bool output_margin=True) |
| str | eval (self, DMatrix data, str name="eval", int iteration=0) |
| np.ndarray | predict (self, DMatrix data, bool output_margin=False, bool pred_leaf=False, bool pred_contribs=False, bool approx_contribs=False, bool pred_interactions=False, bool validate_features=True, bool training=False, Tuple[int, int] iteration_range=(0, 0), bool strict_shape=False) |
| NumpyOrCupy | inplace_predict (self, DataType data, Tuple[int, int] iteration_range=(0, 0), str predict_type="value", float missing=np.nan, bool validate_features=True, Any base_margin=None, bool strict_shape=False) |
| None | save_model (self, Union[str, os.PathLike] fname) |
| bytearray | save_raw (self, str raw_format="deprecated") |
| None | load_model (self, ModelIn fname) |
| int | best_iteration (self) |
| None | best_iteration (self, int iteration) |
| float | best_score (self) |
| None | best_score (self, int score) |
| int | num_boosted_rounds (self) |
| int | num_features (self) |
| None | dump_model (self, Union[str, os.PathLike] fout, Union[str, os.PathLike] fmap="", bool with_stats=False, str dump_format="text") |
| List[str] | get_dump (self, Union[str, os.PathLike] fmap="", bool with_stats=False, str dump_format="text") |
| Dict[str, Union[float, List[float]]] | get_fscore (self, Union[str, os.PathLike] fmap="") |
| Dict[str, Union[float, List[float]]] | get_score (self, Union[str, os.PathLike] fmap="", str importance_type="weight") |
| DataFrame | trees_to_dataframe (self, Union[str, os.PathLike] fmap="") |
| Union[np.ndarray, DataFrame] | get_split_value_histogram (self, str feature, Union[os.PathLike, str] fmap="", Optional[int] bins=None, bool as_pandas=True) |
Data Fields | |
| handle | |
| feature_names | |
| feature_types | |
Protected Member Functions | |
| Union[Tuple[int,...], str] | _transform_monotone_constrains (self, Union[Dict[str, int], str, Tuple[int,...]] value) |
| Union[str, List[List[int]]] | _transform_interaction_constraints (self, Union[Sequence[Sequence[str]], str] value) |
| BoosterParam | _configure_constraints (self, BoosterParam params) |
| Optional[FeatureInfo] | _get_feature_info (self, str field) |
| None | _set_feature_info (self, Optional[FeatureInfo] features, str field) |
| None | _assign_dmatrix_features (self, DMatrix data) |
| None | _validate_features (self, Optional[FeatureNames] feature_names) |
A Booster of XGBoost. Booster is the model of xgboost, that contains low level routines for training, prediction and evaluation.
| None xgboost.core.Booster.__init__ | ( | self, | |
| Optional[BoosterParam] | params = None, |
||
| Optional[Sequence[DMatrix]] | cache = None, |
||
| Optional[Union["Booster", bytearray, os.PathLike, str]] | model_file = None |
||
| ) |
Parameters
----------
params :
Parameters for boosters.
cache :
List of cache items.
model_file :
Path to the model file if it's string or PathLike.
| "Booster" xgboost.core.Booster.__deepcopy__ | ( | self, | |
| Any | _ | ||
| ) |
Return a copy of booster.
| "Booster" xgboost.core.Booster.__getitem__ | ( | self, | |
| Union[int, tuple, slice] | val | ||
| ) |
Get a slice of the tree-based model. .. versionadded:: 1.3.0
| Generator["Booster", None, None] xgboost.core.Booster.__iter__ | ( | self | ) |
Iterator method for getting individual trees. .. versionadded:: 2.0.0
| Optional[str] xgboost.core.Booster.attr | ( | self, | |
| str | key | ||
| ) |
Get attribute string from the Booster.
Parameters
----------
key :
The key to get attribute from.
Returns
-------
value :
The attribute value of the key, returns None if attribute do not exist.
| Dict[str, Optional[str]] xgboost.core.Booster.attributes | ( | self | ) |
Get attributes stored in the Booster as a dictionary.
Returns
-------
result : dictionary of attribute_name: attribute_value pairs of strings.
Returns an empty dict if there's no attributes.
| int xgboost.core.Booster.best_iteration | ( | self | ) |
The best iteration during training.
| float xgboost.core.Booster.best_score | ( | self | ) |
The best evaluation score during training.
| None xgboost.core.Booster.boost | ( | self, | |
| DMatrix | dtrain, | ||
| np.ndarray | grad, | ||
| np.ndarray | hess | ||
| ) |
Boost the booster for one iteration, with customized gradient
statistics. Like :py:func:`xgboost.Booster.update`, this
function should not be called directly by users.
Parameters
----------
dtrain :
The training DMatrix.
grad :
The first order of gradient.
hess :
The second order of gradient.
| "Booster" xgboost.core.Booster.copy | ( | self | ) |
Copy the booster object.
Returns
-------
booster :
A copied booster model
| None xgboost.core.Booster.dump_model | ( | self, | |
| Union[str, os.PathLike] | fout, | ||
| Union[str, os.PathLike] | fmap = "", |
||
| bool | with_stats = False, |
||
| str | dump_format = "text" |
||
| ) |
Dump model into a text or JSON file. Unlike :py:meth:`save_model`, the
output format is primarily used for visualization or interpretation,
hence it's more human readable but cannot be loaded back to XGBoost.
Parameters
----------
fout :
Output file name.
fmap :
Name of the file containing feature map names.
with_stats :
Controls whether the split statistics are output.
dump_format :
Format of model dump file. Can be 'text' or 'json'.
| str xgboost.core.Booster.eval | ( | self, | |
| DMatrix | data, | ||
| str | name = "eval", |
||
| int | iteration = 0 |
||
| ) |
Evaluate the model on mat.
Parameters
----------
data :
The dmatrix storing the input.
name :
The name of the dataset.
iteration :
The current iteration number.
Returns
-------
result: str
Evaluation result string.
| str xgboost.core.Booster.eval_set | ( | self, | |
| Sequence[Tuple[DMatrix, str]] | evals, | ||
| int | iteration = 0, |
||
| Optional[Metric] | feval = None, |
||
| bool | output_margin = True |
||
| ) |
Evaluate a set of data.
Parameters
----------
evals :
List of items to be evaluated.
iteration :
Current iteration.
feval :
Custom evaluation function.
Returns
-------
result: str
Evaluation result string.
| Optional[FeatureNames] xgboost.core.Booster.feature_names | ( | self | ) |
Feature names for this booster. Can be directly set by input data or by assignment.
| Optional[FeatureTypes] xgboost.core.Booster.feature_types | ( | self | ) |
Feature types for this booster. Can be directly set by input data or by assignment. See :py:class:`DMatrix` for details.
| List[str] xgboost.core.Booster.get_dump | ( | self, | |
| Union[str, os.PathLike] | fmap = "", |
||
| bool | with_stats = False, |
||
| str | dump_format = "text" |
||
| ) |
Returns the model dump as a list of strings. Unlike :py:meth:`save_model`, the output
format is primarily used for visualization or interpretation, hence it's more
human readable but cannot be loaded back to XGBoost.
Parameters
----------
fmap :
Name of the file containing feature map names.
with_stats :
Controls whether the split statistics are output.
dump_format :
Format of model dump. Can be 'text', 'json' or 'dot'.
| Dict[str, Union[float, List[float]]] xgboost.core.Booster.get_fscore | ( | self, | |
| Union[str, os.PathLike] | fmap = "" |
||
| ) |
Get feature importance of each feature. .. note:: Zero-importance features will not be included Keep in mind that this function does not include zero-importance feature, i.e. those features that have not been used in any split conditions. Parameters ---------- fmap : The name of feature map file
| Dict[str, Union[float, List[float]]] xgboost.core.Booster.get_score | ( | self, | |
| Union[str, os.PathLike] | fmap = "", |
||
| str | importance_type = "weight" |
||
| ) |
Get feature importance of each feature.
For tree model Importance type can be defined as:
* 'weight': the number of times a feature is used to split the data across all trees.
* 'gain': the average gain across all splits the feature is used in.
* 'cover': the average coverage across all splits the feature is used in.
* 'total_gain': the total gain across all splits the feature is used in.
* 'total_cover': the total coverage across all splits the feature is used in.
.. note::
For linear model, only "weight" is defined and it's the normalized coefficients
without bias.
.. note:: Zero-importance features will not be included
Keep in mind that this function does not include zero-importance feature, i.e.
those features that have not been used in any split conditions.
Parameters
----------
fmap :
The name of feature map file.
importance_type :
One of the importance types defined above.
Returns
-------
A map between feature names and their scores. When `gblinear` is used for
multi-class classification the scores for each feature is a list with length
`n_classes`, otherwise they're scalars.
| Union[np.ndarray, DataFrame] xgboost.core.Booster.get_split_value_histogram | ( | self, | |
| str | feature, | ||
| Union[os.PathLike, str] | fmap = "", |
||
| Optional[int] | bins = None, |
||
| bool | as_pandas = True |
||
| ) |
Get split value histogram of a feature
Parameters
----------
feature :
The name of the feature.
fmap:
The name of feature map file.
bin :
The maximum number of bins.
Number of bins equals number of unique split values n_unique,
if bins == None or bins > n_unique.
as_pandas :
Return pd.DataFrame when pandas is installed.
If False or pandas is not installed, return numpy ndarray.
Returns
-------
a histogram of used splitting values for the specified feature
either as numpy array or pandas DataFrame.
| NumpyOrCupy xgboost.core.Booster.inplace_predict | ( | self, | |
| DataType | data, | ||
| Tuple[int, int] | iteration_range = (0, 0), |
||
| str | predict_type = "value", |
||
| float | missing = np.nan, |
||
| bool | validate_features = True, |
||
| Any | base_margin = None, |
||
| bool | strict_shape = False |
||
| ) |
Run prediction in-place when possible, Unlike :py:meth:`predict` method,
inplace prediction does not cache the prediction result.
Calling only ``inplace_predict`` in multiple threads is safe and lock
free. But the safety does not hold when used in conjunction with other
methods. E.g. you can't train the booster in one thread and perform
prediction in the other.
.. note::
If the device ordinal of the input data doesn't match the one configured for
the booster, data will be copied to the booster device.
.. code-block:: python
booster.set_param({"device": "cuda:0"})
booster.inplace_predict(cupy_array)
booster.set_param({"device": "cpu"})
booster.inplace_predict(numpy_array)
.. versionadded:: 1.1.0
Parameters
----------
data :
The input data.
iteration_range :
See :py:meth:`predict` for details.
predict_type :
* `value` Output model prediction values.
* `margin` Output the raw untransformed margin value.
missing :
See :py:obj:`xgboost.DMatrix` for details.
validate_features:
See :py:meth:`xgboost.Booster.predict` for details.
base_margin:
See :py:obj:`xgboost.DMatrix` for details.
.. versionadded:: 1.4.0
strict_shape:
See :py:meth:`xgboost.Booster.predict` for details.
.. versionadded:: 1.4.0
Returns
-------
prediction : numpy.ndarray/cupy.ndarray
The prediction result. When input data is on GPU, prediction result is
stored in a cupy array.
| None xgboost.core.Booster.load_config | ( | self, | |
| str | config | ||
| ) |
Load configuration returned by `save_config`. .. versionadded:: 1.0.0
| None xgboost.core.Booster.load_model | ( | self, | |
| ModelIn | fname | ||
| ) |
Load the model from a file or bytearray. Path to file can be local
or as an URI.
The model is loaded from XGBoost format which is universal among the various
XGBoost interfaces. Auxiliary attributes of the Python Booster object (such as
feature_names) will not be loaded when using binary format. To save those
attributes, use JSON/UBJ instead. See :doc:`Model IO </tutorials/saving_model>`
for more info.
.. code-block:: python
model.load_model("model.json")
# or
model.load_model("model.ubj")
Parameters
----------
fname :
Input file name or memory buffer(see also save_raw)
| int xgboost.core.Booster.num_boosted_rounds | ( | self | ) |
Get number of boosted rounds. For gblinear this is reset to 0 after serializing the model.
| int xgboost.core.Booster.num_features | ( | self | ) |
Number of features in booster.
| np.ndarray xgboost.core.Booster.predict | ( | self, | |
| DMatrix | data, | ||
| bool | output_margin = False, |
||
| bool | pred_leaf = False, |
||
| bool | pred_contribs = False, |
||
| bool | approx_contribs = False, |
||
| bool | pred_interactions = False, |
||
| bool | validate_features = True, |
||
| bool | training = False, |
||
| Tuple[int, int] | iteration_range = (0, 0), |
||
| bool | strict_shape = False |
||
| ) |
Predict with data. The full model will be used unless `iteration_range` is specified,
meaning user have to either slice the model or use the ``best_iteration``
attribute to get prediction from best model returned from early stopping.
.. note::
See :doc:`Prediction </prediction>` for issues like thread safety and a
summary of outputs from this function.
Parameters
----------
data :
The dmatrix storing the input.
output_margin :
Whether to output the raw untransformed margin value.
pred_leaf :
When this option is on, the output will be a matrix of (nsample,
ntrees) with each record indicating the predicted leaf index of
each sample in each tree. Note that the leaf index of a tree is
unique per tree, so you may find leaf 1 in both tree 1 and tree 0.
pred_contribs :
When this is True the output will be a matrix of size (nsample,
nfeats + 1) with each record indicating the feature contributions
(SHAP values) for that prediction. The sum of all feature
contributions is equal to the raw untransformed margin value of the
prediction. Note the final column is the bias term.
approx_contribs :
Approximate the contributions of each feature. Used when ``pred_contribs`` or
``pred_interactions`` is set to True. Changing the default of this parameter
(False) is not recommended.
pred_interactions :
When this is True the output will be a matrix of size (nsample,
nfeats + 1, nfeats + 1) indicating the SHAP interaction values for
each pair of features. The sum of each row (or column) of the
interaction values equals the corresponding SHAP value (from
pred_contribs), and the sum of the entire matrix equals the raw
untransformed margin value of the prediction. Note the last row and
column correspond to the bias term.
validate_features :
When this is True, validate that the Booster's and data's
feature_names are identical. Otherwise, it is assumed that the
feature_names are the same.
training :
Whether the prediction value is used for training. This can effect `dart`
booster, which performs dropouts during training iterations but use all trees
for inference. If you want to obtain result with dropouts, set this parameter
to `True`. Also, the parameter is set to true when obtaining prediction for
custom objective function.
.. versionadded:: 1.0.0
iteration_range :
Specifies which layer of trees are used in prediction. For example, if a
random forest is trained with 100 rounds. Specifying `iteration_range=(10,
20)`, then only the forests built during [10, 20) (half open set) rounds are
used in this prediction.
.. versionadded:: 1.4.0
strict_shape :
When set to True, output shape is invariant to whether classification is used.
For both value and margin prediction, the output shape is (n_samples,
n_groups), n_groups == 1 when multi-class is not used. Default to False, in
which case the output shape can be (n_samples, ) if multi-class is not used.
.. versionadded:: 1.4.0
Returns
-------
prediction : numpy array
| str xgboost.core.Booster.save_config | ( | self | ) |
Output internal parameter configuration of Booster as a JSON string. .. versionadded:: 1.0.0
| None xgboost.core.Booster.save_model | ( | self, | |
| Union[str, os.PathLike] | fname | ||
| ) |
Save the model to a file.
The model is saved in an XGBoost internal format which is universal among the
various XGBoost interfaces. Auxiliary attributes of the Python Booster object
(such as feature_names) will not be saved when using binary format. To save
those attributes, use JSON/UBJ instead. See :doc:`Model IO
</tutorials/saving_model>` for more info.
.. code-block:: python
model.save_model("model.json")
# or
model.save_model("model.ubj")
Parameters
----------
fname :
Output file name
| bytearray xgboost.core.Booster.save_raw | ( | self, | |
| str | raw_format = "deprecated" |
||
| ) |
Save the model to a in memory buffer representation instead of file.
Parameters
----------
raw_format :
Format of output buffer. Can be `json`, `ubj` or `deprecated`. Right now
the default is `deprecated` but it will be changed to `ubj` (univeral binary
json) in the future.
Returns
-------
An in memory buffer representation of the model
| None xgboost.core.Booster.set_attr | ( | self, | |
| **Optional[Any] | kwargs | ||
| ) |
Set the attribute of the Booster.
Parameters
----------
**kwargs
The attributes to set. Setting a value to None deletes an attribute.
| None xgboost.core.Booster.set_param | ( | self, | |
| Union[Dict, Iterable[Tuple[str, Any]], str] | params, | ||
| Optional[str] | value = None |
||
| ) |
Set parameters into the Booster. Parameters ---------- params : list of key,value pairs, dict of key to value or simply str key value : value of the specified parameter, when params is str key
| DataFrame xgboost.core.Booster.trees_to_dataframe | ( | self, | |
| Union[str, os.PathLike] | fmap = "" |
||
| ) |
Parse a boosted tree model text dump into a pandas DataFrame structure.
This feature is only defined when the decision tree model is chosen as base
learner (`booster in {gbtree, dart}`). It is not defined for other base learner
types, such as linear learners (`booster=gblinear`).
Parameters
----------
fmap :
The name of feature map file.
| None xgboost.core.Booster.update | ( | self, | |
| DMatrix | dtrain, | ||
| int | iteration, | ||
| Optional[Objective] | fobj = None |
||
| ) |
Update for one iteration, with objective function calculated
internally. This function should not be called directly by users.
Parameters
----------
dtrain :
Training data.
iteration :
Current iteration number.
fobj :
Customized objective function.