Medial Code Documentation
Loading...
Searching...
No Matches
Public Member Functions | Data Fields | Protected Member Functions | Protected Attributes
xgboost.core.DataIter Class Reference
Inheritance diagram for xgboost.core.DataIter:
external_memory.Iterator quantile_data_iterator.IterForDMatrixDemo xgboost.dask.DaskPartitionIter xgboost.data.SingleBatchInternalIter xgboost.spark.data.PartIter

Public Member Functions

None __init__ (self, Optional[str] cache_prefix=None, bool release_data=True)
 
Tuple[Callable, Callable] get_callbacks (self, bool allow_host, bool enable_categorical)
 
"_ProxyDMatrix" proxy (self)
 
None reraise (self)
 
None __del__ (self)
 
None reset (self)
 
int next (self, Callable input_data)
 

Data Fields

 cache_prefix
 
 reset
 
 proxy
 

Protected Member Functions

_T _handle_exception (self, Callable fn, _T dft_ret)
 
None _reset_wrapper (self, None this)
 
int _next_wrapper (self, None this)
 

Protected Attributes

 _handle
 
 _enable_categorical
 
 _allow_host
 
 _release
 
 _reset_callback
 
 _next_callback
 
 _next_wrapper
 
 _exception
 
 _temporary_data
 
 _data_ref
 

Detailed Description

The interface for user defined data iterator. The iterator facilitates
distributed training, :py:class:`QuantileDMatrix`, and external memory support using
:py:class:`DMatrix`. Most of time, users don't need to interact with this class
directly.

.. note::

    The class caches some intermediate results using the `data` input (predictor
    `X`) as key. Don't repeat the `X` for multiple batches with different meta data
    (like `label`), make a copy if necessary.

Parameters
----------
cache_prefix :
    Prefix to the cache files, only used in external memory.  It can be either an
    URI or a file path.
release_data :
    Whether the iterator should release the data during reset. Set it to True if the
    data transformation (converting data to np.float32 type) is expensive.

Constructor & Destructor Documentation

◆ __init__()

None xgboost.core.DataIter.__init__ (   self,
Optional[str]   cache_prefix = None,
bool   release_data = True 
)

Member Function Documentation

◆ _next_wrapper()

int xgboost.core.DataIter._next_wrapper (   self,
None  this 
)
protected
A wrapper for user defined `next` function.

`this` is not used in Python.  ctypes can handle `self` of a Python
member function automatically when converting it to c function
pointer.

◆ _reset_wrapper()

None xgboost.core.DataIter._reset_wrapper (   self,
None  this 
)
protected
A wrapper for user defined `reset` function.

◆ get_callbacks()

Tuple[Callable, Callable] xgboost.core.DataIter.get_callbacks (   self,
bool  allow_host,
bool   enable_categorical 
)
Get callback functions for iterating in C. This is an internal function.

◆ next()

int xgboost.core.DataIter.next (   self,
Callable  input_data 
)
Set the next batch of data.

Parameters
----------

input_data:
    A function with same data fields like `data`, `label` with
    `xgboost.DMatrix`.

Returns
-------
0 if there's no more batch, otherwise 1.

Reimplemented in external_memory.Iterator, xgboost.dask.DaskPartitionIter, xgboost.data.SingleBatchInternalIter, xgboost.spark.data.PartIter, and quantile_data_iterator.IterForDMatrixDemo.

◆ proxy()

"_ProxyDMatrix" xgboost.core.DataIter.proxy (   self)
Handle of DMatrix proxy.

◆ reraise()

None xgboost.core.DataIter.reraise (   self)
Reraise the exception thrown during iteration.

◆ reset()

None xgboost.core.DataIter.reset (   self)
Reset the data iterator.  Prototype for user defined function.

Reimplemented in external_memory.Iterator, quantile_data_iterator.IterForDMatrixDemo, and xgboost.dask.DaskPartitionIter.

Field Documentation

◆ reset

xgboost.core.DataIter.reset

The documentation for this class was generated from the following file: