Medial Code Documentation
Loading...
Searching...
No Matches
Functions | Variables
logistic_regression Namespace Reference

Functions

 log_loss (preds, labels)
 Set up a couple of utilities for our experiments.
 
 experiment (objective, label_type, data)
 

Variables

int N = 1000
 Simulate some binary data with a single categorical and single continuous predictor.
 
 X
 
list CATEGORICAL_EFFECTS = [-1, -1, -2, -2, 2]
 
 LINEAR_TERM
 
 TRUE_PROB = expit(LINEAR_TERM)
 
 Y = np.random.binomial(1, TRUE_PROB, size=N)
 
dict DATA
 
int K = 10
 
list A
 
list B
 

Detailed Description

Comparison of `binary` and `xentropy` objectives.

BLUF: The `xentropy` objective does logistic regression and generalizes
to the case where labels are probabilistic (i.e. numbers between 0 and 1).

Details: Both `binary` and `xentropy` minimize the log loss and use
`boost_from_average = TRUE` by default. Possibly the only difference
between them with default settings is that `binary` may achieve a slight
speed improvement by assuming that the labels are binary instead of
probabilistic.

Function Documentation

◆ experiment()

logistic_regression.experiment (   objective,
  label_type,
  data 
)
Measure performance of an objective.

Parameters
----------
objective : string 'binary' or 'xentropy'
    Objective function.
label_type : string 'binary' or 'probability'
    Type of the label.
data : dict
    Data for training.

Returns
-------
result : dict
    Experiment summary stats.

◆ log_loss()

logistic_regression.log_loss (   preds,
  labels 
)

Set up a couple of utilities for our experiments.

Logarithmic loss with non-necessarily-binary labels.

Variable Documentation

◆ A

list logistic_regression.A
Initial value:
1= [experiment('binary', label_type='binary', data=DATA)['time']
2 for k in range(K)]

◆ B

list logistic_regression.B
Initial value:
1= [experiment('xentropy', label_type='binary', data=DATA)['time']
2 for k in range(K)]

◆ DATA

dict logistic_regression.DATA
Initial value:
1= {
2 'X': X,
3 'probability_labels': TRUE_PROB,
4 'binary_labels': Y,
5 'lgb_with_binary_labels': lgb.Dataset(X, Y),
6 'lgb_with_probability_labels': lgb.Dataset(X, TRUE_PROB),
7}

◆ LINEAR_TERM

logistic_regression.LINEAR_TERM
Initial value:
1= np.array([
2 -0.5 + 0.01 * X['continuous'][k]
3 + CATEGORICAL_EFFECTS[X['categorical'][k]] for k in range(X.shape[0])
4]) + np.random.normal(0, 1, X.shape[0])

◆ X

logistic_regression.X
Initial value:
1= pd.DataFrame({
2 'continuous': range(N),
3 'categorical': np.repeat([0, 1, 2, 3, 4], N / 5)
4})