Limiting Memory Usage in predict
If you encounter a bad alloc error during prediction, you may need to restrict memory usage.
Run --get_model_preds with an additional flag:
Where LIMIT_MEMORY_JSON is defined as:
This setting limits the maximum matrix flat array size to 100M float elements (~400 MB). If out model has 1000 features it will split the predictions into 100K batches. Keep in mind:
- Actual memory usage can reach up to 2× this value due to temporary duplication.
- Additional constant and object overhead also contributes to memory consumption that are not taking into account in the batch size
- The limit applies specifically to matrix creation and batch processing during prediction.