48 lines
1.2 KiB
Plaintext
48 lines
1.2 KiB
Plaintext
# evaluate { #axolotl.evaluate }
|
|
|
|
`evaluate`
|
|
|
|
Module for evaluating models.
|
|
|
|
## Functions
|
|
|
|
| Name | Description |
|
|
| --- | --- |
|
|
| [evaluate](#axolotl.evaluate.evaluate) | Evaluate a model on training and validation datasets |
|
|
| [evaluate_dataset](#axolotl.evaluate.evaluate_dataset) | Helper function to evaluate a single dataset safely. |
|
|
|
|
### evaluate { #axolotl.evaluate.evaluate }
|
|
|
|
```python
|
|
evaluate.evaluate(cfg, dataset_meta)
|
|
```
|
|
|
|
Evaluate a model on training and validation datasets
|
|
|
|
Args:
|
|
cfg: Dictionary mapping `axolotl` config keys to values.
|
|
dataset_meta: Dataset metadata containing training and evaluation datasets.
|
|
|
|
Returns:
|
|
Tuple containing:
|
|
- The model (either PeftModel or PreTrainedModel)
|
|
- The tokenizer
|
|
- Dictionary of evaluation metrics
|
|
|
|
### evaluate_dataset { #axolotl.evaluate.evaluate_dataset }
|
|
|
|
```python
|
|
evaluate.evaluate_dataset(trainer, dataset, dataset_type, flash_optimum=False)
|
|
```
|
|
|
|
Helper function to evaluate a single dataset safely.
|
|
|
|
Args:
|
|
trainer: The trainer instance
|
|
dataset: Dataset to evaluate
|
|
dataset_type: Type of dataset ('train' or 'eval')
|
|
flash_optimum: Whether to use flash optimum
|
|
|
|
Returns:
|
|
Dictionary of metrics or None if dataset is None
|