quartodoc integration
This commit is contained in:
committed by
Dan Saunders
parent
c907ac173e
commit
e4fd7aad0b
59
docs/api/README.md
Normal file
59
docs/api/README.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# Axolotl API Documentation with quartodoc
|
||||
|
||||
This directory contains the API documentation for Axolotl, automatically generated using quartodoc.
|
||||
|
||||
## Setup
|
||||
|
||||
1. Make sure quartodoc is installed:
|
||||
```
|
||||
pip install quartodoc
|
||||
```
|
||||
|
||||
2. Install Quarto (required to render the documentation):
|
||||
```
|
||||
# Download and install the latest Quarto release
|
||||
# Visit https://quarto.org/docs/get-started/ for installation instructions
|
||||
```
|
||||
|
||||
## Generating Documentation
|
||||
|
||||
Run the documentation generation script:
|
||||
```
|
||||
python scripts/generate_docs.py
|
||||
```
|
||||
|
||||
This will:
|
||||
- Read the configuration from `_quarto.yml`
|
||||
- Extract documentation from the Python source code
|
||||
- Generate Quarto markdown files (.qmd) in the `docs/api` directory
|
||||
|
||||
## Preview the Documentation
|
||||
|
||||
After generating the documentation, preview it with:
|
||||
```
|
||||
quarto preview
|
||||
```
|
||||
|
||||
## Building the Site
|
||||
|
||||
Build the complete site with:
|
||||
```
|
||||
quarto render
|
||||
```
|
||||
|
||||
This will create a `_site` directory with the static HTML site.
|
||||
|
||||
## Configuration
|
||||
|
||||
The documentation generation is configured in two places:
|
||||
|
||||
1. `_quarto.yml` - Contains the `quartodoc` section that defines which modules to document
|
||||
2. The API section in the Quarto website sidebar configuration (also in `_quarto.yml`)
|
||||
|
||||
## Customization
|
||||
|
||||
To customize the documentation, you can:
|
||||
|
||||
1. Add more modules to document in the `quartodoc` section of `_quarto.yml`
|
||||
2. Create template files in the `quartodoc_templates` directory
|
||||
3. Adjust the layout in the Quarto configuration
|
||||
38
docs/api/cli.evaluate.qmd
Normal file
38
docs/api/cli.evaluate.qmd
Normal file
@@ -0,0 +1,38 @@
|
||||
# cli.evaluate { #axolotl.cli.evaluate }
|
||||
|
||||
`cli.evaluate`
|
||||
|
||||
CLI to run evaluation on a model.
|
||||
|
||||
## Functions
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [do_cli](#axolotl.cli.evaluate.do_cli) | Parses `axolotl` config, CLI args, and calls `do_evaluate`. |
|
||||
| [do_evaluate](#axolotl.cli.evaluate.do_evaluate) | Evaluates a `transformers` model by first loading the dataset(s) specified in the |
|
||||
|
||||
### do_cli { #axolotl.cli.evaluate.do_cli }
|
||||
|
||||
```python
|
||||
cli.evaluate.do_cli(config=Path('examples/'), **kwargs)
|
||||
```
|
||||
|
||||
Parses `axolotl` config, CLI args, and calls `do_evaluate`.
|
||||
|
||||
Args:
|
||||
config: Path to `axolotl` config YAML file.
|
||||
kwargs: Additional keyword arguments to override config file values.
|
||||
|
||||
### do_evaluate { #axolotl.cli.evaluate.do_evaluate }
|
||||
|
||||
```python
|
||||
cli.evaluate.do_evaluate(cfg, cli_args)
|
||||
```
|
||||
|
||||
Evaluates a `transformers` model by first loading the dataset(s) specified in the
|
||||
`axolotl` config, and then calling `axolotl.evaluate.evaluate`, which computes
|
||||
evaluation metrics on the given dataset(s) and writes them to disk.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
cli_args: CLI arguments.
|
||||
128
docs/api/cli.main.qmd
Normal file
128
docs/api/cli.main.qmd
Normal file
@@ -0,0 +1,128 @@
|
||||
# cli.main { #axolotl.cli.main }
|
||||
|
||||
`cli.main`
|
||||
|
||||
Click CLI definitions for various axolotl commands.
|
||||
|
||||
## Functions
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [cli](#axolotl.cli.main.cli) | Axolotl CLI - Train and fine-tune large language models |
|
||||
| [evaluate](#axolotl.cli.main.evaluate) | Evaluate a model. |
|
||||
| [fetch](#axolotl.cli.main.fetch) | Fetch example configs or other resources. |
|
||||
| [inference](#axolotl.cli.main.inference) | Run inference with a trained model. |
|
||||
| [merge_lora](#axolotl.cli.main.merge_lora) | Merge trained LoRA adapters into a base model. |
|
||||
| [merge_sharded_fsdp_weights](#axolotl.cli.main.merge_sharded_fsdp_weights) | Merge sharded FSDP model weights. |
|
||||
| [preprocess](#axolotl.cli.main.preprocess) | Preprocess datasets before training. |
|
||||
| [train](#axolotl.cli.main.train) | Train or fine-tune a model. |
|
||||
|
||||
### cli { #axolotl.cli.main.cli }
|
||||
|
||||
```python
|
||||
cli.main.cli()
|
||||
```
|
||||
|
||||
Axolotl CLI - Train and fine-tune large language models
|
||||
|
||||
### evaluate { #axolotl.cli.main.evaluate }
|
||||
|
||||
```python
|
||||
cli.main.evaluate(config, accelerate, **kwargs)
|
||||
```
|
||||
|
||||
Evaluate a model.
|
||||
|
||||
Args:
|
||||
config: Path to `axolotl` config YAML file.
|
||||
accelerate: Whether to use `accelerate` launcher.
|
||||
kwargs: Additional keyword arguments which correspond to CLI args or `axolotl`
|
||||
config options.
|
||||
|
||||
### fetch { #axolotl.cli.main.fetch }
|
||||
|
||||
```python
|
||||
cli.main.fetch(directory, dest)
|
||||
```
|
||||
|
||||
Fetch example configs or other resources.
|
||||
|
||||
Available directories:
|
||||
- examples: Example configuration files
|
||||
- deepspeed_configs: DeepSpeed configuration files
|
||||
|
||||
Args:
|
||||
directory: One of `examples`, `deepspeed_configs`.
|
||||
dest: Optional destination directory.
|
||||
|
||||
### inference { #axolotl.cli.main.inference }
|
||||
|
||||
```python
|
||||
cli.main.inference(config, accelerate, gradio, **kwargs)
|
||||
```
|
||||
|
||||
Run inference with a trained model.
|
||||
|
||||
Args:
|
||||
config: Path to `axolotl` config YAML file.
|
||||
accelerate: Whether to use `accelerate` launcher.
|
||||
gradio: Whether to use Gradio browser interface or command line for inference.
|
||||
kwargs: Additional keyword arguments which correspond to CLI args or `axolotl`
|
||||
config options.
|
||||
|
||||
### merge_lora { #axolotl.cli.main.merge_lora }
|
||||
|
||||
```python
|
||||
cli.main.merge_lora(config, **kwargs)
|
||||
```
|
||||
|
||||
Merge trained LoRA adapters into a base model.
|
||||
|
||||
Args:
|
||||
config: Path to `axolotl` config YAML file.
|
||||
kwargs: Additional keyword arguments which correspond to CLI args or `axolotl`
|
||||
config options.
|
||||
|
||||
### merge_sharded_fsdp_weights { #axolotl.cli.main.merge_sharded_fsdp_weights }
|
||||
|
||||
```python
|
||||
cli.main.merge_sharded_fsdp_weights(config, accelerate, **kwargs)
|
||||
```
|
||||
|
||||
Merge sharded FSDP model weights.
|
||||
|
||||
Args:
|
||||
config: Path to `axolotl` config YAML file.
|
||||
accelerate: Whether to use `accelerate` launcher.
|
||||
kwargs: Additional keyword arguments which correspond to CLI args or `axolotl`
|
||||
config options.
|
||||
|
||||
### preprocess { #axolotl.cli.main.preprocess }
|
||||
|
||||
```python
|
||||
cli.main.preprocess(config, cloud=None, **kwargs)
|
||||
```
|
||||
|
||||
Preprocess datasets before training.
|
||||
|
||||
Args:
|
||||
config: Path to `axolotl` config YAML file.
|
||||
cloud: Path to a cloud accelerator configuration file.
|
||||
kwargs: Additional keyword arguments which correspond to CLI args or `axolotl`
|
||||
config options.
|
||||
|
||||
### train { #axolotl.cli.main.train }
|
||||
|
||||
```python
|
||||
cli.main.train(config, accelerate, cloud=None, sweep=None, **kwargs)
|
||||
```
|
||||
|
||||
Train or fine-tune a model.
|
||||
|
||||
Args:
|
||||
config: Path to `axolotl` config YAML file.
|
||||
accelerate: Whether to use `accelerate` launcher.
|
||||
cloud: Path to a cloud accelerator configuration file
|
||||
sweep: Path to YAML config for sweeping hyperparameters.
|
||||
kwargs: Additional keyword arguments which correspond to CLI args or `axolotl`
|
||||
config options.
|
||||
38
docs/api/cli.train.qmd
Normal file
38
docs/api/cli.train.qmd
Normal file
@@ -0,0 +1,38 @@
|
||||
# cli.train { #axolotl.cli.train }
|
||||
|
||||
`cli.train`
|
||||
|
||||
CLI to run training on a model.
|
||||
|
||||
## Functions
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [do_cli](#axolotl.cli.train.do_cli) | Parses `axolotl` config, CLI args, and calls `do_train`. |
|
||||
| [do_train](#axolotl.cli.train.do_train) | Trains a `transformers` model by first loading the dataset(s) specified in the |
|
||||
|
||||
### do_cli { #axolotl.cli.train.do_cli }
|
||||
|
||||
```python
|
||||
cli.train.do_cli(config=Path('examples/'), **kwargs)
|
||||
```
|
||||
|
||||
Parses `axolotl` config, CLI args, and calls `do_train`.
|
||||
|
||||
Args:
|
||||
config: Path to `axolotl` config YAML file.
|
||||
kwargs: Additional keyword arguments to override config file values.
|
||||
|
||||
### do_train { #axolotl.cli.train.do_train }
|
||||
|
||||
```python
|
||||
cli.train.do_train(cfg, cli_args)
|
||||
```
|
||||
|
||||
Trains a `transformers` model by first loading the dataset(s) specified in the
|
||||
`axolotl` config, and then calling `axolotl.train.train`. Also runs the plugin
|
||||
manager's `post_train_unload` once training completes.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
cli_args: Training-specific CLI arguments.
|
||||
44
docs/api/datasets.qmd
Normal file
44
docs/api/datasets.qmd
Normal file
@@ -0,0 +1,44 @@
|
||||
# datasets { #axolotl.datasets }
|
||||
|
||||
`datasets`
|
||||
|
||||
Module containing Dataset functionality
|
||||
|
||||
## Classes
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [ConstantLengthDataset](#axolotl.datasets.ConstantLengthDataset) | Iterable dataset that returns constant length chunks of tokens from stream of text files. |
|
||||
| [TokenizedPromptDataset](#axolotl.datasets.TokenizedPromptDataset) | Dataset that returns tokenized prompts from a stream of text files. |
|
||||
|
||||
### ConstantLengthDataset { #axolotl.datasets.ConstantLengthDataset }
|
||||
|
||||
```python
|
||||
datasets.ConstantLengthDataset(self, tokenizer, datasets, seq_length=2048)
|
||||
```
|
||||
|
||||
Iterable dataset that returns constant length chunks of tokens from stream of text files.
|
||||
Args:
|
||||
tokenizer (Tokenizer): The processor used for processing the data.
|
||||
dataset (dataset.Dataset): Dataset with text files.
|
||||
seq_length (int): Length of token sequences to return.
|
||||
|
||||
### TokenizedPromptDataset { #axolotl.datasets.TokenizedPromptDataset }
|
||||
|
||||
```python
|
||||
datasets.TokenizedPromptDataset(
|
||||
self,
|
||||
prompt_tokenizer,
|
||||
dataset,
|
||||
process_count=None,
|
||||
keep_in_memory=False,
|
||||
**kwargs,
|
||||
)
|
||||
```
|
||||
|
||||
Dataset that returns tokenized prompts from a stream of text files.
|
||||
Args:
|
||||
prompt_tokenizer (PromptTokenizingStrategy): The prompt tokenizing method for processing the data.
|
||||
dataset (dataset.Dataset): Dataset with text files.
|
||||
process_count (int): Number of processes to use for tokenizing.
|
||||
keep_in_memory (bool): Whether to keep the tokenized dataset in memory.
|
||||
47
docs/api/evaluate.qmd
Normal file
47
docs/api/evaluate.qmd
Normal file
@@ -0,0 +1,47 @@
|
||||
# evaluate { #axolotl.evaluate }
|
||||
|
||||
`evaluate`
|
||||
|
||||
Module for evaluating models.
|
||||
|
||||
## Functions
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [evaluate](#axolotl.evaluate.evaluate) | Evaluate a model on training and validation datasets |
|
||||
| [evaluate_dataset](#axolotl.evaluate.evaluate_dataset) | Helper function to evaluate a single dataset safely. |
|
||||
|
||||
### evaluate { #axolotl.evaluate.evaluate }
|
||||
|
||||
```python
|
||||
evaluate.evaluate(cfg, dataset_meta)
|
||||
```
|
||||
|
||||
Evaluate a model on training and validation datasets
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
dataset_meta: Dataset metadata containing training and evaluation datasets.
|
||||
|
||||
Returns:
|
||||
Tuple containing:
|
||||
- The model (either PeftModel or PreTrainedModel)
|
||||
- The tokenizer
|
||||
- Dictionary of evaluation metrics
|
||||
|
||||
### evaluate_dataset { #axolotl.evaluate.evaluate_dataset }
|
||||
|
||||
```python
|
||||
evaluate.evaluate_dataset(trainer, dataset, dataset_type, flash_optimum=False)
|
||||
```
|
||||
|
||||
Helper function to evaluate a single dataset safely.
|
||||
|
||||
Args:
|
||||
trainer: The trainer instance
|
||||
dataset: Dataset to evaluate
|
||||
dataset_type: Type of dataset ('train' or 'eval')
|
||||
flash_optimum: Whether to use flash optimum
|
||||
|
||||
Returns:
|
||||
Dictionary of metrics or None if dataset is None
|
||||
39
docs/api/index.qmd
Normal file
39
docs/api/index.qmd
Normal file
@@ -0,0 +1,39 @@
|
||||
# API Reference {.doc .doc-index}
|
||||
|
||||
## Core
|
||||
|
||||
Core functionality for training
|
||||
|
||||
| | |
|
||||
| --- | --- |
|
||||
| [train](train.qmd#axolotl.train) | Prepare and train a model on a dataset. Can also infer from a model or merge lora |
|
||||
| [evaluate](evaluate.qmd#axolotl.evaluate) | Module for evaluating models. |
|
||||
| [datasets](datasets.qmd#axolotl.datasets) | Module containing Dataset functionality |
|
||||
|
||||
## CLI
|
||||
|
||||
Command-line interface
|
||||
|
||||
| | |
|
||||
| --- | --- |
|
||||
| [cli.main](cli.main.qmd#axolotl.cli.main) | Click CLI definitions for various axolotl commands. |
|
||||
| [cli.train](cli.train.qmd#axolotl.cli.train) | CLI to run training on a model. |
|
||||
| [cli.evaluate](cli.evaluate.qmd#axolotl.cli.evaluate) | CLI to run evaluation on a model. |
|
||||
|
||||
## Prompt Strategies
|
||||
|
||||
Prompt formatting strategies
|
||||
|
||||
| | |
|
||||
| --- | --- |
|
||||
| [prompt_strategies.base](prompt_strategies.base.qmd#axolotl.prompt_strategies.base) | module for base dataset transform strategies |
|
||||
| [prompt_strategies.chat_template](prompt_strategies.chat_template.qmd#axolotl.prompt_strategies.chat_template) | HF Chat Templates prompt strategy |
|
||||
|
||||
## Utils
|
||||
|
||||
Utility functions
|
||||
|
||||
| | |
|
||||
| --- | --- |
|
||||
| [utils.models](utils.models.qmd#axolotl.utils.models) | Module for models and model loading |
|
||||
| [utils.tokenization](utils.tokenization.qmd#axolotl.utils.tokenization) | Module for tokenization utilities |
|
||||
5
docs/api/prompt_strategies.base.qmd
Normal file
5
docs/api/prompt_strategies.base.qmd
Normal file
@@ -0,0 +1,5 @@
|
||||
# prompt_strategies.base { #axolotl.prompt_strategies.base }
|
||||
|
||||
`prompt_strategies.base`
|
||||
|
||||
module for base dataset transform strategies
|
||||
80
docs/api/prompt_strategies.chat_template.qmd
Normal file
80
docs/api/prompt_strategies.chat_template.qmd
Normal file
@@ -0,0 +1,80 @@
|
||||
# prompt_strategies.chat_template { #axolotl.prompt_strategies.chat_template }
|
||||
|
||||
`prompt_strategies.chat_template`
|
||||
|
||||
HF Chat Templates prompt strategy
|
||||
|
||||
## Classes
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [ChatTemplatePrompter](#axolotl.prompt_strategies.chat_template.ChatTemplatePrompter) | Prompter for HF chat templates |
|
||||
| [ChatTemplateStrategy](#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy) | Tokenizing strategy for instruction-based prompts. |
|
||||
| [StrategyLoader](#axolotl.prompt_strategies.chat_template.StrategyLoader) | Load chat template strategy based on configuration. |
|
||||
|
||||
### ChatTemplatePrompter { #axolotl.prompt_strategies.chat_template.ChatTemplatePrompter }
|
||||
|
||||
```python
|
||||
prompt_strategies.chat_template.ChatTemplatePrompter(
|
||||
self,
|
||||
tokenizer,
|
||||
chat_template,
|
||||
processor=None,
|
||||
max_length=2048,
|
||||
message_property_mappings=None,
|
||||
message_field_training=None,
|
||||
message_field_training_detail=None,
|
||||
field_messages='messages',
|
||||
roles=None,
|
||||
drop_system_message=False,
|
||||
)
|
||||
```
|
||||
|
||||
Prompter for HF chat templates
|
||||
|
||||
### ChatTemplateStrategy { #axolotl.prompt_strategies.chat_template.ChatTemplateStrategy }
|
||||
|
||||
```python
|
||||
prompt_strategies.chat_template.ChatTemplateStrategy(
|
||||
self,
|
||||
prompter,
|
||||
tokenizer,
|
||||
train_on_inputs,
|
||||
sequence_len,
|
||||
roles_to_train=None,
|
||||
train_on_eos=None,
|
||||
)
|
||||
```
|
||||
|
||||
Tokenizing strategy for instruction-based prompts.
|
||||
|
||||
#### Methods
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [find_turn](#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.find_turn) | Locate the starting and ending indices of the specified turn in a conversation. |
|
||||
| [tokenize_prompt](#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt) | Public method that can handle either a single prompt or a batch of prompts. |
|
||||
|
||||
##### find_turn { #axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.find_turn }
|
||||
|
||||
```python
|
||||
prompt_strategies.chat_template.ChatTemplateStrategy.find_turn(turns, turn_idx)
|
||||
```
|
||||
|
||||
Locate the starting and ending indices of the specified turn in a conversation.
|
||||
|
||||
##### tokenize_prompt { #axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt }
|
||||
|
||||
```python
|
||||
prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt(prompt)
|
||||
```
|
||||
|
||||
Public method that can handle either a single prompt or a batch of prompts.
|
||||
|
||||
### StrategyLoader { #axolotl.prompt_strategies.chat_template.StrategyLoader }
|
||||
|
||||
```python
|
||||
prompt_strategies.chat_template.StrategyLoader()
|
||||
```
|
||||
|
||||
Load chat template strategy based on configuration.
|
||||
199
docs/api/train.qmd
Normal file
199
docs/api/train.qmd
Normal file
@@ -0,0 +1,199 @@
|
||||
# train { #axolotl.train }
|
||||
|
||||
`train`
|
||||
|
||||
Prepare and train a model on a dataset. Can also infer from a model or merge lora
|
||||
|
||||
## Functions
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [create_model_card](#axolotl.train.create_model_card) | Create a model card for the trained model if needed. |
|
||||
| [determine_resume_checkpoint](#axolotl.train.determine_resume_checkpoint) | Determine the checkpoint to resume from based on configuration. |
|
||||
| [execute_training](#axolotl.train.execute_training) | Execute the training process with appropriate backend configurations. |
|
||||
| [handle_untrained_tokens_fix](#axolotl.train.handle_untrained_tokens_fix) | Apply fixes for untrained tokens if configured. |
|
||||
| [save_initial_configs](#axolotl.train.save_initial_configs) | Save initial configurations before training. |
|
||||
| [save_trained_model](#axolotl.train.save_trained_model) | Save the trained model according to configuration and training setup. |
|
||||
| [setup_model_and_tokenizer](#axolotl.train.setup_model_and_tokenizer) | Load the tokenizer, processor (for multimodal models), and model based on configuration. |
|
||||
| [setup_model_and_trainer](#axolotl.train.setup_model_and_trainer) | Load model, tokenizer, trainer, etc. Helper function to encapsulate the full |
|
||||
| [setup_model_card](#axolotl.train.setup_model_card) | Set up the Axolotl badge and add the Axolotl config to the model card if available. |
|
||||
| [setup_reference_model](#axolotl.train.setup_reference_model) | Set up the reference model for RL training if needed. |
|
||||
| [setup_signal_handler](#axolotl.train.setup_signal_handler) | Set up signal handler for graceful termination. |
|
||||
| [train](#axolotl.train.train) | Train a model on the given dataset. |
|
||||
|
||||
### create_model_card { #axolotl.train.create_model_card }
|
||||
|
||||
```python
|
||||
train.create_model_card(cfg, trainer)
|
||||
```
|
||||
|
||||
Create a model card for the trained model if needed.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
trainer: The trainer object with model card creation capabilities.
|
||||
|
||||
### determine_resume_checkpoint { #axolotl.train.determine_resume_checkpoint }
|
||||
|
||||
```python
|
||||
train.determine_resume_checkpoint(cfg)
|
||||
```
|
||||
|
||||
Determine the checkpoint to resume from based on configuration.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
|
||||
Returns:
|
||||
Path to the checkpoint to resume from, or `None` if not resuming.
|
||||
|
||||
### execute_training { #axolotl.train.execute_training }
|
||||
|
||||
```python
|
||||
train.execute_training(cfg, trainer, resume_from_checkpoint)
|
||||
```
|
||||
|
||||
Execute the training process with appropriate backend configurations.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
trainer: The configured trainer object.
|
||||
resume_from_checkpoint: Path to checkpoint to resume from, if applicable.
|
||||
|
||||
### handle_untrained_tokens_fix { #axolotl.train.handle_untrained_tokens_fix }
|
||||
|
||||
```python
|
||||
train.handle_untrained_tokens_fix(
|
||||
cfg,
|
||||
model,
|
||||
tokenizer,
|
||||
train_dataset,
|
||||
safe_serialization,
|
||||
)
|
||||
```
|
||||
|
||||
Apply fixes for untrained tokens if configured.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
model: The model to apply fixes to.
|
||||
tokenizer: The tokenizer for token identification.
|
||||
train_dataset: The training dataset to use.
|
||||
safe_serialization: Whether to use safe serialization when saving.
|
||||
|
||||
### save_initial_configs { #axolotl.train.save_initial_configs }
|
||||
|
||||
```python
|
||||
train.save_initial_configs(cfg, tokenizer, model, peft_config)
|
||||
```
|
||||
|
||||
Save initial configurations before training.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
tokenizer: The tokenizer to save.
|
||||
model: The model to save configuration for.
|
||||
peft_config: The PEFT configuration to save if applicable.
|
||||
|
||||
### save_trained_model { #axolotl.train.save_trained_model }
|
||||
|
||||
```python
|
||||
train.save_trained_model(cfg, trainer, model, safe_serialization)
|
||||
```
|
||||
|
||||
Save the trained model according to configuration and training setup.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
trainer: The trainer object.
|
||||
model: The trained model to save.
|
||||
safe_serialization: Whether to use safe serialization.
|
||||
|
||||
### setup_model_and_tokenizer { #axolotl.train.setup_model_and_tokenizer }
|
||||
|
||||
```python
|
||||
train.setup_model_and_tokenizer(cfg)
|
||||
```
|
||||
|
||||
Load the tokenizer, processor (for multimodal models), and model based on configuration.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
|
||||
Returns:
|
||||
Tuple containing model, tokenizer, `peft_config` (if LoRA / QLoRA, else
|
||||
`None`), and processor (if multimodal, else `None`).
|
||||
|
||||
### setup_model_and_trainer { #axolotl.train.setup_model_and_trainer }
|
||||
|
||||
```python
|
||||
train.setup_model_and_trainer(cfg, dataset_meta)
|
||||
```
|
||||
|
||||
Load model, tokenizer, trainer, etc. Helper function to encapsulate the full
|
||||
trainer setup.
|
||||
|
||||
Args:
|
||||
cfg: The configuration dictionary with training parameters.
|
||||
dataset_meta: Object with training, validation datasets and metadata.
|
||||
|
||||
Returns:
|
||||
Tuple of:
|
||||
- Trainer (Causal or RLHF)
|
||||
- Model
|
||||
- Tokenizer
|
||||
- PEFT config
|
||||
|
||||
### setup_model_card { #axolotl.train.setup_model_card }
|
||||
|
||||
```python
|
||||
train.setup_model_card(cfg)
|
||||
```
|
||||
|
||||
Set up the Axolotl badge and add the Axolotl config to the model card if available.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
|
||||
### setup_reference_model { #axolotl.train.setup_reference_model }
|
||||
|
||||
```python
|
||||
train.setup_reference_model(cfg, tokenizer)
|
||||
```
|
||||
|
||||
Set up the reference model for RL training if needed.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
tokenizer: The tokenizer to use for the reference model.
|
||||
|
||||
Returns:
|
||||
Reference model if needed for RL training, `None` otherwise.
|
||||
|
||||
### setup_signal_handler { #axolotl.train.setup_signal_handler }
|
||||
|
||||
```python
|
||||
train.setup_signal_handler(cfg, model, safe_serialization)
|
||||
```
|
||||
|
||||
Set up signal handler for graceful termination.
|
||||
|
||||
Args:
|
||||
cfg: Dictionary mapping `axolotl` config keys to values.
|
||||
model: The model to save on termination
|
||||
safe_serialization: Whether to use safe serialization when saving
|
||||
|
||||
### train { #axolotl.train.train }
|
||||
|
||||
```python
|
||||
train.train(cfg, dataset_meta)
|
||||
```
|
||||
|
||||
Train a model on the given dataset.
|
||||
|
||||
Args:
|
||||
cfg: The configuration dictionary with training parameters
|
||||
dataset_meta: Object with training, validation datasets and metadata
|
||||
|
||||
Returns:
|
||||
Tuple of (model, tokenizer) after training
|
||||
161
docs/api/utils.models.qmd
Normal file
161
docs/api/utils.models.qmd
Normal file
@@ -0,0 +1,161 @@
|
||||
# utils.models { #axolotl.utils.models }
|
||||
|
||||
`utils.models`
|
||||
|
||||
Module for models and model loading
|
||||
|
||||
## Classes
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [ModelLoader](#axolotl.utils.models.ModelLoader) | ModelLoader: managing all the config and monkey patches while loading model |
|
||||
|
||||
### ModelLoader { #axolotl.utils.models.ModelLoader }
|
||||
|
||||
```python
|
||||
utils.models.ModelLoader(
|
||||
self,
|
||||
cfg,
|
||||
tokenizer,
|
||||
*,
|
||||
processor=None,
|
||||
inference=False,
|
||||
reference_model=False,
|
||||
**kwargs,
|
||||
)
|
||||
```
|
||||
|
||||
ModelLoader: managing all the config and monkey patches while loading model
|
||||
|
||||
#### Attributes
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [has_flash_attn](#axolotl.utils.models.ModelLoader.has_flash_attn) | Check if flash attention is installed |
|
||||
|
||||
#### Methods
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [patch_llama_derived_model](#axolotl.utils.models.ModelLoader.patch_llama_derived_model) | Modify all llama derived models in one block |
|
||||
| [patch_loss_llama](#axolotl.utils.models.ModelLoader.patch_loss_llama) | Patch loss functions and other optimizations |
|
||||
| [set_attention_config](#axolotl.utils.models.ModelLoader.set_attention_config) | sample packing uses custom FA2 patch |
|
||||
| [set_auto_model_loader](#axolotl.utils.models.ModelLoader.set_auto_model_loader) | set self.AutoModelLoader |
|
||||
|
||||
##### patch_llama_derived_model { #axolotl.utils.models.ModelLoader.patch_llama_derived_model }
|
||||
|
||||
```python
|
||||
utils.models.ModelLoader.patch_llama_derived_model()
|
||||
```
|
||||
|
||||
Modify all llama derived models in one block
|
||||
|
||||
##### patch_loss_llama { #axolotl.utils.models.ModelLoader.patch_loss_llama }
|
||||
|
||||
```python
|
||||
utils.models.ModelLoader.patch_loss_llama()
|
||||
```
|
||||
|
||||
Patch loss functions and other optimizations
|
||||
|
||||
##### set_attention_config { #axolotl.utils.models.ModelLoader.set_attention_config }
|
||||
|
||||
```python
|
||||
utils.models.ModelLoader.set_attention_config()
|
||||
```
|
||||
|
||||
sample packing uses custom FA2 patch
|
||||
|
||||
##### set_auto_model_loader { #axolotl.utils.models.ModelLoader.set_auto_model_loader }
|
||||
|
||||
```python
|
||||
utils.models.ModelLoader.set_auto_model_loader()
|
||||
```
|
||||
|
||||
set self.AutoModelLoader
|
||||
- default value: AutoModelForCausalLM (set at __init__)
|
||||
- when using a multi modality model, self.AutoModelLoader should
|
||||
be set according to model type of the model
|
||||
|
||||
## Functions
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [get_module_class_from_name](#axolotl.utils.models.get_module_class_from_name) | Gets a class from a module by its name. |
|
||||
| [load_model](#axolotl.utils.models.load_model) | Load a model for a given configuration and tokenizer. |
|
||||
| [load_tokenizer](#axolotl.utils.models.load_tokenizer) | Load and configure the tokenizer based on the provided config. |
|
||||
| [modify_tokenizer_files](#axolotl.utils.models.modify_tokenizer_files) | Modify tokenizer files to replace added_tokens strings, save to output directory, and return the path to the modified tokenizer. |
|
||||
| [setup_quantized_meta_for_peft](#axolotl.utils.models.setup_quantized_meta_for_peft) | Replaces `quant_state.to` with a dummy function to prevent PEFT from moving `quant_state` to meta device |
|
||||
| [setup_quantized_peft_meta_for_training](#axolotl.utils.models.setup_quantized_peft_meta_for_training) | Replaces dummy `quant_state.to` method with the original function to allow training to continue |
|
||||
|
||||
### get_module_class_from_name { #axolotl.utils.models.get_module_class_from_name }
|
||||
|
||||
```python
|
||||
utils.models.get_module_class_from_name(module, name)
|
||||
```
|
||||
|
||||
Gets a class from a module by its name.
|
||||
|
||||
Args:
|
||||
module (`torch.nn.Module`): The module to get the class from.
|
||||
name (`str`): The name of the class.
|
||||
|
||||
### load_model { #axolotl.utils.models.load_model }
|
||||
|
||||
```python
|
||||
utils.models.load_model(
|
||||
cfg,
|
||||
tokenizer,
|
||||
*,
|
||||
processor=None,
|
||||
inference=False,
|
||||
reference_model=False,
|
||||
**kwargs,
|
||||
)
|
||||
```
|
||||
|
||||
Load a model for a given configuration and tokenizer.
|
||||
|
||||
### load_tokenizer { #axolotl.utils.models.load_tokenizer }
|
||||
|
||||
```python
|
||||
utils.models.load_tokenizer(cfg)
|
||||
```
|
||||
|
||||
Load and configure the tokenizer based on the provided config.
|
||||
|
||||
### modify_tokenizer_files { #axolotl.utils.models.modify_tokenizer_files }
|
||||
|
||||
```python
|
||||
utils.models.modify_tokenizer_files(tokenizer_path, token_mappings, output_dir)
|
||||
```
|
||||
|
||||
Modify tokenizer files to replace added_tokens strings, save to output directory, and return the path to the modified tokenizer.
|
||||
|
||||
This only works with reserved tokens that were added to the tokenizer, not tokens already part of the vocab.
|
||||
|
||||
Args:
|
||||
tokenizer_path: Path or name of the original tokenizer
|
||||
token_mappings: Dict mapping {token_id (int): new_token_string}
|
||||
output_dir: Directory to save the modified tokenizer
|
||||
|
||||
Returns:
|
||||
Path to the modified tokenizer directory
|
||||
|
||||
Ref: https://github.com/huggingface/transformers/issues/27974#issuecomment-1854188941
|
||||
|
||||
### setup_quantized_meta_for_peft { #axolotl.utils.models.setup_quantized_meta_for_peft }
|
||||
|
||||
```python
|
||||
utils.models.setup_quantized_meta_for_peft(model)
|
||||
```
|
||||
|
||||
Replaces `quant_state.to` with a dummy function to prevent PEFT from moving `quant_state` to meta device
|
||||
|
||||
### setup_quantized_peft_meta_for_training { #axolotl.utils.models.setup_quantized_peft_meta_for_training }
|
||||
|
||||
```python
|
||||
utils.models.setup_quantized_peft_meta_for_training(model)
|
||||
```
|
||||
|
||||
Replaces dummy `quant_state.to` method with the original function to allow training to continue
|
||||
38
docs/api/utils.tokenization.qmd
Normal file
38
docs/api/utils.tokenization.qmd
Normal file
@@ -0,0 +1,38 @@
|
||||
# utils.tokenization { #axolotl.utils.tokenization }
|
||||
|
||||
`utils.tokenization`
|
||||
|
||||
Module for tokenization utilities
|
||||
|
||||
## Functions
|
||||
|
||||
| Name | Description |
|
||||
| --- | --- |
|
||||
| [color_token_for_rl_debug](#axolotl.utils.tokenization.color_token_for_rl_debug) | Helper function to color tokens based on their type. |
|
||||
| [process_tokens_for_rl_debug](#axolotl.utils.tokenization.process_tokens_for_rl_debug) | Helper function to process and color tokens. |
|
||||
|
||||
### color_token_for_rl_debug { #axolotl.utils.tokenization.color_token_for_rl_debug }
|
||||
|
||||
```python
|
||||
utils.tokenization.color_token_for_rl_debug(
|
||||
decoded_token,
|
||||
encoded_token,
|
||||
color,
|
||||
text_only,
|
||||
)
|
||||
```
|
||||
|
||||
Helper function to color tokens based on their type.
|
||||
|
||||
### process_tokens_for_rl_debug { #axolotl.utils.tokenization.process_tokens_for_rl_debug }
|
||||
|
||||
```python
|
||||
utils.tokenization.process_tokens_for_rl_debug(
|
||||
tokens,
|
||||
color,
|
||||
tokenizer,
|
||||
text_only,
|
||||
)
|
||||
```
|
||||
|
||||
Helper function to process and color tokens.
|
||||
Reference in New Issue
Block a user