diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 1138d99f1..7a45b7b07 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -22,7 +22,9 @@ jobs: python-version: '3.11' - name: install dependencies run: | - python3 -m pip install jupyter + python3 -m pip install jupyter quartodoc + python3 -m pip install -e . + python3 scripts/generate_docs.py - name: Publish to GitHub Pages (and render) uses: quarto-dev/quarto-actions/publish@v2 with: diff --git a/README.md b/README.md index 343816aff..55f5f7cc1 100644 --- a/README.md +++ b/README.md @@ -97,6 +97,7 @@ That's it! Check out our [Getting Started Guide](https://axolotl-ai-cloud.github - [Multi-GPU Training](https://axolotl-ai-cloud.github.io/axolotl/docs/multi-gpu.html) - [Multi-Node Training](https://axolotl-ai-cloud.github.io/axolotl/docs/multi-node.html) - [Multipacking](https://axolotl-ai-cloud.github.io/axolotl/docs/multipack.html) +- [API Reference](https://axolotl-ai-cloud.github.io/axolotl/api/) - Auto-generated code documentation - [FAQ](https://axolotl-ai-cloud.github.io/axolotl/docs/faq.html) - Frequently asked questions ## 🤝 Getting Help diff --git a/_quarto.yml b/_quarto.yml index 943ed5293..3a35f8f4b 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -1,6 +1,34 @@ project: type: website +quartodoc: + dir: docs/api + package: axolotl + title: API Reference + sections: + - title: Core + desc: Core functionality for training + contents: + - train + - evaluate + - datasets + - title: CLI + desc: Command-line interface + contents: + - cli.main + - cli.train + - cli.evaluate + - title: Prompt Strategies + desc: Prompt formatting strategies + contents: + - prompt_strategies.base + - prompt_strategies.chat_template + - title: Utils + desc: Utility functions + contents: + - utils.models + - utils.tokenization + website: title: "Axolotl" description: "We make fine-tuning accessible, scalable, and fun" @@ -75,6 +103,15 @@ website: - docs/debugging.qmd - docs/nccl.qmd +<<<<<<< HEAD +======= + - section: "Reference" + contents: + - docs/config.qmd + - section: "API Reference" + contents: docs/api/**/*.qmd + +>>>>>>> 5d96b2a95 (quartodoc integration) format: html: theme: darkly diff --git a/docs/api/README.md b/docs/api/README.md new file mode 100644 index 000000000..f8a931e22 --- /dev/null +++ b/docs/api/README.md @@ -0,0 +1,59 @@ +# Axolotl API Documentation with quartodoc + +This directory contains the API documentation for Axolotl, automatically generated using quartodoc. + +## Setup + +1. Make sure quartodoc is installed: + ``` + pip install quartodoc + ``` + +2. Install Quarto (required to render the documentation): + ``` + # Download and install the latest Quarto release + # Visit https://quarto.org/docs/get-started/ for installation instructions + ``` + +## Generating Documentation + +Run the documentation generation script: +``` +python scripts/generate_docs.py +``` + +This will: +- Read the configuration from `_quarto.yml` +- Extract documentation from the Python source code +- Generate Quarto markdown files (.qmd) in the `docs/api` directory + +## Preview the Documentation + +After generating the documentation, preview it with: +``` +quarto preview +``` + +## Building the Site + +Build the complete site with: +``` +quarto render +``` + +This will create a `_site` directory with the static HTML site. + +## Configuration + +The documentation generation is configured in two places: + +1. `_quarto.yml` - Contains the `quartodoc` section that defines which modules to document +2. The API section in the Quarto website sidebar configuration (also in `_quarto.yml`) + +## Customization + +To customize the documentation, you can: + +1. Add more modules to document in the `quartodoc` section of `_quarto.yml` +2. Create template files in the `quartodoc_templates` directory +3. Adjust the layout in the Quarto configuration diff --git a/docs/api/cli.evaluate.qmd b/docs/api/cli.evaluate.qmd new file mode 100644 index 000000000..7e57a9098 --- /dev/null +++ b/docs/api/cli.evaluate.qmd @@ -0,0 +1,38 @@ +# cli.evaluate { #axolotl.cli.evaluate } + +`cli.evaluate` + +CLI to run evaluation on a model. + +## Functions + +| Name | Description | +| --- | --- | +| [do_cli](#axolotl.cli.evaluate.do_cli) | Parses `axolotl` config, CLI args, and calls `do_evaluate`. | +| [do_evaluate](#axolotl.cli.evaluate.do_evaluate) | Evaluates a `transformers` model by first loading the dataset(s) specified in the | + +### do_cli { #axolotl.cli.evaluate.do_cli } + +```python +cli.evaluate.do_cli(config=Path('examples/'), **kwargs) +``` + +Parses `axolotl` config, CLI args, and calls `do_evaluate`. + +Args: + config: Path to `axolotl` config YAML file. + kwargs: Additional keyword arguments to override config file values. + +### do_evaluate { #axolotl.cli.evaluate.do_evaluate } + +```python +cli.evaluate.do_evaluate(cfg, cli_args) +``` + +Evaluates a `transformers` model by first loading the dataset(s) specified in the +`axolotl` config, and then calling `axolotl.evaluate.evaluate`, which computes +evaluation metrics on the given dataset(s) and writes them to disk. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + cli_args: CLI arguments. diff --git a/docs/api/cli.main.qmd b/docs/api/cli.main.qmd new file mode 100644 index 000000000..7f487f17f --- /dev/null +++ b/docs/api/cli.main.qmd @@ -0,0 +1,128 @@ +# cli.main { #axolotl.cli.main } + +`cli.main` + +Click CLI definitions for various axolotl commands. + +## Functions + +| Name | Description | +| --- | --- | +| [cli](#axolotl.cli.main.cli) | Axolotl CLI - Train and fine-tune large language models | +| [evaluate](#axolotl.cli.main.evaluate) | Evaluate a model. | +| [fetch](#axolotl.cli.main.fetch) | Fetch example configs or other resources. | +| [inference](#axolotl.cli.main.inference) | Run inference with a trained model. | +| [merge_lora](#axolotl.cli.main.merge_lora) | Merge trained LoRA adapters into a base model. | +| [merge_sharded_fsdp_weights](#axolotl.cli.main.merge_sharded_fsdp_weights) | Merge sharded FSDP model weights. | +| [preprocess](#axolotl.cli.main.preprocess) | Preprocess datasets before training. | +| [train](#axolotl.cli.main.train) | Train or fine-tune a model. | + +### cli { #axolotl.cli.main.cli } + +```python +cli.main.cli() +``` + +Axolotl CLI - Train and fine-tune large language models + +### evaluate { #axolotl.cli.main.evaluate } + +```python +cli.main.evaluate(config, accelerate, **kwargs) +``` + +Evaluate a model. + +Args: + config: Path to `axolotl` config YAML file. + accelerate: Whether to use `accelerate` launcher. + kwargs: Additional keyword arguments which correspond to CLI args or `axolotl` + config options. + +### fetch { #axolotl.cli.main.fetch } + +```python +cli.main.fetch(directory, dest) +``` + +Fetch example configs or other resources. + +Available directories: +- examples: Example configuration files +- deepspeed_configs: DeepSpeed configuration files + +Args: + directory: One of `examples`, `deepspeed_configs`. + dest: Optional destination directory. + +### inference { #axolotl.cli.main.inference } + +```python +cli.main.inference(config, accelerate, gradio, **kwargs) +``` + +Run inference with a trained model. + +Args: + config: Path to `axolotl` config YAML file. + accelerate: Whether to use `accelerate` launcher. + gradio: Whether to use Gradio browser interface or command line for inference. + kwargs: Additional keyword arguments which correspond to CLI args or `axolotl` + config options. + +### merge_lora { #axolotl.cli.main.merge_lora } + +```python +cli.main.merge_lora(config, **kwargs) +``` + +Merge trained LoRA adapters into a base model. + +Args: + config: Path to `axolotl` config YAML file. + kwargs: Additional keyword arguments which correspond to CLI args or `axolotl` + config options. + +### merge_sharded_fsdp_weights { #axolotl.cli.main.merge_sharded_fsdp_weights } + +```python +cli.main.merge_sharded_fsdp_weights(config, accelerate, **kwargs) +``` + +Merge sharded FSDP model weights. + +Args: + config: Path to `axolotl` config YAML file. + accelerate: Whether to use `accelerate` launcher. + kwargs: Additional keyword arguments which correspond to CLI args or `axolotl` + config options. + +### preprocess { #axolotl.cli.main.preprocess } + +```python +cli.main.preprocess(config, cloud=None, **kwargs) +``` + +Preprocess datasets before training. + +Args: + config: Path to `axolotl` config YAML file. + cloud: Path to a cloud accelerator configuration file. + kwargs: Additional keyword arguments which correspond to CLI args or `axolotl` + config options. + +### train { #axolotl.cli.main.train } + +```python +cli.main.train(config, accelerate, cloud=None, sweep=None, **kwargs) +``` + +Train or fine-tune a model. + +Args: + config: Path to `axolotl` config YAML file. + accelerate: Whether to use `accelerate` launcher. + cloud: Path to a cloud accelerator configuration file + sweep: Path to YAML config for sweeping hyperparameters. + kwargs: Additional keyword arguments which correspond to CLI args or `axolotl` + config options. diff --git a/docs/api/cli.train.qmd b/docs/api/cli.train.qmd new file mode 100644 index 000000000..6254889b3 --- /dev/null +++ b/docs/api/cli.train.qmd @@ -0,0 +1,38 @@ +# cli.train { #axolotl.cli.train } + +`cli.train` + +CLI to run training on a model. + +## Functions + +| Name | Description | +| --- | --- | +| [do_cli](#axolotl.cli.train.do_cli) | Parses `axolotl` config, CLI args, and calls `do_train`. | +| [do_train](#axolotl.cli.train.do_train) | Trains a `transformers` model by first loading the dataset(s) specified in the | + +### do_cli { #axolotl.cli.train.do_cli } + +```python +cli.train.do_cli(config=Path('examples/'), **kwargs) +``` + +Parses `axolotl` config, CLI args, and calls `do_train`. + +Args: + config: Path to `axolotl` config YAML file. + kwargs: Additional keyword arguments to override config file values. + +### do_train { #axolotl.cli.train.do_train } + +```python +cli.train.do_train(cfg, cli_args) +``` + +Trains a `transformers` model by first loading the dataset(s) specified in the +`axolotl` config, and then calling `axolotl.train.train`. Also runs the plugin +manager's `post_train_unload` once training completes. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + cli_args: Training-specific CLI arguments. diff --git a/docs/api/datasets.qmd b/docs/api/datasets.qmd new file mode 100644 index 000000000..2fea393a1 --- /dev/null +++ b/docs/api/datasets.qmd @@ -0,0 +1,44 @@ +# datasets { #axolotl.datasets } + +`datasets` + +Module containing Dataset functionality + +## Classes + +| Name | Description | +| --- | --- | +| [ConstantLengthDataset](#axolotl.datasets.ConstantLengthDataset) | Iterable dataset that returns constant length chunks of tokens from stream of text files. | +| [TokenizedPromptDataset](#axolotl.datasets.TokenizedPromptDataset) | Dataset that returns tokenized prompts from a stream of text files. | + +### ConstantLengthDataset { #axolotl.datasets.ConstantLengthDataset } + +```python +datasets.ConstantLengthDataset(self, tokenizer, datasets, seq_length=2048) +``` + +Iterable dataset that returns constant length chunks of tokens from stream of text files. + Args: + tokenizer (Tokenizer): The processor used for processing the data. + dataset (dataset.Dataset): Dataset with text files. + seq_length (int): Length of token sequences to return. + +### TokenizedPromptDataset { #axolotl.datasets.TokenizedPromptDataset } + +```python +datasets.TokenizedPromptDataset( + self, + prompt_tokenizer, + dataset, + process_count=None, + keep_in_memory=False, + **kwargs, +) +``` + +Dataset that returns tokenized prompts from a stream of text files. + Args: + prompt_tokenizer (PromptTokenizingStrategy): The prompt tokenizing method for processing the data. + dataset (dataset.Dataset): Dataset with text files. + process_count (int): Number of processes to use for tokenizing. + keep_in_memory (bool): Whether to keep the tokenized dataset in memory. diff --git a/docs/api/evaluate.qmd b/docs/api/evaluate.qmd new file mode 100644 index 000000000..ae9fff77b --- /dev/null +++ b/docs/api/evaluate.qmd @@ -0,0 +1,47 @@ +# evaluate { #axolotl.evaluate } + +`evaluate` + +Module for evaluating models. + +## Functions + +| Name | Description | +| --- | --- | +| [evaluate](#axolotl.evaluate.evaluate) | Evaluate a model on training and validation datasets | +| [evaluate_dataset](#axolotl.evaluate.evaluate_dataset) | Helper function to evaluate a single dataset safely. | + +### evaluate { #axolotl.evaluate.evaluate } + +```python +evaluate.evaluate(cfg, dataset_meta) +``` + +Evaluate a model on training and validation datasets + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + dataset_meta: Dataset metadata containing training and evaluation datasets. + +Returns: + Tuple containing: + - The model (either PeftModel or PreTrainedModel) + - The tokenizer + - Dictionary of evaluation metrics + +### evaluate_dataset { #axolotl.evaluate.evaluate_dataset } + +```python +evaluate.evaluate_dataset(trainer, dataset, dataset_type, flash_optimum=False) +``` + +Helper function to evaluate a single dataset safely. + +Args: + trainer: The trainer instance + dataset: Dataset to evaluate + dataset_type: Type of dataset ('train' or 'eval') + flash_optimum: Whether to use flash optimum + +Returns: + Dictionary of metrics or None if dataset is None diff --git a/docs/api/index.qmd b/docs/api/index.qmd new file mode 100644 index 000000000..6331cfc6e --- /dev/null +++ b/docs/api/index.qmd @@ -0,0 +1,39 @@ +# API Reference {.doc .doc-index} + +## Core + +Core functionality for training + +| | | +| --- | --- | +| [train](train.qmd#axolotl.train) | Prepare and train a model on a dataset. Can also infer from a model or merge lora | +| [evaluate](evaluate.qmd#axolotl.evaluate) | Module for evaluating models. | +| [datasets](datasets.qmd#axolotl.datasets) | Module containing Dataset functionality | + +## CLI + +Command-line interface + +| | | +| --- | --- | +| [cli.main](cli.main.qmd#axolotl.cli.main) | Click CLI definitions for various axolotl commands. | +| [cli.train](cli.train.qmd#axolotl.cli.train) | CLI to run training on a model. | +| [cli.evaluate](cli.evaluate.qmd#axolotl.cli.evaluate) | CLI to run evaluation on a model. | + +## Prompt Strategies + +Prompt formatting strategies + +| | | +| --- | --- | +| [prompt_strategies.base](prompt_strategies.base.qmd#axolotl.prompt_strategies.base) | module for base dataset transform strategies | +| [prompt_strategies.chat_template](prompt_strategies.chat_template.qmd#axolotl.prompt_strategies.chat_template) | HF Chat Templates prompt strategy | + +## Utils + +Utility functions + +| | | +| --- | --- | +| [utils.models](utils.models.qmd#axolotl.utils.models) | Module for models and model loading | +| [utils.tokenization](utils.tokenization.qmd#axolotl.utils.tokenization) | Module for tokenization utilities | diff --git a/docs/api/prompt_strategies.base.qmd b/docs/api/prompt_strategies.base.qmd new file mode 100644 index 000000000..dff9733b4 --- /dev/null +++ b/docs/api/prompt_strategies.base.qmd @@ -0,0 +1,5 @@ +# prompt_strategies.base { #axolotl.prompt_strategies.base } + +`prompt_strategies.base` + +module for base dataset transform strategies diff --git a/docs/api/prompt_strategies.chat_template.qmd b/docs/api/prompt_strategies.chat_template.qmd new file mode 100644 index 000000000..212e0a1c9 --- /dev/null +++ b/docs/api/prompt_strategies.chat_template.qmd @@ -0,0 +1,80 @@ +# prompt_strategies.chat_template { #axolotl.prompt_strategies.chat_template } + +`prompt_strategies.chat_template` + +HF Chat Templates prompt strategy + +## Classes + +| Name | Description | +| --- | --- | +| [ChatTemplatePrompter](#axolotl.prompt_strategies.chat_template.ChatTemplatePrompter) | Prompter for HF chat templates | +| [ChatTemplateStrategy](#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy) | Tokenizing strategy for instruction-based prompts. | +| [StrategyLoader](#axolotl.prompt_strategies.chat_template.StrategyLoader) | Load chat template strategy based on configuration. | + +### ChatTemplatePrompter { #axolotl.prompt_strategies.chat_template.ChatTemplatePrompter } + +```python +prompt_strategies.chat_template.ChatTemplatePrompter( + self, + tokenizer, + chat_template, + processor=None, + max_length=2048, + message_property_mappings=None, + message_field_training=None, + message_field_training_detail=None, + field_messages='messages', + roles=None, + drop_system_message=False, +) +``` + +Prompter for HF chat templates + +### ChatTemplateStrategy { #axolotl.prompt_strategies.chat_template.ChatTemplateStrategy } + +```python +prompt_strategies.chat_template.ChatTemplateStrategy( + self, + prompter, + tokenizer, + train_on_inputs, + sequence_len, + roles_to_train=None, + train_on_eos=None, +) +``` + +Tokenizing strategy for instruction-based prompts. + +#### Methods + +| Name | Description | +| --- | --- | +| [find_turn](#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.find_turn) | Locate the starting and ending indices of the specified turn in a conversation. | +| [tokenize_prompt](#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt) | Public method that can handle either a single prompt or a batch of prompts. | + +##### find_turn { #axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.find_turn } + +```python +prompt_strategies.chat_template.ChatTemplateStrategy.find_turn(turns, turn_idx) +``` + +Locate the starting and ending indices of the specified turn in a conversation. + +##### tokenize_prompt { #axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt } + +```python +prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt(prompt) +``` + +Public method that can handle either a single prompt or a batch of prompts. + +### StrategyLoader { #axolotl.prompt_strategies.chat_template.StrategyLoader } + +```python +prompt_strategies.chat_template.StrategyLoader() +``` + +Load chat template strategy based on configuration. diff --git a/docs/api/train.qmd b/docs/api/train.qmd new file mode 100644 index 000000000..5f558a385 --- /dev/null +++ b/docs/api/train.qmd @@ -0,0 +1,199 @@ +# train { #axolotl.train } + +`train` + +Prepare and train a model on a dataset. Can also infer from a model or merge lora + +## Functions + +| Name | Description | +| --- | --- | +| [create_model_card](#axolotl.train.create_model_card) | Create a model card for the trained model if needed. | +| [determine_resume_checkpoint](#axolotl.train.determine_resume_checkpoint) | Determine the checkpoint to resume from based on configuration. | +| [execute_training](#axolotl.train.execute_training) | Execute the training process with appropriate backend configurations. | +| [handle_untrained_tokens_fix](#axolotl.train.handle_untrained_tokens_fix) | Apply fixes for untrained tokens if configured. | +| [save_initial_configs](#axolotl.train.save_initial_configs) | Save initial configurations before training. | +| [save_trained_model](#axolotl.train.save_trained_model) | Save the trained model according to configuration and training setup. | +| [setup_model_and_tokenizer](#axolotl.train.setup_model_and_tokenizer) | Load the tokenizer, processor (for multimodal models), and model based on configuration. | +| [setup_model_and_trainer](#axolotl.train.setup_model_and_trainer) | Load model, tokenizer, trainer, etc. Helper function to encapsulate the full | +| [setup_model_card](#axolotl.train.setup_model_card) | Set up the Axolotl badge and add the Axolotl config to the model card if available. | +| [setup_reference_model](#axolotl.train.setup_reference_model) | Set up the reference model for RL training if needed. | +| [setup_signal_handler](#axolotl.train.setup_signal_handler) | Set up signal handler for graceful termination. | +| [train](#axolotl.train.train) | Train a model on the given dataset. | + +### create_model_card { #axolotl.train.create_model_card } + +```python +train.create_model_card(cfg, trainer) +``` + +Create a model card for the trained model if needed. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + trainer: The trainer object with model card creation capabilities. + +### determine_resume_checkpoint { #axolotl.train.determine_resume_checkpoint } + +```python +train.determine_resume_checkpoint(cfg) +``` + +Determine the checkpoint to resume from based on configuration. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + +Returns: + Path to the checkpoint to resume from, or `None` if not resuming. + +### execute_training { #axolotl.train.execute_training } + +```python +train.execute_training(cfg, trainer, resume_from_checkpoint) +``` + +Execute the training process with appropriate backend configurations. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + trainer: The configured trainer object. + resume_from_checkpoint: Path to checkpoint to resume from, if applicable. + +### handle_untrained_tokens_fix { #axolotl.train.handle_untrained_tokens_fix } + +```python +train.handle_untrained_tokens_fix( + cfg, + model, + tokenizer, + train_dataset, + safe_serialization, +) +``` + +Apply fixes for untrained tokens if configured. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + model: The model to apply fixes to. + tokenizer: The tokenizer for token identification. + train_dataset: The training dataset to use. + safe_serialization: Whether to use safe serialization when saving. + +### save_initial_configs { #axolotl.train.save_initial_configs } + +```python +train.save_initial_configs(cfg, tokenizer, model, peft_config) +``` + +Save initial configurations before training. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + tokenizer: The tokenizer to save. + model: The model to save configuration for. + peft_config: The PEFT configuration to save if applicable. + +### save_trained_model { #axolotl.train.save_trained_model } + +```python +train.save_trained_model(cfg, trainer, model, safe_serialization) +``` + +Save the trained model according to configuration and training setup. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + trainer: The trainer object. + model: The trained model to save. + safe_serialization: Whether to use safe serialization. + +### setup_model_and_tokenizer { #axolotl.train.setup_model_and_tokenizer } + +```python +train.setup_model_and_tokenizer(cfg) +``` + +Load the tokenizer, processor (for multimodal models), and model based on configuration. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + +Returns: + Tuple containing model, tokenizer, `peft_config` (if LoRA / QLoRA, else + `None`), and processor (if multimodal, else `None`). + +### setup_model_and_trainer { #axolotl.train.setup_model_and_trainer } + +```python +train.setup_model_and_trainer(cfg, dataset_meta) +``` + +Load model, tokenizer, trainer, etc. Helper function to encapsulate the full +trainer setup. + +Args: + cfg: The configuration dictionary with training parameters. + dataset_meta: Object with training, validation datasets and metadata. + +Returns: + Tuple of: + - Trainer (Causal or RLHF) + - Model + - Tokenizer + - PEFT config + +### setup_model_card { #axolotl.train.setup_model_card } + +```python +train.setup_model_card(cfg) +``` + +Set up the Axolotl badge and add the Axolotl config to the model card if available. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + +### setup_reference_model { #axolotl.train.setup_reference_model } + +```python +train.setup_reference_model(cfg, tokenizer) +``` + +Set up the reference model for RL training if needed. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + tokenizer: The tokenizer to use for the reference model. + +Returns: + Reference model if needed for RL training, `None` otherwise. + +### setup_signal_handler { #axolotl.train.setup_signal_handler } + +```python +train.setup_signal_handler(cfg, model, safe_serialization) +``` + +Set up signal handler for graceful termination. + +Args: + cfg: Dictionary mapping `axolotl` config keys to values. + model: The model to save on termination + safe_serialization: Whether to use safe serialization when saving + +### train { #axolotl.train.train } + +```python +train.train(cfg, dataset_meta) +``` + +Train a model on the given dataset. + +Args: + cfg: The configuration dictionary with training parameters + dataset_meta: Object with training, validation datasets and metadata + +Returns: + Tuple of (model, tokenizer) after training diff --git a/docs/api/utils.models.qmd b/docs/api/utils.models.qmd new file mode 100644 index 000000000..2b590edf3 --- /dev/null +++ b/docs/api/utils.models.qmd @@ -0,0 +1,161 @@ +# utils.models { #axolotl.utils.models } + +`utils.models` + +Module for models and model loading + +## Classes + +| Name | Description | +| --- | --- | +| [ModelLoader](#axolotl.utils.models.ModelLoader) | ModelLoader: managing all the config and monkey patches while loading model | + +### ModelLoader { #axolotl.utils.models.ModelLoader } + +```python +utils.models.ModelLoader( + self, + cfg, + tokenizer, + *, + processor=None, + inference=False, + reference_model=False, + **kwargs, +) +``` + +ModelLoader: managing all the config and monkey patches while loading model + +#### Attributes + +| Name | Description | +| --- | --- | +| [has_flash_attn](#axolotl.utils.models.ModelLoader.has_flash_attn) | Check if flash attention is installed | + +#### Methods + +| Name | Description | +| --- | --- | +| [patch_llama_derived_model](#axolotl.utils.models.ModelLoader.patch_llama_derived_model) | Modify all llama derived models in one block | +| [patch_loss_llama](#axolotl.utils.models.ModelLoader.patch_loss_llama) | Patch loss functions and other optimizations | +| [set_attention_config](#axolotl.utils.models.ModelLoader.set_attention_config) | sample packing uses custom FA2 patch | +| [set_auto_model_loader](#axolotl.utils.models.ModelLoader.set_auto_model_loader) | set self.AutoModelLoader | + +##### patch_llama_derived_model { #axolotl.utils.models.ModelLoader.patch_llama_derived_model } + +```python +utils.models.ModelLoader.patch_llama_derived_model() +``` + +Modify all llama derived models in one block + +##### patch_loss_llama { #axolotl.utils.models.ModelLoader.patch_loss_llama } + +```python +utils.models.ModelLoader.patch_loss_llama() +``` + +Patch loss functions and other optimizations + +##### set_attention_config { #axolotl.utils.models.ModelLoader.set_attention_config } + +```python +utils.models.ModelLoader.set_attention_config() +``` + +sample packing uses custom FA2 patch + +##### set_auto_model_loader { #axolotl.utils.models.ModelLoader.set_auto_model_loader } + +```python +utils.models.ModelLoader.set_auto_model_loader() +``` + +set self.AutoModelLoader +- default value: AutoModelForCausalLM (set at __init__) +- when using a multi modality model, self.AutoModelLoader should + be set according to model type of the model + +## Functions + +| Name | Description | +| --- | --- | +| [get_module_class_from_name](#axolotl.utils.models.get_module_class_from_name) | Gets a class from a module by its name. | +| [load_model](#axolotl.utils.models.load_model) | Load a model for a given configuration and tokenizer. | +| [load_tokenizer](#axolotl.utils.models.load_tokenizer) | Load and configure the tokenizer based on the provided config. | +| [modify_tokenizer_files](#axolotl.utils.models.modify_tokenizer_files) | Modify tokenizer files to replace added_tokens strings, save to output directory, and return the path to the modified tokenizer. | +| [setup_quantized_meta_for_peft](#axolotl.utils.models.setup_quantized_meta_for_peft) | Replaces `quant_state.to` with a dummy function to prevent PEFT from moving `quant_state` to meta device | +| [setup_quantized_peft_meta_for_training](#axolotl.utils.models.setup_quantized_peft_meta_for_training) | Replaces dummy `quant_state.to` method with the original function to allow training to continue | + +### get_module_class_from_name { #axolotl.utils.models.get_module_class_from_name } + +```python +utils.models.get_module_class_from_name(module, name) +``` + +Gets a class from a module by its name. + +Args: + module (`torch.nn.Module`): The module to get the class from. + name (`str`): The name of the class. + +### load_model { #axolotl.utils.models.load_model } + +```python +utils.models.load_model( + cfg, + tokenizer, + *, + processor=None, + inference=False, + reference_model=False, + **kwargs, +) +``` + +Load a model for a given configuration and tokenizer. + +### load_tokenizer { #axolotl.utils.models.load_tokenizer } + +```python +utils.models.load_tokenizer(cfg) +``` + +Load and configure the tokenizer based on the provided config. + +### modify_tokenizer_files { #axolotl.utils.models.modify_tokenizer_files } + +```python +utils.models.modify_tokenizer_files(tokenizer_path, token_mappings, output_dir) +``` + +Modify tokenizer files to replace added_tokens strings, save to output directory, and return the path to the modified tokenizer. + +This only works with reserved tokens that were added to the tokenizer, not tokens already part of the vocab. + +Args: + tokenizer_path: Path or name of the original tokenizer + token_mappings: Dict mapping {token_id (int): new_token_string} + output_dir: Directory to save the modified tokenizer + +Returns: + Path to the modified tokenizer directory + +Ref: https://github.com/huggingface/transformers/issues/27974#issuecomment-1854188941 + +### setup_quantized_meta_for_peft { #axolotl.utils.models.setup_quantized_meta_for_peft } + +```python +utils.models.setup_quantized_meta_for_peft(model) +``` + +Replaces `quant_state.to` with a dummy function to prevent PEFT from moving `quant_state` to meta device + +### setup_quantized_peft_meta_for_training { #axolotl.utils.models.setup_quantized_peft_meta_for_training } + +```python +utils.models.setup_quantized_peft_meta_for_training(model) +``` + +Replaces dummy `quant_state.to` method with the original function to allow training to continue diff --git a/docs/api/utils.tokenization.qmd b/docs/api/utils.tokenization.qmd new file mode 100644 index 000000000..d3ac1f0d2 --- /dev/null +++ b/docs/api/utils.tokenization.qmd @@ -0,0 +1,38 @@ +# utils.tokenization { #axolotl.utils.tokenization } + +`utils.tokenization` + +Module for tokenization utilities + +## Functions + +| Name | Description | +| --- | --- | +| [color_token_for_rl_debug](#axolotl.utils.tokenization.color_token_for_rl_debug) | Helper function to color tokens based on their type. | +| [process_tokens_for_rl_debug](#axolotl.utils.tokenization.process_tokens_for_rl_debug) | Helper function to process and color tokens. | + +### color_token_for_rl_debug { #axolotl.utils.tokenization.color_token_for_rl_debug } + +```python +utils.tokenization.color_token_for_rl_debug( + decoded_token, + encoded_token, + color, + text_only, +) +``` + +Helper function to color tokens based on their type. + +### process_tokens_for_rl_debug { #axolotl.utils.tokenization.process_tokens_for_rl_debug } + +```python +utils.tokenization.process_tokens_for_rl_debug( + tokens, + color, + tokenizer, + text_only, +) +``` + +Helper function to process and color tokens. diff --git a/objects.json b/objects.json new file mode 100644 index 000000000..49c176f42 --- /dev/null +++ b/objects.json @@ -0,0 +1 @@ +{"project": "axolotl", "version": "0.0.9999", "count": 57, "items": [{"name": "axolotl.train.create_model_card", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.create_model_card", "dispname": "-"}, {"name": "axolotl.train.determine_resume_checkpoint", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.determine_resume_checkpoint", "dispname": "-"}, {"name": "axolotl.train.execute_training", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.execute_training", "dispname": "-"}, {"name": "axolotl.train.handle_untrained_tokens_fix", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.handle_untrained_tokens_fix", "dispname": "-"}, {"name": "axolotl.train.save_initial_configs", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.save_initial_configs", "dispname": "-"}, {"name": "axolotl.train.save_trained_model", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.save_trained_model", "dispname": "-"}, {"name": "axolotl.train.setup_model_and_tokenizer", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.setup_model_and_tokenizer", "dispname": "-"}, {"name": "axolotl.train.setup_model_and_trainer", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.setup_model_and_trainer", "dispname": "-"}, {"name": "axolotl.train.setup_model_card", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.setup_model_card", "dispname": "-"}, {"name": "axolotl.train.setup_reference_model", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.setup_reference_model", "dispname": "-"}, {"name": "axolotl.train.setup_signal_handler", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.setup_signal_handler", "dispname": "-"}, {"name": "axolotl.train.train", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/train.html#axolotl.train.train", "dispname": "-"}, {"name": "axolotl.train", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/train.html#axolotl.train", "dispname": "-"}, {"name": "axolotl.evaluate.evaluate", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/evaluate.html#axolotl.evaluate.evaluate", "dispname": "-"}, {"name": "axolotl.evaluate.evaluate_dataset", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/evaluate.html#axolotl.evaluate.evaluate_dataset", "dispname": "-"}, {"name": "axolotl.evaluate", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/evaluate.html#axolotl.evaluate", "dispname": "-"}, {"name": "axolotl.datasets.ConstantLengthDataset", "domain": "py", "role": "class", "priority": "1", "uri": "docs/api/datasets.html#axolotl.datasets.ConstantLengthDataset", "dispname": "-"}, {"name": "axolotl.datasets.TokenizedPromptDataset", "domain": "py", "role": "class", "priority": "1", "uri": "docs/api/datasets.html#axolotl.datasets.TokenizedPromptDataset", "dispname": "-"}, {"name": "axolotl.datasets", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/datasets.html#axolotl.datasets", "dispname": "-"}, {"name": "axolotl.cli.main.cli", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.main.html#axolotl.cli.main.cli", "dispname": "-"}, {"name": "axolotl.cli.main.evaluate", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.main.html#axolotl.cli.main.evaluate", "dispname": "-"}, {"name": "axolotl.cli.main.fetch", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.main.html#axolotl.cli.main.fetch", "dispname": "-"}, {"name": "axolotl.cli.main.inference", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.main.html#axolotl.cli.main.inference", "dispname": "-"}, {"name": "axolotl.cli.main.merge_lora", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.main.html#axolotl.cli.main.merge_lora", "dispname": "-"}, {"name": "axolotl.cli.main.merge_sharded_fsdp_weights", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.main.html#axolotl.cli.main.merge_sharded_fsdp_weights", "dispname": "-"}, {"name": "axolotl.cli.main.preprocess", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.main.html#axolotl.cli.main.preprocess", "dispname": "-"}, {"name": "axolotl.cli.main.train", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.main.html#axolotl.cli.main.train", "dispname": "-"}, {"name": "axolotl.cli.main", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/cli.main.html#axolotl.cli.main", "dispname": "-"}, {"name": "axolotl.cli.train.do_cli", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.train.html#axolotl.cli.train.do_cli", "dispname": "-"}, {"name": "axolotl.cli.train.do_train", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.train.html#axolotl.cli.train.do_train", "dispname": "-"}, {"name": "axolotl.cli.train", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/cli.train.html#axolotl.cli.train", "dispname": "-"}, {"name": "axolotl.cli.evaluate.do_cli", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.evaluate.html#axolotl.cli.evaluate.do_cli", "dispname": "-"}, {"name": "axolotl.cli.evaluate.do_evaluate", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/cli.evaluate.html#axolotl.cli.evaluate.do_evaluate", "dispname": "-"}, {"name": "axolotl.cli.evaluate", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/cli.evaluate.html#axolotl.cli.evaluate", "dispname": "-"}, {"name": "axolotl.prompt_strategies.base", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/prompt_strategies.base.html#axolotl.prompt_strategies.base", "dispname": "-"}, {"name": "axolotl.prompt_strategies.chat_template.ChatTemplatePrompter", "domain": "py", "role": "class", "priority": "1", "uri": "docs/api/prompt_strategies.chat_template.html#axolotl.prompt_strategies.chat_template.ChatTemplatePrompter", "dispname": "-"}, {"name": "axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.find_turn", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/prompt_strategies.chat_template.html#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.find_turn", "dispname": "-"}, {"name": "axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/prompt_strategies.chat_template.html#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt", "dispname": "-"}, {"name": "axolotl.prompt_strategies.chat_template.ChatTemplateStrategy", "domain": "py", "role": "class", "priority": "1", "uri": "docs/api/prompt_strategies.chat_template.html#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy", "dispname": "-"}, {"name": "axolotl.prompt_strategies.chat_template.StrategyLoader", "domain": "py", "role": "class", "priority": "1", "uri": "docs/api/prompt_strategies.chat_template.html#axolotl.prompt_strategies.chat_template.StrategyLoader", "dispname": "-"}, {"name": "axolotl.prompt_strategies.chat_template", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/prompt_strategies.chat_template.html#axolotl.prompt_strategies.chat_template", "dispname": "-"}, {"name": "axolotl.utils.models.ModelLoader.has_flash_attn", "domain": "py", "role": "attribute", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.ModelLoader.has_flash_attn", "dispname": "-"}, {"name": "axolotl.utils.models.ModelLoader.patch_llama_derived_model", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.ModelLoader.patch_llama_derived_model", "dispname": "-"}, {"name": "axolotl.utils.models.ModelLoader.patch_loss_llama", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.ModelLoader.patch_loss_llama", "dispname": "-"}, {"name": "axolotl.utils.models.ModelLoader.set_attention_config", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.ModelLoader.set_attention_config", "dispname": "-"}, {"name": "axolotl.utils.models.ModelLoader.set_auto_model_loader", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.ModelLoader.set_auto_model_loader", "dispname": "-"}, {"name": "axolotl.utils.models.ModelLoader", "domain": "py", "role": "class", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.ModelLoader", "dispname": "-"}, {"name": "axolotl.utils.models.get_module_class_from_name", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.get_module_class_from_name", "dispname": "-"}, {"name": "axolotl.utils.models.load_model", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.load_model", "dispname": "-"}, {"name": "axolotl.utils.models.load_tokenizer", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.load_tokenizer", "dispname": "-"}, {"name": "axolotl.utils.models.modify_tokenizer_files", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.modify_tokenizer_files", "dispname": "-"}, {"name": "axolotl.utils.models.setup_quantized_meta_for_peft", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.setup_quantized_meta_for_peft", "dispname": "-"}, {"name": "axolotl.utils.models.setup_quantized_peft_meta_for_training", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models.setup_quantized_peft_meta_for_training", "dispname": "-"}, {"name": "axolotl.utils.models", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/utils.models.html#axolotl.utils.models", "dispname": "-"}, {"name": "axolotl.utils.tokenization.color_token_for_rl_debug", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.tokenization.html#axolotl.utils.tokenization.color_token_for_rl_debug", "dispname": "-"}, {"name": "axolotl.utils.tokenization.process_tokens_for_rl_debug", "domain": "py", "role": "function", "priority": "1", "uri": "docs/api/utils.tokenization.html#axolotl.utils.tokenization.process_tokens_for_rl_debug", "dispname": "-"}, {"name": "axolotl.utils.tokenization", "domain": "py", "role": "module", "priority": "1", "uri": "docs/api/utils.tokenization.html#axolotl.utils.tokenization", "dispname": "-"}]} diff --git a/quartodoc.yml b/quartodoc.yml new file mode 100644 index 000000000..fbbb93852 --- /dev/null +++ b/quartodoc.yml @@ -0,0 +1,47 @@ +project: + title: Axolotl API Reference + description: API documentation for Axolotl + +repo: + url: https://github.com/axolotl-ai-cloud/axolotl + +packages: + - package: axolotl + source_path: src/axolotl + url_path: /api + + sections: + - title: Core + source_path: core + url_path: core + + - title: CLI + source_path: cli + url_path: cli + + - title: Prompt Strategies + source_path: prompt_strategies + url_path: prompt_strategies + + - title: Utils + source_path: utils + url_path: utils + + - title: Integrations + source_path: integrations + url_path: integrations + + contents: + - "__init__.py" + - "train.py" + - "evaluate.py" + - "datasets.py" + - "prompters.py" + +renderer: + output_dir: docs/api + schema_dir: docs/schema + number_section_groups: true + template_paths: + - quartodoc_templates + quiet: false diff --git a/scripts/generate_docs.py b/scripts/generate_docs.py new file mode 100755 index 000000000..6e8ec8155 --- /dev/null +++ b/scripts/generate_docs.py @@ -0,0 +1,40 @@ +#!/usr/bin/env python3 +""" +Script to generate API documentation for Axolotl using quartodoc. +""" + +import os +import subprocess # nosec B404 +import sys + + +def run_command(cmd, check=True): + """Run a shell command and return the result.""" + print(f"Running: {cmd}") + result = subprocess.run(cmd, shell=True, check=False) # nosec B602 + if check and result.returncode != 0: + print(f"Error running command: {cmd}") + sys.exit(result.returncode) + return result + + +def main(): + """Generate API documentation for Axolotl.""" + # Ensure we're in the project root + if not os.path.exists("_quarto.yml"): + print("Error: _quarto.yml not found. Run this script from the project root.") + sys.exit(1) + + # Create the output directories if they don't exist + os.makedirs("docs/api", exist_ok=True) + + # Generate the documentation + print("Generating API documentation...") + run_command("quartodoc build") + + print("Documentation generated successfully!") + print("Run 'quarto preview' to view the documentation.") + + +if __name__ == "__main__": + main()