This commit is contained in:
Dan Saunders
2025-01-27 15:43:51 -05:00
parent f866157b74
commit 4d1553e53f
11 changed files with 159 additions and 39 deletions

View File

@@ -0,0 +1,11 @@
# ConstantLengthDataset { #axolotl.ConstantLengthDataset }
```python
ConstantLengthDataset(self, tokenizer, datasets, seq_length=2048)
```
Iterable dataset that returns constant length chunks of tokens from stream of text files.
Args:
tokenizer (Tokenizer): The processor used for processing the data.
dataset (dataset.Dataset): Dataset with text files.
seq_length (int): Length of token sequences to return.

View File

@@ -0,0 +1,19 @@
# TokenizedPromptDataset { #axolotl.TokenizedPromptDataset }
```python
TokenizedPromptDataset(
self,
prompt_tokenizer,
dataset,
process_count=None,
keep_in_memory=False,
**kwargs,
)
```
Dataset that returns tokenized prompts from a stream of text files.
Args:
prompt_tokenizer (PromptTokenizingStrategy): The prompt tokenizing method for processing the data.
dataset (dataset.Dataset): Dataset with text files.
process_count (int): Number of processes to use for tokenizing.
keep_in_memory (bool): Whether to keep the tokenized dataset in memory.

28
api/choose_config.qmd Normal file
View File

@@ -0,0 +1,28 @@
# choose_config { #axolotl.choose_config }
```python
choose_config(path)
```
Helper method for choosing a `axolotl` config YAML file (considering only files
ending with `.yml` or `.yaml`). If more than one config file exists in the passed
`path`, the user is prompted to choose one.
## Parameters {.doc-section .doc-section-parameters}
| Name | Type | Description | Default |
|--------|--------|-----------------------------------------------|------------|
| path | Path | Directory in which config file(s) are stored. | _required_ |
## Returns {.doc-section .doc-section-returns}
| Name | Type | Description |
|--------|--------|----------------------------------------------------------------------------------|
| | str | Path to either (1) the sole YAML file, or (2) if more than one YAML files exist, |
| | str | the user-selected YAML file. |
## Raises {.doc-section .doc-section-raises}
| Name | Type | Description |
|--------|------------|-------------------------------------------------|
| | ValueError | If no YAML files are found in the given `path`. |

5
api/index.qmd Normal file
View File

@@ -0,0 +1,5 @@
# Function reference {.doc .doc-index}
## Core API
Core functionality of Axolotl

21
api/load_cfg.qmd Normal file
View File

@@ -0,0 +1,21 @@
# load_cfg { #axolotl.load_cfg }
```python
load_cfg(config=Path('examples/'), **kwargs)
```
Loads the `axolotl` configuration stored at `config`, validates it, and performs
various setup.
## Parameters {.doc-section .doc-section-parameters}
| Name | Type | Description | Default |
|--------|--------------------|--------------------------------------------------------------|---------------------|
| config | Union\[str, Path\] | Path (local or remote) to `axolotl` config YAML file. | `Path('examples/')` |
| kwargs | | Additional keyword arguments to override config file values. | `{}` |
## Returns {.doc-section .doc-section-returns}
| Name | Type | Description |
|--------|-------------|-----------------------------------------------------|
| | DictDefault | `DictDefault` mapping configuration keys to values. |

5
api/validate_config.qmd Normal file
View File

@@ -0,0 +1,5 @@
# validate_config { #axolotl.validate_config }
```python
validate_config(cfg, capabilities=None, env_capabilities=None)
```