Update docs/.gitignore to exclude auto-generated API documentation files
This commit is contained in:
3
docs/.gitignore
vendored
3
docs/.gitignore
vendored
@@ -1,2 +1,5 @@
|
|||||||
/.quarto/
|
/.quarto/
|
||||||
_site/
|
_site/
|
||||||
|
/api/*.qmd
|
||||||
|
/api/*.html
|
||||||
|
site_libs/
|
||||||
|
|||||||
@@ -1,44 +0,0 @@
|
|||||||
# datasets { #axolotl.datasets }
|
|
||||||
|
|
||||||
`datasets`
|
|
||||||
|
|
||||||
Module containing Dataset functionality
|
|
||||||
|
|
||||||
## Classes
|
|
||||||
|
|
||||||
| Name | Description |
|
|
||||||
| --- | --- |
|
|
||||||
| [ConstantLengthDataset](#axolotl.datasets.ConstantLengthDataset) | Iterable dataset that returns constant length chunks of tokens from stream of text files. |
|
|
||||||
| [TokenizedPromptDataset](#axolotl.datasets.TokenizedPromptDataset) | Dataset that returns tokenized prompts from a stream of text files. |
|
|
||||||
|
|
||||||
### ConstantLengthDataset { #axolotl.datasets.ConstantLengthDataset }
|
|
||||||
|
|
||||||
```python
|
|
||||||
datasets.ConstantLengthDataset(self, tokenizer, datasets, seq_length=2048)
|
|
||||||
```
|
|
||||||
|
|
||||||
Iterable dataset that returns constant length chunks of tokens from stream of text files.
|
|
||||||
Args:
|
|
||||||
tokenizer (Tokenizer): The processor used for processing the data.
|
|
||||||
dataset (dataset.Dataset): Dataset with text files.
|
|
||||||
seq_length (int): Length of token sequences to return.
|
|
||||||
|
|
||||||
### TokenizedPromptDataset { #axolotl.datasets.TokenizedPromptDataset }
|
|
||||||
|
|
||||||
```python
|
|
||||||
datasets.TokenizedPromptDataset(
|
|
||||||
self,
|
|
||||||
prompt_tokenizer,
|
|
||||||
dataset,
|
|
||||||
process_count=None,
|
|
||||||
keep_in_memory=False,
|
|
||||||
**kwargs,
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Dataset that returns tokenized prompts from a stream of text files.
|
|
||||||
Args:
|
|
||||||
prompt_tokenizer (PromptTokenizingStrategy): The prompt tokenizing method for processing the data.
|
|
||||||
dataset (dataset.Dataset): Dataset with text files.
|
|
||||||
process_count (int): Number of processes to use for tokenizing.
|
|
||||||
keep_in_memory (bool): Whether to keep the tokenized dataset in memory.
|
|
||||||
@@ -1,5 +0,0 @@
|
|||||||
# prompt_strategies.base { #axolotl.prompt_strategies.base }
|
|
||||||
|
|
||||||
`prompt_strategies.base`
|
|
||||||
|
|
||||||
module for base dataset transform strategies
|
|
||||||
@@ -1,80 +0,0 @@
|
|||||||
# prompt_strategies.chat_template { #axolotl.prompt_strategies.chat_template }
|
|
||||||
|
|
||||||
`prompt_strategies.chat_template`
|
|
||||||
|
|
||||||
HF Chat Templates prompt strategy
|
|
||||||
|
|
||||||
## Classes
|
|
||||||
|
|
||||||
| Name | Description |
|
|
||||||
| --- | --- |
|
|
||||||
| [ChatTemplatePrompter](#axolotl.prompt_strategies.chat_template.ChatTemplatePrompter) | Prompter for HF chat templates |
|
|
||||||
| [ChatTemplateStrategy](#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy) | Tokenizing strategy for instruction-based prompts. |
|
|
||||||
| [StrategyLoader](#axolotl.prompt_strategies.chat_template.StrategyLoader) | Load chat template strategy based on configuration. |
|
|
||||||
|
|
||||||
### ChatTemplatePrompter { #axolotl.prompt_strategies.chat_template.ChatTemplatePrompter }
|
|
||||||
|
|
||||||
```python
|
|
||||||
prompt_strategies.chat_template.ChatTemplatePrompter(
|
|
||||||
self,
|
|
||||||
tokenizer,
|
|
||||||
chat_template,
|
|
||||||
processor=None,
|
|
||||||
max_length=2048,
|
|
||||||
message_property_mappings=None,
|
|
||||||
message_field_training=None,
|
|
||||||
message_field_training_detail=None,
|
|
||||||
field_messages='messages',
|
|
||||||
roles=None,
|
|
||||||
drop_system_message=False,
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Prompter for HF chat templates
|
|
||||||
|
|
||||||
### ChatTemplateStrategy { #axolotl.prompt_strategies.chat_template.ChatTemplateStrategy }
|
|
||||||
|
|
||||||
```python
|
|
||||||
prompt_strategies.chat_template.ChatTemplateStrategy(
|
|
||||||
self,
|
|
||||||
prompter,
|
|
||||||
tokenizer,
|
|
||||||
train_on_inputs,
|
|
||||||
sequence_len,
|
|
||||||
roles_to_train=None,
|
|
||||||
train_on_eos=None,
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Tokenizing strategy for instruction-based prompts.
|
|
||||||
|
|
||||||
#### Methods
|
|
||||||
|
|
||||||
| Name | Description |
|
|
||||||
| --- | --- |
|
|
||||||
| [find_turn](#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.find_turn) | Locate the starting and ending indices of the specified turn in a conversation. |
|
|
||||||
| [tokenize_prompt](#axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt) | Public method that can handle either a single prompt or a batch of prompts. |
|
|
||||||
|
|
||||||
##### find_turn { #axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.find_turn }
|
|
||||||
|
|
||||||
```python
|
|
||||||
prompt_strategies.chat_template.ChatTemplateStrategy.find_turn(turns, turn_idx)
|
|
||||||
```
|
|
||||||
|
|
||||||
Locate the starting and ending indices of the specified turn in a conversation.
|
|
||||||
|
|
||||||
##### tokenize_prompt { #axolotl.prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt }
|
|
||||||
|
|
||||||
```python
|
|
||||||
prompt_strategies.chat_template.ChatTemplateStrategy.tokenize_prompt(prompt)
|
|
||||||
```
|
|
||||||
|
|
||||||
Public method that can handle either a single prompt or a batch of prompts.
|
|
||||||
|
|
||||||
### StrategyLoader { #axolotl.prompt_strategies.chat_template.StrategyLoader }
|
|
||||||
|
|
||||||
```python
|
|
||||||
prompt_strategies.chat_template.StrategyLoader()
|
|
||||||
```
|
|
||||||
|
|
||||||
Load chat template strategy based on configuration.
|
|
||||||
@@ -1,38 +0,0 @@
|
|||||||
# utils.tokenization { #axolotl.utils.tokenization }
|
|
||||||
|
|
||||||
`utils.tokenization`
|
|
||||||
|
|
||||||
Module for tokenization utilities
|
|
||||||
|
|
||||||
## Functions
|
|
||||||
|
|
||||||
| Name | Description |
|
|
||||||
| --- | --- |
|
|
||||||
| [color_token_for_rl_debug](#axolotl.utils.tokenization.color_token_for_rl_debug) | Helper function to color tokens based on their type. |
|
|
||||||
| [process_tokens_for_rl_debug](#axolotl.utils.tokenization.process_tokens_for_rl_debug) | Helper function to process and color tokens. |
|
|
||||||
|
|
||||||
### color_token_for_rl_debug { #axolotl.utils.tokenization.color_token_for_rl_debug }
|
|
||||||
|
|
||||||
```python
|
|
||||||
utils.tokenization.color_token_for_rl_debug(
|
|
||||||
decoded_token,
|
|
||||||
encoded_token,
|
|
||||||
color,
|
|
||||||
text_only,
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Helper function to color tokens based on their type.
|
|
||||||
|
|
||||||
### process_tokens_for_rl_debug { #axolotl.utils.tokenization.process_tokens_for_rl_debug }
|
|
||||||
|
|
||||||
```python
|
|
||||||
utils.tokenization.process_tokens_for_rl_debug(
|
|
||||||
tokens,
|
|
||||||
color,
|
|
||||||
tokenizer,
|
|
||||||
text_only,
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Helper function to process and color tokens.
|
|
||||||
Reference in New Issue
Block a user