axolotl/docs at 9f68918f130417c8f721da22fa49e1769d686fd0 - axolotl - Gitea

tocmo0nlord/axolotl

Files

History

mhenrhcsen 9f68918f13 Implement configurable handling of excess tokens in datasets

- Added `excess_token_handling` option to the configuration, allowing users to choose between "drop" and "truncate" for handling tokens exceeding the maximum sequence length.
- Introduced `truncate_or_drop_long_seq` function to manage both single and batched samples based on the selected handling method.
- Updated relevant dataset processing functions to utilize the new handling option, ensuring backward compatibility with existing "drop" behavior.
- Enhanced logging to reflect truncation actions in dataset processing.

This change improves flexibility in managing sequence lengths during training and evaluation.

2025-05-12 14:08:43 +02:00

..

dataset-formats

feat(doc): add split_thinking docs (#2613 ) [skip ci]

2025-05-06 20:05:32 -04:00

Ray Train Axolotl Integration (#2251 )

2025-01-29 00:10:19 -05:00

.gitignore

Autodoc generation with quartodoc (#2419 )

2025-03-21 12:26:47 -04:00

amd_hpc.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

batch_vs_grad.qmd

Feat: update doc (#1475 ) [skip ci]

2024-04-04 13:43:40 +09:00

cli.qmd

Fix(doc): add delinearize instruction (#2545 )

2025-04-24 01:03:43 -04:00

config.qmd

Implement configurable handling of excess tokens in datasets

2025-05-12 14:08:43 +02:00

custom_integrations.qmd

Add: Sparse Finetuning Integration with llmcompressor (#2479 )

2025-05-01 12:25:16 -04:00

dataset_loading.qmd

Feat: Add doc on loading datasets and support for Azure/OCI (#2482 )

2025-04-07 12:41:13 -04:00

dataset_preprocessing.qmd

Autodoc generation with quartodoc (#2419 )

2025-03-21 12:26:47 -04:00

debugging.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

docker.qmd

chore(doc): update docker tags on doc (#2559 ) [skip ci]

2025-04-25 17:14:48 -04:00

faq.qmd

feat: add eos_tokens and train_on_eot for chat_template EOT parsing (#2364 )

2025-04-28 10:11:20 -04:00

fsdp_qlora.qmd

github urls (#1734 )

2024-07-11 09:19:29 -04:00

getting-started.qmd

Feat: minor docs improvements for RLHF and faq on embeddings (#2401 ) [skip ci]

2025-03-17 08:39:04 -04:00

inference.qmd

Feat: minor docs improvements for RLHF and faq on embeddings (#2401 ) [skip ci]

2025-03-17 08:39:04 -04:00

input_output.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

installation.qmd

Fix(doc): add delinearize instruction (#2545 )

2025-04-24 01:03:43 -04:00

lora_optims.qmd

feat: add support for multimodal in lora kernels (#2472 ) [skip ci]

2025-04-02 09:33:46 -04:00

lr_groups.qmd

support for custom lr groups for non-embedding modules (#2213 )

2025-01-24 12:56:28 -05:00

mac.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

multi-gpu.qmd

feat(doc): explain deepspeed configs (#2514 ) [skip ci]

2025-04-11 09:52:43 -04:00

multi-node.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

multimodal.qmd

fix(doc): key used to point to url in multimodal doc (#2575 ) [skip ci]

2025-04-29 15:10:59 -04:00

multipack.qmd

Bootstrap Hosted Axolotl Docs w/Quarto (#1429 )

2024-03-21 22:28:36 -07:00

nccl.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

ray-integration.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

reward_modelling.qmd

chore(docs): add cookbook/blog link to docs (#2410 ) [skip ci]

2025-03-17 08:38:19 -04:00

rlhf.qmd

fix(doc): clarify vllm usage with grpo (#2573 ) [skip ci]

2025-04-28 10:07:45 -04:00

sequence_parallelism.qmd

batch api HF adapter for ring-flash-attn; cleanup and improvements (#2520 )

2025-04-16 13:50:48 -04:00

torchao.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

unsloth.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00