axolotl

Author	SHA1	Message	Date
Wing Lian	4d92a68a96	use default torch fused adamw optimizer as default as adamw_hf is deprecated (#2425 ) * use default torch fused adamw optimizer as default as adamw_hf is deprecated * make sure to have latest packaging installed * bump packagingin requirements.txt too	2025-03-19 23:58:33 -04:00
NanoCode012	51cd409488	Feat: minor docs improvements for RLHF and faq on embeddings (#2401 ) [skip ci] * feat: add doc on shrink_embeddings and custom calling * chore: rename inference doc * fix: clarify same config is used for all cli * chore: rearrange order inference qmd * feat: add simpo to doc * fix: update defaults * feat: add rl configs to doc * fix: ensure beta consistent with trl.beta * fix: clarify about lora/fft * chore: rename title * chore: fix language * feat: move config reference higher * Update docs/getting-started.qmd Co-authored-by: salman <salman.mohammadi@outlook.com> * Update docs/rlhf.qmd Co-authored-by: salman <salman.mohammadi@outlook.com> --------- Co-authored-by: salman <salman.mohammadi@outlook.com>	2025-03-17 08:39:04 -04:00
Wing Lian	4f5eb42a73	remove reference to deprecated import (#2407 )	2025-03-15 08:49:41 -04:00
Wing Lian	fbe54be6b8	only validate hf user token on rank 0 (#2408 )	2025-03-13 23:29:06 -04:00
Wing Lian	f0072f3b9d	use max of 32 dataset processes if not explicit (#2403 ) * use max of 32 dataset processes if not explicit * change alternate min val for consistency	2025-03-11 12:02:58 -04:00
Wing Lian	59899b9817	pass additional info for fix untrained tokens when using distributed + offloading (#2388 ) * pass additional info for fix untrained tokens when using distributed + offloading * use latest version of vendored lib * use v0.0.5 of contribs lgpl * fix for no bad tokens and add tests * use release * add multigpu test too * make sure the multigpu zero3 test actually uses zero3	2025-03-11 12:02:43 -04:00
NanoCode012	4a736986fa	fix(modal): add git pull when getting branch files (#2399 )	2025-03-10 15:14:41 -04:00
NanoCode012	83f8698b8a	fix: create mount folder on modal if not exist (#2390 )	2025-03-10 16:27:42 +07:00
xzuyn	60a11a6410	Use Latest Cut Cross Entropy (#2392 ) * Update __init__.py * Update README.md * Update cutcrossentropy_install.py * add test	2025-03-10 16:26:40 +07:00
NanoCode012	16dc6ee68d	refactor: trl grpo configs to have descriptions (#2386 ) * refactor: trl grpo configs to have descriptions * chore: caps	2025-03-07 08:58:53 -05:00
Wing Lian	fa7c79b3b9	remove lion-pytorch as it's already handled upstream (#2389 )	2025-03-07 08:58:15 -05:00
Wing Lian	ae66374156	Optimizer refactor and add Muon support (#2367 ) * add muon optimizer optimizer_cls_and_kwargs is on trainer_kwargs only add adamw_kwargs if they're non-null fix mocks better handling of override and check the optimizer unwrap optimizer * fix import	2025-03-06 11:49:19 -05:00
Wing Lian	5e21b1a9da	various fixes 20250305 (#2384 ) * various validation fixes * fix check for non-truthy value	2025-03-06 11:48:44 -05:00
mhenrichsen	575e5f28ec	Update Tokenizer Overrides Handling in models.py (#1549 ) * override special tokens mock code * fix(doc): remove duplicate config * feat: replace added_tokens in tokenizer and add test * make sure to run tokenizer modification on rank 0 only * use is local main process instead * feat: rename config --------- Co-authored-by: NanoCode012 <nano@axolotl.ai> Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-03-05 11:15:12 -05:00
xzuyn	0134093acc	Add REX LR Scheduler (#2380 ) * Update trainer_builder.py * Update base.py * Update __init__.py * Update base.py * Update base.py * Update config.qmd * Update base.py * Update base.py * Update base.py * Update base.py * Update base.py * Update base.py * Update base.py * lint * lint * lint * lint * lint * lint * Update base.py * Update base.py * lint * Update base.py * Update base.py * Move RexLR to `schedulers.py` * Remove RexLR from `base.py` * Fix tooltip formatting * lint * Create test_schedulers.py * Use a default optimizer in test * lint * lint * Add `warmup_steps` and `cosine_min_lr_ratio` to test * lint	2025-03-05 10:26:11 -05:00
NanoCode012	d4de93a7bb	feat(grpo): add reward_weights config and refactor (#2365 )	2025-03-05 10:02:08 -05:00
NanoCode012	d883b11b6f	fix(doc): add installation for cce to docs (#2375 ) [skip ci] * fix(doc): add installation for cce to docs * fix: format	2025-03-05 10:00:39 -05:00
Dan Saunders	f4910dd2ea	`train.py` refactor (#2371 ) * refactor train.py * updates * update * combine like functions * review comments	2025-03-05 08:58:33 -05:00
NanoCode012	2efe1b4c09	Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 ) * feat(doc): organize docs, add to menu bar, fix broken formatting * feat: add link to custom integrations * feat: update readme for integrations to include citations and repo link * chore: update lm_eval info * chore: use fullname * Update docs/cli.qmd per suggestion Co-authored-by: Dan Saunders <danjsaund@gmail.com> * feat: add sweep doc * feat: add kd doc * fix: remove toc * fix: update deprecation * feat: add more info about chat_template issues * fix: heading level * fix: shell->bash code block * fix: ray link * fix(doc): heading level, header links, formatting * feat: add grpo docs * feat: add style changes * fix: wrong cli arg for lm-eval * fix: remove old run method * feat: load custom integration doc dynamically * fix: remove old cli way * fix: toc * fix: minor formatting --------- Co-authored-by: Dan Saunders <danjsaund@gmail.com>	2025-02-25 16:09:37 +07:00
NanoCode012	1110a37e21	feat: add deepseek_v3 sample packing (#2230 )	2025-02-24 15:03:15 -05:00
Matt Baker	00fc8109e4	Correctly reference mount paths (#2347 ) * Correctly reference mount paths * Also fix mount paths in lm_eval * chore: lint --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-02-24 11:12:57 -05:00
Wing Lian	2d5826f544	Relicense the logprob KD loss functions as Apache 2.0 (#2358 )	2025-02-23 12:31:35 -05:00
Wing Lian	a4170030ab	don't install extraneous old version of pydantic in ci and make sre to run multigpu ci (#2355 )	2025-02-21 22:06:29 -05:00
Wing Lian	1db6ad60a7	support for passing init_lora_weights to lora_config (#2352 )	2025-02-20 22:56:34 -05:00
salman	29b366b2e1	Bumping 0.15.1 TRL version for GRPO+PEFT fix (#2344 ) * bumping TRL version * apply upstream fixes to our custom fix --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-02-20 22:56:04 -05:00
NanoCode012	b53a41372f	feat: update transformers version to 4.49.0 (#2340 )	2025-02-20 21:12:06 -05:00
Wing Lian	02f45e94be	calculate sample length fixes and SFT splitting fixes (#2351 ) * fix chat template splitting long samples across multiple rows * make the preprocessing faster	2025-02-20 14:29:58 -05:00
Tobias	8dfadc2b3c	Fix sample packing producing longer sequences than specified by `sequence_len` (#2332 ) * Extend MultiPackBatchSampler test to include shorter sequence length and drop long sequences filter * Fix get_dataset_lengths for datasets that were previously filtered (e.g., with drop_long_seq_in_dataset) * Update src/axolotl/utils/samplers/utils.py Fix get_dataset_lengths for datasets that do not have position_ids or length attributes Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> --------- Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2025-02-19 12:02:35 +07:00
Wing Lian	7fa690fac8	bump dev version (#2342 )	2025-02-18 04:30:59 -05:00
Wing Lian	3c743c4bfb	v0.7.0 for release (#2341 ) Some checks failed ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled Details ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.6.0) (push) Has been cancelled Details ci-cd / build-axolotl (vllm, 124, 12.4.1, true, 3.11, 2.5.1) (push) Has been cancelled Details publish pypi / Create Release (push) Has been cancelled Details ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled Details ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, true, 3.11, 2.5.1) (push) Has been cancelled Details ci-cd / build-axolotl-cloud-no-tmux (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled Details publish pypi / Upload release to PyPI (push) Has been cancelled Details	2025-02-18 04:26:21 -05:00
NJordan72	91bb95685a	chore: cleanup deprecated config elements (#2309 ) * feat: update metadata fields and refactor config class in axolotlinputconfig - Replace `metadata` fields with `json_schema_extra` in RayConfig class. - Replace `Config` class with `ConfigDict` in AxolotlInputConfig. - Set `populate_by_name` to `True` directly in `ConfigDict` instance. * feat: update axolotlinputconfig in utils * Replace `conlist` with `Annotated` for `datasets`, `test_datasets`, and `pretraining_dataset` fields * Change default values for `lr_scheduler` and `optimizer` fields in `HyperparametersConfig` class * Remove unnecessary Union from `evals_per_epoch` field in `AxolotlInputConfig` class * Import `MinLen` from `annotated_types` module * Remove import of `conlist` from `pydantic` module * feat: update modelinputconfig and axolotlinputconfig in v0_4_1 - Removed ConfigDict import from pydantic in `src/axolotl/utils/config/models/input/v0_4_1/__init__.py` - Added `model_config` with `protected_namespaces` to ModelInputConfig - Replaced `config: ConfigDict` with `model_config` in AxolotlInputConfig - Set `populate_by_name` to True in `model_config` for AxolotlInputConfig * chore: get rid of unused import	2025-02-18 15:39:24 +07:00
NJordan72	b194e17c28	feat: add config for optional parameters in a chat message (#2260 ) * feat: add config for optional parameters in a chat message * chore: cleanup * chore: fix nits and add light docs * docs: update docs/dataset-formats/conversation.qmd Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * feat: configurable message mappings, jinja template analyzer * chore: handle bradley terry * docs: update docs * refactor: change order of mappings, improve message transform * refactor: make chat awware of property mappings * chore: remove .python-version * chore: revert change * chore: add dataset validation to tests where appropriate * chore: add dataset validation to tests where appropriate * chore: clean up handling of ds_cfg * chore: recursively serialize config * make sure to use the return value from validate_config * DefaultDict pickle/unpickle fix * fix super call for override * refactor: message fields * chore: empty commit * tests: validate config before using * chore: add config validation to all e2e tests * chore: add unneeded logging * chore: add missed config validation * chore: pass field_messages to prompter * test: fix borked test * chore: remove uninteded file * chore: add deprecation warning and update chat_datasets script * chore: lint * refactor: message fields * feat: update axolotlinputconfig and test_models - add configdict import in axolotl/utils/config/models/input/v0_4_1/__init__.py - remove unnecessary line breaks in sftdataset, dpodataset, ktodataset, stepwisesuperviseddataset classes - update model_dump method in axolotlinputconfig to exclude none values - correct typo in test_models.py comment * feat: simplify dpodataset and ktodataset classes in config models removed several optional fields from dpodataset and ktodataset classes in axolotl/utils/config/models/input/v0_4_1. this simplifies the configuration subsets for these datasets. * feat: improve readability and structure in dataset configuration models this commit enhances the readability and structure of the dataset configuration models in the `axolotl/utils/config/models/input/v0_4_1` module. it removes unused `configdict` import and adds line breaks to separate class definitions for better clarity. additionally, a minor documentation fix is included to ensure a newline at the end of the `stepwise_supervised.qmd` file. * feat: change log level from info to debug in chattemplatestrategy * feat(prompt_strategies): refactor chattemplateprompter and chattemplatestrategy - Make `chat_template` a required parameter in `ChatTemplatePrompter` constructor - Add default value for `message_property_mappings` in `ChatTemplatePrompter` constructor - Add `messages_array_name` property to `ChatTemplatePrompter` - Change `processor` type to Optional in `ChatTemplatePrompter` - Add TypeError check for `processor` in `ChatTemplatePrompter.build_prompt` - Remove `_messages` property from `ChatTemplateStrategy` - Make `prompter` a required parameter and add type hint in `ChatTemplateStrategy` constructor - Remove `messages` getter and setter from `ChatTemplateStrategy` - Use `prompter.messages_array_name` in `ChatTemplateStrategy.get_conversation_thread` - Remove condition to set `messages` field in `load` function * feat(tests/utils): ignore type check in load_model call in test_models.py * feat: improve type handling and test structure in chat templates - Add return type hint for `get_chat_template` function in `chat_templates.py` - Remove unnecessary assignment of `strategy.messages` in several test cases - Add `messages_array_name` parameter to various test configurations in `test_chat_templates.py` and `test_chat_templates_advanced.py` - Remove redundant `strategy.messages` assignment in `test_chat_templates_advanced.py` * feat(axolotl): enhance chat strategy with datasetconfig support This commit introduces support for DatasetConfig in the ChatTemplateStrategy. It also refines the strategy loader to handle different types of ds_cfg inputs and improves the clarity of the code by formatting and reordering. The key changes include: - Importing Union from typing and BaseModel from pydantic. - Adding DatasetConfig as an optional type for ds_cfg in StrategyLoader. - Adjusting the handling of ds_cfg in StrategyLoader to account for BaseModel instances. - Refactoring the prompter_params and strategy_params for better readability. - Changing the reference from prompt[self.messages] to prompt[self.prompter.messages_array_name] in the is_prompt_batched method. * feat: update message handling in btchattemplatestrategy * Replace `self.messages` with direct string references to "chosen_messages" and "rejected_messages" * Append system, user, and assistant content directly to "chosen_messages" and "rejected_messages" * Add a new attribute "messages_array_name" to the `load` function parameters * Remove the conditional attribute assignment for "field_messages" in the `load` function * feat: add config validation in test_kd.py - Import `validate_config` from `axolotl.utils.config` - Validate the configuration in `test_llama_kd` and another function in `TestKnowledgeDistillation` class * feat: enhance config validation and capabilities handling * Import `EnvCapabilities` and `GPUCapabilities` from `axolotl.utils.config.models.internals` * Update `validate_config` function to create `KTODataset` and `SFTDataset` instances using `dict(ds_cfg)` * Replace `capabilities` and `env_capabilities` with instances of `GPUCapabilities` and `EnvCapabilities` respectively in `AxolotlConfigWCapabilities` model dump * feat: update config validation in axolotl utils - Remove import of `EnvCapabilities` and `GPUCapabilities` from `axolotl.utils.config.models.internals` - Update `validate_config` function to use `capabilities` and `env_capabilities` directly instead of creating new instances of `GPUCapabilities` and `EnvCapabilities` * feat: refactor strategyloader in chat_template.py - Extracted the creation of strategy parameters into a separate function, `_get_strategy_params(cfg, dataset_config)` - Created a new function, `_get_strategy_cls()`, to obtain the strategy class - Replaced `ChatTemplateStrategy` with `strategy_cls` for strategy instantiation * trigger CI * chore: revert dataset config changes for kto/dpo * subject: refactor: rename 'messages_array_name' to 'field_messages' Body: - Renamed 'messages_array_name' to 'field_messages' in 'ChatTemplatePrompter' class and its usages in 'chat_template.py' - Updated 'load' function in 'bradley_terry/chat_template.py' to reflect the change - Adjusted 'get_chat_template_msg_variables' and 'get_message_vars' methods in 'jinja_template_analyzer.py' to use the new variable name - Modified 'StrategyLoader' in 'chat_template.py' to use 'field_messages' - Updated tests in 'test_chat_templates.py' and 'test_chat_templates_advanced.py' to use 'field_messages' instead of 'messages_array_name' * feat: refactor prompt strategies and update config models * Remove redundant 'return None' in `axolotl/prompt_strategies/__init__.py` * Simplify message handling in `axolotl/prompt_strategies/bradley_terry/chat_template.py` by using a single 'messages' list instead of separate 'chosen_messages' and 'rejected_messages' lists * Update default 'message_property_mappings' in `axolotl/prompt_strategies/bradley_terry/chat_template.py` * Add 'field_messages' field to `axolotl/utils/config/models/input/v0_4_1/__init__.py` configuration model * chore: remove unused input * chore: remove redundant type ignore * fix: remove old configs and update examples * fix: type check * fix: remove loading old config in ChatMessage * fix: update faq with potential new undefinederror * fix: add debug if property mapped is not found * chore: improve explanation for unmapped properties * fix: update docs with new config * chore: add note for deprecation config and del old config from dict --------- Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> Co-authored-by: Wing Lian <wing@axolotl.ai> Co-authored-by: NanoCode012 <nano@axolotl.ai>	2025-02-18 09:59:27 +07:00
Dan Saunders	3aac3b1da9	Move sweeps code to another module (#2338 )	2025-02-17 15:46:04 -05:00
Dan Saunders	3d8425fa91	Activation function Triton kernels, LoRA custom autograd functions (#2324 ) * LoRA + activation fn Triton kernels: initial commit * implementing optims * finalizing MLP LoRA kernels and progress on QKV / W kernels * updates * O projection optim * adding monkey patching logic * doc strings, typing, pre-commit fixes * updates * adding lora 8b kernels example * working on fsdp support * tests and fixes * small fixes, getting tests to pass, adding doc strings * integration tests for LoRA patching * config.qmd * remove unneeded pytest fixture * fix * review comments first pass * improving tests, attention class agnostic patching * adding support for more archs * wip SiLU / GELU impls * improved testing, small updates, etc. * slightly updating docs * rebase * fixing test_attention_patching_integration * additional review comments, fixing test in CI (hopefully) * isolating problematic patching test * relaxing allclose threshold to reduce flakiness * fixing accidental change * adding model arch agnostic attention class fetching * removing unused activations	2025-02-17 14:23:15 -05:00
Seungduk Kim	97a2fa2781	Select input_ids explicitly after panda conversion (#2335 ) Without selecting the column, applying `len` counts the whole row as 1 which resulting the total number of the samples instead of the token counts.	2025-02-17 00:07:27 -05:00
Wing Lian	a98526ef78	add support for include_tokens_per_second in training args (#2269 ) * add support for include_tokens_per_second in training args * Update docs/config.qmd Co-authored-by: NanoCode012 <nano@axolotl.ai> * Update src/axolotl/core/trainer_builder.py Co-authored-by: NanoCode012 <nano@axolotl.ai> --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2025-02-13 17:39:19 -05:00
NanoCode012	2e57391bf8	fix: add missing shards_idx, preprocess_shards to docs and validator (#2331 )	2025-02-13 17:28:21 -05:00
minpeter	aa45fed451	Add `bos_token` and `add_generation_prompt` to the alpaca chat template (#2322 ) * fix alpaca add_generation_prompt * Alpaca template considering multi-turn Co-authored-by: xzuyn <xzuyn@users.noreply.github.com> --------- Co-authored-by: xzuyn <xzuyn@users.noreply.github.com>	2025-02-13 17:27:55 -05:00
Wing Lian	ffae8d6a95	GRPO (#2307 )	2025-02-13 16:01:01 -05:00
Lee Park	fdbb1a207c	[Fixing #2149 ] load_from_disk for RL-type training (#2193 ) * Update rl.py * Update rl.py * Update rl.py * refactor pref dataset loading to reuse load_dataset_w_config * refactor again after rebase from main * chore: add docstring and types --------- Co-authored-by: Wing Lian <wing@axolotl.ai> Co-authored-by: NanoCode012 <nano@axolotl.ai>	2025-02-13 08:31:07 -05:00
NanoCode012	526e5ee8b8	fix(config): missing config not being documented and fix model_ override (#2317 ) * fix(config): missing config not being documented and fix model_ space override * fix: delete redundant field	2025-02-08 06:01:48 -05:00
Wing Lian	1faf1a5c5a	batch add of spectrum snr results (#2320 )	2025-02-07 21:33:14 -05:00
NanoCode012	a620d481e2	fix: drop long seq even if not sample packing (#2211 ) * fix: drop long seq even if not sample packing * fix: logging import * fix: cfg passed being none * fix: try to fix logging * fix: refactor call to not use accelerate log * fix: try to fix circular import issue * fix: don't drop when skip prepare * chore: remove duplicate line * fix: update warning to mention that sequences will be trimmed * fix: do not drop seq if input_ids don't exist * fix: increase RM unittest sequence length to reduce trim warnings * fix: solve conflicts * fix: default min_seq_len in case of None	2025-02-04 09:43:35 -05:00
Wing Lian	158330ab60	[feature] sweeps (#2171 )	2025-02-01 21:11:18 -05:00
Wing Lian	80e1468b8d	better handling of multipack dataset length (#2296 )	2025-02-01 21:10:34 -05:00
Wing Lian	78ce268848	KD Trainer w logprobs (#2303 ) * refactor trainer to prevent circular dependencies later fix loader default KD dataset loading and KD with logprobs filter bad rows make batch smaller handle padding/collation for KD datasets make it work flipped the slice cross entropy loss coefficient during KD make sure to multiply against the correct loss chore: lint triton wip no where support v2 trial no torch.exp inside triton kernel no log etc no torch.tensor v3 fix kwarg don't use triton for now better rescaling for temperatures hash for temperature too use kd_alpha in the correct loss method fix kd loss so it's causal (fixes repeating tokens) var naming and add todo chore: lint refactor so we can easily add new loss functions add license block remove references to triton kd for now handle token/logprob shifting support for custom trainer classes from plugins refactor kd chat template loader move more things to kd plugin remove moved class from import make plugin setup concise increase logging around loading plugins add copyrights remove duplicate code more info on preprocess for kd and fix import be a bit pickier about loading dynamic prompt strategies kd sample packing make loss torch script compat support streaming for processing sft datasts? improve iterable support ensure that batch vs single is done properly tweak check for batched prompt data reward can use same batch check fix reward trainer calls for tokenization improve check for batched reward model doesn't work well with batched add kd trainer e2e test linting rename test files so it gets picked up make the kd e2e fit in vram for ci and add lora version set lora_dropout explicitly lower lr make sure to set tokenizer from l3 70b and save safetensors make sure to use the correct tokenizer fix adapter model check make sure to use tensorboard to capture loss for checks chore: lint chore: lint improve logprob masking and shift in trainer more fixes try tests for kd on l40s don't shift student logits for kd no batching for kd chat templates make sure to truncate logprobs if there are more than top_k change up logic so we always truncate to top_k use iter instead of tuple fix finding the top-k rather than assuming first position has the correct val apply z-score scaling to kd kd loss needs to be calculated in full precision Always re-normalize teacher distribution various fixes * support for configurable top-k/softmax ordering * add attribute check for filter rows and lint * fix logic * handle none case for conversion to int * fix student logit off by one * set kd_temp to 1.0 for test loss * address PR feedback	2025-01-31 20:18:52 -05:00
NanoCode012	d425d5d3c3	fix: add warning for invalid eval_steps or save_steps (#2298 )	2025-01-31 08:58:25 -05:00
Wing Lian	cf17649ef3	Misc fixes 20250130 (#2301 ) * misc fixes for garbage collection and L40S w NCCL P2P * patch bnb fix for triton check * chore: lint * change up import * try patching differently * remove patch for bnb fix for now * more verbose checks and tweak train loss threshold	2025-01-31 08:58:04 -05:00
Wing Lian	6f713226dd	make save_safetensors: true the default (#2292 ) * make save_safetensors: true the default * revert change to model output check	2025-01-30 11:48:48 -05:00
Wing Lian	8779997ba5	native support for modal cloud from CLI (#2237 ) * native support for modal cloud from CLI * do lm_eval in cloud too * Fix the sub call to lm-eval * lm_eval option to not post eval, and append not extend * cache bust when using branch, grab sha of latest image tag, update lm-eval dep * allow minimal yaml for lm eval * include modal in requirements * update link in README to include utm * pr feedback * use chat template * revision support * apply chat template as arg * add wandb name support, allow explicit a100-40gb * cloud is optional * handle accidental setting of tasks with a single task str * document the modal cloud yaml for clarity [skip ci] * cli docs * support spawn vs remote for lm-eval * Add support for additional docker commands in modal image build * cloud config shouldn't be a dir * Update README.md Co-authored-by: Charles Frye <cfrye59@gmail.com> * fix annotation args --------- Co-authored-by: Charles Frye <cfrye59@gmail.com>	2025-01-30 11:34:02 -05:00

1 2 3 4 5 ...

994 Commits