Commit Graph

  • c2a48c3a1e add logging Wing Lian 2024-10-15 12:10:35 -04:00
  • 415399b565 Update README.md Wing Lian 2024-10-16 10:56:11 -04:00
  • 67c04133f2 Update src/axolotl/integrations/liger/args.py Wing Lian 2024-10-16 10:54:27 -04:00
  • 4911d0952f skip duplicate code check Wing Lian 2024-10-15 12:04:43 -04:00
  • 1d7ab52161 update docs and example Wing Lian 2024-10-15 11:45:41 -04:00
  • fcdc6fee8b upgrade liger to 0.3.1 Wing Lian 2024-10-15 11:43:54 -04:00
  • d1bf20f990 test upgrade_liger-tr4.46.1 sunny 2024-11-01 17:15:32 -04:00
  • bb648cbc63 test sunny 2024-11-01 17:02:35 -04:00
  • 8b0bca4842 test sunny 2024-11-01 17:01:03 -04:00
  • d36baf44b1 test sunny 2024-11-01 17:00:35 -04:00
  • 16c8140d20 test sunny 2024-11-01 16:38:09 -04:00
  • 21c25cf7bc test sunny 2024-11-01 16:34:00 -04:00
  • 32288a5d3c test sunny 2024-11-01 16:23:01 -04:00
  • 052a9a79b4 only run the remainder of the gpu test suite if one case passes first (#2009) [skip ci] Wing Lian 2024-10-31 13:45:01 -04:00
  • 8302069d6b Built site for gh-pages Quarto GHA Workflow Runner 2024-10-31 17:28:47 +00:00
  • 3591bcfaf9 add torch 2.5.1 for base image (#2010) Wing Lian 2024-10-31 13:27:49 -04:00
  • d0863075fc Built site for gh-pages Quarto GHA Workflow Runner 2024-10-31 17:27:13 +00:00
  • dc1de7d81b add retries for load datasets requests failures (#2007) Wing Lian 2024-10-31 13:26:14 -04:00
  • bfe207ea3e Built site for gh-pages Quarto GHA Workflow Runner 2024-10-31 16:14:39 +00:00
  • d4dbfa02fe Add plugin manager's callback hooks to training flow (#2006) Chirag Jain 2024-10-31 21:43:46 +05:30
  • efa1209a92 add smoke test training soap-optim Wing Lian 2024-10-30 15:40:27 -04:00
  • 67b9e31bbc make sure to set alternate optimizer and set lr and eps from adam Wing Lian 2024-10-17 17:21:19 -04:00
  • ad60916323 add soap optimizer support Wing Lian 2024-10-17 17:02:46 -04:00
  • 2888b7ac47 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-30 18:42:30 +00:00
  • 5c7e89105d Fix: modelloader handling of model_kwargs load_in*bit (#1999) NanoCode012 2024-10-31 01:41:34 +07:00
  • c726cba3c5 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-30 18:27:59 +00:00
  • 74db2a1bae Fix get_chat_template call for trainer builder (#2003) Chirag Jain 2024-10-30 23:57:00 +05:30
  • bfb80a3ef9 stuff 1991test sunny 2024-10-30 13:44:06 -04:00
  • 5efcfd62d8 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-30 16:31:05 +00:00
  • e62554c419 feat: add Exaone3 chat_template (#1995) Geun, Lim 2024-10-31 01:30:12 +09:00
  • 3cf45b8ea1 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-30 16:28:00 +00:00
  • 32c60765ef remove skipped test (#2002) Wing Lian 2024-10-30 12:27:04 -04:00
  • 38773d661f fixing sunny 2024-10-30 11:04:50 -04:00
  • 271c2c2b82 fixed formatting sunny 2024-10-29 15:50:56 -04:00
  • 32b6f30947 fix attempt at issue 1991 sunny 2024-10-29 15:44:32 -04:00
  • fc1f275e6c yml change sunny 2024-10-29 15:27:42 -04:00
  • 46d2b4ce89 yml change sunny 2024-10-29 15:25:25 -04:00
  • 88c9a7aecc LOG for debug sunny 2024-10-29 13:35:55 -04:00
  • d9a93990d1 yml sunny 2024-10-29 10:40:32 -04:00
  • 8c3a727f9d feat: update yml chat_template to specify dataset field (#2001) [skip ci] NanoCode012 2024-10-29 21:26:03 +07:00
  • 107b67b852 Hardware requirements (#1997) [skip ci] Oliver Kunc 2024-10-29 15:13:50 +01:00
  • b376adbaa5 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-29 03:15:47 +00:00
  • bfc77b0f36 Feat: Add support for tokenizer’s or custom jinja chat_template (#1970) NanoCode012 2024-10-29 10:14:51 +07:00
  • a740f2ad8c Built site for gh-pages Quarto GHA Workflow Runner 2024-10-28 21:03:00 +00:00
  • e1e0556c99 add option for resizing embeddings when adding new tokens (#2000) Wing Lian 2024-10-28 17:02:04 -04:00
  • 41a68b5ddf Built site for gh-pages Quarto GHA Workflow Runner 2024-10-28 11:33:43 +00:00
  • d3c45d27b5 fix zero3 (#1994) Wing Lian 2024-10-28 07:32:49 -04:00
  • 2dfe749d9b Built site for gh-pages Quarto GHA Workflow Runner 2024-10-25 15:29:18 +00:00
  • 2501c1a6a3 Fix: Gradient Accumulation issue (#1980) NanoCode012 2024-10-25 22:28:23 +07:00
  • 1208543a19 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-25 13:07:47 +00:00
  • 1d6a5e2bd6 Refactor func load_model to class ModelLoader (#1909) Mengqing Cao 2024-10-25 21:06:56 +08:00
  • 7581eafc67 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-22 17:55:21 +00:00
  • 718cfb2dd1 revert image tagged as main-latest (#1990) Wing Lian 2024-10-22 13:54:24 -04:00
  • 403da4ec18 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-22 12:53:39 +00:00
  • 9bd5f7d015 Log checkpoints as mlflow artifacts (#1976) Adam Hazell 2024-10-22 13:52:21 +01:00
  • cc657dd0e5 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-21 23:52:10 +00:00
  • 9aec22b10a Built site for gh-pages Quarto GHA Workflow Runner 2024-10-21 23:51:42 +00:00
  • 5c629ee444 use torch 2.4.1 images as latest now that torch 2.5.0 is out (#1987) Wing Lian 2024-10-21 19:51:06 -04:00
  • 955cca41fc don't explicitly set cpu pytorch version (#1986) Wing Lian 2024-10-21 19:50:50 -04:00
  • 7df3648071 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-21 15:01:42 +00:00
  • e12a2130e9 first pass at pytorch 2.5.0 support (#1982) Wing Lian 2024-10-21 11:00:45 -04:00
  • 66a1e209e3 changed yml 1947fix sunny 2024-10-18 10:46:36 -04:00
  • 8f5d7d63af Built site for gh-pages Quarto GHA Workflow Runner 2024-10-18 07:38:11 +00:00
  • 67f744dc8c add pytorch 2.5.0 base images (#1979) Wing Lian 2024-10-18 03:36:51 -04:00
  • 1359a4db24 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-17 19:16:25 +00:00
  • f62e23737b memoize dataset length for eval sample packing (#1974) Sunny Liu 2024-10-17 15:15:29 -04:00
  • da655994a4 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-17 18:13:24 +00:00
  • 54673fd6ca also debug if other debug args are set (#1977) Wing Lian 2024-10-17 14:12:31 -04:00
  • dba033cb5b fixed yml for issue1947 testing sunny 2024-10-17 12:31:11 -04:00
  • 6a32e9a0df added yml for testing issue 1947 sunny 2024-10-17 11:26:32 -04:00
  • 4afb2656b3 added yml for testing issue 1947 sunny 2024-10-17 11:23:47 -04:00
  • 28e7e444ee fix: update bradleyterry to use new chat_template cj_tokenizer_default_prompt_template NanoCode012 2024-10-16 20:42:14 +07:00
  • 6d9a3c4d81 examples: Fix config llama3 (#1833) [skip ci] JohanWork 2024-10-14 22:00:48 +02:00
  • 207e7627f9 fix(doc): formatting NanoCode012 2024-10-15 00:41:50 +07:00
  • 7eb62ae5a9 fix: update dummy message to prevent potential overlap with real content NanoCode012 2024-10-14 23:50:35 +07:00
  • 95805cf850 chore: lint NanoCode012 2024-10-14 23:43:30 +07:00
  • 4aafb7e600 fix: imported name incorrectly updated on merge NanoCode012 2024-10-14 23:41:17 +07:00
  • 17bc4c8b36 fix: update test based on new defaults NanoCode012 2024-10-14 18:03:35 +07:00
  • d101cfc125 feat: handles chat_template requiring specific user/assistant order NanoCode012 2024-10-14 14:00:55 +07:00
  • e5cd55cff9 feat: add example using fallback NanoCode012 2024-10-14 12:22:22 +07:00
  • 24aa6b15a0 feat: handle sharegpt deprecation better in docs NanoCode012 2024-10-14 12:21:58 +07:00
  • 9dfc5fa8b8 fix: remove default setting on edge case where chat template overriden in dataset section NanoCode012 2024-10-14 11:48:40 +07:00
  • 0c3255288f Merge branch 'main' into cj_tokenizer_default_prompt_template NanoCode012 2024-10-14 10:36:08 +07:00
  • 8bb80c01c2 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-14 00:05:38 +00:00
  • 335027f155 upgrade accelerate to 1.0.1 (#1969) Wing Lian 2024-10-13 20:04:30 -04:00
  • f4c9a14451 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-13 21:35:34 +00:00
  • ec4272c3a0 add ds zero3 to multigpu biweekly tests (#1900) Wing Lian 2024-10-13 17:34:37 -04:00
  • fdbfad75b2 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-13 19:12:02 +00:00
  • 68b1369de9 Reward model (#1879) Wing Lian 2024-10-13 15:11:13 -04:00
  • 8d12050b29 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-13 16:16:25 +00:00
  • cd2d89f467 wip add new proposed message structure (#1904) Wing Lian 2024-10-13 12:15:18 -04:00
  • 82b5dc9328 Merge branch 'main' into cj_tokenizer_default_prompt_template Chirag Jain 2024-10-13 16:27:10 +05:30
  • 8467326c33 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-13 01:42:36 +00:00
  • 1834cdc364 Add support for qwen 2.5 chat template (#1934) Vincent Haines 2024-10-12 21:41:43 -04:00
  • fa2ed629a6 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-13 01:41:35 +00:00
  • ac128b7b1d fix: update eval causal lm metrics to add perplexity (#1951) [skip ci] NanoCode012 2024-10-13 08:41:13 +07:00
  • 31591bd94c Fixing Validation - Mistral Templates (#1962) pandora 2024-10-13 03:40:39 +02:00
  • d45eeab8b3 Built site for gh-pages Quarto GHA Workflow Runner 2024-10-13 00:54:55 +00:00
  • d20b48a61e only install torchao for torch versions >= 2.4.0 (#1963) Wing Lian 2024-10-12 20:53:48 -04:00
  • 34783869eb Built site for gh-pages Quarto GHA Workflow Runner 2024-10-12 22:20:41 +00:00