axolotl

Author	SHA1	Message	Date
Wing Lian	0c6f928601	address PR feedback	2023-06-10 14:23:56 -04:00
Wing Lian	eea2731a5e	add streaming dataset support for pretraining datasets	2023-06-10 14:23:56 -04:00
Wing Lian	1db46a9c72	linting fix	2023-06-10 14:23:56 -04:00
Wing Lian	ab5cd28acf	more gpt-neox long ctx fixes	2023-06-10 14:23:55 -04:00
Wing Lian	1a82082e91	fix bettertransformers save, force it to skip after saving correctly in callback	2023-06-10 14:23:55 -04:00
Wing Lian	1210dc8fd5	more tweaks to do pre-training with bettertransformers	2023-06-10 14:23:55 -04:00
Wing Lian	488a67d75a	experimental expansion of ctx len	2023-06-10 14:23:53 -04:00
Wing Lian	71a43f8479	add validation/warning for bettertransformers and torch version	2023-06-10 14:22:31 -04:00
Wing Lian	39619028a3	use pythia-12b, neox-20b is flaky	2023-06-10 14:22:30 -04:00
Wing Lian	8792199799	add flash attn context for efficient training and attempt setting model to train mode:	2023-06-10 14:22:30 -04:00
Wing Lian	1edc30c786	add support for opimum bettertransformers	2023-06-10 14:22:30 -04:00
Wing Lian	41e4f6ca31	Merge pull request #181 from OpenAccess-AI-Collective/xpos-rope add support to extend context with xpos rope	2023-06-10 14:04:03 -04:00
Wing Lian	215d775147	Merge pull request #180 from Glavin001/feat/stream-inference Add streaming inference & fix stopping at EOS	2023-06-10 12:04:34 -04:00
Wing Lian	f36e227eaf	formatting for linter	2023-06-10 12:00:52 -04:00
Wing Lian	5878bb1f3a	add option to readme	2023-06-10 11:57:41 -04:00
Wing Lian	a03a7d7d8b	add support to extend context with xpos rope	2023-06-10 10:29:46 -04:00
Glavin Wiechert	fec6bcc3e6	Add streaming inference & fix stopping at EOS	2023-06-10 08:14:47 +00:00
Wing Lian	931e606459	Merge pull request #179 from OpenAccess-AI-Collective/fix-max_seq_len fix for max sequence len across different model types	2023-06-09 20:52:03 -04:00
Wing Lian	7f09106437	fix for max sequence len across different model types	2023-06-09 20:42:33 -04:00
NanoCode012	6b50200234	Merge pull request #178 from PocketDocLabs/main Update README.md to reflect current gradient checkpointing support	2023-06-10 08:26:48 +09:00
PocketDocLabs	16f9e28048	Update README.md to reflect current gradient checkpointing support Previously the readme stated gradient checkpointing was incompatible with 4-bit lora in the current implementation however this is no longer the case. I have replaced the warning with a link to the hugging face documentation on gradient checkpointing.	2023-06-09 16:10:58 -07:00
NanoCode012	b9083a7fc1	Merge pull request #176 from NanoCode012/fix/peft-import Fix backward compat for peft	2023-06-10 07:56:35 +09:00
NanoCode012	aefb2fc681	Fix backward compat for peft	2023-06-10 07:46:36 +09:00
NanoCode012	b5aa8d854c	Merge pull request #169 from NanoCode012/feat/landmark Feat: Add landmark attention	2023-06-10 07:26:06 +09:00
NanoCode012	4d6490bce2	Merge pull request #171 from OpenAccess-AI-Collective/NanoCode012-falcon-lora-matrix Fix falcon support lora	2023-06-09 17:58:22 +09:00
NanoCode012	b242b69e10	Fix falcon support lora	2023-06-09 17:50:16 +09:00
NanoCode012	320beb20f4	Merge pull request #170 from OpenAccess-AI-Collective/NanoCode012-lambdalabs-fix Feat: Improve lambda labs instruction	2023-06-09 16:52:27 +09:00
NanoCode012	2e13ceff37	Improve lambda labs instruction	2023-06-09 15:03:08 +09:00
NanoCode012	2a801b001a	Fix grad checkpoint and outputs param	2023-06-09 14:28:44 +09:00
NanoCode012	e44c9e0b3e	Fix patching via import instead of hijacking	2023-06-09 14:27:24 +09:00
NanoCode012	55b8542de8	Feat: Add landmark attention	2023-06-09 12:54:08 +09:00
Wing Lian	febe902517	Merge pull request #168 from bratao/main Disable Wandb if no wandb project is specified	2023-06-08 22:05:56 -04:00
Bruno Cabral	f4df266842	Disable Wandb	2023-06-08 21:02:02 -03:00
NanoCode012	281dc3df59	Merge pull request #167 from NanoCode012/fix/redundant-save-eval-steps Fix: Refactor out unmodified save_steps and eval_steps	2023-06-09 01:39:33 +09:00
NanoCode012	2ef4634d45	Refactor out unmodified save_steps and eval_steps	2023-06-09 01:23:13 +09:00
NanoCode012	7eae90333e	Merge pull request #166 from NanoCode012/fix/seed Fix: Set to use cfg.seed or 42 for seed	2023-06-09 01:15:08 +09:00
NanoCode012	c8242de725	Merge pull request #132 from utensil/falcon-7b-qlora Axolotl supports falcon + qlora	2023-06-09 01:14:03 +09:00
NanoCode012	2cfe9e9b16	Set to use cfg.seed or 42 for backward compat	2023-06-09 01:02:36 +09:00
Utensil	79a8f52181	Trim trailing whitespace	2023-06-08 23:48:57 +08:00
NanoCode012	afaa0d2c01	Merge pull request #164 from NanoCode012/fix/falcon-fsdp-validate Fix: Validate falcon with fsdp	2023-06-09 00:44:12 +09:00
NanoCode012	bfd27ba55e	Fix failing test	2023-06-09 00:35:03 +09:00
NanoCode012	babf0fdb71	Validate falcon with fsdp	2023-06-09 00:29:04 +09:00
Utensil	a52f4816b0	Default `wandb_project` to empty as suggested Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2023-06-08 23:04:19 +08:00
NanoCode012	81911d112c	Merge pull request #163 from NanoCode012/feat/matmul-tf32 Feat: Set matmul tf32=True when tf32 passed	2023-06-09 00:01:31 +09:00
NanoCode012	52765ac588	Set matmul tf32	2023-06-08 23:41:12 +09:00
NanoCode012	73e9ea4069	Merge pull request #143 from NanoCode012/fix/deprecate-prepare-8bit-training Fix future deprecate prepare_model_for_int8_training	2023-06-08 23:07:53 +09:00
NanoCode012	f8d379883d	Merge pull request #162 from NanoCode012/fix/custom-prompt-readme Fix: Move custom prompts out of hidden	2023-06-08 23:05:17 +09:00
NanoCode012	04a1b77307	Merge pull request #161 from NanoCode012/fix/peft-setup Fix: Update peft and gptq instruction	2023-06-08 23:01:53 +09:00
NanoCode012	2097a09d2d	Move custom prompts out of hidden	2023-06-08 22:53:56 +09:00
NanoCode012	cfff94b123	Add peft install for quickstart	2023-06-08 22:50:20 +09:00

1 2 3 4 5 ...

526 Commits