* replace references to random 68m model w 135m smollm2 * use AutoTokenizer for smollm2
* [ci] make e2e tests a bit faster by reducing test split size * use 10% split of alpaca dataset to speed up dataset loading/tokenization * reduce gas 4->2 for most e2e tests * increase val set size for packing