* bump transformers and update attention class map name
* also run the tests in docker
* add mixtral e2e smoke test
* fix base name for docker image in test
* mixtral lora doesn't seem to work, at least check qlora
* add testcase for mixtral w sample packing
* check monkeypatch for flash attn multipack
* also run the e2e tests in docker
* use all gpus to run tests in docker ci
* use privileged mode too for docker w gpus
* rename the docker e2e actions for gh ci
* set privileged mode for docker and update mixtral model self attn check
* use fp16/bf16 for mixtral w fa2
* skip e2e tests on docker w gpus for now
* tests to validate mistral and mixtral patches
* fix rel import