* fix(dataset): normalize tokenizer config and change hash from tokenizer class to tokenizer path * fix: normalize config