Question
The README mentions:
"we will also open-source the training recipe soon, so you can train your own DFlash draft model to accelerate any LLM."
I'm very interested in training custom DFlash draft models. Could you share more details?
- Is there a rough timeline for releasing the training recipe?
- What will the training recipe include? (e.g., data format, training scripts, hyperparameter configs, hardware requirements)
- Will it support training draft models for arbitrary base LLMs, or only specific architectures?
Being able to train custom draft models would significantly expand DFlash's applicability. Thanks for the great work!
Context
- Use case: Want to train a DFlash draft model for a custom fine-tuned LLM
- Backend: vLLM / vLLM-ascend
Question
The README mentions:
I'm very interested in training custom DFlash draft models. Could you share more details?
Being able to train custom draft models would significantly expand DFlash's applicability. Thanks for the great work!
Context