XiaomiMiMo/MiMo-Audio-Training
Python
Captured source
source ↗XiaomiMiMo/MiMo-Audio-Training
Language: Python
Stars: 109
Forks: 13
Open issues: 5
Created: 2025-10-16T13:52:54Z
Pushed: 2025-10-16T13:55:07Z
Default branch: main
Fork: no
Archived: no
README:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
MiMo-Audio-Training Toolkit
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Introduction
Welcome to the MiMo-Audio-Training toolkit! This toolkit is designed to fine-tune the XiaomiMiMo/MiMo-Audio-7B-Instruct. This toolkit serves as a reference implementation for researchers and developers interested in MiMo-Audio and looking to adapt it to their own custom tasks.
Supported Tasks
The MiMo-Audio-Eval toolkit supports a comprehensive set of tasks. Some of the key features include:
- Tasks:
- SFT:
- ASR
- TTS / InstructTTS
- Audio Understanding and Reasoning
- Spoken Dialogue
Getting Started
To get started with the MiMo-Audio-Training toolkit, follow the instructions below to set up the environment and install the required dependencies.
Prerequisites (Linux)
- Python 3.12
- CUDA >= 12.0
Installation:
git clone --recurse-submodules https://github.com/XiaomiMiMo/MiMo-Audio-Training cd MiMo-Audio-Training pip install -r requirements.txt pip install flash-attn==2.7.4.post1 pip install -e .
> \[!Note] > If the compilation of flash-attn takes too long, you can download the precompiled wheel and install it manually: > > * Download Precompiled Wheel > > ``sh > pip install /path/to/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp312-cp312-linux_x86_64.whl >
Training Process:
Download the fine-tuning Dataset and pre-process the data as the instruct_template.md
Training
We provide multiple training scripts under the scripts directory, supporting both single-GPU and multi-GPU training setups.
cd MiMo-Audio-Training bash scripts/train_multiGPU_torchrun.sh
Generate and Evaluation
Run inference using: generate.py
Evaluate the SFT model with 🌐MiMo-Audio-Eval.
Citation
@misc{coreteam2025mimoaudio,
title={MiMo-Audio: Audio Language Models are Few-Shot Learners},
author={LLM-Core-Team Xiaomi},
year={2025},
url={https://github.com/XiaomiMiMo/MiMo-Audio},
}Contact
Please contact us at [mimo@xiaomi.com](mailto:mimo@xiaomi.com) or open an issue if you have any questions.
Notability
notability 6.0/10Xiaomi audio training repo with moderate stars.