Qwen2-Audio: Chat with Your Voice!
Captured source
source ↗Qwen2-Audio: Chat with Your Voice! | Qwen
We have a new blog! View this page at qwen.ai . This page will automatically redirect in 5 seconds. If you are not redirected automatically, please click the button below. Go Now
Qwen2-Audio: Chat with Your Voice! August 9, 2024 · 10 min · 1999 words · Qwen Team | Translations: 简体中文
DEMO PAPER GITHUB HUGGING FACE MODELSCOPE DISCORD To achieve the objective of building an AGI system, the model should be capable of understanding information from different modalities. Thanks to the rapid development of large language models, LLMs are now capable of understanding language and reasoning. Previously we have taken a step forward to extend our LLM, i.e., Qwen, to more modalities, including vision and audio, and built Qwen-VL and Qwen-Audio. Today, we release Qwen2-Audio, the next version of Qwen-Audio, which is capable of accepting audio and text inputs and generating text outputs. Qwen2-Audio has the following features: Voice Chat: for the first time, users can use the voice to give instructions to the audio-language model without ASR modules.
Audio Analysis: the model is capable of analyzing audio information, including speech, sound, music, etc., with text instructions.
Multilingual: the model supports more than 8 languages and dialects, e.g., Chinese, English, Cantonese, French, Italian, Spanish, German, and Japanese.
We open-weight Qwen2-Audio-7B and Qwen2-Audio-7B-Instruct in Hugging Face and ModelScope, and we have built a demo for users to interact with. Below are some examples to show the model performance: Voice Chat # Voice Chat: Identify speaker information and respond Next User Your browser does not support the audio element.
Qwen2-Audio Yes, the speaker is female and in her twenties.
User Your browser does not support the audio element.
Qwen2-Audio I'm sorry to hear that you're feeling stressed about your exam. It's completely normal to feel anxious when you're preparing for an important test. Here are some tips that might help you:   1. Start studying early: Don't leave everything until the last minute. Break up your study sessions into smaller chunks of time, and try to stick to a regular schedule. 2. Find a quiet place to study: A quiet environment can help you concentrate better. If possible, find a library or a study room where you can work without distractions. 3.Take breaks: It's important to take breaks during your study sessions. This will give your brain a chance to rest and recharge. Try taking short breaks every hour or so. 4. Get enough sleep: Lack of sleep can make it harder to focus and retain information. Make sure you're getting enough sleep each night, ideally around 7-8 hours. 5. Practice relaxation techniques: Deep breathing exercises, meditation, or yoga can all help you relax and reduce stress.   Remember, it's okay if you're not feeling well-prepared. Just do your best, and trust yourself. Good luck with your exam!
Voice Chat: Speech translation Next User Your browser does not support the audio element.
Qwen2-Audio 每个人都希望被欣赏,所以如果你欣赏某人,不要把它保密。
User Your browser does not support the audio element.
Qwen2-Audio Jeder möchte geschätzt werden, also wenn du jemanden schätzt, lass es nicht als Geheimnis verbergen.
User Your browser does not support the audio element.
Qwen2-Audio Chacun veut être apprecié, alors si tu apprécies quelqu'un, ne le gardes pas secret.
User Your browser does not support the audio element.
Qwen2-Audio 1. 每个人都喜欢被赞美,所以如果你喜欢一个人,不要隐藏你的赞赏。 2. 所有人都渴望得到认可,因此如果你欣赏某人,请不要独自享用这种感觉。 3. 人们都希望被欣赏,所以如果你喜欢某人,不要独自保密欣赏他人是一种美德,所以不要让别人知道你对他们的赞赏是秘密的。 4. 人们都希望受到赞扬,所以如果你欣赏某人,不妨公开表达出来。 5. 如果你喜欢某人,不要犹豫告诉他们,因为每个人都有权利享受他人的赞赏。
Voice Chat: Detecting background noise and responding accordingly Next User Your browser does not support the audio element.
Qwen2-Audio Use headphones to block out external noise and find a quiet place to study. Prioritize your tasks and set realistic goals for yourself. Use active learning techniques such as summarizing, note-taking, and quizzing to help you retain information. Take breaks and practice self-care to avoid burnout.
User Your browser does not support the audio element.
Qwen2-Audio Yes, you can try negotiating with them to reach a mutually beneficial agreement. You could also consider seeking help from a mediator or professional negotiator if needed.
Audio Analysis # Audio Analysis : Speech transcription and story telling Next User Your browser does not support the audio element.
Qwen2-Audio The original content of this audio is: 'Stop all the clocks, cut off the telephone, prevent the dog from barking, silence the pianos and with muffled drum bring out the coffin, let the mourners come.'
User No less than 50 words and no more than 200 words, add scenes such as rain and thunder to the poem. Finally, give the verses you created in lines.
Qwen2-Audio In the midst of an eerie stormy night, With raindrops falling heavy and bright, The world seemed to hush its voice, As winds howled through the darkened space.   Stop all the clocks, the time stood still, As the lightning flashed through the sky, The telephone's ring was drowned out, By the roaring thunder that shook the ground.   Prevent the dog from barking, its cries were silenced, By the fear that gripped it in the night, Silence the pianos, their keys were muted, As the mournful sound of raindrops beat on wood.   With muffled drums, a slow march began, A solemn procession through the storm, The coffin was carried, heavy and solemn, Let the mourners come, to pay their final respects.   As the storm slowly passed, the world emerged, A new day dawned, but memories lingered, Of the night when time stood still, And the mourners came, to say goodbye.
Audio Analysis: Sound analysis Next User What do you hear? Your browser does not support the audio element.
Qwen2-Audio I heard the sound of an alarm and a truck's air brake mixed with the noise of a heavy engine running and shifting gears.
User Under what circumstances can you hear such sounds?
Qwen2-Audio These types of sounds might be commonly heard in industrial settings like construction sites or factories where machinery is operated, or during emergency situations like traffic accidents where sirens are triggered.
User Your browser does not support the audio element.
Qwen2-Audio It's understandable to feel scared under any situation that makes you feel uncomfortable. It's important to remember that it's okay to be scared. Here are...
Excerpt shown — open the source for the full document.
Notability
notability 7.0/10Notable multimodal audio model release