MusicGen-Stem: Multi-stem music generation and edition through autoregressive modeling

Your Image

An example of music generation and stem editing


We first generate some music according to a textual prompt:


Prompt Generated Stems Mix
Slow pop song with an acoustic guitar. Drums
Bass
Other


Then, we want to keep the bass and drums and replace the other stem by a piano melody:


Text Prompt Input Stems Generated Other Stem Mix
Slow pop song with a piano. Drums
Bass


Finally, we want to keep the bass and the other stem and replace the drums by oriental percussions:


Text Prompt Input Stems Generated Drums Stem Mix
Oriental percussions. Bass
Other

Text to Music Samples


Here we use our model in the standard text-to-music setup. Since our model generates 3 stems we present bellow both the stems isolated and the mix.


Prompt Stems Mix MusicGen
Jazz piece featuring a piano and accompanying bass and drums. Drums
Bass
Other
Rock song with a prominent guitar riff and driving drum beat. Drums
Bass
Other
Hip-hop track with a catchy hook and a deep bass. Drums
Bass
Other
Reggae song with an off-beat rhythm. Drums
Bass
Other
Country ballad with piano and guitar. Drums
Bass
Other
Chill lo-fi beat with piano Drums
Bass
Other
Folk song with a simple acoustic guitar melody. Drums
Bass
Other
R&B ballad with a smooth piano melody. Drums
Bass
Other
Funk track with a groovy bassline. Drums
Bass
Other
80s new wave song with a synthesizer Drums
Bass
Other


Music Editing samples


Here we showcase the editing capabilities of our model. For this task we feed the model with an audio prompt and ask the model to generate missing stems. We optain the audio prompt by removing the original stem that we want to replace using hybrid transformer Demucs.

Bass editing:
Original song Audio Prompt Method Generated Bass Mix
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
Drums editing:
Original song Audio Prompt Method Generated Drums Mix
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
Other editing:
Original song Audio Prompt Method Generated other Mix
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A
MusicGen-Stem
MSDM PT
MSDM RT
MusicGen Instruct N/A




Dual song


Here we experiment a bit, we regenerate every stem with respect to the other ones and then sum them.
Original Song Audio Prompts Stems Mix
Drums
Bass
Other
Drums
Bass
Other
Drums
Bass
Other
Drums
Bass
Other
Drums
Bass
Other
Drums
Bass
Other