TL;DR
ACE-Step 1.5 is an open-source music generation model that matches commercial services like Suno v4.5 in quality while running locally on consumer hardware. Generate a full 4-minute song in under 10 seconds on an RTX 3090, with VRAM requirements as low as 4GB.
Why it matters: AI music generation has been locked behind subscriptions ($10-30/month for Suno, Udio). ACE-Step 1.5 changes that—MIT-licensed, fully local, with no usage limits. Train custom styles from just 8 songs in under an hour.
Who it’s for: Musicians seeking AI-assisted ideation, content creators needing custom background music, developers building music-related applications, and anyone tired of paying for AI music subscriptions.
The Open-Source Music Generation Revolution
Until now, if you wanted high-quality AI-generated music, you had two choices: pay Suno $10-30/month or use Udio’s limited free tier. Both are cloud-based, both have usage restrictions, and both keep your creations on their servers.
ACE-Step 1.5 (released February 3, 2026) breaks this model wide open. It’s the first open-source music generation model that genuinely competes with commercial offerings—and in some benchmarks, it sits between Suno v4.5 and Suno v5 in quality.
The repository hit 2,200+ stars within 3 days of release. The original ACE-Step has 3,900+ stars. The community reception has been… enthusiastic.
“This is absolutely nuts, and I love the separation of concerns in the architecture. It opens up a lot of possibilities. Fantastic work!!” — r/LocalLLaMA
What Makes ACE-Step 1.5 Special
Blazing Fast Generation
Speed comparisons tell the story:
| Model | 4-Minute Song | Notes |
|---|---|---|
| ACE-Step 1.5 (A100) | 2 seconds | 10-120× faster |
| ACE-Step 1.5 (RTX 3090) | ~10 seconds | Consumer GPU |
| Most commercial services | 2-4 minutes | Cloud-based |
| Some competitors | 20+ seconds | Open-source |
On an RTX 4070 Super, a 2-minute song takes about 2 minutes to generate. Still faster than uploading, waiting, and downloading from cloud services.
Runs on Consumer Hardware
The VRAM requirements are surprisingly low:
- Minimum: Less than 4GB VRAM (basic generation)
- Recommended: 8GB for full-length songs
- LoRA training: 12GB (one hour for custom style)
Tested GPUs: RTX 4060, RTX 3090, RTX 4070 Super, A100. Works on AMD ROCm and even CPU/Apple Silicon (slower).
Commercial-Grade Quality
Benchmarks from the technical paper show ACE-Step 1.5 scoring between Suno v4.5 and Suno v5 on standard evaluation metrics. Style alignment and lyric adherence are strong. The model supports:
- 1000+ instruments and styles with fine-grained timbre control
- 50+ languages for lyrics
- 10 seconds to 10 minutes of audio
- Batch generation of up to 8 songs simultaneously
Beyond Basic Generation
ACE-Step 1.5 isn’t just text-to-music. It supports:
- Reference audio input — Guide generation with existing tracks
- Cover generation — Create covers from audio
- Repaint & edit — Selective local audio editing
- Track separation — Split audio into stems
- Multi-track layering — Add layers like Suno Studio
- Vocal-to-BGM — Generate accompaniment for vocals
- LoRA training — Personal style in 8 songs, 1 hour
The Architecture: Why It’s Fast AND Good
ACE-Step 1.5 uses a novel hybrid approach:
-
Language Model (LM) acts as an “omni-capable planner” that transforms simple prompts into comprehensive song blueprints. It handles structure, metadata, and lyrics via Chain-of-Thought reasoning.
-
Diffusion Transformer (DiT) generates the actual audio from the blueprint. This is where the speed comes from—efficient diffusion in latent space.
-
Intrinsic Reinforcement Learning aligns the components without external reward models, avoiding biases from human preference datasets.
Model variants:
acestep-v15-turbo— Fast generation (default)acestep-5Hz-lm-0.6B— Smaller LM, fasteracestep-5Hz-lm-1.7B— Larger LM, better prompt adherence
Self-Hosting ACE-Step 1.5
Quick Start (5 Minutes)
The fastest way to get running:
# Install uv package manager (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the repo
git clone https://github.com/ACE-Step/ACE-Step-1.5.git
cd ACE-Step-1.5
# Install dependencies
uv sync
# Launch the Gradio Web UI
uv run acestep
Open http://localhost:7860 in your browser. Models download automatically on first run (~6GB total).
Windows One-Click Install
For Windows users, there’s a pre-built portable package:
- Download ACE-Step-1.5.7z
- Extract anywhere
- Run
start_gradio_ui.bat
Includes Python and all dependencies. Requires CUDA 12.8.
REST API Server
For programmatic access:
uv run acestep-api
# API available at http://localhost:8001
Or enable API alongside the web UI:
uv run acestep --enable-api --api-key sk-your-key --port 8001
Docker (via ComfyUI)
The community has created a ComfyUI integration for workflow-based music generation:
# Inside ComfyUI custom_nodes folder
git clone https://github.com/billwuhao/ComfyUI_ACE-Step
Supports LoRA, repainting, remixing, and audio-to-audio workflows.
AMD GPU Support
ACE-Step runs on AMD GPUs via ROCm, but requires a workaround:
# Activate your venv first (important!)
source .venv/bin/activate
# Run directly instead of via uv
python -m acestep.acestep_v15_pipeline --server-name 127.0.0.1 --port 7680
Configuration Deep Dive
Command Line Options
# Public access (network-accessible)
uv run acestep --server-name 0.0.0.0 --share
# Change language (en, zh, ja)
uv run acestep --language zh
# Pre-load models on startup
uv run acestep --init_service true
# Use the larger LM for better prompts
uv run acestep --lm_model_path acestep-5Hz-lm-1.7B
# Force CPU offload for low VRAM (<16GB auto-enabled)
uv run acestep --offload_to_cpu true
# Add authentication
uv run acestep --auth-username admin --auth-password secret
Environment Variables
Create a .env file for persistent config:
ACESTEP_INIT_LLM=true # Force-enable LLM
ACESTEP_CONFIG_PATH=acestep-v15-turbo
ACESTEP_LM_MODEL_PATH=acestep-5Hz-lm-1.7B
ACESTEP_DOWNLOAD_SOURCE=huggingface # or modelscope
ACESTEP_API_KEY=sk-your-secret-key
VRAM Optimization
For GPUs with limited VRAM:
# Auto-enables if VRAM < 16GB
uv run acestep --offload_to_cpu true
# Use smaller LM (0.6B vs 1.7B)
uv run acestep --lm_model_path acestep-5Hz-lm-0.6B
# Disable LLM entirely (DiT-only mode)
uv run acestep --init_llm false
Prompt Engineering for Music
ACE-Step 1.5 responds well to detailed, descriptive prompts. Here are examples from the acemusic.ai playground:
Neo-Soul Jazz
A smooth neo-soul instrumental built on a relaxed groove from an upright
bass with light brushwork on the drums and warm electric piano chords.
A soulful alto saxophone enters to play a memorable, lyrical melody that
serves as the main theme. The arrangement is spacious and clean, allowing
each instrument room to breathe. Following the main section, there's an
extended improvisational passage featuring expressive saxophone runs.
Synthwave
An energetic, driving synthwave track propelled by a punchy four-on-the-floor
drum machine beat and a pulsing synth bassline. A bright, arpeggiated synth
lead carries the main melodic hook, weaving through atmospheric synth pads
that provide harmonic depth. The arrangement builds dynamically, introducing
new synth layers and filter sweeps to maintain momentum.
Anime J-Rock Theme
An explosive j-rock anthem driven by crunchy, overdriven electric guitars
playing powerful riffs and chords. A punchy acoustic drum kit lays down
an energetic 4/4 beat with crashing cymbals, locked in with a solid bassline.
The track is fronted by a powerful female lead vocal performance, delivered
with clarity, strength, and conviction typical of anime theme songs.
Lo-Fi Hip-Hop
A melancholic and atmospheric hip-hop track built on a foundation of a clean,
arpeggiated piano melody and a deep, resonant sub-bass. The beat is sparse,
lo-fi hip-hop groove with a soft kick and a snappy snare. A male vocalist
delivers lyrics with delay to create a distant, introspective feel.
Tips for Better Results
- Describe instruments specifically — “warm electric piano” beats “piano”
- Include arrangement details — verse/chorus structure, builds, breakdowns
- Specify the vibe — “melancholic,” “energetic,” “dreamy”
- Mention production style — “lo-fi,” “polished,” “raw”
- Use the LLM’s query rewriting — Let it expand simple prompts
Training Custom Styles (LoRA)
One of ACE-Step 1.5’s killer features: train a personalized style from just 8 songs in about an hour on a 12GB GPU.
The Gradio UI includes one-click annotation and training. The workflow:
- Upload 8+ reference songs in your target style
- The model auto-annotates BPM, key, and captions
- Click train — wait ~1 hour on RTX 3090
- Load your LoRA and generate in your custom style
This opens possibilities for:
- Personal signature sounds
- Brand-specific audio
- Genre-specific fine-tuning
- Artist style approximation (use responsibly)
🎵 Listen: Generated Examples
Head to acemusic.ai/playground/trending to hear what ACE-Step 1.5 can produce. Trending tracks include:
- “Ember Swing” — Neo-soul jazz, 3:41
- “Dreamwave Drive” — Synthwave, 3:03
- “I’m happy” — Korean indie rock, 2:00
- “Echoes of the past” — Anthemic hip-hop, 2:10
All generated by ACE-Step 1.5 with the prompts visible on each track.
Community Reactions & Honest Limitations
What People Love
- “An open-source model with quality approaching Suno v4.5/v5… running locally on a potato GPU. No subscriptions. No API.”
- The architecture’s separation of concerns opens customization possibilities
- LoRA training democratizes personalized music AI
- Speed is genuinely impressive for the quality level
Known Limitations
Not everyone is fully satisfied:
- Prompt adherence — Some users report electronic genres don’t match expectations
- Mastering quality — “Sounds like loudness war era music” — could use better mastering
- Genre specificity — May not understand niche electronic subgenres well
- Coherence — Some outputs lack long-term musical coherence
Consensus: Excellent for rapid prototyping and ideation. May not replace professional production pipelines, but it’s getting close.
Comparison: ACE-Step vs Alternatives
| Feature | ACE-Step 1.5 | Suno v4.5 | Udio | DiffRhythm |
|---|---|---|---|---|
| Open Source | ✅ MIT | ❌ | ❌ | ✅ |
| Local/Self-Hosted | ✅ | ❌ | ❌ | ✅ |
| Quality | Good-Great | Great | Good | Medium |
| Speed (4-min song) | 2s-10s | 2-4 min | 1-2 min | ~10s |
| Min VRAM | 4GB | N/A | N/A | 8GB |
| LoRA Training | ✅ | ❌ | ❌ | ❌ |
| Price | Free | $10-30/mo | Free tier | Free |
| Languages | 50+ | 50+ | Limited | Limited |
Use Cases
For Musicians
- Ideation: Generate 10 variations of a concept in minutes
- Demo creation: Quick backing tracks for songwriting
- Style exploration: Try genres outside your comfort zone
- LoRA training: Capture your signature sound
For Content Creators
- Background music: Custom tracks for videos, no licensing issues
- Podcast intros: Unique, on-brand audio
- Game prototyping: Placeholder soundtracks that might become final
For Developers
- API integration: Build music features into apps
- Workflow automation: Generate audio programmatically
- Custom interfaces: Build on top of the REST API
Quick Reference
# Install
git clone https://github.com/ACE-Step/ACE-Step-1.5.git
cd ACE-Step-1.5 && uv sync
# Run Web UI
uv run acestep
# Run API server
uv run acestep-api
# With authentication
uv run acestep --enable-api --api-key sk-xxx --auth-username admin --auth-password secret
# Low VRAM mode
uv run acestep --offload_to_cpu true --lm_model_path acestep-5Hz-lm-0.6B
Links
- GitHub: ace-step/ACE-Step-1.5
- Hugging Face Model: ACE-Step/Ace-Step1.5
- Online Demo: Hugging Face Spaces
- Technical Paper: arXiv:2602.00744
- Playground: acemusic.ai
- Discord: ACE-Step Community
- ComfyUI Integration: billwuhao/ComfyUI_ACE-Step
Final Thoughts
ACE-Step 1.5 represents a genuine inflection point for open-source AI music. It’s not perfect—prompt adherence and mastering quality have room to improve—but it’s good enough to be genuinely useful, and it’s fast enough to integrate into creative workflows.
For the first time, you can run Suno-quality music generation on your own hardware, with no subscriptions, no usage limits, and full control over your outputs. Train custom styles in an hour. Build it into your applications via API.
The future of AI music just went local, and it’s MIT-licensed.