Loading market data...

Stability AI Releases Stable Audio 3: A Family of Fast Latent Diffusion Models for Audio Generation and Editing

MarkTechPostMay 26, 2026 at 10:31 PM

Stability AI has released Stable Audio 3, a family of latent diffusion models for instrumental music and sound effects generation. The release includes open weights for the small and medium variants. Small runs on a MacBook Pro M4 CPU. Medium fits on consumer GPUs with 8 GB of VRAM.

Both generate stereo audio at 44. 1 kHz using a three-stage training pipeline: flow matching, distillation warmup, and adversarial post-training. On the BBC Sound Effects benchmark at 5 seconds, SA3 medium scores FAD 0. 369 — lower than every open-weight baseline evaluated in the paper.

This is a summary. For the full story, read the original article at MarkTechPost.

Original source: MarkTechPost

Best Text-to-Speech TTS Models in 2026: A Benchmark-Based Comparison

MarkTechPostMay 30, 2026 at 9:26 PM

Trajectory Releases a Concurrent Multi-LoRA Training Stack for Continual Learning, Reporting a 2.81× Experiment-Throughput Gain

MarkTechPostMay 31, 2026 at 2:04 AM

StepFun Releases Step 3.7 Flash: A 198B MoE Vision-Language Model for Coding Agents and Search Workflows

MarkTechPostMay 29, 2026 at 9:25 PM

← Back to all articles

Related Articles

Best Text-to-Speech TTS Models in 2026: A Benchmark-Based Comparison

Trajectory Releases a Concurrent Multi-LoRA Training Stack for Continual Learning, Reporting a 2.81× Experiment-Throughput Gain

StepFun Releases Step 3.7 Flash: A 198B MoE Vision-Language Model for Coding Agents and Search Workflows