PC-Mix Dataset

Partial Spoof Dataset with Controlled Speech-Background Mixing

PC-Mix pairs speech from PartialSpoof v1.2 with a self-curated partial-spoof background pool to create controlled speech–background authenticity combinations. The original samples are collected from VGGSound and AudioCaps.

Resources

Dataset Statistics

Split Mixed Samples Original Samples Background Source Event Source Fusion Method
Train 25,380 6,345 SONYC 60% AudioLDM2
40% UrbanSound8K (0–1s)
Ducking Overlay
Dev 24,844 6,211 IHTApark-UBS 60% AudioLDM2
40% ESC-50 (0–1s)
Ducking Overlay

Evaluation Sets

Subset Mixed Original Background Event Fusion
E0 Baseline 17,809 17,809 SONYC 60% AudioLDM2
40% FSD50K (0–1s)
Ducking Overlay
E1 Generator OOD 14,247 - SONYC AudioGen Ducking Overlay
E2 Fusion OOD 14,247 - SONYC AudioLDM2 Energy Matching
+ Crossfade (20–80ms)
E3 Background OOD 17,809 - DEMAND AudioLDM2 Ducking Overlay
E4 Noise OOD 7,125 - WHAM Noise AudioLDM2 Ducking Overlay

Audio Demo

Example 1

Type Audio
Original
Speech Bona + Background Bona
Speech Bona + Background Spoof
Speech Spoof + Background Bona
Speech Spoof + Background Spoof