Examples
Static CMU ARCTIC (fixed sources, fixed microphones)
This example mixes multiple CMU ARCTIC utterances using a static ISM RIR and produces a multi-microphone output (default: binaural).
Key arguments
--num-sources: number of source speakers to mix.--num-mics: number of microphones in the fixed array.--duration: length (seconds) of each source signal.--order: ISM reflection order.--tmax: RIR length in seconds.--room: room size (Lx Ly Lz).--plot: save layout plots and GIFs.--out-dir: output directory for WAV/metadata/plots/GIFs.
Example runs
uv run python examples/static.py --num-sources 1 --duration 5 --plot
Expected outputs:
static.wavstatic_ref01.wav,static_ref02.wav, ...static_metadata.jsonATTRIBUTION.txtstatic_static_2d.png(andstatic_static_3d.pngif 3D)static.gif(andstatic_3d.gifif 3D)
uv run python examples/static.py --order 12 --tmax 0.6 --device auto
Expected outputs:
static.wavstatic_ref01.wav,static_ref02.wav, ...static_metadata.jsonATTRIBUTION.txt
Dynamic CMU ARCTIC (moving sources, fixed microphones)
This example generates moving source trajectories and convolves source signals with dynamic RIRs (trajectory mode).
The script uses save_scene_plots and save_scene_gifs for visualization output.
Key arguments
--steps: number of RIR time steps for the trajectory.--order: ISM reflection order.--tmax: RIR length in seconds.--out-dir: output directory for WAV/metadata/plots/GIFs.
Example runs
uv run python examples/dynamic_src.py --steps 24 --plot
Expected outputs:
dynamic_src.wavdynamic_src_ref01.wav,dynamic_src_ref02.wav, ...dynamic_src_metadata.jsonATTRIBUTION.txtdynamic_src_static_2d.png/dynamic_src_dynamic_2d.pngdynamic_src.gif(anddynamic_src_3d.gifif 3D)
uv run python examples/dynamic_src.py --num-sources 3 --duration 8 --order 10
Expected outputs:
dynamic_src.wavdynamic_src_ref01.wav,dynamic_src_ref02.wav, ...dynamic_src_metadata.jsonATTRIBUTION.txt
Dynamic CMU ARCTIC (fixed sources, moving microphones)
This example keeps sources fixed and moves the microphone array along a linear path.
Key arguments
--steps: number of RIR time steps for the trajectory.--plot: save layout plots and GIFs.--out-dir: output directory for WAV/metadata/plots/GIFs.
Example runs
uv run python examples/dynamic_mic.py --steps 20 --plot
Expected outputs:
dynamic_mic.wavdynamic_mic_ref01.wav,dynamic_mic_ref02.wav, ...dynamic_mic_metadata.jsonATTRIBUTION.txtdynamic_mic_static_2d.png/dynamic_mic_dynamic_2d.pngdynamic_mic.gif(anddynamic_mic_3d.gifif 3D)
uv run python examples/dynamic_mic.py --order 12 --tmax 0.6 --device auto
Expected outputs:
dynamic_mic.wavdynamic_mic_ref01.wav,dynamic_mic_ref02.wav, ...dynamic_mic_metadata.jsonATTRIBUTION.txt
Unified CLI (static/dynamic)
The unified CLI wraps the three scenarios above and supports JSON/YAML configuration files.
Key arguments
--mode:static,dynamic_src, ordynamic_mic.--config-in: load settings from JSON/YAML.--config-out: write current settings to JSON/YAML.--deterministic: enable deterministic kernels (best-effort).--out-dir: output directory for WAV/metadata/plots/GIFs.
Example runs
uv run python examples/cli.py --mode static --plot
Expected outputs:
static_binaural.wavstatic_binaural_metadata.jsonATTRIBUTION.txtstatic_static_2d.png(and 3D variant if room is 3D)
uv run python examples/cli.py --mode dynamic_src --gif --steps 24
Expected outputs:
dynamic_src_binaural.wavdynamic_src_binaural_metadata.jsonATTRIBUTION.txtdynamic_src.gif(and 3D variant if room is 3D)
Benchmark (CPU vs GPU)
This script benchmarks static ISM and, optionally, dynamic trajectory simulation.
Key arguments
--repeats: number of iterations to average.--gpu:cuda,mps, orauto.--dynamic: benchmark dynamic trajectory path as well.
Note
CUDA paths are validated in CI on CUDA runners. Runtime and numerical behavior still depend on your local CUDA/PyTorch environment.
Example runs
uv run python examples/benchmark_device.py --repeats 10 --gpu auto
Expected output (logs):
cpu avg: ... ms<device> avg: ... msspeedup: ...x
uv run python examples/benchmark_device.py --dynamic --repeats 5 --gpu mps
Expected output (logs):
cpu dynamic avg: ... msmps dynamic avg: ... msspeedup: ...x
Dynamic dataset builder (fixed room, fixed mic, moving sources)
This example generates a small dynamic dataset inspired by Cross3D: the room and mic array are fixed, while source positions and trajectories are randomized per scene. Each scene produces a convolved mixture and metadata. You can choose CMU ARCTIC or LibriSpeech from the command line.
What it does
- Uses CMU ARCTIC or LibriSpeech utterances as source signals.
- Samples random source trajectories (linear or zigzag) within a fixed room.
- Keeps the microphone array fixed across all scenes.
- Simulates dynamic RIRs and convolves the sources.
- Saves mixture + per-source reference audio, plus JSON metadata (and plots/GIFs if enabled).
Output files
For each scene index k:
scene_k.wav— multi-microphone mixturescene_k_refXX.wav— per-source reference audio after RIR convolution (premix)scene_k_metadata.json— room size, trajectories, DOA, array attributes, etc.ATTRIBUTION.txt— dataset attribution and redistribution note for the runscene_k_static_2d.png/scene_k_dynamic_2d.png— layout plots (3D variants are saved when the room is 3D)scene_k.gif— animation (andscene_k_3d.gifwhen the room is 3D)
Run (CMU ARCTIC)
uv run python examples/build_dynamic_dataset.py \
--dataset cmu_arctic \
--num-scenes 4 \
--num-sources 2 \
--duration 6
Run (LibriSpeech)
uv run python examples/build_dynamic_dataset.py \
--dataset librispeech \
--subset train-clean-100 \
--num-scenes 4 \
--num-sources 2 \
--duration 6
Additional examples
# CMU ARCTIC: only 1 moving source, plotting enabled
uv run python examples/build_dynamic_dataset.py \
--dataset cmu_arctic \
--num-scenes 2 \
--num-sources 3 \
--num-moving-sources 1 \
--plot
# LibriSpeech: more steps, fewer scenes
uv run python examples/build_dynamic_dataset.py \
--dataset librispeech \
--subset dev-clean \
--num-scenes 2 \
--num-sources 2 \
--steps 96
Key arguments
--dataset: dataset backend (cmu_arctic/librispeech).--subset: LibriSpeech subset (e.g.,train-clean-100).--num-scenes: number of scenes to generate.--num-sources: number of sources per scene.--num-moving-sources: number of sources that move (others stay fixed).--num-mics: number of microphones in the fixed array.--duration: target duration (seconds) of each source signal.--steps: number of RIR steps (trajectory resolution).--order: ISM reflection order.--tmax: RIR length in seconds.--seed: RNG seed for reproducibility.--dataset-dir: dataset root path.--out-dir: output directory for per-scene WAV/JSON/plots/GIFs.--plot: enable plotting + GIFs (default: off).--download: explicitly request dataset download when files are missing (the script also retries with download enabled after a missing-data error).--device: cpu/cuda/mps/auto.
Dataset option validity and error handling
--datasetaccepts onlycmu_arcticorlibrispeech(argparsechoices).--subsetis only used forlibrispeech; unsupported values raiseValueErrorinLibriSpeechDataset.--dataset-diris the dataset root passed to the loader.- CMU ARCTIC expects
ARCTIC/cmu_us_<speaker>_arctic/...under that root. - LibriSpeech expects
LibriSpeech/<subset>/<speaker>/<chapter>/...under that root. --downloadis optional for this script:- If files are missing and
--downloadis not set, the script retries once with download enabled. - For strict offline runs, pre-populate
--dataset-dirand ensure all files exist before execution. - In LibriSpeech mode, malformed utterance IDs passed to
load_audio(not inspeaker-chapter-utteranceformat) raiseValueError.
Note
cuda is available and validated in CI. Actual runtime behavior still depends
on your local CUDA/PyTorch environment.
Implementation notes
The example is implemented in examples/build_dynamic_dataset.py and uses:
torchrir.datasets.load_dataset_sourcesto build fixed-length signals from multiple utterances.torchrir.sim.simulate_dynamic_rirto generate the dynamic RIR sequence.torchrir.signal.DynamicConvolver(mode="trajectory")to produce the final mixture.save_scene_audio+save_scene_metadatato store scene metadata (kept as separate calls). Metadata includes areference_audiolist describing the savedscene_k_refXX.wavfiles (each entry corresponds to a single source convolved with its dynamic RIR), plusdataset_licenseandmodificationsfields for attribution tracking.
Additional example
uv run python examples/build_dynamic_dataset.py --dataset cmu_arctic --num-scenes 2 --out-dir outputs/ds_small
Expected outputs:
outputs/ds_small/scene_000.wav,scene_001.wavoutputs/ds_small/scene_000_metadata.json,scene_001_metadata.jsonoutputs/ds_small/ATTRIBUTION.txt