OpenL3 is an open-source Python library for computing deep audio and image embeddings.
CREPE Pitch Tracker
CREPE is a monophonic pitch tracker written in python based on a deep convolutional neural network operating directly on the time-domain waveform input.
A Python library for soundscape synthesis and augmentation,
A python library for Musical Data Augmentation. The goal of this package is to make it easy for practitioners to consistently apply perturbations to annotated music data for the purpose of fitting statistical models.
A JSON Annotated Music Specification for Reproducible MIR Research.
BirdVoxDetect is a pre-trained deep learning system which detects flight calls from songbirds in audio recordings, and retrieves the corresponding species.
librosa is a python package for music and audio analysis. It provides the building blocks necessary to create music information retrieval systems.
ScanIR is an application for flexible acoustic multichannel impulse response measurement in Matlab intended for public distribution.
Kymatio is an implementation of the wavelet scattering transform in the Python programming language, suitable for large-scale numerical experiments in signal processing and machine learning.
This library provides tools for working with common Music Information Retrieval (MIR) datasets.
Companion code for Deep Salience Representations for $F_0$ Estimation in Polyphonic Music.
MedleyDB is a dataset of annotated, royalty-free multitrack recordings for noncommercial and academic research.
Head-Related Impulse Responses Repository
This HRIR repository consist of 113 dataset from 4 publicly available HRTF databases, namely the LISTEN ,CIPIC , FIU and MIT-KEMAR
HMDiR HRTF dataset
The Head-Mounted-Display acoustic Impulse Responses dataset comprehends HRIR measurements for 1200 locations collected over a Neumann KU-100 mannequin fitted with a variety of HMDs used for virtual, augmented, or mixed reality.
CityTones Soundfield Repository
The CityTones project is a collaborative open-source repository that consists of 360 audio/visual environmental recordings for both creative and academic purposes.
This repository contains companion source code for working with the OpenMIC-2018 dataset, a collection of audio and crowd-sourced instrument labels.
A dataset for guitar transcription
The BirdVox-DCASE-20k dataset contains 20,000 ten-second audio recordings. These recordings come from ROBIN autonomous recording units, placed near Ithaca, NY, USA during the fall 2015.
This dataset contains 8732 labeled sound excerpts (<=4s) of urban sounds.