[go: nahoru, domu]

Skip to content

Notebooks to investigate data set bias in audio embeddings

Notifications You must be signed in to change notification settings

changhongw/audio-embedding-bias

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bias Correction with Pre-trained Audio Embeddings

Implementation of different bias correction methods for pretrained audio embeddings proposed in the following paper:

Changhong Wang, Gaël Richard, and Brian McFee. "Transfer Learning and Bias Correction with Pre-trained Audio Embeddings". Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2023.

(Ps: deem in deem.py is an acronym for "debiasing embeddings")

Content

Installation

We recommend using Conda environment:

git clone https://github.com/changhongw/audio-embedding-bias.git
conda env create -f environment.yml
conda activate embedding-bias

Datasets

Download IRMAS and OpenMIC datasets and save in directories data/irmas and data/openmic-2018, respectively.

Pre-trained embeddings

Extract VGGish, OpenL3, and YAMNet embeddings for both datasets. Or use our extracted pre-trained embeddings directly.

Bias correction

Run the note books in notebooks:

  • 0_data_distribution.ipynb: investigate the distribution of each dataset in terms of genre distribution and number of samples per class
  • 1_debias_linear.ipynb: linear bias correction (original, LDA, mLDA)
  • 2_debias_nonlinear.ipynb: nonlinear bias correction (K, KLDA, mKLDA)
  • 3_cosine_similarity.ipynb: calculate cosine similarity between dataset separation and instrument classification; check matrix rank for the case of multiple bias correction
  • 4_result_summary.ipynb: summarize results from all bias correction methods

Note

Thanks to Jayeon Yi, we notice two typos in the paper, i.e. the dimensionality of $W$ and $U$ in Equation (3). We correct them as following:

  • $W\in\mathbb{R}^{D\times G}$ -> $W\in\mathbb{R}^{G\times D}$
  • $U\in\mathbb{R}^{D\times G}$ -> $U\in\mathbb{R}^{G\times G}$

Contact

For any questions, support, or inquiries, please feel free to contact changhong.wang@telecom-paris.fr.

Cite

Please cite the following paper if you use the code provided in this repository.

Changhong Wang, Gaël Richard, and Brian McFee. "Transfer Learning and Bias Correction with Pre-trained Audio Embeddings". Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2023.

@inproceedings{wang2023transfer,
    author = {Changhong Wang and Gaël Richard and Brian McFee},
    title = {Transfer Learning and Bias Correction with Pre-trained Audio Embeddings},
    booktitle = {Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference},
    year = 2023,
}

About

Notebooks to investigate data set bias in audio embeddings

Resources

Stars

Watchers

Forks

Packages

No packages published