IS426_Audio2Spectrogram

Processes audio into usable spectrograms for classification

There are several processing steps

The program should ask for which classes of data you are going to be processing. At the moment, it only processes one large audio clip, but later versions will process all the files within a directory.

This will create empty folders, and the program will ask you to select the files on your computer which correspond to that class. (Currently only one class at a time, multiclass processing in next version).

These files will be cloned into a folder in the root directory of the program in "./data"

So you've got these audio files. There is a function (waveformIngestion()) that takes in the: -frequency of the audio data (whether it be 44100 or 48000 or something), this will determine the array size for the audio. -url of each audio file -the "chunk" size that you want to break the audio down into

This will be applied to all of the audio clips.

So next you're going to have a folder structure be autogenerated from this function which takes an audio clip, and the audio will be split into chunks proportional to the total size of the clip, each audio clip will spawn one folder that has all of the clips of the chunks inside of it.

This means in each class folder, you'll have a large amount of folders corresponding to the amount of examples that exist for that particular class.

This repository will process a test audio file which is Steve Lacy's song "Bad Habits".

A copy of this file structure will be used then to generate the spectrograms of each of the chunks. The chunks will then become MEL Power Law Spectrograms.

These chunks, that are then labeled by their folder, can be used to train a machine learning classification model that utilizes CNN's (Convolutional Neural Networks).

Presto, we have done dimensionality reduction on audio files into an image format which features can be extracted from and learned utilizing the advancements in computer vision, without any specialized domain knowledge!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
LICENSE		LICENSE
README.md		README.md
Steve Lacy - Bad habits (Instrumental) (320 kbps).mp3		Steve Lacy - Bad habits (Instrumental) (320 kbps).mp3
halcyox mono.wav		halcyox mono.wav
steve lacy mono.wav		steve lacy mono.wav
waveform_classifierML.py		waveform_classifierML.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IS426_Audio2Spectrogram

There are several processing steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IS426_Audio2Spectrogram

There are several processing steps

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages