Core IO and DSP¶
Audio processing¶
load(path[, sr, mono, offset, duration, dtype]) |
Load an audio file as a floating point time series. |
to_mono(y) |
Force an audio signal down to mono. |
resample(y, orig_sr, target_sr[, res_type, ...]) |
Resample a time series from orig_sr to target_sr |
get_duration([y, sr, S, n_fft, hop_length, ...]) |
Compute the duration (in seconds) of an audio time series or STFT matrix. |
autocorrelate(y[, max_size, axis]) |
Bounded auto-correlation |
zero_crossings(y[, threshold, ...]) |
Find the zero-crossings of a signal y: indices i such that sign(y[i]) != sign(y[j]). |
clicks([times, frames, sr, hop_length, ...]) |
Returns a signal with the signal click placed at each specified time |
Spectral representations¶
stft(y[, n_fft, hop_length, win_length, ...]) |
Short-time Fourier transform (STFT) |
istft(stft_matrix[, hop_length, win_length, ...]) |
Inverse short-time Fourier transform (ISTFT). |
ifgram(y[, sr, n_fft, hop_length, ...]) |
Compute the instantaneous frequency (as a proportion of the sampling rate) obtained as the time-derivative of the phase of the complex spectrum as described by [R3]. |
cqt(y[, sr, hop_length, fmin, n_bins, ...]) |
Compute the constant-Q transform of an audio signal. |
hybrid_cqt(y[, sr, hop_length, fmin, ...]) |
Compute the hybrid constant-Q transform of an audio signal. |
pseudo_cqt(y[, sr, hop_length, fmin, ...]) |
Compute the pseudo constant-Q transform of an audio signal. |
fmt(y[, t_min, n_fmt, kind, beta, ...]) |
The fast Mellin transform (FMT) [R5] of a uniformly sampled signal y. |
phase_vocoder(D, rate[, hop_length]) |
Phase vocoder. |
magphase(D) |
Separate a complex-valued spectrogram D into its magnitude (S) and phase (P) components, so that D = S * P. |
logamplitude(S[, ref_power, amin, top_db]) |
Log-scale the amplitude of a spectrogram. |
perceptual_weighting(S, frequencies, **kwargs) |
Perceptual weighting of a power spectrogram: |
A_weighting(frequencies[, min_db]) |
Compute the A-weighting of a set of frequencies. |
Time and frequency conversion¶
frames_to_samples(frames[, hop_length, n_fft]) |
Converts frame indices to audio sample indices |
frames_to_time(frames[, sr, hop_length, n_fft]) |
Converts frame counts to time (seconds) |
samples_to_frames(samples[, hop_length, n_fft]) |
Converts sample indices into STFT frames. |
samples_to_time(samples[, sr]) |
Convert sample indices to time (in seconds). |
time_to_frames(times[, sr, hop_length, n_fft]) |
Converts time stamps into STFT frames. |
time_to_samples(times[, sr]) |
Convert timestamps (in seconds) to sample indices. |
hz_to_note(frequencies, **kwargs) |
Convert one or more frequencies (in Hz) to the nearest note names. |
hz_to_midi(frequencies) |
Get the closest MIDI note number(s) for given frequencies |
midi_to_hz(notes) |
Get the frequency (Hz) of MIDI note(s) |
midi_to_note(midi[, octave, cents]) |
Convert one or more MIDI numbers to note strings. |
note_to_hz(note, **kwargs) |
Convert one or more note names to frequency (Hz) |
note_to_midi(note[, round_midi]) |
Convert one or more spelled notes to MIDI number(s). |
hz_to_mel(frequencies[, htk]) |
Convert Hz to Mels |
hz_to_octs(frequencies[, A440]) |
Convert frequencies (Hz) to (fractional) octave numbers. |
mel_to_hz(mels[, htk]) |
Convert mel bin numbers to frequencies |
octs_to_hz(octs[, A440]) |
Convert octaves numbers to frequencies. |
fft_frequencies([sr, n_fft]) |
Alternative implementation of np.fft.fftfreqs |
cqt_frequencies(n_bins, fmin[, ...]) |
Compute the center frequencies of Constant-Q bins. |
mel_frequencies([n_mels, fmin, fmax, htk]) |
Compute the center frequencies of mel bands. |
Pitch and tuning¶
estimate_tuning([y, sr, S, n_fft, ...]) |
Estimate the tuning of an audio time series or spectrogram input. |
pitch_tuning(frequencies[, resolution, ...]) |
Given a collection of pitches, estimate its tuning offset (in fractions of a bin) relative to A440=440.0Hz. |
piptrack([y, sr, S, n_fft, hop_length, ...]) |
Pitch tracking on thresholded parabolically-interpolated STFT |