This page will provide a growing list of questions.

## Chapter 2

**Periodic Signals**Do you know the basics? Take a quick test!

**Probability Density Function**What is the advantage of describing signals with their PDF. What information is lost?

**Fourier Series**Have you understood the fundamental properties of the Fourier Series? Take the test!

**Quantization**Draw the PDF shape of the quantization error (signal variance much higher than quantizer step size). Label the axes and annotate x and y values

**Sampling**What is the maximum bandwidth of a signal sampled with a rate of 24kHz?

**Sampling**A sinusoidal is sampled at a sampling frequency of 6 kHz. Which of the following frequencies will produce a 1 kHz sinusoidal in the reconstructed base band (starting at 0Hz)?

- 4
- 7
- 15
- 5

**Sampling**Why is the sampling rate for speech lower, in general, than the sampling rate for music?

- On average, the bandwidth for speech signals is less than the bandwidth for musical signals.
- Speech is monophonic (Single voiced).
- Speech has a lower dynamic range than music.
- The sampling rate for music and speech is always the same. The statement is incorrect.

**Sampling**Mark the following statements as true or false.

- An anti-aliasing filter is used to limit the bandwidth of a signal to within half the sampling frequency.
- Perfect reconstruction of a continuous signal is possible as long as its bandwidth is lower than the sampling rate.

**Quantization & Sampling**Mark the statements that are correct.

- In theory, a quantized signal can be perfectly reconstructed if the word length is at least twice the maximum amplitude.
- A sampled signal has a periodic spectrum.
- The quantization error is always in the range of [-Delta/2;Delta/2] (Delta is the quantization step size).
- Increasing the quantizer word length from w=4 bit to w=8 bit will increase the SNR by 6dB.

**Moving Average Filter**A Moving Average Filter of length N is applied two times (in series) to an audio signal. Describe length and shape of the impulse response of an FIR filter resulting in the same output when only applied once.

**Correlation coefficient**Two signals have a (normalized) correlation coefficient of -1. What does that imply?

**Correlation coefficient**What will be the (normalized) correlation coefficient of a sine wave with frequency ‘f’ and another sine wave with the same frequency but a phase-shift of pi radians?

- 0
- -1
- 0.5
- 1

**Correlation function**Draw the shape of the correlation function between a sawtooth signal and white noise.

**Correlation function**You compute the correlation function of two signal blocks with the lengths 1024 and 2048, respectively. What is the length of the result?

**Autocorrelation**Mark the statements that are correct.

- the ACF is always positive
- the ACF is symmetric around lag 0
- the Fourier transform of the ACF is real-valued
- the ACF of a periodic signal is periodic
- the ACF is always periodic
- the ACF of a periodic signal is maximum at lag 0

**Autocorrelation**Take a recording (ca. 15-30s) of a single-voiced instrument or vocals. Then,

- compute the block-wise autocorrelation function with Matlab’s
`xcorr`

function (block length: 4096, hop length: 256), - compare the results with your own implementation,
- create a function which finds the ACF maximum after the first zero crossing and returns all maxima indices for all blocks in a vector, and
- plot and discuss/interpret the results.

- compute the block-wise autocorrelation function with Matlab’s
**Convolution in the frequency domain**You have a signal block

`x`

and an impulse response`h`

, both of the same length (1024) and you want to compute the convolution result. Knowing that convolution via the FFT might be more efficient than in the time domain, you implement the following script:X = fft(x); H = fft(h); Y = X.*H; y = ifft(Y);

Discuss why the result is not the expected convolution result and modify the code so that it produces the correct result as computed by the Matlab function ‘conv’.

**Convolution**Sketch the output of the convolution of the two pairs of signals below. Make sure to properly label axes and mark important axis points.

**Convolution**Describe the following properties of the convolution operation and give an example of the practical meaning in an audio processing context.

- (g(t)*h(t))*x(t) = g(t)*(h(t)*x(t))
- g(t)*(h(t)+x(t)) = (g(t)*h(t))+(g(t)*x(t))

**Correlation and Convolution 1**Both the convolution and the correlation operation can be calculated in the frequency domain by multiplication. Discuss the differences in the frequency domain and how they relate to the time domain equations.

**Correlation and Convolution 2**Name 2 pairs of signals: (1) the correlation function between those signals equals the output of their convolution, (2) the correlation function between those signals and the output of their convolution is different.

**Fourier transform**Name 3 important properties of the Fourier Transform and briefly discuss them.

**Fourier transform**Given that the Fourier transform of x(t) is X(jomega), what is the Fourier transform of x(t-T0)?

**Fourier transform**Each of the following magnitude spectra has a corresponding time domain signal. Give the correct grouping (e.g., 1A, 2B, etc.).

**Fourier transform**What is the FT of two delta functions located symmetrically around 0?

**Fourier transform**The Fourier transform of a rectangular window is WR(jomega). Given basic operations like duplication, multiplication, addition, convolution, time-scaling, etc., how do you derive the shape Fourier transform of a triangular window? What is the result?

**Fourier transform**Which of the following statements are correct:

- FT and DFT are both discrete in the time domain
- a complete description of the spectrum contains both magnitude and phase
- The distance between two neighboring frequency bins (in Hz) inreases as the DFT block block length increases
- The FTs of a cosine and a sine with the same frequency are identical.
- The FT of a real signal is symmetric.
- The FT will be compressed as the time domain signal is stretched.
- An impulse response is completely defined and can be perfectly reconstructed given its FT.

**Fourier transform**What happens to the Fourier Transform of a signal if it is multiplied with a cosine wave with non-zero frequency?

- The FT gets shifted along the frequency axis.
- The FT remains unchanged.
- The frequency axis of the FT gets scaled.
- The Fourier Transform is inverted around the frequency axis.

**Fourier transform**Mark the following statements as true or false:

- The magnitude spectrum of a real-valued signal is symmetric about the y-axis.
- The real part of the FT is always positive.
- The FT will be compressed along the frequency axis as the time domain signal is stretched.
- The FT of an impulse response describes a system as completely as the impulse response itself.
- The energy of the time-domain signal is preserved in the frequency domain.

**Time-Frequency transformations**Name one advantage and one disadvantage of the Constant Q Transform as compared to the DFT.

**Cepstrum**Mark the statements that are correct.

- The basic signal model for the input signal of a cepstral analysis is x(t) = e(t) + h(t) (e(t): excitation signal, h(t) transfer function)
- The cepstrum has a non-linear frequency scale
- The cepstrum can be used for, amongst others, pitch detection and spectral envelope extraction
- The human voice fits the basic signal model of the cepstrum well.

## Chapter 3

**Feature Normalization**You extracted 5 features (spectral centroid, spectral crest, spectral flatness, zero crossing rate as introduced in the text book) and want to use a NN (Nearest Neighbor) classifier with a Euclidean distance metric. Mark the correct statements below.

- all of the mentioned features are in a similar numerical range
- PCA (Principle Component Analysis) should be applied for feature normalization
- z-score normalization will result in a numerical range from -1 to 1 for each feature
- The classification performance will be affected by the feature normalization process

**Feature Selection**Consider a classification task using a set of 3 features X1, X2, X3. The classification accuracy for different feature combinations are shown in the following table.

feature combinations classification accuracy (X1) 50% (X2) 55% (X3) 60% (X1;X2) 74% (X1;X3) 65% (X2;X3) 66% - Describe the process/the processing steps of Sequential Forward Selection.
- Based on this selection, which two features will be selected?

**Feature Design**Discuss advantages and disadvantages of using unsupervised Feature Learning approaches as compared to ‘traditional’ hand-crafted features.

**Dimensionality**Given a finite number of observations, is it true that increasing the number of features always leads to an improved classification accuracy. Why or why not?

## Chapter 5

**Pitch perception**Describe the relation of the pitch perception dimensions tone height and chroma.

**MIDI Pitch**Given the A4 has the MIDI pitch index 69, what is the index of C5?

**Cents**How do you compute the distance of two frequencies f1, f2 as pitch difference? Derive given m(f) = 69+12ld(f/fA4).

**Cents**Mark the correct statements below.

- Cent is relative to the ratio between two pitches
- The distance in Cents between the pitches E4 and F4 is 100 cents
- The distance in Cents between the pitches E2 and G2 equals the distance between E4 and G4.
- To compute the distance of two frequencies in cent, the reference tuning frequency is required (e.g., A4=440Hz).

**Frequency Reassignment**Mark the correct statements below.

- The derivative of the phase with respect to time referred to as the instantaneous frequency.
- The sampling rate has to be known to compute the instantaneous frequency in Hz.
- If the hopsize is small enough, phase unwrapping is no longer needed.
- The instantaneous frequency estimation might be wrong if there are multiple sources overlapping in the frequency domain.

## Chapter 6

**Definition**Describe the difference between an ‘onset’ and a ‘beat’.

**Inter-Onset-Interval**You extracted an IOI-histogram with two distinct peaks (at 166.67ms and 333.33ms, respectively).

- Explain why there might be two peaks in the histogram.
- What is the most likely tempo estimate given this histogram?