Use Non-Frames Version Previous Page Next Page
Signal Generators: STFT Resynthesis (Vocoding)


 ar      pvadd      ktimpnt, kfmod, ifilcod, ifn, ibins[, ibinoffset, ibinincr, iextractmode, ifreqlim, igatefn]


pvadd reads from a pvoc file and uses the data to perform additive synthesis using an internal array of interpolating oscillators. The user supplies the wave table (usually one period of a sine wave), and can choose which analysis bins will be used in the re-synthesis.


ifilcod – integer or character-string denoting a control-file derived from pvanal analysis of an audio signal. An integer denotes the suffix of a file pvoc.m; a character-string (in double quotes) gives a filename, optionally a full pathname. If not fullpath, the file is sought first in the current directory, then in the one given by the environment variable SADIR (if defined). pvoc control files contain data organized for fft resynthesis. Memory usage depends on the size of the files involved, which are read and held entirely in memory during computation but are shared by multiple calls (see also lpread).

ifn – table number of a stored function containing a sine wave

ibins – number of bins that will be used in the resynthesis (each bin counts as one oscillator in the re-synthesis)

ibinoffset (optional) – is the first bin used (it is optional and defaults to 0).

ibinincr (optional) – sets an increment by which pvadd counts up from ibinoffset for ibins components in the re-synthesis (see below for a further explanation).

iextractmode (optional) – determines if spectral extraction will be carried out and if so whether components that have changes in frequency below ifreqlim or above ifreqlim will be discarded. A value for iextractmode of 1 will cause pvadd to synthesize only those components where the frequency difference between analysis frames is greater than ifreqlim. A value of 2 for iextractmode will cause pvadd to synthesize only those components where the frequency difference between frames is less than ifreqlim. The default values for iextractmode and ifreqlim are 0, in which case a simple resynthesis will be done. See examples below.

igatefn (optional) – is the number of a stored function which will be applied to the amplitudes of the analysis bins before resynthesis takes place. If igatefn is greater than 0 the amplitudes of each bin will be scaled by igatefn through a simple mapping process. First, the amplitudes of all of the bins in all of the frames in the entire analysis file are compared to determine the maximum amplitude value. This value is then used create normalized amplitudes as indeces into the stored function igatefn. The maximum amplitude will map to the last point in the function. An amplitude of 0 will map to the first point in the function. Values between 0 and 1 will map accordingly to points along the function table.This will be made clearer in the examples below.


ktimpnt and kfmod are used in the same way as in pvoc.


  ktime line  0, p3, p3
  asig  pvadd ktime, 1, "oboe.pvoc", 1, 100, 2

In the above, ibins is 100 and ibinoffset is 2. Using these settings the resynthesis will contain 100 components beginning with bin #2 (bins are counted starting with 0). That is, resynthesis will be done using bins 2-101 inclusive. It is usually a good idea to begin with bin 1 or 2 since the 0th and often 1st bin have data that is neither necessary nor even helpful for creating good clean resynthesis.

  ktime line  0, p3, p3
  asig  pvadd ktime, 1, "oboe.pvoc", 1, 100, 2, 2

The above is the same as the previous example with the addition of the value 2 used for the optional ibinincr argument. This result will still result in 100 components in the resynthesis, but pvadd will count through the bins by 2 instead of by 1. It will use bins 2, 4, 6, 8, 10, and so on. For ibins=10, ibinoffset=10, and ibinincr=10, pvadd would use bins 10, 20, 30, 40, up to and including 100.

Below is an example using spectral extraction. In this example iextractmode is one and ifreqlim is 9. This will cause pvadd to synthesize only those bins where the frequency deviation, averaged over 6 frames, is greater than 9.

  ktime line  0, p3, p3
  asig  pvadd ktime, 1,  "oboe.pvoc", 1, 100, 2, 2, 1, 9

If iextractmode were 2 in the above, then only those bins with an average frequency deviation of less than 9 would be synthesized. If tuned correctly, this technique can be used to separate the pitched parts of the spectrum from the noisy parts. In practice this depends greatly on the type of sound, the quality of the recording and digitization, and also on the analysis window size and frame increment.

Next is an example using amplitude gating. The last 2 in the argument list stands for f2 in the score.

  asig  pvadd ktime, 1,  "oboe.pvoc", 1, 100, 2, 2, 0, 0, 2

Suppose the score for the above were to contain:

  f2 0 512 7 0 256 1 256 1

Then those bins with amplitudes of 50% of the maximum or greater would be left unchanged, while those with amplitudes less than 50% of the maximum would be scaled down. In this case the lower the amplitude the more severe the scaling down would be. But suppose the score contains:

  f2 0 512 5 1 512 .001

In this case lower amplitudes will be left unchanged and greater ones will be scaled down, turning the sound "upside-down" in terms of the amplitude spectrum! Functions can be arbitrarily complex. Just remember that the normalized amplitude values of the analysis are themselves the indeces into the function.

Finally, both spectral extraction and amplitude gating can be used together. The example below will synthesize only those components that with a frequency deviation of less than 5Hz per frame and it will scale the amplitudes according to F2.

  asig  pvadd ktime, 1,  "oboe.pvoc", 1, 100, 1, 1, 2, 5, 2


By using several pvadd units together, one can gradually fade in different parts of the resynthesis, creating various "filtering" effects. The author uses pvadd to synthesis one bin at a time to have control over each separate component of the re-synthesis.

If any combination of ibins, ibinoffset, and ibinincr, creates a situation where pvadd is asked to used a bin number greater than the number of bins in the analysis, it will just use all of the available bins, and give no complaint. So to use every bin just make ibins a big number (ie. 2000).

Expect to have to scale up the amplitudes by factors of 10-100, by the way.


Richard Karpen
Seattle, Wash
1998 (New in Csound version 3.48, additional arguments version 3.56)

Use Non-Frames Version Previous Page Next Page
Signal Generators: STFT Resynthesis (Vocoding)