Vocoders/Speech Synthesis

Introduction: Speech synthesis

Vocoders were originally developed in the 1930s to be a potentially efficient means of transmitting voice signals via telephone lines. From that the name voice coder follows.

A vocoder is a complete analyzer (129/1) synthesizer (129/2) system that breaks down -analyzes- a vocal or other audio signal into a series of adjacent frequency bands, and then uses the amplitudes of the frequency bands to build up -synthesize- a signal that is similar in certain respects.

Like locomotives, vocoders come with a variety of bells and whistles, but their basic mechanisms are the same. The figure shows the basic vocoder function in block diagram form. The left side of the the diagram is the analyzer portion of the device. An audio signal, usually called the speech signal, is fed through a series of bandpass filters.

The center frequencies F1, F2, F3...Fn of the filters are spaced one-quarter to one-half octave apart; together, the filter bands cover most of the audio spectrum. Thus, the filter band slices up the spectrum of the speech signal. Each slice then goes to an envelope follower, the output of which is a control voltage that is proportional to the strength of that slice. The envelope follower outputs are control signals. They tell us how strong each slice of the frequency spectrum of the speech signal is at any time. In other words, the analyzer output is a set of slowly varying control voltages that constitute a code or analysis of the spectrum of the speech signal.

The synthesizer portion of the vocoder is shown on the right side of the Fig. 1. A set of bandpass filters, identical to those of the analyzer section, is fed by a second audio signal, called the replacement, carrier, or exitation signal. These filters slice up the carrier spectrum into bands in the same way the analyzer filters slice up the speech signal spectrum. Each slice is then fed through a voltage-controlled amplifier. The outputs of the VCAs are mixed. This mix is the audio output of the basic vocoder. If all of the VCA control signals were of the same voltage, the vocoder output would -in principle- be the same as the carrier input. If, on the other hand, the VCA control inputs are connected to the analyzer envelope follower outputs, as shown with dotted lines in Fig.1, the spectral variations of the speech signal are impressed on the carrier signal.

Suppose, for instance, that the speech input signal is a person speaking or singing, and the carrier input is a steady tone that is rich in harmonics, such as the sawtooth wave. The vocoder output then has the pitch of the carrier and the timbral variations of the speech. This is the basic principle of the vocoder. Just this much function is not quite enough to reconstruct convincing speech. It is, however, adequate to impart musically useful vocal inflections to steady tones.

Wendy Carlos used standard modular envelope followers and voltage-controlled amplifiers, plus two standard half-octave fixed filter banks which were modified to provide separate inputs and outputs for each of the filter sections, when she produced her version of the Beethoven choral for the Clockwork Orange score. She patched the component modules together exactly as shown in Fig.1. She achieved the effect of a "synthesized chorus" by feeding many oscillators of a keyboard-controlled synthesizer into the carrier input of her setup, while a singerīs voice signal was fed into the speech input. The vocoder output signal needed no further filtering or articulation.


Doepfer A-129 Modular Vocoder system

The A-129 /x series of modules forms a modular vocoder. The basic components are the analysis section A-129 /1 and the synthesis section A-129 /2.

As said in the vocoder introduction , the vocoder needs two input signals: a speech element which serves as the raw material for the tonal shaping, and is patched into the analysis section; and a carrier signal, which is patched via the instrument input into the synthesis section.

Figure: Analysis-synthesis section.



A 129/1 -analyzer section- slices up the spectrum of the speech signal. The A 129/2 slices up the carrier signals spectrum.

Since the A-129 is a modular vocoder, and the connections between the analysis and synthesis section are external, using patch-leads, You have the possibility of connecting the frequency bands of the analysis and synthesis sections arbitrarily. A low frequency band in the speech signal can control a high frequency band in the carrier signal. This offers a wealth of musical creation as well as timbral resources.

You can also patch in your choice of modules: attenuator, slew limiter, CV to Midi / Midi to CV interfaces, inverter, LFOīs etc.

The A-129 /2 synthesis section can also be used as a stand-alone voltage-controlled filter bank, or as Midi-controlled filter bank in combination with a Midi-to-CV interface.

Slew Limiter and Slew Controllers

The Five-way VC slew limiter/ offset generator / attenuators A-129 /3 and Slew controllers A-129/4 are particularly designed for this purpose. Module A-129 /3 includes 5-way Attenuators, 5-way Offset
Generators, and a Slew Limiter which works on all the voltages at the five CV inputs simultaneously.


Using the A-129 /3 just on its own, two functions are available: Attenuator: whatever signal is patched into the CV input can be attenuated by your chosen amount before being sent to the CV output. The attenuation is set with a control knob.

Offset Generator: whatever signal is patched into the CV input will have an offset voltage added to it before being sent to the output. The offset is variable with a control knob.

To use the Slew Limiter section of the 129 /3, you need to have module A129 /4 Slew Limiter Controller as well. It has several dedicated functions, and gives you control over the following slew limiter functions:

- Manual control of the slew rate
- CV control of the slew rate, with an input attenuator

- Choice of three functions: "Follow", "Slew" and "Freeze"

- Freezing the output voltages for the duration of a gate

This set of functions is operated by the Slew Limiter Controller, A-129 /4. Usually, the slew limiter is patched between the CV outputs of the analysis section and the CV inputs of the synthesis section.

A-129 /3 can, particularly be used in combination with A-129 /4, for other purposes.

Voiced /unvoiced detector

A129-5 The Voiced / unvoiced detector A-129 /5 can recognise voiced and unvoiced sections in the speech signal, and switch the carrier signal accordingly.

The incoming speech signal is processed through a pre- amplifier with adjustable gain and a treble boost unit.

The treble boost improves the vocoder effect.

The voiced/unvoiced recognition system controls a voltage controlled switch (like A-150) which is used to switch between the voiced and unvoiced carrier signal (e.g. VCO and Noise).

Additionally the voiced/unvoiced information is available a a gate signal. A LED displays the unvoiced state.





A special MIDI-interface A-195 for the vocoder system is planned for fall '97. The basic functions are a 16-way CV-to-MIDI interface and a 16-way MIDI-to-CV interface (way 16 will be used for other functions like controlling slew-rate or voiced/unvoiced).

The CV-to-MIDI section converts the CV outputs of vocoder analysis into MIDI controllers which may be recorded by a computer sequencer. The MIDI-to-CV section converts incoming MIDI controller information into CV's for the vocoder synthesis section.

Additionally, Doepfer plan to store some factory and user definable 'vocals' in the MIDI interface. Complete vocals like "a", "e", "o", "s", "sh" etc. can then be called up by MIDI program change events (maybe another MIDI event will be used for this purpose).

Thus the vocoder system will become a universal MIDI controlled filter system not limited to the standard vocoder features.

MAM vocoder filterbank 11

The VF-11 is an analogue 11-bank vocoder with combinated filterbank functions. This is the machine for great new sounds and modulations or "just" for classic robot voices. It has loads of knobs and switches to create your own personal sound.

Try modulating drumloops, voices or give a boring sound new pulsating life thru the VF-11. You can also create all those famous voices which made bands like ELO, KRAFTWERK and all those hip-hoppers in the 70's and 80' s famous.

The filterbank sections allows you to create vibrato and phaser effects. By choosing single frequency ranges it allows you to change the sound how you want it to be and all this to a price which makes old second hand vocoder seem really expensive!