Our Machinery of Data project 2014
|Purpose||To present open data in a unique and interesting format|
Machinery of Data Competition (part of Unleashed/GovHack)
The Original Idea
- "to have a row of polystyrene heads (attired, decorated) with speakers inside them and each one has a different voice and stream of data."
Extension of Idea
- heads can move side to side, motion sensor eg. they don't start talking until something moves in front of them, because the name of the competition is "Machinery" of Data I like the idea of the heads being machinery themed (Kylie).
Changes to Idea
- no movement of heads
- no motion sensor
- 1 Jobs
- 2 Hardware
- 3 Text to speech
- 4 Data
- 5 Machinery of Data exhibition
- black polystyrene heads
- collected electronic, mechanical, plastic and metal components
- craft gear
- crafty skills
- rasp pi or pc
Sound in Linux
ALSA is the interface to the sound hardware. Unless the sound device does hardware mixing - not so common these days - ALSA can't do any itself. PulseAudio can do this; it interfaces with ALSA to play sound.
We need mixing because with ALSA, I don't think more than once process can access a given sound card. We need this to play two different voices out of different speakers.
With any luck ALSA doesn't need configuring;
/proc/asound contains some symlinks naming each card. This can be passed to PulseAudio's
module-alsa-sink using the
I don't think we need this stuff - see the next section - but here's how I started.
Stop PulseAudio resurrecting itself:
echo autospawn = no > $HOME/.config/pulse/client.conf
If you lose all sound, try deleting this file.
We'll probably need to stop PulseAudio trying to configure itself by not loading
Now configure a FIFO to send its output to a single speaker. Add this to a file (
heads.pa maybe). This one accepts data at 8kHz in μ-law format:
load-module module-pipe-source source_name=head1 file=/tmp/head1 rate=8000 format=ulaw load-module module-loopback source=head1 channel_map=left
pulseaudio --log-level=warn --file=heads.pa
Now you can dump data to
/tmp/head1 and it comes out of the left speaker.
When you're done,
pulseaudio --start is supposed to get your normal sound back, but it's not working for me - the volume control breaks! Maybe try
Playing a sound with paplay: First find your sound card's sink:
$ pactl list short sinks</code>: 0 alsa_output.pci-0000_00_1b.0.analog-stereo module-alsa-card.c
paplay --device=alsa_output.pci-0000_00_1b.0.analog-stereo --channel-map=left myfile.wav
will play the sound on the left channel.
For extra marks: It might be possible to use DMA to produce several channels of sound on a Raspberry Pi. PiBlaster might be a good start.
Text to speech
Festival by default seems to write its output to a .wav file, then invoke some command to play it. This surprises me a bit because it can't start playing the sound before it's finished generating it. We can use
paplay -d to play it to a specific stream.
Combining festival with paplay:
festival> (Parameter.set 'Audio_Method 'Audio_Command) festival> (Parameter.set 'Audio_Command "paplay --device=alsa_output.pci-0000_00_1b.0.analog-stereo --raw --rate=$SR --format=s16le --channels=1 --channel-map=left $FILE") festival> (SayText "Foo") #<Utterance 0x7fd39d3ada10>
$ festival --server server Sat Jul 5 14:06:19 2014 : Festival server started on port 1314 client(1) Sat Jul 5 14:06:34 2014 : accepted from localhost
$ nc localhost 1314 (SayText "Foo") LP #<Utterance 0x7fea512413d0> ft_StUfF_keyOK
The bit starting "LP" is sent after the sound has played. I don't know what should be used to detect errors, when done etc; I haven't found any docs about it.
- Python, in client/server mode; possibly not working (see
The basic Festival voice is very artificial, and the ones that are "freely" available aren't much better. Festival's demo page has some really nice ones, these aren't available. At this point we might have to give speech synthesis a miss.
We might be able to pre-render the text using a non-free option like Ivona.
- Debian user forums
- HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)
- Howto: Setup more realistic voices in Festival
gespeaker and mbrola
$ sudo aptitude install gespeaker mbrola-en1 mbrola-us1 mbrola-us2 $ sudo ln -s /usr/lib/x86_64-linux-gnu/espeak-data/ /usr/share
(The symlink is due to a bug).
now you can use gespeaker with the mbrola voices.
I've created some samples.
Ahead-of-time text to speech
VoiceRSS' "English (Great Britian)" voice isn't too bad; there appears to be no usage restrictions on generated audio.
Our main source of data is Trove newspapers and from that we're looking at pulling out;
- letters to the editor
- news stories