Everybody loves a good robot or two. Here at WFMU, we have three of them. Two of them call up our DJs from time to time and say things like "telemetry channel - zero zero." From time to time, you might hear one of our DJs announce on the air that they have to go back to music right away because the robot is calling. It's true. When the robot calls, a bright strobe light goes off in the studio, everybody has a seizure and the DJ is supposed to drop everything and find out what the robot has to say.
Our third robot, I can't tell you what he does. But he lives on Mike Lupica's desk and he has an arm that moves back and forth and does stuff. Dave the Spazz opens his show with a robot. You know the one - Hal, from 2001 A Space Odyssey. The one with the heavy breathing who goes "Just what do you think you're doing, Dave?"
Now that I think about it, we actually have four robots. Last August, we had a five day long internet stream of left-wing claptrap called the RNC Remix. We had a bank of automated programming to fall back on in case we didn't have enough DJs. We called it the "Flaming Robot of Love" (streaming mp3) and it was one of the remix's most popular features. And if we count the Flaming Robot of Love, then we have to count Nachum's new audio robot that programs his web-only stream nineteen hours a day. So that's five. Five robots.
Last Wednesday I had such a bad case of laryngitis that I couldn't make a sound. The night before my show, I decided to do my show anyway, using a text-to-speech generator I found on the web. I figured I would type out my announcements, download the audio file and play it on the air. The text-to-speech generator I found had a fairly robotic sounding male voice, not unlike the crappy text-to-speech programs that comes built in to Macs.
But as my show began, listener Doug sent me a much better text-to-speech site, this one here. I started using the Spanish male voice, also known as "M022" and I was amazed at how expressive he was. He reminded me of Julio I., my former nemesis at Upsala College. Before I knew it, Julio was bitching at me, just like old times. Soon he was recording public service announcements like this one (mp3).
After the show, I started wondering how that site worked - were the
voices made of tiny bits of recorded human speech, or were they actually
synthesized - and what was the real purpose of a company like this? It
turns out that the speech was synthesized, and scansoft
markets its speech technology to all sorts of companies - anyone
needing robotic voices that sound human. Which is a lot of companies, when you think about it.
As I played with the site some more, I found a British female voice that I actually recognized. This was the exact same voice that my friend Adam had built in to his car's global positioning system. Adam would type in his destination, and the British lady would tell him exactly where to go.
One day, Adam and I drove to Parsippany to visit one of the station's computers. As we headed back, Adam got off at the wrong exit, and we ended up on First Street in Newark. I tried telling Adam how to get back to Jersey City, but no. He wanted to listen to the British robot lady. She calmly told us that we had taken a wrong turn, and where we should go now to get back on track. She proceeded to direct us into the most crack infested neighborhood in North America. I can still remember her voice as Adam was pulled from the car and beaten within an inch of his life. This is what she said (mp3).