|
#1
| ||||
| ||||
| Speech to Text software I've got a lot of MP3 and WAV files of talks and sermons that I am going to transcribe into text. These talks go many years back and hence the transcripts have been lost. There are hours and hours of talks and I'm looking for software that will help make my job easier. OS X comes with some very good text to speech capabilities. Is there any capability to do speech to text on OS X? It would be great if there were free/cheap shareware products that do this task. Any recommendations? |
|
#2
| |||
| |||
| ViaVoice I was looking for something similar for some old "reel to reel" tapes of my father from about 50 years ago. I was looking for a wholey different type of product but came across this link. Looks like it might be expensive though. http://www-306.ibm.com/software/voice/viavoice/
__________________ ------------------------------------------------------------ 17" PowerBook G4, 2xG4, 3xG3, 2xSun Sparc 1000e (8 cpu), 2xLINUX (Fedora II (Dell), Mandrake 9(Compaq)) Last edited by thewelshman; August 23rd, 2005 at 04:43 PM. |
|
#3
| |||
| |||
| iListen from MacSpeech is the only speech to text Mac app as I understand it. Anyway, it's clearly the most developed at least. I forget the exact price, $100-$150 something like that. The real investment is the time investment though, setting up and learning a speech to text program is not a small matter. And it requires a level of patience not typical for most apps. There is a mail list where you can ask questions of users. Google for the URL, can't recall it off top of my head. |
|
#4
| ||||
| ||||
| These programs look good . Shame there aren't any demos of either program... |
|
#5
| |||
| |||
| Be careful of ViaVoice, I think it may no longer be under development. Speech recog is a big time investment, you don't want to get sucked in to a product that has been abandoned. If you only have one project, like a pile of tapes you want converted to text, hire a transcriptionist. There is no cheap easy way to quickly convert sound to text in a reliable manner, not yet. This is my understanding, feel free to add your own, I'm not an expert. |
|
#6
| ||||
| ||||
| I also wanted to add that you might find some people who would be glad to spend an hour or two transcribing for you at a reasonable price... On the other hand, if the software actually works... I guess it could get difficult, though. If the software misunderstands something, and you have to go over _all_ them again, you might as well transcribe them yourself, as that won't take much longer...
__________________ macnews.net.tc is active again. MacBook Air 13" 1.6 GHz, 2 GB RAM, 80 GB HD. Mac OS X 10.5.5 Hackintosh Core2Duo 2.4 GHz, 2 GB RAM, 160 GB HD. Mac OS X 10.5.5 iPhone 3G 16 GB white, AppleTV 1G 40 GB Mac user since 1987, Apple Product Professional 2007, 2008. Apple Certified Support Professional 10.5 |
|
#7
| |||
| |||
| If you have an ongoing need for speech recog that merits spending many hours getting set up, then iListen is certainly worth a look. The software works, imperfectly, after a lot of input and training on your part. I was just trying to warn folks that speech recog is not at the same place as normal software. It's still a developing field and things never work perfectly, no matter how much you work at it. It takes a different mindset than normal software, and probably isn't practical for quick one time tasks. |
|
#8
| |||
| |||
| II am totally and completely frustrated that after all these years, and all the technology out there, the brilliant Mac software developers have still not come up with a speech-to-text system that will automatically transcribe third party recorded interviews. My job involves a huge amount of transcription of digitally recorded interviews with people and this task of transcribing interviews is enormously boring, time consuming and mind-numbing. Because of timelines and the nature of the work, I can't email the work off to some person in India doing it for three dollars an hour or whatever ... I need a machine that can do it and it is a perfect job for a machine ... I want software to take it so I can concentrate on the more creative stuff. The transcription systems that are out there, like iListen and ViaVoice are totally useless to me because they only work on the principle of first "training" the software on your voice, and your voice inflections. And obviously, I don't need software to recognize my voice ... i need it to recongize other people's voices and I need software that can do this without needing to be "trained." And I recognize that such software will make mistakes ... but I don't need a *perfect* system because I have to edit the words after they are transcribed anyway. What I need is a system that will reduce the laborious task of transcribing because it can more or less recongize other people's voices, make sense of it and give me a transcription that is pretty close to what was said. I know that what I am asking for is a very complicated and a very sophisticated task for software because every voice is different and we all tend to run our sentences together ... and it's complicated because some people say Tomaaaato and some people say tomito and and some people talk a mile a minute and others speak veeeeeeeery sloooooowly. Not to mention dealing with all the different foreign accents in just the English language. But I have been thinking about the various ways around these problems and I have a few ideas (I don't write software so I can't do it, but if there is a developer out there who can do it, I'd be very appreciative!) What I am thinking is... why not break this problem down into several steps? 1) Develop the software that will take a digital recording someone's natural speech and basically "flatten" it out so that it sounds just like one of the computer generated voices ... like "Bruce" or whatever ... There are various existing sound applications that can change the tone, pitch, speed, etc. of sound... so I am thinking that this first step shouldn't be too, too, far beyond today's technological capability. 2) Once you have done that, then all you have to do is pre-train the speech-to-text software to recognize the above "voice," ... so you provide a pre-trained system with an in-built decent vocabulary/dictionary plus the ability for the user to add new words to that vocabulary/dictionary. 3) I run my interviews through this software ... the software autmatically does step one, and then step two and gives me the result. Yes, it might be slow and take a bit of time, but it should certainly save me hours of transcription time. Any thoughts on this?? Can it be done? Has anybody tried anything remotely approaching this? |
![]() |
| Thread Tools | |
|
|