image
image

Go Back   macosx.com > Mac Help Forums > Mac OS X System & Mac Software

Reply
 
Thread Tools
  #1  
Old August 23rd, 2005, 10:24 AM
Viro's Avatar
Registered User
 
Join Date: Nov 2003
Location: Oxford, UK
Posts: 2,492
Thanks: 0
Thanked 1 Time in 1 Post
Viro will become famous soon enoughViro will become famous soon enough
Speech to Text software

I've got a lot of MP3 and WAV files of talks and sermons that I am going to transcribe into text. These talks go many years back and hence the transcripts have been lost. There are hours and hours of talks and I'm looking for software that will help make my job easier.

OS X comes with some very good text to speech capabilities. Is there any capability to do speech to text on OS X? It would be great if there were free/cheap shareware products that do this task. Any recommendations?
Reply With Quote
  #2  
Old August 23rd, 2005, 01:27 PM
Registered User
 
Join Date: Feb 2005
Location: Aberdeen Scotland
Posts: 92
Thanks: 0
Thanked 0 Times in 0 Posts
thewelshman is on a distinguished road
ViaVoice

I was looking for something similar for some old "reel to reel" tapes of my father from about 50 years ago. I was looking for a wholey different type of product but came across this link. Looks like it might be expensive though.

http://www-306.ibm.com/software/voice/viavoice/
__________________
------------------------------------------------------------
17" PowerBook G4, 2xG4, 3xG3, 2xSun Sparc 1000e (8 cpu), 2xLINUX (Fedora II (Dell), Mandrake 9(Compaq))

Last edited by thewelshman; August 23rd, 2005 at 04:43 PM.
Reply With Quote
  #3  
Old August 24th, 2005, 12:12 AM
Registered User
 
Join Date: Feb 2005
Posts: 32
Thanks: 0
Thanked 0 Times in 0 Posts
philbert is on a distinguished road
iListen from MacSpeech is the only speech to text Mac app as I understand it. Anyway, it's clearly the most developed at least.

I forget the exact price, $100-$150 something like that.

The real investment is the time investment though, setting up and learning a speech to text program is not a small matter. And it requires a level of patience not typical for most apps.

There is a mail list where you can ask questions of users.

Google for the URL, can't recall it off top of my head.
Reply With Quote
  #4  
Old August 24th, 2005, 03:42 AM
Viro's Avatar
Registered User
 
Join Date: Nov 2003
Location: Oxford, UK
Posts: 2,492
Thanks: 0
Thanked 1 Time in 1 Post
Viro will become famous soon enoughViro will become famous soon enough
These programs look good . Shame there aren't any demos of either program...
Reply With Quote
  #5  
Old August 24th, 2005, 07:39 AM
Registered User
 
Join Date: Feb 2005
Posts: 32
Thanks: 0
Thanked 0 Times in 0 Posts
philbert is on a distinguished road
Be careful of ViaVoice, I think it may no longer be under development. Speech recog is a big time investment, you don't want to get sucked in to a product that has been abandoned.

If you only have one project, like a pile of tapes you want converted to text, hire a transcriptionist. There is no cheap easy way to quickly convert sound to text in a reliable manner, not yet.

This is my understanding, feel free to add your own, I'm not an expert.
Reply With Quote
  #6  
Old August 24th, 2005, 08:04 AM
fryke's Avatar
Super Moderator
 
Join Date: Sep 2000
Location: macosx.com
Posts: 13,337
Thanks: 2
Thanked 23 Times in 21 Posts
fryke has a spectacular aura aboutfryke has a spectacular aura about
I also wanted to add that you might find some people who would be glad to spend an hour or two transcribing for you at a reasonable price... On the other hand, if the software actually works... I guess it could get difficult, though. If the software misunderstands something, and you have to go over _all_ them again, you might as well transcribe them yourself, as that won't take much longer...
__________________
macnews.net.tc is active again.
MacBook Air 13" 1.6 GHz, 2 GB RAM, 80 GB HD. Mac OS X 10.5.5
Hackintosh Core2Duo 2.4 GHz, 2 GB RAM, 160 GB HD. Mac OS X 10.5.5
iPhone 3G 16 GB white, AppleTV 1G 40 GB

Mac user since 1987, Apple Product Professional 2007, 2008. Apple Certified Support Professional 10.5
Reply With Quote
  #7  
Old August 24th, 2005, 08:22 AM
Registered User
 
Join Date: Feb 2005
Posts: 32
Thanks: 0
Thanked 0 Times in 0 Posts
philbert is on a distinguished road
If you have an ongoing need for speech recog that merits spending many hours getting set up, then iListen is certainly worth a look.

The software works, imperfectly, after a lot of input and training on your part.

I was just trying to warn folks that speech recog is not at the same place as normal software. It's still a developing field and things never work perfectly, no matter how much you work at it. It takes a different mindset than normal software, and probably isn't practical for quick one time tasks.
Reply With Quote
  #8  
Old September 20th, 2006, 10:29 PM
Registered User
 
Join Date: Sep 2006
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
Roses is on a distinguished road
Lightbulb Here's my free idea

II am totally and completely frustrated that after all these years, and all the technology out there, the brilliant Mac software developers have still not come up with a speech-to-text system that will automatically transcribe third party recorded interviews.

My job involves a huge amount of transcription of digitally recorded interviews with people and this task of transcribing interviews is enormously boring, time consuming and mind-numbing.

Because of timelines and the nature of the work, I can't email the work off to some person in India doing it for three dollars an hour or whatever ... I need a machine that can do it and it is a perfect job for a machine ... I want software to take it so I can concentrate on the more creative stuff.

The transcription systems that are out there, like iListen and ViaVoice are totally useless to me because they only work on the principle of first "training" the software on your voice, and your voice inflections. And obviously, I don't need software to recognize my voice ... i need it to recongize other people's voices and I need software that can do this without needing to be "trained."

And I recognize that such software will make mistakes ... but I don't need a *perfect* system because I have to edit the words after they are transcribed anyway. What I need is a system that will reduce the laborious task of transcribing because it can more or less recongize other people's voices, make sense of it and give me a transcription that is pretty close to what was said.


I know that what I am asking for is a very complicated and a very sophisticated task for software because every voice is different and we all tend to run our sentences together ... and it's complicated because some people say Tomaaaato and some people say tomito and and some people talk a mile a minute and others speak veeeeeeeery sloooooowly. Not to mention dealing with all the different foreign accents in just the English language.

But I have been thinking about the various ways around these problems and I have a few ideas

(I don't write software so I can't do it, but if there is a developer out there who can do it, I'd be very appreciative!)

What I am thinking is... why not break this problem down into several steps?

1) Develop the software that will take a digital recording someone's natural speech and basically "flatten" it out so that it sounds just like one of the computer generated voices ... like "Bruce" or whatever ...

There are various existing sound applications that can change the tone, pitch, speed, etc. of sound... so I am thinking that this first step shouldn't be too, too, far beyond today's technological capability.

2) Once you have done that, then all you have to do is pre-train the speech-to-text software to recognize the above "voice," ... so you provide a pre-trained system with an in-built decent vocabulary/dictionary plus the ability for the user to add new words to that vocabulary/dictionary.

3) I run my interviews through this software ... the software autmatically does step one, and then step two and gives me the result.

Yes, it might be slow and take a bit of time, but it should certainly save me hours of transcription time.

Any thoughts on this?? Can it be done? Has anybody tried anything remotely approaching this?
Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off
Forum Jump


All times are GMT -5. The time now is 08:42 AM.


Mac Support® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.1.0
Copyright 2000-2008 DigitalCrowd, Inc.