Wireless Speech Recognition ..

Speech recognition is now primarily wireless; We've migrated fast, to universal wireless access-communcation devices.

Often, the speech recognition is remote based - And the better signal we send it, the better it performs.

Here, we hope you'll find ideas, technology or projects using hands free and/or mobile devices to make wireless speech recognition a rewarding and useful universal tool!

Wednesday, June 25, 2008

A how-to/helpful "Space" you really should visit..

↑ top

We won't waste any ink on the Windows Live Space referenced here;
It has to be visited to be understood.
But be assured it's not a complicated, ugly/busy site that's cryptic in its content.

If you have any inclination, now or soon, to begin using the ultra-powerful
Macros for Windows Speech Recognition you must visit this website... !!

Windows Speech Recognition Macros

ENJOY!

Labels: Help, How-TO, Macros, Windows Live Spaces, Windows Speech Recognition

A speech-driven PIM on steroids, from Speereo!

↑ top

Speereo's PR Manager, who left a kind comment to our last posting about their super-cool Voice Translator has alerted us to another product, titled "Sapie", which has received rather glowing reviews by 3rd parties.

We haven't tested it yet, but here is a pretty comprehensive overview/review of Sapie, that is well worth reading.

In the meantime.. Take a look at this performance comparison chart!

We hope to be speaking with Speereo's PR Manager, Gleb Klimshin fairly soon and he's promised to discuuss some Wireless Speech Programs they've developed..If they are up to par with Speereo's other products..

Labels: Mobile PIM, performance, Sapie, speech recognition, Speereo

Saturday, June 21, 2008

National Instruments - programming w/ speech!

↑ top

National Instruments "LabVIEW Real-Time Modules", that develop reliable and deterministic applications deployed to a variety of scalable hardware targets is vigorously now using Windows Speech Recognition ("WSR") to both program and test their development.

· Click to read about Labview software, from the National Instrument's website ·

A video is worth a few thousand words:
Click here to see a very interesting demonstration...!!

Labels: Labview, National Instruments, Vista and programming, Windows Speech Recognition

Wednesday, June 18, 2008

Nuance vs Vlingo - threatens Yahoo Voice Search?

↑ top

Via XEconomy, today;

"If a Texas district court grants an injunction sought by Burlington, MA-based Nuance Communications (NASDAQ: NUAN), it could force Yahoo to shut down the voice-enabled version of its mobile search platform. The search tool is powered by software from Vlingo, a Cambridge, MA-based startup Nuance sued yesterday for alleged patent infringement.

The Yahoo (NASDAQ: YHOO) platform, called oneSearch with Voice, works on Blackberry Pearl, Blackberry Curve, and Blackberry 8800 series smartphones, and allows users to enter Web search queries such as “Boston Red Sox scores” or “United Airlines Flight 541″ simply by speaking them into the device. Vlingo’s deal to get its speech recognition technology included in oneSearch was seen as a major coup for the Harvard Square startup, which has about 35 employees and recently closed a $20 million Series B financing round led by Yahoo.

Nuance filed its lawsuit in the United States District Court for the Eastern District of Texas, a jurisdiction famous for favoring plaintiffs in patent-infringement cases. Xconomy obtained a copy of Nuance’s complaint. It alleges that Vlingo’s speech recognition software—including “without limitation, products and services Vlingo is supplying to Yahoo! oneSearch”—infringes on U.S. Patent No. 6,766,295, which was issued to Nuance engineers Hy Murviet and Ashvin Kannan in 2004. The patent covers a technique for making computerized transcription of a users’ speech more accurate over time using audio samples from multiple sessions such as phone calls.

The suit seeks unspecified monetary damages and attorney fees, and also asks the court to “preliminarily and permanently restrain” Vlingo and its business partners from making, using, and selling the allegedly infringing software. Those partners would presumably include Yahoo's 'one search with Voice' platform."

Labels: litigation, mobile voice searches, Nuance, patent infringment, vlingo

Wednesday, June 11, 2008

Speech Analytics - an evolving business resource..

↑ top

Speech analytics technology has evolved in recent years, providing organizations with insight into sales, service and products gleaned from the voice biometrics of customers.

It has also brought a dilemma for organizations deploying it - Does it belong under the hands-on management of the contact center? Or should marketing govern its use? Or maybe it should remain under the strict control of Business Intelligence (BI)?

An important question organizations need to consider when they're purchasing speech analytics tools, according to Keith Dawson, senior analyst with Frost and Sullivan.

"There is no one way to determine which is best -- different options are good at different points," Dawson said. "A lot of it is going to depend on where the purchaser's analytics culture is."

Vendors that have developed speech analytics typically promote just the core speech analytics functions; Parsing recordings for meanings, establishing patterns and alerting users to unseen connections; but there are actually three primary philosophies for organizations purchasing speech analytics.

Businesses with entrenched speech technology in self service or other speech-recognition tools might take the "what's it doing re: in-house performance" approach.

In the contact center environment, agent performance optimization vendors are pushing the technology from the workforce optimization side. Speech analytics are a way to both measure agent skills, and train them as well.

When presented as a marketing tool/revenue producer, discussions on buying speech analytics turns from cost cutting to profits.
Contact center mangement has often concentrated only on cost control;
Average Handle Time ("AHT") or per-call Average Work Time ("AWT");
and have mostly ignored profit and revenue.

|
"They're going to have to collaborate with business people who don't care about the activities & performance inside the call center -- they care only about the outcomes," Dawson said.

Labels: business analysis, customer voice biometrics, Speech Analytics

Tuesday, June 10, 2008

The most remote speech-driven system yet?

↑ top

On Bantayan Island in the Philippines, to prove its new speech-driven services work, the Government Service Insurance System (GSIS) unveiled a remote voice biometrics (a/k/a "voice recognition") system for its members on a picturesque island located in the northernmost tip of Cebu.

"If it will work in Bantayan Island, it will work anywhere," said GSIS President and General Manager Winston Garcia, in a briefing here. Garcia said the service is being launched as part of its modernization program. "This system removes the need for them to go to a GSIS office to renew their pension status. They can now do so remotely, via a phone call," noted Garcia, explaining how the system is intended to reduce large inbound call queues into GSIS offices.

The service was beta-tested in the United States prior to the local (Cebu) launch; E.G. Washington, New York, Chicago, Los Angeles, San Francisco, and Hawaii.

The speech driven "GSIS Voice Activated Processing System" (G-VAPS) enables its 1.2 million members to transact with the GSIS using their unique voice as their "electronic signature,". The system is secure, and is be able to detect "tape-recorded" as opposed to live speech. Active members and pensioners call a US-based GSIS Teleservice toll free number to use the service.

Currently, it allows members to apply for and process loans. It can also be initially used by pensioners to renew their status to active members; Active members can then go to GSIS office to record their voice after presenting proper identities.

"This is voluntary but it provides them convenience," said Garcia when asked if he expects all members to avail of this new service. He also noted GSIS Wireless Automated Processing System kiosks remain another option to apply for loans.

Garcia further noted the system was customized by its own information technology department.

Labels: GSIS, remote speech recognition, Voice biometrics, voice recognition

Saturday, June 07, 2008

Speech recognition's "mainstream", in unusual fields

↑ top

Cited primarily from an article at http://computer.getmash.net/;

Speech recognition has long languished in the no-man’s land between sci-fi fantasy (”Computer, engage warp drive!”) and everyday usage reality.

But that’s changing fast, as advances in computing power, artificial intelligence, powerful API's & newly available WSR Macros, make speech recognition the next powerful step for everyday use by "non-geek" users, user-interface design and now electronic voice-based security.

As to voice-based security: A whole host of highly advanced speech technologies, including emotion and lie detection, are moving from the lab to the marketplace.

This not a new technology,” says Daniel Hong, an analyst at Datamonitor who specializes in speech technology. “But it took a long time for Moore’s Law to make it viable.”

Mr. Hong estimates at the speech technology market is worth more than $2 billion, with plenty of growth in embedded and network apps.

And it’s about time. Speech recognition's technology has been around since the 1950s, but only recently have computer processors and accompanying artificial intelligence become powerful enough to handle the complex algorithms required to recognize our speech, both local & remote, improve our lives & productivity, and open our eyes to the long tail of speech recognition fields.

A few examples:
There are already several capable voice-controlled technologies on the market. You can issue spoken commands to devices like Motorola’s Mobile TV DH01n, a mobile TV with navigation capabilities, and a host of telematics GPS devices. Microsoft recently announced a deal to slip voice-activation software into cars manufactured by Hyundai and Kia, and its TellMe division is investigating voice-recognition applications for the iPhone. And Indesit, Europe’s second-largest home appliances manufacturer, just introduced the world’s first voice-controlled oven.

Yet as promising as this year’s crop of specch-controlled devices are, they’re just the beginning.

Speech technology comes in several flavors, including the speech recognition that drives voice-activated mobile devices; network systems that power IVR's using automated speech recognition, the unequalled desktop Vista Speech recognition, now with available macros {which we use to post & write articles) and the long-standing the standard in the Healthcare industry, the highly impressive network-based Philips SpeechMagic systems.

Voice biometrics (the true technical description of the often mis-used phrase "voice recognition") is a particularly hot area. Every individual has a unique voice print that is determined by the physical characteristics of his or her vocal tract. By analyzing speech samples for telltale acoustic features, voice biometrics can verify a speaker’s identity either in person or over the phone, without the specialized hardware required for fingerprint or retinal scanning.

The technology can also have unanticipated consequences. When the Australian social services agency Centrelink began using voice biometrics to authenticate users of its automated phone system, the software started to identify welfare fraudsters who were claiming multiple benefits — something a simple password system could never do.

The Federal Financial Institutions Examination Council has issued guidance requiring stronger security than simple ID and password combinations, which is expected to drive widespread adoption of voice verification by U.S. financial institutions in coming years. Ameritrade, Volkswagen and European banking giant ABN AMRO all employ voice-authentication systems already.

Advanced voice-recognition systems that can tell if a speaker is agitated, anxious or lying are also in the pipeline.

Computer scientists (e.g. at Carnegie Mellon) have already developed software that can identify emotional states and even truthfulness by analyzing acoustic features like pitch and intensity, and lexical ones like the use of contractions and particular parts of speech. And they are honing their algorithms using the massive amounts of real-world speech data collected by call centers and free 411 speech-driven services such as the extremely popular Goog411.

A reliable, speech-based lie detector would be a boon to law enforcement and the military. But broader emotion detection could be useful as well. Our host company which developed the now-standard Law Enforcement "Mobile Prosecutor" is presently experimenting with embedding it with voice-stress analysis.

In another example, a virtual call center agent that could sense a customer’s mounting frustration and route her to a live agent would save time, money and customer loyalty.

“It’s not quite ready, but it’s coming pretty soon,” says James Larson, an independent speech application consultant who co-chairs the W3C Voice Browser Working Group.

Companies like Autonomy eTalk claim to have functioning anger and frustration detection systems already, but experts are skeptical. According to Julia Hirschberg, a computer scientist at Columbia University, “The systems in place are typically not ones that have been scientifically tested.”

According to Hirschberg, lab-grade systems are currently able to detect anger with accuracy rates in “the mid-70s to the low 80s.”

They are even better at detecting uncertainty, which could be helpful in automated training contexts. (Imagine a computer-based tutorial that was sufficiently savvy to drill you in areas you seemed unsure of.)

Lie detection via voice stress & syntax-pattern deviation analysis is a tougher nut to crack, but progress is being made.

In a study funded by the National Science Foundation and the Department of Homeland Security, Hirschberg and several colleagues used software tools developed by SRI to scan statements that were known to be either true or false. Scanning for 250 different acoustic and lexical cues, “We were getting accuracy maybe around the mid- to upper-60s,” she says.

That may not sound so hot, but it’s a lot better than the commercial speech-based lie detection systems currently on the market. According to independent researchers, such “voice stress analysis” systems are no more reliable than a coin-toss.

It may be awhile before industrial-strength emotion and lie detection come to a call center near you. But make no mistake: They are just around the proverbial corner. And they will be preceded by a mounting tide of gadgets that you can talk to, argue with and intelligently discuss topics with.

Don’t be surprised if, some day soon, your Bluetooth headset tells you to calm down. Or informs you that your last caller was lying through his teeth.

Now that Windows Speech Recognition Macros for Windows Vista™ are in feverish development, both in-house (Microsoft Speech Components Group [ listen_+at+_microsoft.com ], and the beta group inside the Microsoft Speech Yahoo Technical Group) - desktop speech recognition is advancing daily by leaps and bounds, literally.

Powerful WSR macros that can, for example:

Open e-mail messages from a specific (non-Inbox) account with TO: / CC: / BCC: and Subject: fields already completed;

Macros that can move large blocks of extant text in and out of specific locations inside different applications;

Navigate & move items in and out of various folders inside Vista Explorer;

Spoken database lookups

are already evolving and being used & improved daily. It will not be long before speech recognition becomes "what we just use" for most of our daily work & living activities..

(A detailed post on the powerful new WSR Macro Tool & evolving macros is coming soon; We're gathering data, useful macros and research to be sure it is both interesting & useful to all types of speech recognition users)

Labels: artificial intelligence, automated speech recognition, Columbia University, Daniel Hong, Datamonitor, Macros, remote speech recognition, speech recognition, Voice biometrics, Windows Vista

Thursday, June 05, 2008

Remote speech recognition, with Instinct!

↑ top

Samsung's "Instinct" mobile device, scheduled to become available June 20th, now includes some rather advanced speech recognition coupled with nice artificial intelligence features. Slated to be a viable i-Phone competitor, this feature certainly helps that "battle", IOHO.

· Click to read the Gizmodo (excellent) article ·

Via Gizmodo's very nice article:
"Samsung's Instinct may be the best stab at the coveted title of iPhone killah this CTIA. The 3.1-inch touchscreen phone has localized haptic feedback, plus three hard navigation keys.

Check out the well-done You Tube video:

Let's hope the Instinct turns out to be all it's promised.. This advanced speech recognition should really begin to galvanize widespread recognition of true remote speech recognition, from newly arriving moble devices!

Labels: artificial intelligence, automated speech recognition, Instinct, mobile voice searches, Samsung, wireless remote speech recognition

Wednesday, June 04, 2008

Speech-Recognition, translation style!

↑ top

Speereo, a leading developer for voice recognition software offers a new "Free UEFA Euro 2008 Guide" complete with its popular application called Speereo Voice Translator.

· Click to visit the ytranslation spoftware's website ··

Speereo Voice Translator is a perfect solution for travelers and understands every spoken word. Once the user pronounces a phrase in his native language, Speereo Voice Translator immediately reads back the same phrase in one of 14 languages included into an application. It is designed for devices that have Windows Mobile (Pocket PC and Smartphones) or Symbian OS installed.

Via the Speereo web page:
"Speereo Voice Translator is available in two versions: multilanguage and two language. Multilanguage version of Speereo Voice Translator for business communication and traveling, running on smartphones and Pocket PCs, is an innovative phrasebook that provides translation among 14 languages: English, Spanish, German, French, Italian, Russian, Chinese Simplified, Chinese Traditional, Turkish, Portuguese, Korean, Japanese, formal Arabic, Finnish"

It seems speech recognition becomes more ubiquitous.. every week!

Labels: automated speech recognition, speech recognition, Speereo, translation