speech recognition software


Summary: Software capable of recognizing dictation and transcribing the spoken words into written text.

Top Publications

  1. Bhan S, Coblentz C, Norman G, Ali S. Effect of voice recognition on radiologist reporting time. Can Assoc Radiol J. 2008;59:203-9 pubmed
    ..61). Overall, VR slightly decreases the reporting efficiency of radiologists. However, efficiency may be improved if English is a first language, a headset microphone, and macros and templates are used. ..
  2. Issenman R, Jaffer I. Use of voice recognition software in an outpatient pediatric specialty practice. Pediatrics. 2004;114:e290-3 pubmed
    ..VRS is an intriguing technology. It holds the possibility of streamlining medical practice. However, the learning curve and accuracy of the tested version of the software limit broad physician acceptance at this time. ..
  3. Pezzullo J, Tung G, Rogg J, Davis L, Brody J, Mayo Smith W. Voice recognition dictation: radiologist as transcriptionist. J Digit Imaging. 2008;21:384-9 pubmed
    ..In summary, in non-academic settings, utilizing radiologists as transcriptionists results in more error ridden radiology reports and increased costs compared with conventional transcription services. ..
  4. Abu Hasaballah K, James A, Aseltine R. Lessons and pitfalls of interactive voice response in medical research. Contemp Clin Trials. 2007;28:593-602 pubmed
    ..Readers will gain a better understanding of this technology, which will enable them to optimize its usage in clinical research. ..
  5. Rana D, Hurst G, Shepstone L, Pilling J, Cockburn J, Crawford M. Voice recognition for radiology reporting: is it good enough?. Clin Radiol. 2005;60:1205-12 pubmed
    ..419 and p=0.814, respectively). VR is a viable reporting method for experienced users, with a quicker overall report production time (despite an increase in the radiologists' time) and a tendency to more errors for inexperienced users. ..
  6. Crawford A, Sikirica V, Goldfarb N, Popiel R, Patel M, Wang C, et al. Interactive voice response reminder effects on preventive service utilization. Am J Med Qual. 2005;20:329-36 pubmed
    ..Study limitations include unknown generalizability of results and possible self-selection. There is justification for more IVR interventions and research to enhance MCO members' preventive service utilization. ..
  7. Kauppinen T, Koivikko M, Ahovuo J. Improvement of report workflow and productivity using speech recognition--a follow-up study. J Digit Imaging. 2008;21:378-82 pubmed publisher
    ..SR was easily adopted and well accepted by radiologists. Our findings encourage the utilization of SR, which improves the productivity and accelerates the workflow with excellent end-user satisfaction. ..
  8. Derman Y, Arenovich T, Strauss J. Speech recognition software and electronic psychiatric progress notes: physicians' ratings and preferences. BMC Med Inform Decis Mak. 2010;10:44 pubmed publisher
    ..Future investigations of this nature should use more participants, a broader range of document types, and compare front- and back-end SR methods. ..
  9. Quint L, Quint D, Myles J. Frequency and spectrum of errors in final radiology reports generated with automatic speech recognition technology. J Am Coll Radiol. 2008;5:1196-9 pubmed publisher
    ..Knowledge of the frequency and spectrum of errors should raise awareness of this issue and facilitate methods for report improvement. ..

More Information


  1. Hawley M, Enderby P, Green P, Cunningham S, Brownsell S, Carmichael J, et al. A speech-controlled environmental control system for people with severe dysarthria. Med Eng Phys. 2007;29:586-93 pubmed
    ..7s versus 16.9s, p<0.001). It is concluded that a speech-controlled ECS is a viable alternative to switch-scanning systems for some people with severe dysarthria and would lead, in many cases, to more efficient control of the home...
  2. Rush A, Bernstein I, Trivedi M, Carmody T, Wisniewski S, Mundt J, et al. An evaluation of the quick inventory of depressive symptomatology and the hamilton rating scale for depression: a sequenced treatment alternatives to relieve depression trial report. Biol Psychiatry. 2006;59:493-501 pubmed
    ..In nonpsychotic MDD outpatients without overt cognitive impairment, clinician assessment of depression severity using either the QIDS-C16 or HRSD17 may be successfully replaced by either the self-report or IVR version of the QIDS. ..
  3. Polur P, Miller G. Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals. Med Eng Phys. 2006;28:741-8 pubmed
    ..However, its application as a rehabilitation/control tool to assist dysarthric motor impaired individuals holds sufficient promise...
  4. Polur P, Miller G. Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a Mel-cepstral stochastic model. J Rehabil Res Dev. 2005;42:363-71 pubmed
    ..However, its application as a rehabilitation/control tool to assist dysarthric motor-impaired individuals such as cerebral palsy subjects holds sufficient promise. ..
  5. Hawley M, Cunningham S, Green P, Enderby P, Palmer R, Sehgal S, et al. A voice-input voice-output communication aid for people with severe speech impairment. IEEE Trans Neural Syst Rehabil Eng. 2013;21:23-31 pubmed publisher
    ..These limitations will be addressed in future work. ..
  6. Trachtenbarg D. Press 1 to promote health behavior with interactive voice response. Am J Manag Care. 2006;12:305 pubmed
  7. Rytting C, Brew C, Fosler Lussier E. Segmenting words from natural speech: subsegmental variation in segmental cues. J Child Lang. 2010;37:513-43 pubmed publisher
    ..While robustness to phonetic variability may be intrinsically valuable, this finding needs to be complemented by parallel studies of the actual abilities of children to segment phonetically variable speech. ..
  8. Hart J, McBride A, Blunt D, Gishen P, Strickland N. Immediate and sustained benefits of a "total" implementation of speech recognition reporting. Br J Radiol. 2010;83:424-7 pubmed publisher
    ..Our experience demonstrates the dramatic impact that a well-planned, organisation-wide implementation of SRR can have on radiology service delivery. ..
  9. Garcia A, David G, Chand D. Understanding the work of medical transcriptionists in the production of medical records. Health Informatics J. 2010;16:87-100 pubmed publisher
    ..We conclude with a discussion of the implications of these findings for the design and implementation of SRT systems for the production of medical records and for how the work of MTs can help reduce medical errors. ..
  10. Pouplin S, Roche N, Hugeron C, Vaugier I, Bensmail D. Recommendations and settings of word prediction software by health-related professionals for patients with spinal cord injury: a prospective observational study. Eur J Phys Rehabil Med. 2016;52:48-56 pubmed
    ..It thus seems essential to develop information networks and training to disseminate the results of studies and in consequence possibly improve communication for people with cervical SCI who use such devices. ..
  11. Wang Z, Yang C, Wu W, Fan Y. [A research in speech endpoint detection based on boxes-coupling generalization dimension]. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2008;25:536-41 pubmed
  12. Jafari A, Almasganj F, Bidhendi M. Statistical modeling of speech Poincaré sections in combination of frequency analysis to improve speech recognition performance. Chaos. 2010;20:033106 pubmed publisher
    ..By the proposed feature set, 5.7% absolute isolated phoneme recognition improvement is obtained against only MFCC-based features. ..
  13. Guo H, Chan A. Approximated mutual information training for speech recognition using myoelectric signals. Conf Proc IEEE Eng Med Biol Soc. 2006;1:767-70 pubmed
    ..Our results show that AMMI training consistently reduces the error rates compared to these by the ML training, increasing the accuracy by approximately 3% on average. ..
  14. Lee J, Kang H, Choi J, Son Y. An investigation of vocal tract characteristics for acoustic discrimination of pathological voices. Biomed Res Int. 2013;2013:758731 pubmed publisher
  15. Ury A. Speakeasy. Practices can speed productivity and reduce costs by adding speech recognition to their EHRs. Healthc Inform. 2007;24:57-8 pubmed
  16. Swayne L. Re: Voice recognition in the heat of battle. J Am Coll Radiol. 2008;5:228; author reply 228-9 pubmed publisher
  17. Miller D, Bruce H, Gagnon M, Talbot V, Messier C. Improving older adults' experience with interactive voice response systems. Telemed J E Health. 2011;17:452-5 pubmed publisher
  18. Justice L, Breit Smith A, Rogers M. Data recycling: using existing databases to increase research capacity in speech-language development and disorders. Lang Speech Hear Serv Sch. 2010;41:39-43 pubmed publisher
    ..Researchers invested in addressing basic and applied problems of relevance to speech and language services in schools can make use of a variety of extant databases to increase research capacity. ..
  19. Liu X, Hieronymus J, Gales M, Woodland P. Syllable language models for Mandarin speech recognition: exploiting character language models. J Acoust Soc Am. 2013;133:519-28 pubmed publisher
    ..This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance. ..
  20. Skowronski M, Harris J. Automatic speech recognition using a predictive echo state network classifier. Neural Netw. 2007;20:414-23 pubmed
    ..The simple training algorithm and noise robustness of the predictive ESN classifier make it an attractive classification engine for automatic speech recognition. ..
  21. Morin R, Langer S. Speech recognition system evaluation. J Am Coll Radiol. 2005;2:449-51 pubmed
  22. Johnson A, Sow D, Biem A. A discriminative approach to EEG seizure detection. AMIA Annu Symp Proc. 2011;2011:1309-17 pubmed
    ..The results strongly suggests the possibility of deploying the designed system at the bedside. ..
  23. Conn J. Not dead yet. New technology has not completely eclipsed the medical transcription industry, but its many problems put the industry at risk. Mod Healthc. 2005;35:38, 40, 42-4 pubmed
  24. Twitchell J. Instant connection: wireless voice communication. Nurs Manage. 2007;38:49-51 pubmed
  25. Fleury A, Noury N, Vacher M, Glasson H, Seri J. Sound and speech detection and classification in a Health Smart Home. Conf Proc IEEE Eng Med Biol Soc. 2008;2008:4644-7 pubmed publisher
    ..We introduce the methods for the sound and speech recognition, the post-processing of the data and finally the experimental results obtained in real conditions in the flat. ..
  26. Tassani S, Baruffaldi F, Testi D, Cacciari C, Accarisi S, Viceconti M. Personal Digital Assistant in an orthopaedic wireless ward: the HandHealth project. Comput Methods Programs Biomed. 2007;86:21-9 pubmed
    ..Medical images can be showed on the device display, but also transferred to a high-resolution monitor. Large amount of data can be dictated and translated by remote continuous speech recognition. ..
  27. Wolpin S, Berry D, Kurth A, Lober W. Improving health literacy: a Web application for evaluating text-to-speech engines. Comput Inform Nurs. 2010;28:198-204 pubmed publisher
    ..Future avenues of research include exploring more complex tasks, usability issues related to implementing text-to-speech features, and applied health promotion and education opportunities among vulnerable populations. ..
  28. Kokkinakis K, Loizou P. Selective-tap blind signal processing for speech separation. Conf Proc IEEE Eng Med Biol Soc. 2009;2009:3150-3 pubmed publisher
  29. D Haene M, Schrauwen B, Van Campenhout J, Stroobandt D. Accelerating event-driven simulation of spiking neurons with multiple synaptic time constants. Neural Comput. 2009;21:1068-99 pubmed publisher
    ..Moreover, our algorithm is highly independent of the complexity (i.e., number of synaptic time constants) of the underlying neuron model. ..
  30. Scharenborg O. Modeling the use of durational information in human spoken-word recognition. J Acoust Soc Am. 2010;127:3758-70 pubmed publisher
    ..Fine-Tracker thus provides the first computational model of human word recognition that is able to extract durational information from the speech signal and to use it to differentiate words. ..
  31. Walter C. How biometrics keep sensitive information secure. Nursing. 2005;35:76 pubmed
  32. Remez R, Dubowski K, Davids M, Thomas E, Paddu N, Grossman Y, et al. Estimating speech spectra for copy synthesis by linear prediction and by hand. J Acoust Soc Am. 2011;130:2173-8 pubmed publisher
    ..The results show a substantial intelligibility cost of reliance on uncorrected linear prediction estimates when phonemic variation approaches natural incidence. ..
  33. Fraiwan L, Lweesy K, Al Nemrawi A, Addabass S, Saifan R. Voiceless Arabic vowels recognition using facial EMG. Med Biol Eng Comput. 2011;49:811-8 pubmed publisher
    ..The random forest classifier with time frequency features showed the best performance with an accuracy of 77% evaluated using a 10-fold cross-validation. ..
  34. Lancioni G, Singh N, O Reilly M, Sigafoos J, Buonocunto F, Sacco V, et al. Microswitch- and VOCA-assisted programs for two post-coma persons with minimally conscious state and pervasive motor disabilities. Res Dev Disabil. 2009;30:1459-67 pubmed publisher
    ..Implications of the findings for improving the situation of post-coma persons with minimally conscious state and pervasive motor disabilities are discussed. ..
  35. Maniwa K, Jongman A, Wade T. Acoustic characteristics of clearly spoken English fricatives. J Acoust Soc Am. 2009;125:3962-73 pubmed publisher
  36. Ariyaeeinia A, Morrison C, Malegaonkar A, Black S. A test of the effectiveness of speaker verification for differentiating between identical twins. Sci Justice. 2008;48:182-6 pubmed publisher
    ..The paper details the problem in speaker verification posed by identical twins, discusses the experimental investigations and provides an analysis of the results. ..
  37. Choi Y, Lee S. Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation. Neural Netw. 2013;45:62-9 pubmed publisher
    ..The proposed features were evaluated with a noisy speech database and showed better performance than the baseline methods such as mel-frequency cepstral coefficients (MFCCs) and RASTA-PLP in unknown noisy conditions. ..
  38. Thielst B, Gardner J. Clinical documentation systems: another link between technology and quality. J Healthc Manag. 2008;53:5-7 pubmed
  39. Chang P, Sheng Y, Sang Y, Wang D, Hsu Y, Hou I. Developing and evaluating a wireless speech-and-touch-based interface for intelligent comprehensive triage support systems. Stud Health Technol Inform. 2006;122:693-7 pubmed
    ..A more flexible interface, not efficiency, might be the main reason for this finding. ..
  40. Tsimhoni O, Smith D, Green P. Address entry while driving: speech recognition versus a touch-screen keyboard. Hum Factors. 2004;46:600-10 pubmed
    ..Applications of this research include the design of in-vehicle navigation systems as well as other systems requiring significant driver input, such as E-mail, the Internet, and text messaging. ..
  41. Gonzalez Pacheco V, Malfaz M, Fernandez F, Salichs M. Teaching human poses interactively to a social robot. Sensors (Basel). 2013;13:12406-30 pubmed publisher
    ..Such a natural way of training enables robots to learn from users, even if they are not experts in robotics. ..
  42. Necas P, Hejna P. [Usage of automatic voice transcription in autopsy service]. Soud Lek. 2011;56:40-2 pubmed
    ..Such improvement involves appropriate vocabulary usage and special vocal adaptation. The role of the autopsy secretary is acknowledged. ..
  43. Stelzle F, Maier A, Nöth E, Bocklet T, Knipfer C, Schuster M, et al. Automatic quantification of speech intelligibility in patients after treatment for oral squamous cell carcinoma. J Oral Maxillofac Surg. 2011;69:1493-500 pubmed publisher
    ..Surgical reconstruction techniques seem to have an impact on speech intelligibility. ..
  44. Chong M, Reed M. Data management. Talkng technology. Health Serv J. 2008;Suppl:4 pubmed
  45. Luo X, Fu Q. Speaker normalization for chinese vowel recognition in cochlear implants. IEEE Trans Biomed Eng. 2005;52:1358-61 pubmed
    ..After speaker normalization, subjects' patterns of recognition performance across speakers changed, demonstrating the potential for speaker-dependent effects with the proposed normalization technique. ..
  46. Roup C, Poling G, Harhager K, Krishnamurthy A, Feth L. Evaluation of a telephone speech-enhancement algorithm among older adults with hearing loss. J Speech Lang Hear Res. 2011;54:1477-83 pubmed publisher
    ..The algorithm has the potential to benefit older adults with SNHL who struggle to communicate via the telephone with or without hearing aids. ..
  47. Ghosh A, Shankar B, Meher S. A novel approach to neuro-fuzzy classification. Neural Netw. 2009;22:100-9 pubmed publisher
    ..All these measures supported the superiority of the proposed NF classification model. The proposed model learns well even with a lower percentage of training data that makes the system fast. ..
  48. Singh R, Allen J. The influence of stop consonants' perceptual features on the Articulation Index model. J Acoust Soc Am. 2012;131:3051-68 pubmed publisher
    ..A detailed analysis of the variance from the AI error is provided along with a Bernoulli-trials analysis of the statistical significance. ..
  49. Rose G, Skelly J, Badger G, NAYLOR M, Helzer J. Interactive voice response for relapse prevention following cognitive-behavioral therapy for alcohol use disorders: a pilot study. Psychol Serv. 2012;9:174-84 pubmed publisher
    ..03), and both self-efficacy and coping significantly improved from pre-CBT to post-ATIVR (p < .01). Results indicate ATIVR is feasible and acceptable. Its efficacy should be evaluated in a randomized controlled trial. ..
  50. Oard D. Social science. Unlocking the potential of the spoken word. Science. 2008;321:1787-8 pubmed publisher
  51. Midanik L, Greenfield T. Interactive voice response versus computer-assisted telephone interviewing (CATI) surveys and sensitive questions: the 2005 National Alcohol Survey. J Stud Alcohol Drugs. 2008;69:580-8 pubmed
    ..However, reports of nonheterosexual sexual orientation identity remain sensitive for older respondents. Embedding IVR within a telephone interview may provide an effective way of helping assure valid responses to sensitive item content. ..
  52. Georgoulas G, Georgopoulos V, Stylios C. Speech sound classification and detection of articulation disorders with support vector machines and wavelets. Conf Proc IEEE Eng Med Biol Soc. 2006;1:2199-202 pubmed
    ..The proposed method is implemented on a data set where different sets of features and different schemes of SVMs are tested leading to satisfactory performance. ..
  53. Lancioni G, O Reilly M, Singh N, Sigafoos J, Oliva D, Montironi G, et al. Extending the evaluation of a computer system used as a microswitch for word utterances of persons with multiple disabilities. J Intellect Disabil Res. 2005;49:639-46 pubmed
    ..The computer system was useful as a microswitch to enable access to favourite stimuli. There is a need to improve the accuracy of the system with respect to its recognition of the participants' utterances. ..