To the point, when asking “Is YouTube auto-generated captioning good enough?”, I would ask “Good enough for what?”

  *   Good enough for accessibility? Plainly, no. Accessibility generally involves providing equivalent service to someone with a disability as someone without. And a mechanically captioned video is by and large incomprehensible. You might make out a few words here and there, enough to understand the subject, but not enough to comprehend the concepts. There are cases where a 99% accurately captioned video will not be considered “accessible” because the captions keep missing one key word, while in other cases 94% accuracy is more than sufficient. When dealing with YouTube, in the best of cases, you’re looking at 70-80% accuracy.

  *   Good enough for discoverability, the ability to search in videos based on a transcription of what was said? As mentioned by others, this is highly dependent on the accept of the speaker and wouldn’t only go so far. For instance, good luck trying to get a mechanical captioning system to tell the difference between “atom” and “adam”.

In short, when you’re dealing with unscripted dialogue, conversations, Q&A, lectures, etc., you need a professional transcription service to make the captions accessible. When dealing with a prepared statement or a script with no deviation, if the speaker speaks in a level monotone, with no “uhm” or “uhh”, you just might get away with a mechanical caption. But in such cases, you may already have access to the prepared statement or script, so making the transcription available as a separate file may just be enough.

Raul Burriel | Streaming Media Coordinator
Academic Technology | Oregon State University
541-737-4546 | [log in to unmask]<mailto:[log in to unmask]> | @rburriel

From: The EDUCAUSE Instructional Technologies Constituent Group Listserv [mailto:[log in to unmask]] On Behalf Of Michael Berta
Sent: Thursday, February 23, 2017 11:08 AM
To: [log in to unmask]
Subject: [INSTTECH] Are Auto-Generated YouTube Captions Good Enough?

Like many institutions, we are wrestling with the question of captioning videos the college and faculty (through college-owned YouTube accounts) produces. A lot of the conversation is about amount of time and cost of services but there has been very little in terms of specific data. I began a time study by having multiple people follow a transcription process on several videos ranging from 2 to 22 minutes. The aim, of course, to provide some hard data in terms of time and costing for such a task.

The question was posed, "Is YouTube auto-generated captioning good enough?" Which really has two parts.

  1.  What is the acceptable error rate/accuracy rate for captioning generally?
  2.  Is it defensible to use only YouTube captioning for videos?
I'd like to know if anyone has any experiences, knowledge, or data they could share. I'm also happy to share my time study results when it is complete; just let me know.


Michael R. Berta, Ed.D.
Director of Educational Development
Center for Excellence in Teaching and Learning

Daemen College
4380 Main Street
Amherst, NY 14226
T. 716.566.7870
E. [log in to unmask]<mailto:[log in to unmask]>
********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at

Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at