Google adds automatic captions to YouTube

Good news to the closed caption advocates like me! Google, in a significant development for the deaf, announced on Thursday it was adding automatic caption capability to videos on YouTube.

Google said machine-generated captions would initially be available only in English and on videos from 13 YouTube “partner channels” but it hopes to extend the feature eventually to all videos uploaded to the site.

“Google believes that the world’s information should be accessible to everyone,” said Vint Cerf, a Google vice president who has been described as the “Father of the Internet.”

“One of the big challenges of the video medium is whether it can be made accessible to everyone,” said Cerf, who also holds the title of “Chief Internet Evangelist” at Google.

Speaking at Google’s Washington office, Cerf noted that he has a “great personal interest” in the closed caption capability. Cerf, 66, is hearing impaired and has been wearing hearing aids since the age of 13.

Since last year, YouTube users have been able to manually add captions to videos but the feature is not widely used and the vast majority of content on the site does not have captions.

Noting that more than 20 hours of video are uploaded to YouTube every minute, Ken Harrenstien, a deaf Google software engineer, said “the majority of user-generated video content online is still inaccessible to people like me.”

Google uses advanced speech recognition technology to generate the automatic captions on YouTube and Harrenstien noted that it is not perfect — the word “sim card,” for example, came out as “salmon” during one demonstration.

But he said the technology “will continue to improve with time.”

“Today I’m more hopeful than ever that we’ll achieve our long-term goal of making videos universally accessible,” he said in a blog post. “Even with its flaws, I see the addition of automatic captioning as a huge step forward.”

Although the automatic captions can only be generated from videos in English for the moment, they can be simultaneously machine-translated into any of the 51 languages Google supports.

In addition to the automatic captions, Google announced a new feature that will make it easier for users to add captions to their videos.

Called automatic caption timing it involves creating a transcript of the video and uploading it to YouTube. Speech recognition technology is then used to create captions for the video and insert them appropriately.

“This should significantly lower the barriers for video owners who want to add captions, but who don’t have the time or resources to create professional caption tracks,” Harrenstien said.

Both features will be available in English by the end of the week.

The university partners whose videos will allow automatic captioning include the University of California at Berkeley, the University of California at Los Angeles, Columbia University, Duke, the Massachusetts Institute of Technology, Stanford, Yale and the University of New South Wales in Australia.

National Geographic’s YouTube channel will also include the feature as will most of Google and YouTube’s own channels.

Now, if they can only put automatic captions on our sign language videos… That would be the day! 🙂


Closed Caption bill filed in Philippine Senate

Philippine Senator and action movie star Ramon “Bong” Revilla Jr. filed a landmark bill addressing barriers on access to information. Senate Bill 2872 or Closed Caption Bill which he filed last November 12 requires news programs on television to broadcast with sub-title or captioned.

This is a welcome news especially for people like me who actively campaign for equal rights to information through the aid of technology. TV captions is of greater importance because of its wider audience reach. In my case, I advocate for closed captioning on web videos or films that are shown on the Internet.

Captioning does not only benefit the deaf. Other people such as those who want to learn new language, those who are starting to learn how to read and those who are in a noisy environment would also profit from it.

Although this is a very welcome development, certain issues must be considered in implementing this bill. The most important is the language that will be used. Most schooled Filipino deaf don’t understand the Tagalog vernacular. Since popular news broadcasts are in this format, adding captions won’t be of big help to them. That is why some deaf communities advocate for sign language inset instead of this.

I hope the good senator would instigate further consultations especially to the affected sector. I’m sure the deaf community would be very glad to assist him.

Here is Senator Revilla’s press release:


To ensure equal access of deaf Filipinos to public information, Senator Bong Revilla today filed a bill that would require all television networks to put closed captions in their news programs.

In his speech during the National Conference on Sustainable Partnership1for Deaf Transformation held today (Wednesday, November 12, 2008) at the Ople Hall of the Department of Labor and Employment (DOLE), the senator said there is an estimated 4.5 million deaf Filipinos, most of them poor,who have no access to programs that will help them realize their full potentials. “This is a very sad reality and government makes it worse by turning a blind eye to this fact. This is our biggest hurdle, and we will transcend this if we, the private sector and the government, work together and share in this responsibility,” said Revilla, chairman of the Senate Committee on Public Information and Mass Media.

Revilla stressed that all Filipinos should have equal rights guaranteed under the Constitution. “One of these rights that particularly elude the deaf is access to information. We must uphold Section . 7 of our Constitution that says the right of the people to information on matters of public concern shall be recognized,” he explained.

In pushing for an equal access to public information of deaf and hard of hearing Filipinos, Revilla simultaneously filed Senate Bill 2872 that would oblige all franchise holders or operators of television networks or stations and producers of television news programs to have these news programs broadcast with closed caption.

Closed-captioning refers to the method of subtitling television programs by coding statements as vertical interval data signal that are decoded at the receiver and superimposed at the bottom of the television screen.

Under the bill, any owner or operator of television networks or stations and any producer of television news programs who shall violate the requirement shall be punished by a fine of not less than Fifty Thousand Pesos (P50,000.00) but not more than One Hundred Thousand Pesos (PlOO,OOO.OO) or ,by imprisonment of not less than six (6) months but not more than one (1) year or both such fine and imprisonment at the discretion of the court.

If the offender is a corporation, partnership or association, or any other judicial person, the president, manager, administrator or the person-in-charge of the management of the business shall be liable therefore. In addition, the license or permit to operate the business shall be canceled.

“The passage of this bill will address the constitutional mandate for the state to recognize the basic right of the people to information on matters of public concern,” Revilla pointed out.

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to Ma.gnoliaAdd to TechnoratiAdd to FurlAdd to Newsvine

Closed Captioning on the Internet

Here is the summary of the lecture I presented during the centennial anniversary celebration of Philippines School for the Deaf last December 2007.

Closed Captioning is one of the important priority recommendations of Web Accessibility Initiative of World Wide Web Consortium or W3C (Priority 1-1.3 and 1-1.4) and of Philippine Web Accessibility Group Maturity Stage 1-5.

If you have AUDIO content in your website, provide written transcriptions of it. Let’s say, if an audio link of your company’s official anthem can be heard from your website, create a separate web page where the lyrics of the song can be read.

This can also be true with VIDEO clips. You may include closed captions at the bottom while the video is being played so that a deaf person can understand the conversations.

Now, what is a closed caption?
Closed caption is a text that is displayed often at the bottom of a video display that transcribes speech and other relevant sounds. As the video plays, caption describes all significant audio content and non-speech information, such as the identity of speakers and their manner of speaking, along with music or sound effects using words or symbols.

How does it help the persons with disabilities?
Closed captioning symbolized as CC, allows people who are deaf or hard of hearing to read a transcript or dialog of the audio portion of a video, film or other presentation.

Are there other people who can benefit from closed captioning?
Aside from the hearing impaired users, other people can benefit from closed captioning. These include:

  • People who want to learn new language;
  • People who are starting to learn how to read;
  • People who are in a noisy environment;

There are also some hearing people that suffer from the Central Auditory Processing Disorder. It means they confuse human voices with background noises as well as determining direction of the sound.

What is the difference between Closed Caption and Open Caption?
The term “closed” in closed captioning means that not all viewers see the captions—only those who decode or activate them. This distinguishes from “open captions,” where all viewers see the
captions, calling permanently visible captions in a video, film, or other medium “open”, “burned-in” or “hard coded” captions.

Closed caption should not be confused with subtitling although they often interchange. SUBTITLES are what we often see on DVD-Movies. Like CC, they are also seen at the bottom of the screen during
dialogues. However, subtitles can be translations of the dialogue in a foreign language while closed caption are direct transcription of speeches.

How do I put closed captions in video files?
There are two media formats that can be used in creating closed captions. One is the Synchronized Accessible Media Interchange or SAMI developed by Microsoft to be compatible with their Windows Media Player (.wmv) format. The .smi file is created to synchronize with the video. The other one is Synchronized Multimedia Integration Language or (SMIL). It is a W3C Recommended XML markup language for describing multimedia presentations. SMIL is considered as an industry standard. This language can be used on other non-Microsoft popular formats like RealPlayer (.rm), Apple Quicktime (.mov) or MPEG (.mpg or .mp4) files. It also creates a .sml file to synchronize with the video. SMIL works in all browsers including Internet Explorer while SAMI can only be viewed exclusively in IE.

Adobe Shockwave Flash videos can also be closed captioned. But the technique in putting one is by embedding it inside the flash file itself.

Windows Media Player does not support audio descriptions at this time. So the closed caption appears outside the screen and can only be synchronized as an object embedded format in a web page. RealPlayer and Quicktime formats place the closed caption inside the screen.

There are companies that create closed captions for a fee. However, there is a free software can be used to create the closed caption. It’s called Magpie 2.0 created by the The Carl and Ruth Shapiro
Family National Center for Accessible Media (NCAM). It is a research and development facility dedicated to the issues of media and information technology for people with disabilities in their homes,
schools, workplaces, and communities. You can visit their site and download their software at:

On my next blog entry, I will post an example of a close captioned video file.

Blog at

Up ↑