Captioning Options for Your Online Conference

Many conferences have moved online this year due to the pandemic, and many attendees are expecting captions on videos (both live and recorded) to help them understand the content. Captions can help people who are hard of hearing, but they also help people who are trying to watch presentations in noisy environments and those who lack good audio setups as they are watching sessions. Conferences arguably should have been providing live captions for the in-person events they previously held. But since captions are finally becoming a wider a topic of concern, I want to discuss how captions work and what to look for when choosing how to caption content for an online conference.

There was a lot of information that I wanted to share about captions, and I wanted it to be available in one place. If you don’t have the time or desire to read this post, there is a summary at the bottom.

Note: I’m not a professional accessibility specialist. I am a former conference organizer and current speaker who has spent many hours learning about accessibility and looking into options for captioning. I’m writing about captions here to share what I’ve learned with other conference organizers and speakers.

Closed Captions, Open Captions, and Subtitles

Closed captions provide the option to turn captions on or off while watching a video. They are usually shown at the bottom of the video. Here’s an example of one of my videos on YouTube with closed captions turned on.

YouTube video with closed captions turned on and the caption text shown along the bottom. The CC button on the bottom has a red line under it indicating it is on.
YouTube video with closed captions turned on. The CC button at the bottom has a red line under it to indicate the captions are on.

The placement of the captions may vary based upon the service used and the dimensions of the screen. For instance, if I play this video full screen on my wide screen monitor, the captions cover some of the content instead of being shown below.

Open captions are always displayed with the video – there is no option to turn them off. The experience with open captions is somewhat like watching a subtitled foreign film.

But despite captions often being referred to colloquially as subtitles, there is a difference between the two. Captions are made for those who are hard of hearing or have auditory processing issues. Captions should include any essential non-speech sound in the video as well as speaker differentiation if there are multiple speakers. Subtitles are made for viewers who can hear and just need the dialogue provided in text form.

For online conferences, I would say that closed captions are preferred, so viewers can choose whether or not to show the captions.

How Closed Captions Get Created

Captions can either be created as a sort of timed transcript that gets added to a pre-recorded video, or they can be done in real time. Live captioning is sometimes called communication access real-time translation (CART).

If you are captioning a pre-recorded video, the captions get created as a companion file to your video. There are several formats for caption files, but the most common I have seen are .SRT (SubRip Subtitle), .VTT (Web Video Text Tracks). These are known as simple closed caption formats because they are human readable – showing a timestamp or sequence number and the caption in plain text format with a blank line between each caption.

Who Does the Captions

There are multiple options for creating captions. The first thing to understand is that captioning is a valuable service and it costs money and/or time.

In general, there are 3 broad options for creating captions on pre-recorded video:

  • Authors or conference organizers manually create a caption file
  • Presentation software creates a caption file using AI
  • A third-party service creates a caption file with human transcription, AI, or a combination of both

Manually creating a caption file

Some video editing applications allow authors to create caption files. For example, Camtasia provides a way to manually add captions or to upload a transcript and sync it to your video.

Alternatively, there is a VTT Creator that lets you upload your video, write your captions with the video shown so you get the timing right, and then output your .VTT file.

Another approach is to use text-to-speech software to create a transcript of everything said during the presentation and then edit that transcript into a caption file.

Services like YouTube offer auto-captioning, so if it’s an option to upload as a private video to get the caption file from there, that is a good start. But you will need to go back through and edit the captions to ensure accuracy with either of these approaches. Vimeo also offers automatic captioning, but the results will also need to be reviewed and edited for accuracy.

These are valid approaches when you don’t have other options, but they can be very time consuming and the quality may vary. This might be ok for one short video, but is probably not ideal for a conference.

If you are going to make presenters responsible for their own captions, you need to provide them with plenty of time to create the captions and suggest low-cost ways to auto-generate captions. I’ve seen estimates that it can take up to 5 hours for an inexperienced person to create captions for one hour of content. Please be aware of the time commitment you are requesting of your presenters if you put this responsibility on them.

Captions in Your Presentation Software

Depending on the platform you use, your presentation software might provide AI-driven live captioning services. This is also known as Automatic Speech Recognition (ASR). For example, Teams offers a live caption service. As of today (November 2020), my understanding is that Zoom, GoToMeeting, and GoToWebinar do not offer built-in live caption services. Zoom allows you to let someone type captions or integrate with a 3rd party caption service. Zoom and GoToMeeting/GoToWebinar do offer transcriptions of meeting audio after the fact using an AI service.

PowerPoint also offers live captioning via its subtitles feature. My friend Echo made a video and blog post to show the effectiveness of PowerPoint subtitles, which you can view here. There are a couple of things to note before using this PowerPoint feature:

  1. It only works while PowerPoint is in presentation mode. If you have demos or need to refer to a document or website, you will lose captions when you open the document or web browser.
  2. If you are recording a session, your subtitles will be open subtitles embedded into your video. Viewers will not be able to turn them off.
  3. The captions will only capture the audio of the presenter who is running the PowerPoint. Other speakers will not have their voice recorded and will not be included in the captions.

Google Slides also offers live captions. The same limitations noted for PowerPoint apply to Google Slides as well.

Third-Party Caption Services

There are many companies that provide captioning services for both recorded and live sessions. This can be a good route to go to ensure consistency and quality. But all services are not created equal – quality will vary. For recorded sessions, you send them video files and they give you back caption files (.VTT, .SRT, or another caption file format). They generally charge you per minute of content. Some companies offer only AI-generated captions. Others offer AI- or human-generated captions, or AI-generated captions with human review. Humans transcribing your content tends to cost more than AI, but it also tends to have a higher accuracy. But I have seen some impressively accurate AI captions. Captions on recorded content are often less expensive than live captions (CART).

Below are a few companies I have come across that offer caption services. This is NOT an endorsement. I’m listing them so you can see examples of their offerings and pricing. Most of them offer volume discount or custom pricing.

  • Otter.ai – offers AI-generated captions for both recorded and live content, bulk import/export, team vocabulary
  • 3PlayMedia – offers AI-generated and human-reviewed captions for recorded content, AI-generated captions for live content. (Their standard pricing is hidden behind a form, but it’s currently $0.60 per minute of live auto-captioning and $2.50 per minute of closed captions for recorded video.)
  • Rev – offers captions for both recorded and live content, shared glossaries and speaker names to improve accuracy.

The Described and Captioned Media Program maintains a list of captioning service vendors for your reference. If you have used a caption service for a conference and want to share your opinion to help others, feel free to leave a comment on this post.

Questions for Conference Organizers to Ask When Choosing a Captioning Vendor

For recorded or live video:

  • What is your pricing model/cost? Do you offer bulk discounts or customized pricing?
  • Where/how will captions be shown in my conference platform? (If it will overlay video content, you need to notify speakers to adjust content to make room for it. But try to avoid this issue where possible.)
  • Is there an accuracy guarantee for the captions? How is accuracy measured?
  • Can I provide a list of names and a glossary of technical terms to help improve the caption accuracy?
  • Does the captioning service support multiple speakers? Does it label speakers’ dialogue to attribute it to the right person?
  • Does the captioning service conform to DCMP or WCAG captioning standards? (Helps ensure quality and usability)
  • How does the captioning service keep my files and information secure (platform security, NDAs, etc.)?
  • What languages does the captioning service support? (Important if your sessions are not all in English)

For recorded video:

  • Does my conference platform support closed captions? (If it doesn’t, then open captions encoded into the video will be required.)
  • What file type should captions be delivered in to be added to the conference platform?
  • What is the required lead time for the captioning service to deliver the caption files?
  • How do I get videos to the caption service?

For captions on live sessions:

  • Does the live caption service integrate with my conference/webinar platform?
  • How do I get support if something goes wrong? Is there an SLA?
  • What is the expected delay from the time a word is spoken to when it appears to viewers?

Further Captioning Advice for Conference Organizers

  • Budget constraints are real, especially if you are a small conference run by volunteers that doesn’t make a profit. Low quality captions can be distracting, but no captions means you have made a decision to exclude people who need captions. Do some research on pricing from various vendors, and ask what discounts are available. You can also consider offering a special sponsorship package where a sponsor can be noted as providing captions for the conference.
  • If you are running a large conference, this should be a line item in your budget. Good captions cost money, but that isn’t an excuse to go without them.
  • If your conference includes both live and recorded sessions, you can find a vendor that does both. You’ll just want to check prices to make sure they work for you.
  • If your budget means you have to go with ASR, make sure to allow time to review and edit closed captions on recorded video.
  • Try to get a sample of the captions from your selected vendor to ensure quality beforehand. If possible for recorded videos, allow speakers to preview the captions to ensure quality. Some of them won’t, but some will. And it’s likely a few errors will have slipped through that can be caught and corrected by the speakers or the organizer team. This is especially important for deeply technical or complex topics.
  • Make sure you have plenty of lead time for recorded videos. If a speaker is a few days late delivering a video, make sure their video can still be captioned and confirm if there is an extra fee.

Final Thoughts and Recap

If you’d like more information about captions, 3PlayMedia has an Ultimate Guide to Closed Captioning with tons of good info. Feel free to share any tips or tricks you have for captioning conference sessions in the comments.

I’ve summarized the info in this post below for quick reference.

Terms to Know

  • Closed captions: captions that can be turned on and off by the viewer
  • Open captions: captions that are embedded into the video and cannot be turned off
  • CART: communication access real-time translation, a technical term for live captioning
  • ASR: automatic speech recognition, use of artificial intelligence technology to generate captions
  • .SRT and .VTT: common closed caption file formats

Choosing a Captioning Solution for Your Conference

(Click to enlarge)

Diagram summarizing decision points when choosing a captioning solution. For high budget, choose human generated/reviewed captions from a service. For low budget and moderate time, choose ASR captions. For no budget, choose ASR built into presentation/conference software. Otherwise, someone will need to manually create captions. If you can't provide captions, let viewers know in advance.
This diagram represents general trends and common decision points when choosing a captioning solution. Your specific situation may vary from what is shown here

Summary of Caption Solutions

Manual creation of caption files for recorded sessions
Cost: None
Time/Effort: High
Pros:
• Doesn’t require a third-party integration
• Supports closed captions
• Works no matter what application is shown on the screen
• Works not matter what application is used to record and edit video
Cons:
• Accuracy will vary widely
• Manual syntax errors can cause the file to be unusable

Upload to YouTube, Vimeo or another service that offers free captions
Cost: None to Low
Time/Effort: Medium
Pros:
• Supports closed captions
• Works no matter what application is shown on the screen
• Works no matter what application is used to record and edit video
Cons:
• Not available for live sessions
• Requires editing of captions to achieve acceptable accuracy
• Requires an account with the service and (at least temporary) permission to upload the video
• Accuracy will vary widely

Auto-generated captions in presentation software (e.g., PowerPoint, Google Slides)
Cost: Low
Time/Effort: Low
Pros:
• Works for live and recorded sessions
• No third-party integrations required
Cons:
• Requires that all presenters use presentation software with this feature
• Must be enabled by the presenter
• Won’t work when speaker is showing another application
• Often offers only open captions
• Accuracy may vary
• Often only captures one speaker

ASR (AI-generated) captions from captioning service
Cost: Medium
Time/Effort: Low
Pros:
• Works for live and recorded sessions
• Supports closed captions
• Works no matter what application is shown on the screen
• Works not matter what application is used to record and edit video
Cons:
• Accuracy may vary
• Requires planning to meet lead times for recorded sessions
• Poor viewer experience if delay is too large during live sessions

Human-generated or human-reviewed captions from a captioning service
Cost: High
Time/Effort: Low
Pros:
• Ensures the highest quality with the lowest effort from conference organizers and speakers
• Works for live and recorded sessions
• Works no matter what application is shown on the screen
• Works not matter what application is used to record and edit video
Cons:
• Requires planning to meet lead times for recorded sessions
• Poor viewer experience if delay is too large during live sessions

I hope you find this exploration of options for captions in online conference content helpful. Let me know in the comments if you have anything to add to this post to help other conference organizers.

Share

One Response

  1. That makes sense that human-generated captions would be the best. I could see how a computer would have a harder time with some things. I’ll keep that in mind if I ever need to caption a conference.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?