Captioning Options for Your Online Conference

Many conferences have moved online this year due to the pandemic, and many attendees are expecting captions on videos (both live and recorded) to help them understand the content. Captions can help people who are hard of hearing, but they also help people who are trying to watch presentations in noisy environments and those who lack good audio setups as they are watching sessions. Conferences arguably should have been providing live captions for the in-person events they previously held. But since captions are finally becoming a wider a topic of concern, I want to discuss how captions work and what to look for when choosing how to caption content for an online conference.

There was a lot of information that I wanted to share about captions, and I wanted it to be available in one place. If you don’t have the time or desire to read this post, there is a summary at the bottom.

Note: I’m not a professional accessibility specialist. I am a former conference organizer and current speaker who has spent many hours learning about accessibility and looking into options for captioning. I’m writing about captions here to share what I’ve learned with other conference organizers and speakers.

Closed Captions, Open Captions, and Subtitles

Closed captions provide the option to turn captions on or off while watching a video. They are usually shown at the bottom of the video. Here’s an example of one of my videos on YouTube with closed captions turned on.

YouTube video with closed captions turned on and the caption text shown along the bottom. The CC button on the bottom has a red line under it indicating it is on.
YouTube video with closed captions turned on. The CC button at the bottom has a red line under it to indicate the captions are on.

The placement of the captions may vary based upon the service used and the dimensions of the screen. For instance, if I play this video full screen on my wide screen monitor, the captions cover some of the content instead of being shown below.

Open captions are always displayed with the video – there is no option to turn them off. The experience with open captions is somewhat like watching a subtitled foreign film.

But despite captions often being referred to colloquially as subtitles, there is a difference between the two. Captions are made for those who are hard of hearing or have auditory processing issues. Captions should include any essential non-speech sound in the video as well as speaker differentiation if there are multiple speakers. Subtitles are made for viewers who can hear and just need the dialogue provided in text form.

For online conferences, I would say that closed captions are preferred, so viewers can choose whether or not to show the captions.

How Closed Captions Get Created

Captions can either be created as a sort of timed transcript that gets added to a pre-recorded video, or they can be done in real time. Live captioning is sometimes called communication access real-time translation (CART).

If you are captioning a pre-recorded video, the captions get created as a companion file to your video. There are several formats for caption files, but the most common I have seen are .SRT (SubRip Subtitle), .VTT (Web Video Text Tracks). These are known as simple closed caption formats because they are human readable – showing a timestamp or sequence number and the caption in plain text format with a blank line between each caption.

Who Does the Captions

There are multiple options for creating captions. The first thing to understand is that captioning is a valuable service and it costs money and/or time.

In general, there are 3 broad options for creating captions on pre-recorded video:

  • Authors or conference organizers manually create a caption file
  • Presentation software creates a caption file using AI
  • A third-party service creates a caption file with human transcription, AI, or a combination of both

Manually creating a caption file

Some video editing applications allow authors to create caption files. For example, Camtasia provides a way to manually add captions or to upload a transcript and sync it to your video.

Alternatively, there is a VTT Creator that lets you upload your video, write your captions with the video shown so you get the timing right, and then output your .VTT file.

Another approach is to use text-to-speech software to create a transcript of everything said during the presentation and then edit that transcript into a caption file.

Services like YouTube offer auto-captioning, so if it’s an option to upload as a private video to get the caption file from there, that is a good start. But you will need to go back through and edit the captions to ensure accuracy with either of these approaches. Vimeo also offers automatic captioning, but the results will also need to be reviewed and edited for accuracy.

These are valid approaches when you don’t have other options, but they can be very time consuming and the quality may vary. This might be ok for one short video, but is probably not ideal for a conference.

If you are going to make presenters responsible for their own captions, you need to provide them with plenty of time to create the captions and suggest low-cost ways to auto-generate captions. I’ve seen estimates that it can take up to 5 hours for an inexperienced person to create captions for one hour of content. Please be aware of the time commitment you are requesting of your presenters if you put this responsibility on them.

Captions in Your Presentation Software

Depending on the platform you use, your presentation software might provide AI-driven live captioning services. This is also known as Automatic Speech Recognition (ASR). For example, Teams offers a live caption service. As of today (November 2020), my understanding is that Zoom, GoToMeeting, and GoToWebinar do not offer built-in live caption services. Zoom allows you to let someone type captions or integrate with a 3rd party caption service. Zoom and GoToMeeting/GoToWebinar do offer transcriptions of meeting audio after the fact using an AI service.

PowerPoint also offers live captioning via its subtitles feature. My friend Echo made a video and blog post to show the effectiveness of PowerPoint subtitles, which you can view here. There are a couple of things to note before using this PowerPoint feature:

  1. It only works while PowerPoint is in presentation mode. If you have demos or need to refer to a document or website, you will lose captions when you open the document or web browser.
  2. If you are recording a session, your subtitles will be open subtitles embedded into your video. Viewers will not be able to turn them off.
  3. The captions will only capture the audio of the presenter who is running the PowerPoint. Other speakers will not have their voice recorded and will not be included in the captions.

Google Slides also offers live captions. The same limitations noted for PowerPoint apply to Google Slides as well.

Third-Party Caption Services

There are many companies that provide captioning services for both recorded and live sessions. This can be a good route to go to ensure consistency and quality. But all services are not created equal – quality will vary. For recorded sessions, you send them video files and they give you back caption files (.VTT, .SRT, or another caption file format). They generally charge you per minute of content. Some companies offer only AI-generated captions. Others offer AI- or human-generated captions, or AI-generated captions with human review. Humans transcribing your content tends to cost more than AI, but it also tends to have a higher accuracy. But I have seen some impressively accurate AI captions. Captions on recorded content are often less expensive than live captions (CART).

Below are a few companies I have come across that offer caption services. This is NOT an endorsement. I’m listing them so you can see examples of their offerings and pricing. Most of them offer volume discount or custom pricing.

  • Otter.ai – offers AI-generated captions for both recorded and live content, bulk import/export, team vocabulary
  • 3PlayMedia – offers AI-generated and human-reviewed captions for recorded content, AI-generated captions for live content. (Their standard pricing is hidden behind a form, but it’s currently $0.60 per minute of live auto-captioning and $2.50 per minute of closed captions for recorded video.)
  • Rev – offers captions for both recorded and live content, shared glossaries and speaker names to improve accuracy.

The Described and Captioned Media Program maintains a list of captioning service vendors for your reference. If you have used a caption service for a conference and want to share your opinion to help others, feel free to leave a comment on this post.

Questions for Conference Organizers to Ask When Choosing a Captioning Vendor

For recorded or live video:

  • What is your pricing model/cost? Do you offer bulk discounts or customized pricing?
  • Where/how will captions be shown in my conference platform? (If it will overlay video content, you need to notify speakers to adjust content to make room for it. But try to avoid this issue where possible.)
  • Is there an accuracy guarantee for the captions? How is accuracy measured?
  • Can I provide a list of names and a glossary of technical terms to help improve the caption accuracy?
  • Does the captioning service support multiple speakers? Does it label speakers’ dialogue to attribute it to the right person?
  • Does the captioning service conform to DCMP or WCAG captioning standards? (Helps ensure quality and usability)
  • How does the captioning service keep my files and information secure (platform security, NDAs, etc.)?
  • What languages does the captioning service support? (Important if your sessions are not all in English)

For recorded video:

  • Does my conference platform support closed captions? (If it doesn’t, then open captions encoded into the video will be required.)
  • What file type should captions be delivered in to be added to the conference platform?
  • What is the required lead time for the captioning service to deliver the caption files?
  • How do I get videos to the caption service?

For captions on live sessions:

  • Does the live caption service integrate with my conference/webinar platform?
  • How do I get support if something goes wrong? Is there an SLA?
  • What is the expected delay from the time a word is spoken to when it appears to viewers?

Further Captioning Advice for Conference Organizers

  • Budget constraints are real, especially if you are a small conference run by volunteers that doesn’t make a profit. Low quality captions can be distracting, but no captions means you have made a decision to exclude people who need captions. Do some research on pricing from various vendors, and ask what discounts are available. You can also consider offering a special sponsorship package where a sponsor can be noted as providing captions for the conference.
  • If you are running a large conference, this should be a line item in your budget. Good captions cost money, but that isn’t an excuse to go without them.
  • If your conference includes both live and recorded sessions, you can find a vendor that does both. You’ll just want to check prices to make sure they work for you.
  • If your budget means you have to go with ASR, make sure to allow time to review and edit closed captions on recorded video.
  • Try to get a sample of the captions from your selected vendor to ensure quality beforehand. If possible for recorded videos, allow speakers to preview the captions to ensure quality. Some of them won’t, but some will. And it’s likely a few errors will have slipped through that can be caught and corrected by the speakers or the organizer team. This is especially important for deeply technical or complex topics.
  • Make sure you have plenty of lead time for recorded videos. If a speaker is a few days late delivering a video, make sure their video can still be captioned and confirm if there is an extra fee.

Final Thoughts and Recap

If you’d like more information about captions, 3PlayMedia has an Ultimate Guide to Closed Captioning with tons of good info. Feel free to share any tips or tricks you have for captioning conference sessions in the comments.

I’ve summarized the info in this post below for quick reference.

Terms to Know

  • Closed captions: captions that can be turned on and off by the viewer
  • Open captions: captions that are embedded into the video and cannot be turned off
  • CART: communication access real-time translation, a technical term for live captioning
  • ASR: automatic speech recognition, use of artificial intelligence technology to generate captions
  • .SRT and .VTT: common closed caption file formats

Choosing a Captioning Solution for Your Conference

(Click to enlarge)

Diagram summarizing decision points when choosing a captioning solution. For high budget, choose human generated/reviewed captions from a service. For low budget and moderate time, choose ASR captions. For no budget, choose ASR built into presentation/conference software. Otherwise, someone will need to manually create captions. If you can't provide captions, let viewers know in advance.
This diagram represents general trends and common decision points when choosing a captioning solution. Your specific situation may vary from what is shown here

Summary of Caption Solutions

Manual creation of caption files for recorded sessions
Cost: None
Time/Effort: High
Pros:
• Doesn’t require a third-party integration
• Supports closed captions
• Works no matter what application is shown on the screen
• Works not matter what application is used to record and edit video
Cons:
• Accuracy will vary widely
• Manual syntax errors can cause the file to be unusable

Upload to YouTube, Vimeo or another service that offers free captions
Cost: None to Low
Time/Effort: Medium
Pros:
• Supports closed captions
• Works no matter what application is shown on the screen
• Works no matter what application is used to record and edit video
Cons:
• Not available for live sessions
• Requires editing of captions to achieve acceptable accuracy
• Requires an account with the service and (at least temporary) permission to upload the video
• Accuracy will vary widely

Auto-generated captions in presentation software (e.g., PowerPoint, Google Slides)
Cost: Low
Time/Effort: Low
Pros:
• Works for live and recorded sessions
• No third-party integrations required
Cons:
• Requires that all presenters use presentation software with this feature
• Must be enabled by the presenter
• Won’t work when speaker is showing another application
• Often offers only open captions
• Accuracy may vary
• Often only captures one speaker

ASR (AI-generated) captions from captioning service
Cost: Medium
Time/Effort: Low
Pros:
• Works for live and recorded sessions
• Supports closed captions
• Works no matter what application is shown on the screen
• Works not matter what application is used to record and edit video
Cons:
• Accuracy may vary
• Requires planning to meet lead times for recorded sessions
• Poor viewer experience if delay is too large during live sessions

Human-generated or human-reviewed captions from a captioning service
Cost: High
Time/Effort: Low
Pros:
• Ensures the highest quality with the lowest effort from conference organizers and speakers
• Works for live and recorded sessions
• Works no matter what application is shown on the screen
• Works not matter what application is used to record and edit video
Cons:
• Requires planning to meet lead times for recorded sessions
• Poor viewer experience if delay is too large during live sessions

I hope you find this exploration of options for captions in online conference content helpful. Let me know in the comments if you have anything to add to this post to help other conference organizers.

Contact the Author | Contact DCAC

I’m Speaking at Virtual PASS Summit 2020

PASS Summit has gone virtual this year, but that isn’t keeping PASS from delivering a good lineup of speakers and activities. I’m excited to be presenting a pre-con and two regular sessions this year. I know virtual delivery changes the interaction between audience and speaker, and I’m going to do everything I can to make my sessions more than just standard lecture and demo to keep things interesting.

Building Power BI Reports that Communicate Insights and Engage People (Pre-Con)

If you are into Power BI or data visualization, check out my pre-con session. It’s called Building Power BI Reports that Communicate Insights and Engage People. Unless we’ve had data visualization training, the way we learn to make reports is by copying reports that others have made. But that assumes other people were designing intentionally for human consumption. Another issue is that we often mimic example reports from tool vendors. That can be very helpful with the technical aspects of getting content on the page, but we often overlook the design aspects of reports that can make or break their usability and effectiveness in communicating information. My pre-con will begin with discussion on how humans interpret data visualizations and how you can use that to your advantage to make better, more consumable visualizations. We’ll take those lessons and apply them specifically to Power BI and then add on some tips and tricks. Throughout the day, there will be hands-on exercises and opportunities for group conversation. And you’ll receive some resources to take with you to help you continue to improve your report designs.

Agenda slide from my pre-con session: 1) Defining Success, 2) Message & Story, 3) Designing a Visual, 4) Refine Your Report 5) Applied Power BI 6) Power BI Tricks 7) Wrap-Up
Agenda for my PASS Summit pre-con titled Building Power BI Reports that Communicate Insights and Engage People

This session is geared toward people that have at least basic familiarity with Power BI Desktop (if you can populate a bar chart on a report page, that’s good enough). If you have never opened Power BI Desktop, we might move a little fast, but you are welcome to join us and give it a try. If you are pretty good with Power BI Desktop, but you want to improve your data visualization skills, this session could also be a good fit for you. I hope you’ll register and join my pre-con.

Implementing Data-Driven Storytelling Techniques in Power BI

Data storytelling is a popular concept, but the techniques to implement storytelling in Power BI can be a bit elusive, especially when you have data values that change as the data is refreshed. In this session, we’ll talk about what is meant by story. Then I’ll introduce you to tool-agnostic techniques for data storytelling and show you how you can use them in Power BI. We’ll also discuss the visual hierarchy within a page and how that affects your story. You can view my session description here.

Inclusive Presentation Design

I’m also delivering a professional development session for those of us that give presentations. Most speakers have good intentions and are excited to share their knowledge and perspective, but we often exclude audience members with our presentation design. Join me in this session to discuss how to design your presentation materials with appropriate content formatted to maximize learning for your whole audience. You’ll gain a better understanding of how to enhance your delivery to make an impact on those with varying abilities to see, hear, and understand your presentation. You can view my presentation description here.

Other Pre-Cons from My Brilliant Co-Workers

If you aren’t into report design, my DCAC coworkers are delivering pre-cons that may interest you.

Denny Cherry is doing a pre-con session on Microsoft Azure Platform Infrastructure.

John Morehouse is talking about Avoiding the Storms When Migrating to Azure.

I hope you’ll join one of us for a pre-con as well as our regular sessions. With PASS Summit being virtual, the lower price and removal of travel requirements may make this conference more accessible to some who haven’t been able to attend in past years. Be sure to get yourself registered and spread the word to colleagues.

Contact the Author | Contact DCAC

Power Up: Exploring the Power BI Ecosystem, May 27-28

Next week I’m speaking at at the Dynamic Communities Power Up event titled “Exploring the Power BI Ecosystem“. It takes place on May 27 & 28, 2020. This exciting 2-Day virtual event is designed to ensure attendees have a complete view of the Power BI product and surrounding ecosystem, provide expanded knowledge of the core components and showcase the possibilities for continued exploration and innovation.

Sessions during the event are 2.5 hours long, to really give you time to get into a topic. There are healthy 45-minute breaks between sessions to give you time to attend to personal matters. And the sessions are recorded to give you a chance to catch anything you miss. Some sessions, including mine, offer a take-home exercise to help solidify concepts discussed during the session.

I’m presenting Data Visualization and Storytelling on May 28 at 9am EST/1pm UTC. In this session, you will learn how to build eye-catching Power BI reports to support decision making. You will also see the importance and a realistic approach to data storytelling.

The following topics will be showcased through practical examples:

  • Creating beautiful reports: prioritizing your KPIs, playing with colors, grid
  • Choosing the best chart to illustrate your point
  • Introduction to the concept of Data Storytelling
  • Implementing quality checks on your report design
  • Implementing navigation in your report: bookmarks, drill-through, page-report tooltips, interactive Q&A

This training is a paid event, but it’s just $399 for the full 2 days. This training is great if you are a beginner-to-intermediate Power BI user trying to round out your skills across the many areas of the Power BI suite. You can head over to the website to register. I hope to see you there!

Contact the Author | Contact DCAC

I Presented with Live Captioning and Sign Language Interpreters

I had the pleasure of presenting a full-day pre-conference session on the Friday before SQLSaturday Austin-BI last weekend. I could spend paragraphs telling you how enjoyable and friendly and inclusive the event was. But I’d like to focus on one really cool aspect of my speaking experience: I had both closed captioning and sign language interpreters in my pre-con session.

First, let’s talk about the captions. While PowerPoint does have live captions/subtitles, that only works when you are using PowerPoint. When you show a demo or go to a web page, taking PowerPoint off the screen, you lose that ability. So we had a special setup provided by Shawn Weisfeld (Twitter|GitHub).

How the Live Captions Worked

A presenter uses a lavalier mic that sends audio to Epiphan Pearl. The presenter's computer sends video to Epiphan Pearl. Epiphan Pearl sends audio to a computer that sends audio to Azure and receives captions. The computer overlays the captions above the images from teh presenters laptop. That is all sent to the projector.
Technology setup at SQLSaturday Austin- BI Edition 2020 that provided live captions

The presenter connects their laptop to the Epiphan Pearl with an HDMI cable so they can send the video (picture) from the laptop. The speaker wears a lavalier microphone, which sends audio to the Pearl. The transcription green screen computer takes audio from the Pearl, sends it to Azure to be transcribed using Cognitive Services, and overlays the returned transcription text on a green screen input that is sent back to the Pearl. The projector gets the combined output of the transcription text and the presenter’s computer video output.

You can see an example of what it looked like from my presentation on Saturday in the tweet below. There are lots more pictures of it on Twitter with the #SqlSatAustinBI hashtag.

While this setup requires a bit more hardware, it worked so well! It took about 10 minutes to get it set up in the morning. As the speaker, I didn’t have to do anything but wear a mic. It transcribed everything I said regardless of what program my laptop was showing. There was very little lag. It seemed to be less than one second between when I would say something and when we would see it on the screen. While I try to speak clearly and slowly, sometimes I slip and fall back into speaking quickly. But the transcription kept up well. Some attendees said it was great to have the captions up on the screen to help them understand what I said when I occasionally spoke too quickly. The captions are placed at the top of the screen, above the image coming from my laptop, so I didn’t have to adjust my slides or anything to allow space for the captions.

The live captions were a big success. They helped not only people who had trouble hearing, but also those who spoke English as a second language and those who weren’t familiar with some of the terms I used and needed to see them spelled out.

Presenting With Sign Language Interpreters

This was my first time presenting with sign language interpreters to help communicate with my audience. Since the pre-con session lasted multiple hours, there were two interpreters in my room. They would switch places about every hour. They were kind enough to answer a few questions for me during breaks.

I asked them if it was difficult to sign all the technical terminology used and if they tried to study up on terms ahead of time. One of them told me that they don’t study the subject and they fingerspell all the technical terms. Most of my terms were spelled on my slides, and I saw the interpreter look at the slide to get the spelling. When someone asked a question about the font I was using, the interpreter asked me to spell it out, since it wasn’t written anywhere. I asked if having printed slides helped (I provided PDFs of the slides to the attendees at the beginning of the session). One of the interpreters told me no, because they were already watching the signer for questions and watching my slides and listening to me.

What I loved most about having the interpreters there was that the person using the service got to fully participate in the session. They asked questions and made comments like anyone else. And they participated in hands-on small group activities.

Check out this great photo of one of the interpreters in action during a small group activity.

5 people sit in a group at a table while a sign language interpreter sits across the table and helps the group communicate
Photo of small group activities during my Power BI pre-con with a sign language interpreter in the group. Photo by Angela Tidwell

Having ASL interpreters didn’t require any extra effort on my part. I didn’t have to practice with them beforehand or provide them with any of my conference materials. They were great professionals and were able to keep up with me through lecture, demos, small group exercises, and Q&A.

Sign language interpreters cost money. And they should – they provide a valuable service. In this case, the interpreters were provided by the State of Texas because the person using the service worked for the state government. Because this was training for their job, the person’s employer was obligated to provide this service. So we were lucky that it didn’t cost us anything.

While the SQLSaturday organizers were coordinating the ASL interpreters, they found out that there is a fund in Texas that can help with accessibility services when a person’s employer doesn’t/can’t provide them. It may not be the same in every state, but it’s definitely something to look into if you need to pay for interpreters for an event like this.

Make Your Next Event More Accessible

I have organized events, and I understand the effort that it requires. I’m so happy that Angela and Mike made the effort to make SQLSaturday Austin-BI a more inclusive event. I would like to challenge you to do the same for the next event you organize or the next presentation you give at a tech conference.

Your conference may not be able to afford the Epiphan Pearl (note: the original model we used is discontinued, but there is a new model) and the Azure costs. I’d like to see SQLSaturdays join together and purchase equipment and share across events – it would be great if PASS would help with this. If we can’t do that, we could always start small with the built-in capabilities in PowerPoint and work our way up from there.

It was a great experience as a speaker and as an audience member to have the live captions. And I was so happy that someone wanted to attend my session and was making the effort to sign up and request the ASL interpreters. I hope we see more of that in the future. But we need to do our part to let people know that we welcome that and we will work to make it happen.

Contact the Author | Contact DCAC
1 2 3 5

Video

Globally Recognized Expertise

As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.

And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.

Awards & Certifications

Microsoft Partner   Denny Cherry & Associates Consulting LLC BBB Business Review    Microsoft MVP    Microsoft Certified Master VMWare vExpert
INC 5000 Award for 2020    American Business Awards People's Choice    American Business Awards Gold Award    American Business Awards Silver Award    FT Americas’ Fastest Growing Companies 2020   
Best Full-Service Cloud Technology Consulting Company       Insights Sccess Award    Technology Headlines Award    Golden Bridge Gold Award    CIO Review Top 20 Azure Solutions Providers