Open Intro Statistics Videos
David
**Answer**
Yes, our videos' captions have been manually reviewed and edited.
**For anyone else needing to caption their own videos**
We found the following process to be very efficient, which we completed for a ~dozen videos last year that didn't yet have manual captions:
1) Use YouTube's auto-captioning to get an initial draft.
2) Process the auto-captioning through Gemini, with a prompt to fix obvious typos, add punctuation, and do proper capitalization. Here's the exact prompt we settled on after a iterating with a handful of videos:
"""
Please clean up the following caption text. Include proper punctuation and fix typos, but do not change the wording unless there is a clear grammatical error or incorrect word. Add line breaks to make new paragraphs, but do not add headers to the output, keep everything as regular text, do not break out equations separately, and do not add images or video. For any spot where an Em dash is suitable, use two hyphens in place of the Em dash. Because this is caption text for deaf listeners, do not drop text unless it appears to be a transcription error.
Following the instructions above, add a space before and after any instances of two consecutive hyphens.
"
[raw YouTube auto-generated transcript]
"
"""
This cleaned up about 90% of errors output from YouTube's default auto-captioning system in addition to adding in capitalization and punctuation.
3) Manual review to catch the last 10-20% of errors in the captions and fix them.
Interestingly, step #2 is *exactly* the kind of application that an LLM is perfect for, because an LLM is a word prediction algorithm based on the surrounding text, so it can do a really good job with a good prompt.
Best,
David
lnussdor
thank you for sharing your process with creating captions.
I used it this morning on one of my videos and it worked beautifully.
Lisa
lnussdor
Hello,
I am using the Open Intro Org videos with my course. Do you know whether the captions on YouTube are auto- captioned? This is coming up for me because of the Title 2 requirements of having manually edited captioning. Thank you!