Videos and OL — Quick turnarounds with the help of AI? Yes, sir!
If you read our last post, you know we are interested in artificial intelligence. This week we want to talk a little bit about how we’re actually using artificial intelligence within the Digital Academy to solve a very real problem or — as marketing people would say — embrace “an opportunity”.
Like all federal organizations, the Digital Academy has an obligation to publish everything in both official languages. Legally, and because it is the right thing to do, from an inclusion perspective.
As you can imagine, it’s a non-trivial exercise. We decided to test the artificial intelligence built into YouTube to help us accelerate the process, and we thought you may be interested to know more.
We think that we can deliver content beginning to end, press publish in 48 hours. Think we’re crazy? Let us convince you.
Getting a transcript — AI and human touches
We start by shooting a video, then upload it to YouTube. YouTube’s natural language processing
will generate an auto transcript. So far, so good. Except, that we know that those transcripts are about 80% to 85 % good. YouTube know that too. So it gives us the ability to edit the transcript right there in the platform “editor”, and to make it a 100% right. For example, we can correct punctuation and capitalization. But also some of the words that YouTube just doesn’t recognize. At the end of that process we have a complete, quality transcript.
Closed captioning on original video… and the beginning of a translation
Once we have a quality transcript, we can turn it into a high quality closed captioning for people who choose to read the content as they’re watching the video.
During this step, we can ask YouTube to generate real-time translations into other languages including French, of course. No, we’re not going to pretend that this translation is good enough. Our guess is that the translation is probably 80% to 85 % good, but it’s not ridiculous and it’s kind of fun to check out.
At this point, we have a final English video with closed captioning, a transcript of that video, and a less than perfect French translation, which brought us to test another avenue.
Killing two deliverables with one script
Let’s say — for the sake of experimenting how much content we can create in 48 hours — we decided to turn the video into a blog post. We handed the English transcript to a human writer, because a script is actually not a very interesting read. It needs the magic touch of a skilled writer.
The same script was run through a translation software, to generate french-language script. For the purposes of our experiment, we’re testing a software called DEEPL, a real-time translation software that allows us to have a very good script that we hand to the narrator of the French video.
This French script are meant as speaking points, not as something you would read textually. It just gives the narrator the background on what we’re trying to accomplish, but they do the video the way they want to do it.
Once the French video is recorded
We generate the French language transcript (with what was actually said), and edit it the same way we did for the English video, that is through YouTube’s natural language processing and editing tool. And then, we generate quality closed captioning.
The final French transcript is given to a human writer, who will generate the French blog post. You read that right: the blog post is not translated from a full English blog post, but rather a better re-written version of the transcript.
Beginning to end: 48 hours. Dare we say: “Not bad.”!
At the end of this experiment, we have two videos. We have audio for both. We have two blog posts. We have closed captioning and real-time translation.
As you know, this usually can take weeks, if not longer. So it feels like we’ve actually had a real breakthrough here. And we plan to experiment more with this process in the coming weeks. We’ll tell you all about our experiments, as we test more applications of AI, and as we refine the way that we generate our content in multiple languages.
Note: This blog post was generated by a human being from the transcript, in less than one hour.