Skip to main content

Using AI to explore the future of news audio

Radio reaches more Americans every week than any other platform. Public radio stations in the United States have over 3,000 local journalists and each day they create audio news reports about the communities they serve. But news audio is in a similar place as newspaper articles were in the 1990s: hard to find, and difficult to sort by topic, source, relevance or recency. News audio can not delay in improving its discoverability. 

KQED is the most listened to public radio station in the United States, and one of the largest news organizations in the Bay Area. In partnership with Google, KQED and KUNGFU.AI, an AI services provider and leader in applied machine learning, ran a series of tests on KQED’s audio to determine how we might reduce the errors and time to publish our news audio transcripts, and ultimately, make radio news audio more findable. 

“One of the pillars of the Google New Initiative is incubating new approaches to difficult problems,” said David Stoller, Partner Lead for News & Publishing at Google “Once complete, this technology and associated best practices will be openly shared, greatly expanding the anticipated impact.” 


What makes finding audio so much harder?

In order for news audio to be searched or sorted, the speech must first be converted to text.  This added step is trickier than it seems, and currently puts news audio at a disadvantage for being found quickly and accurately. Transcription takes time, effort and bandwidth from newsrooms — not something that is in abundance these days. Even though there have been great advances in speech to text, when it comes to news, the bar for accuracy is very high. As someone who works to make KQED’s reporting widely available, it is frustrating when KQED’s audio isn’t prominent in search engines and news aggregators.


The challenge of correctly identifying who, what and where

For our tests, KQED and KUNGFU.AI, applied the latest speech-to-text tools to a collection of KQED’s news audio. News stories try to address the “five Ws:” who, what, when, where and why. Unfortunately, because AI typically lacks the context in which the speech was made (i.e. identity of the speaker, location of the story), one of the most difficult challenges of automated speech-to-text is correctly identifying these types of proper nouns, known as named entities. KQED’s local news audio is rich in references of named entities related to topics, people, places, and organizations that are contextual to the Bay Area region. Speakers use acronyms like “CHP” for California Highway Patrol and “the Peninsula” for the area spanning San Francisco to San Jose. These are more difficult for artificial intelligence to identify.

When named entities aren’t understood, machine learning models make their best estimation of what was said. For example, in our test, “The Asia Foundation” was incorrectly transcribed as “age of Foundations” and “misgendered” was incorrectly transcribed as “Miss Gendered.”  For news publishers, these are not just transcription errors, but editorial problems that change the meaning of a topic and can cause embarrassment for the news outlet. This means people have to go in and correct these transcriptions, which is expensive to do for every audio segment. Without transcriptions, search engines can't find these stories, limiting the amount of quality local news people can find online.

An illustration showing a new proposed process for audio transcription where the human correcting the mistakes in the first version helps inform it to make the transcription more clear, accurate for the future.

A machine learning ↔ human ↔ machine learning feedback loop

At KQED, our editors can correct common machine learning errors in our transcripts. But right now, the machine learning model isn’t learning from its mistakes. We’re beginning to test out a feedback loop in which newsrooms could identify common transcription errors to improve the machine learning model. 

We're confident that in the near future, improvements into these speech-to-text models will help convert audio to text faster, ultimately helping people find audio news more effectively. 



by Tim OlsonKQED via The Keyword

Comments

Popular posts from this blog

certain keys on my keyboard dont work when "cold"

Hi guys, i have a Lenovo Y520-15IKBN (80WK) and certain keys on the keyboard don't work (e,g,h,8,9,Fn...) but only when the weather is cold. for example in the winter it used to work after certain amount of time when i first boot the laptop and stops working when i stop using it for a while, but now that the weather is hot it works just fine except for the first couple of minutes or when its colder. of course i do realise that it has nothing to do with the outside weather but with the temperature of the computer itself. can someone explain to me why this is happening and how it should be fixed as i cannot take it to the tech service until july even though it's still under warranty because i need it for school. ps: an external keyboard works fine. Submitted April 29, 2018 at 03:35PM by AMmej https://ift.tt/2KiQg05

Old PC with a Foxconn n15235 motherboard needs drivers! Help!!

So my Pc corrupted and I had to fresh install windows on it, but now its missing 3 drivers and one of them is for the Ethernet controller! I've tried searching everywhere for the windows 7 drivers but all I seem to find are some dodgey programs saying they will install it for me. Problem is without the ethernet driver I can't bloody connect to the internet. I've been using a USB to try get some drivers on there, but they just end up being useless programmes . I'm also a bit of a noob at these things, I don't understand where to find the names of things in my PC, I've opened it up but I don't understand whats significant and what isnt. If someone has the drivers and can teach me how to install them I'd be very appreciative! Submitted April 29, 2018 at 02:47PM by darrilsteady https://ift.tt/2r76xMZ