Subscribe to our LinkedIn so you don't miss important media news and analysis
Share now
Editor’s note: This article is sponsored by BeyondWords, a company that offers text-to-speech publishing services for newsrooms and other businesses. In his partner column, co-founder and COO James MacLeod explores why deploying AI to convert text articles into audio is often more efficient than hiring human narrators.
As audiences spend more time listening on the move, audio articles have emerged as a crucial tool in news habit formation. Consequently, a bulk of digital publishers are using this format to drive advertising and subscription revenue – and as a valuable sonic branding tool.
While publishers like The Economist, Zetland, and Tortoise provide human narration, many are using voice AI to make their journalism listenable. Others, such as The Irish Times, use a combination of both.
The best approach depends on your newsroom’s setup, resources, and goals. But with ongoing developments in synthetic speech, the balance increasingly tips into AI’s favour.
Danish newspaper Zetland has offered journalist-read versions of its stories since 2017. Members consume articles via audio, rather than text, an incredible 80% of the time.
The publication believes the passion in the writer’s voice “invites you closer in a different way than reading does”, creating a campfire-like atmosphere that promotes engagement.
However, this personability and emotional intensity comes at a significant cost. Zetland pays a speech coach to train each journalist, and recording and editing a 15-minute story takes around an hour.
The Economist enlists external broadcasters to record its audio edition, which deputy editor Tom Standage describes as a “very effective retention tool”. But he previously noted that delivering it is “quite a challenge logistically”.
For most publishers, however, human narration simply isn’t feasible. The pace of the news cycle and the need to iterate articles after publishing makes it impossible to sustain this option.
That’s why more are turning to voice AI.
Integration with BeyondWords means audio versions of articles are published near-instantly and at low cost. There are no additional steps in the publishing workflow, and no delays for listeners. Plus, audio is automatically updated if the written story develops.
Customizable natural language processing (NLP) algorithms ensure text is spoken aloud in a natural manner. Irrelevant sections are automatically stripped and ambiguous elements, such as abbreviations, are converted into their most listener-friendly forms.
Neural voices are more realistic than ever, and a wide selection makes it easy to find a voice that resonates. The Irish Times, for instance, uses an Irish-accented English voice to better engage its regionalised listeners and ensure superior pronunciation of local terms.
Digital editor Paddy Logue says many listeners don’t realise it’s not a human that’s speaking.
Better yet, there is the option to develop a voice clone. This allows publishers to harness the engagement power of a particular speaker’s voice (such as a journalist’s) – without endless hours in front of the microphone. All that’s required is an upfront script recording, which is used to train the voice model.
Netwerk24, for example, worked with BeyondWords to create a lifelike voice clone of Afrikaans broadcaster Sue Pyler. As deputy site editor Kelly Anderson says, “it’s much more engaging to listen to a voice that sounds like our brand”.
Like human narration, the voice clone gives audio articles a distinctive and consistent sound, which strengthens sonic branding and boosts listener engagement. Unlike human voice over, it delivers results instantly, tirelessly, and at scale.
Karl Oskar Teien, director of product at Aftenposten, believes delivery at scale is crucial: “If all of a sudden the audio appears in some articles, but not all of them it can cause frustration. We want to give our readers the ability to play every article.”
Aftenposten worked with BeyondWords to create a voice clone of one of its podcast hosts. It has recorded an average listening duration of 75% on its resulting audio articles, showing just how well synthetic speech can keep users engaged.
Developments in quality and tooling, combined with growing audio demand, mean that voice AI now makes business sense. As David Tvrdon wrote: “You should include an audio version of your articles. There are no excuses anymore.”
If you’re interested in making your articles listenable with BeyondWords, book a meeting with the team. We can arrange a demo and propose a plan suited to your newsroom’s unique needs.
Everything you need to know about European media market every week in your inbox
Co-Founder and COO at BeyondWords
We are using cookies to give you the best experience on our website.
You can find out more about which cookies we are using or switch them off in settings.