News

Scroll’s AI journey: Progress, challenges and some key learnings

2024-03-04. Scroll, an independent Indian news organisation, began exploring ways to incorporate Artificial Intelligence into its workflow in 2021. The digital-only organisation wanted to adapt its long-form reportage to different formats and video seemed to be its best bet to reach a large audience.

by Neha Gupta neha.gupta@wan-ifra.org | March 4, 2024

This decision aligned with projections from a Bain & Company report, which indicated that short-form video is poised to grow to 600-650 million users in India by 2025.

“This was an incredible revelation for us. We are a very small newsroom and the idea was to get a slice of that audience,” said Sannuta Raghu, Lead – AI Working Group for News & Journalism at Scroll Media.

Language constraints in tech: The URL-to-MP4 journey

Scroll has developed a URL-to-MP4 tool that essentially extracts media from a text article, allowing the user to add any additional media, and generates an MP4 (video) file, ready to be posted onto social channels – in less than three minutes.

This tool enables journalists to condense articles spanning 500 to 2,000 words into concise 30-second videos.

“The generated video needs to be convincing while also upholding the standards of Scroll’s journalism,” Raghu said. “However, initially, these videos only sounded convincing, but didn’t make much sense.”

In the past year, leveraging insights from nearly 200 million contributors to ChatGPT, Scroll has made significant headway with its tool, achieving enhanced contextualisation of text. 

Catch Sannuta Raghu on 14 March at Digital Media India 2024 in New Delhi where she will talk about optimising AI for business. 

Although the tool has not reached its full potential, instances of hallucinations – erroneous or misleading AI-generated results – have notably decreased.

That said, Scroll faces a significant challenge in adapting this tool to Indian audiences due to limited scalability, despite having the technology. 

India is home to 22 official languages but the URL-to-MP4 tool only serves English, German and (partially) Hindi (one of the country’s official languages, alongside English). 

“Since LLMs tend to have regional Indian languages as low resource data, there’s not enough training material at this point,” Raghu said. 

“There’s a very evident American and Western European bias in most popular LLMs. We’ve been testing edge cases and the LLM has performed extremely well in English and German, but fails to understand Hindi and other Indian regional languages,” she added.

Navigating AI integration in newsrooms

Raghu recommends initiating formal AI integration by assembling a diverse multi-stakeholder working group spanning company verticals. Subsequently, journalists or editors should secure a buy-in from a key decision-maker to proceed.

“In 2018 and 2021, when we were applying for grants, AI was not a topic of conversation in newsrooms. However, the moment you have a key decision maker on your side, and it aligns with the goals of your company, the next few steps become simple,” she said.

Raghu strategically framed the proposal to align with the company’s mission.

“I was aware of the problems we were facing and the goals we were chasing. Traffic was a goal, but we also couldn’t afford to hire more people. So how do you reach larger audiences and more languages with very few people on the job? That was the hurdle we were faced with,” she said. 

Two years into dabbling with the technology, the company is still cautiously optimistic.

“We don’t know what turn this technology is going to take and it is fair for people to be sceptical about it. But at the same time, can we cautiously but optimistically go ahead and try a few things and see what they do for us?” she asked.

And that’s exactly what Scroll has done. The company is conscious of not letting the technology affect its reportage and journalism, and is using it mainly as a traffic tool.

Follow AI Unlocked – A WAN-IFRA and Fathm initiative

A horde of AI tools in the works

In addition to the URL-to-MP4 tool, here are a few more projects under development at Scroll:

  • An LLM-based (Large Language Model) research tool/chatbot to query Scroll’s journalism of 10 years in natural language and fact-check. It’s an internal-facing tool and not meant for an external user. 
  • An LLM-based research tool to query large datasets and documents from credible government sources. Raghu pointed out that while the accuracy of this tool is not ideal and the model is not able to reason, it extracts information quite well. 
  • A proof of concept for an internal style guide-based autocorrect tool. 

“Our ecosystem approach is intended to make the lives of our journalists easy. The idea is that each of these tools will one day sit within our CMS,” Raghu added. 

If earlier Scroll was producing two videos a day, AI enabled the organisation to put out an additional 10 videos, achieving volume and scale, which in turn brought in more views.

Optimal scenarios for AI usage

Additionally, Raghu also listed a few early decisions the company took for the internal use cases of AI. 

  • No to adapting the style of a particular writer to generate related or unrelated text copy in their style.
  • No to photo-realistic avatars of journalists and event recreations.
  • Yes to illustrations and illustrated avatars (not in the likeness of a human – dead or alive).
  • Yes to summarisation, brainstorming, classification, extraction, SEO optimisation in headlines.
  • Yes to re-writing non-core, non-original, non-reported text.

Key learnings in a small newsroom

Here are some of Raghu’s recommendations for other news publishers based on Scroll’s AI experiments:

  • Do an intention vs resource audit first: Identify what your big needle movers could be if you were to use AI. 
  • “Human in the loop” is crucial but it is going to increase workload initially: Train and support your frontline staff, in this case, Scroll’s social media team. “
  • Don’t rush to hit publish.
  • Data in = data out: Don’t expect magic. “If you have bad data, it is going to produce a bad result,” said Raghu.
  • Red teaming – the act of ethical hackers authorised by the organisation to emulate real attackers’ tactics – is very helpful
  •  Get a diverse group of colleagues to build a “use of AI” policy (with the information at hand).

Next up: Raghu’s team at Scroll is developing a bridge product to connect users and journalists, trying to figure out a net positive user experience, through AI integration.

Join our next AI Unlocked webinar on 26 March, which will address Artificial Intelligence, Trust and Audience.

Neha Gupta

Multimedia Journalist

neha.gupta@wan-ifra.org

Share via
Copy link