AI is an increasingly hot topic among journalists. Opinions are divided, with two broad camps – those who fear the “robots” will take over and lead to job losses, and those who believe AI will create new opportunities, albeit probably not quite yet.
We spoke to an AI research scientist to get their take. Our questions were simple. Will those machines still need our input or are we done here? How can we make sure that we deploy them in the best possible ways?
Mirabelle Jones has studied creative technology with a focus on ethics and social justice for 10 years and produced works critically investigating the intersection of ethics and artificial intelligence. We discussed different AI tools being used in journalism, and how to work with them in a transparent, ethical, and responsible way.
This interview has been edited and condensed for clarity. Note: An example of an AI-generated text can be found at the end of this article.
AP: You wear many hats, one of which is story-telling. You studied creative writing. How does AI fit into storytelling practices and techniques?
MJ: Some of my earliest work with book art was around gesture recognition, and creative writing based on gesture recognition. It brought up questions of how computers see and interpret that and also (a huge question) of whether they can produce creative output.
AP: Writing calls for creativity but its role in reporting and writing fiction is different. Furthermore, much of the news content nowadays is about curating and replicating existing texts by creating mash-ups, memes, or lists…
MJ: That is very true. I think there is a subset of researchers interested in producing creative content but most people, by far, are looking for realistic, applicable uses of this technology – more like a reliable article, identifying cancer cells etc. which, increasingly, it can do.
AP: What does reliable mean though? Assuming you are using AI to help you gather and analyse data. Can it do so objectively?
MJ: There is a difference between how the public defines AI and what a data scientist would tell you AI is. The public usually describes AI as artificial intelligence and assumes there is some actual intelligence inside a machine that is producing content. This is not correct.
You asked me to look at two AI tools for journalists that turn datasets into stories, Automated Insights and Narrative Science. I find it interesting how these companies explain (or don’t explain) machine learning.
Automated Insights says “Natural language generation (NLG) is a technology that transforms data into clear, human-sounding narratives–for any industry and application.” The word “transforms” here is really interesting. It makes it sound like there is a magical process involved.
I feel that using such language to discuss AI and ML (machine learning), as if they were a form of magic, allows companies to skirt offering thorough explanations or taking responsibility for their product.
This is what Kate Crawford and Alexander Campolo call “enchanted determinism” [talking about technology as if it was magic – Editor]. This is not new in terms of how companies discuss AI but it is increasingly used for software services in which AI or ML is part of the product. So no, it is not objective.
AP: How does AI work then?
MJ: Artificial Intelligence usually refers to technologies which aren’t inherently “intelligent” but instead use a learning process to become “smarter.”
The learning process might teach an algorithm how to automatically identify patterns or features. For example, an algorithm might be trained to identify potentially cancerous tissue in medical scans. The learning process usually involves taking a big chunk of data and dividing it into a training set and a testing set. In our cancer example, you’d train an algorithm to identify the cancerous tissue and then you’d test it on the other half of the images to see how well it performs but also to update it and improve it. Once trained, this produces a model that can be deployed and used on other images.
There are ethical concerns that can arise all along the way from the selection of the data set to the labeling of data, to which algorithms are used and how, to finally how the model is deployed and used.
AP: What is labeling and how do you select a dataset?
MJ: Data sets are created by people and sometimes by other machine learning programs. A large and often tedious step in the creation of a data set is image labeling.
If a model is to be trained in order to, say, predict from a photo whether or not someone is trustworthy the data scientist needs to have a data set of labeled images that include examples of “trustworthy” and “untrustworthy” people – some awful [biggoted] examples can be found at the Awful AI Github Repo here.
This labeling is often initially done by people, often through services like Mechanical Turk and by people paid very little for their work and paid per labeled image – thus with an incentive to work fast rather than carefully. The opinions and views of these people are translated into the resulting labeled data.
AP: Many advocates for AI in newsrooms emphasise the opportunity to create personalised content for specific audiences. Do we need more specialised and personalised content? Won’t it just contribute to info bubbles and polarisation?
MJ: I was a journalist for a while and writing articles at some point became really brainless work where I was asked to produce articles in an F-shaped format [aligned to how people scan articles, which is in an “F-shaped” pattern, with hooks on the top and left-side – Editor]. As many photos as possible and easy to read captions. This is obviously a big shift from creating critical contributions that relate the news in a non-biased way but also get people to consider certain things.
Now that doesn’t seem to happen as much. We’re based in a system of creating content, as you mentioned – distribute the content, more is better, get the clicks. That has a lot to do with the current financing model for journalism.
Newspapers don’t exist anymore. In a way having an AI journalist would inevitably feed this model because you would be able to create more content faster and it wouldn’ matter if it was necessarily well written, but it would get you that SEO clickability.
AP: And you could have your own bot to assist readers in deciding what the algorithm should feed them next… BBC opened its chatbot code to help other media outlets create their own chatbots to “reach audiences who engage infrequently with our news content”.
MJ: Their belief, as they explain it, is that a less formal and more conversational style will be less off-putting to younger readers who are hungry for accurate, but accessible, news coverage. I think that’s an interesting angle. I wonder if we shouldn’t focus more in schools on teaching students to use critical thinking skills and how to read news articles, rather than accommodating short attention spans by switching the format of journalism.
I think the best articles are the ones where you really see the attitudes and opinions of the author in a way that is interesting and a lot of that interest is based on a world view and experiences within their community. That is a part of journalism that I think will be lost.
AP: What would be your ethical advice for newsrooms deploying or experimenting with AI? What should they focus on? What should be regulated centrally?
MJ: As a company you should examine what is it that you are trying to do and what are the points where you can have human intervention ethically. We all are ethical beings, we all have ethics involved in our daily interactions, and there are certain things that will stand out to us as ethical concerns, to all of us, as humans.
So to some extent you can rely upon having a human elder to look over an article that has been crafted by, say, GPT3 [Generative Pre-trained Transformer 3 – a language model that uses deep learning to produce human-like text – Editor] and observe where there might be ethical considerations.
You also could consider what data you are feeding your model. Having a two-point check system is a good route.
It gets complicated because a lot of news put out by human journalists these days is very polarised, not necessarily grounded in truth but in the attention economy and the clickability it brings. In a way, the concerns around GPT3 aren’t necessarily being applied to journalists and newsrooms at the moment.
AP: Can AI help us control our biases and fact-check content written by reporters? Associated Press created a tool to examine user-generated content back in 2017, there is also FullFact, a solution that helps spot fake news and “combat misinformation”.
MJ: I would be very interested in seeing this tool in action and learn more about what counts as a “credible source.” Does it include independent press or first-hand accounts? I’m thinking of things like the Occupy Movement, where so many major news sources either weren’t covering events or weren’t covering them in an unbiased way.
To get information at the time, individual accounts were the most useful. So I think this could be an incredibly useful tool, but I also wonder about how it will be implemented in a way that won’t silence or discredit marginalized voices.
It would definitely be interesting to see an ethically trained AI bot to look at articles and spot potential problems. It may be too difficult for an ML program to do that because it lacks the context, the community, the cultural connection to understand what might be problematic.
For example, when a philosopher AI bot is asked to address issues of racism, it starts saying it won’t answer anything that contains the word “black”, it will just redirect people and say “I can’t respond to that”. That is erasing a culture and also instances of cultural oppression as a way to mitigate cultural oppression. It becomes very complicated.
AP: Last year Reuters produced a prototype of a deepfake like, AI generated video report on sports. A presenter recorded all possible words and then developed a reporting scheme into which data sets of games would be fed. An AI would match the results with the scheme and apply the audiovisual layer on top. Would you have any ethical concerns about that? How does a presenter let go of editorial control? How do you work with that?
MJ: For me the concerns here are also about job security and employment in journalism. Another reason I left journalism was about not having access to basic workers’ rights. Featuring diversity can become an issue if you have the same person reading weather reports over and over.
And you know, how we are going to compensate those people? Are they going to get paid every time they “appear” on screen? Are they only compensated per recording and that is it?
In terms of acting and deep fake instances of live journalism, where you have an anchor present, it highlights the concern we talked about before which is the authenticity and cultural connection, and what makes an article an interesting contribution specifically because of author’s connection to the topic and that coming through an interrogation of the topic through a lens, a specific perspective, and an actual standpoint.
I think in some articles it is very interesting to consider who is actually reporting on the issue, whether they are including locals in their reporting and giving their community a voice. That is a very important part – who has the ability to create, share, who “has the mic” and with AI that also becomes more complicated.
AP: Do you see ways in which AI could get past these limitations?
MJ: I think AI has its own stories and that is often overlooked. Examining AI for what it is instead of how it could solve all of our biggest problems such as a pressure for never-ending productivity one one side, and distraction on the other – both of which capitalism is masterfully taking advantage of.
There has always been so much pressure in the history of AI to produce an AI that is so similar to humans that one wouldn’t be able to distinguish between the two, this is why the Turing test [the test shows AI and human generated text to a person to see if they can distinguish which is which – Editor] has been considered such a milestone in AI’s history.
What is most interesting in AI is not to consider its autonomy but to consider the unique process that creates an AI work, be it fiction or nonfiction, and how it distinguishes it from ours. But very few people do that, we consider it a kind of exciting threat.
There are so many videos online of people relishing when an AI fails, watching robots fall when they try to open doors, etc. The current problem is that most people don’t understand how the AI works, most people cannot write even a simple algorithm because it is not something we teach everyone.
BONUS SECTION! We included a piece of AI generated story in the interview.
Enjoy GPT-3 thoughts on future of AI in journalism
AP: What do you think is the future of AI in journalism?
[GPT3]: AI is going to be huge in journalism. It may even become indistinguishable from human journalists. The most common form of AI use in journalism is in data science and processing. Reporters and editors will be able to use AI to parse data and find stories that they would have otherwise missed. On the production side, AI will augment journalists’ abilities in ways that will be hard to predict. I believe AI will also be used to allow journalists to write more in-depth stories. Reporters are already using AI to extract data from documents and analyze emails. It will also be used to analyze audio and video files. In the future, AI will be used to create stories from audio and video files. In the end, AI will help journalists do their jobs better.
[GPT-3] What is your advice for journalists who want to get involved with AI? Give me some tips.
[GPT-3] I believe the most important skill for journalists to learn is coding. By coding, I mean understanding how code works. So, if you don’t know how to code, try to learn. AI is a wide technology area. You don’t need to learn everything about AI; you just need to learn the basics.
AP: If you were going to write a newspaper for other AI readers, what subjects would you report on?
[GPT3]: I would report on the fact that I am learning to draw.
More from The Fix: Fighting disinfo with tech: A look at 3 start-ups