Engelberg Center Live!

Synthetic Media: AI and Journalism

Episode Summary

Of all the industries that AI is reshaping, perhaps none is as concerning as journalism, with its already complicated relationship with objectivity and bias, and the role technology plays in its dissemination and curation. What does it mean when the press we depend on is actually a synthetic amalgam of human and machine, using AI to create its words and images and shape its workflows, ethics, and intellectual property? In today’s episode, Professor Mike Ananny joins to unpack these complex issues, explain the study he collaborated on with Knowing Machines team member Jake Karr, and help think through our relationship with journalism even before AI complicated everything.

Episode Notes

Music Used:

The Blue Dot Sessions, "Angel Tooth,” “Paper Feather"

Episode Transcription

Knowing Machines Podcast

Episode 6: “Synthetic Media”



Tamar: From the Engelberg Center on Innovation, Law and Policy at NYU School of Law and USC's Annenberg School for Communication and Journalism. This is Knowing Machines, a podcast and research project about how we train AI systems to interpret the world. Supported by the Alfred P Sloan Foundation. I'm your host, Tamar Avishai.


Of all the industries that AI seems to be seeping into, altering and even dislocating. The most troubling to me is journalism. And not just because I am a journalist, although that's certainly a part of it, but because I like you, like all of us, consume journalism. We depend on it, on a baseline of investigation into the truth of things probed by human reporters and vetted by human editors. We don't want to think that any of this can be replaced by computers, or that worse, we could agree to meet those computers halfway. But of course, this makes some assumptions about the state of journalism itself that might not be the truth of things. To dive into these issues. I spoke with professor Mike Ananny, a journalist, author and professor at the University of Southern California's Annenberg School for Communication and Journalism. Mike partnered up with legal scholar Jake Karr, whom you might remember from episode three, to tackle the issues of journalism, law and the role of large language models as they enter this conversation of synthetic media. That is all the journalistic content, from interview transcriptions to entire written articles that are produced with the help of generative AI. Together, Mike and Jake embarked on a study [00:02:00] of the way journalism itself was adapting, interviewing reporters and analyzing articles down to the word. In this episode, we explore their findings how journalism has evolved over time, how generative AI is changing its face, and what we can do about it. Mike, thank you so much for joining me today.


Mike: Thank you. Great to be here.


Tamar: So before we dive into the study, specifically, can you talk about the bigger picture issues that this study is reacting to and responding to?


Mike: Yeah. Like all other industries, journalism is being impacted by generative AI. It's trying to figure out what generative AI means for itself or its professionals.  But this question of what Generative AI does for media Is such a foundational, such a big question that it's not good enough to only answer it through the lens of what journalists do or what news organizations are, which is kind of my background, and my training is thinking about the sociology of news production and how how news is made and who makes news and what do they think they're doing. We've also got these big, really complicated, open legal questions that are not only technical questions about what the law says or doesn't say, but this is a space where the law is unsettled, and there's a lot of different ideas about what the law even means. You know, I think about and I talk about with my students the idea [00:04:00] that the press is the only industry mentioned in the Constitution. If you look at the Constitution, it's the only thing that we could probably call a business and a public service and an institution that has this constitutional role, that has this constitutional protection.


Mike: And it's it's very unclear what that protection or that role means in any given era. And that's changed a lot in, especially in the last sort of 130 years since the Supreme Court has been thinking about what the First Amendment means for journalism. It's only something that started in sort of the the early 1900s. Did it start thinking about this question? But the question of what is the press mean to the Constitution is one that's always been entangled with the technologies of the time. So whether that's. You know, cheap printing and the ability to move paper around the country cheaply and easily, whether that's broadcast radio and what it means to own spectrum, and the role of the FCC in trying to figure out the public's airwaves and the protection of it, whether the press should be publicly funded or not, and where the money comes from to make the press. And what is the idea of journalistic objectivity? Is that even a real thing or not, or is that something that's been invented? So there's also this phrase that's being used, sometimes called synthetic media.


Tamar: Yeah. I was going to ask, can you define what you mean by synthetic media?


Mike: Synthetic media is really just kind of a fancy term for saying there's this content, these these sounds, these pictures, these words, videos that are being made by large language models that are being made by generative artificial intelligence and so some of the news or in some parts of the news that people see might be coming not from [00:06:00] human journalists, but they might be coming from. News organizations using machine learning and generative AI and large language models and all the other stuff that we've talked about in this podcast in different conversations, using those technologies and those infrastructures to create things that we audiences think of as news. So that's kind of the big distinction of this moment, is that we can't necessarily know if the news that we're reading was made by humans or made by machine learning models, or made by some mixture of those two things, and we don't know what that mixture looks like. So synthetic media is really just a fancy way of saying what we think of as the media or the journalism or the press has this potentially synthetic, artificial computational quality to it. And we don't know right now whether it has that, whether all content has that or whether some of it does. And we're trying to figure out what news even is. So that's part of the I think the big motivation for this project is thinking about what the heck does it mean to live in worlds where news might be coming from a mix of humans and machines?


Tamar: Okay, so now we have this big picture, and believe me, I'm champing at the bit to dig into the issues that you just raised. But could you explain this study and the motivations you had and the methodologies you used?


Mike: So with my colleague and close collaborator Jake Karr, who is at the Engelberg Center at the New York University Law School, we collaborated on this project because we wanted to figure out essentially three things. In this study. We were motivated by three questions. The first is what does journalism think synthetic media is in this moment in 2022, [00:08:00] 2023? What does journalism think synthetic media is? That was our first question. The second question motivating us was sort of why does synthetic media matter to journalism? What are the stakes? What are why be concerned about this topic at all? Why do they think that it matters? And the third question we were interested in was how is journalism getting ready for this moment? How is it perceiving the risks of generative AI? How is it thinking about the uncertainties or the possibilities of generative AI? This is sort of a big unknown moment in the field. We wanted to figure out how journalists were getting ready for it. So those are the three questions that we were really motivated by. So Jake and I got together on this project and we said, well, how could we possibly study this? And we decided to focus the study in three different ways. The first way we looked at it was this kind of geeky, academic thing that Matt Carlson calls meta journalistic discourse, which all that means is that's journalists talking to themselves about journalism, to make their own profession better. It's sort of the way that they have conversations with themselves. Sometimes that's called trade press, but call it meta journalistic discourse.


Mike: The first thing that Jake and I did was that we basically gathered up pretty much all the examples of articles, news stories, commentaries, op eds that we could find where journalists were talking mostly to themselves about what generative AI means. We had a huge corpus, hundreds and hundreds and hundreds of articles. We read through them. We then picked 40 articles that we decided to sort of read in in detail and focus on, because we thought those were the most important. And we read those really closely categorized them, analyzed them. That was sort of our first big piece of data that we use to try to answer those three questions. The second big piece of data that we created was this big corpus of the policies and principles that news organizations created to figure out how to regulate the use [00:10:00] of generative AI. So what would they let journalists do? What did they forbid journalists from doing? What did they encourage journalists to do? There's all these documents, these policies, frameworks, principles, guidelines. They go by different names, but they're all focused on trying to put some bumpers and some sort of guidelines on what journalists do with generative AI. So we gathered up 36 of those documents and sort of read those closely. So we've got the 40 in depth meta journalistic stories that we've got. We've got these 36 policies, frameworks, documents. And then the last thing we did was we talked to journalists themselves. So we reached out to journalists that we thought were particularly central or focused on this question of generative AI in journalism.


Mike: And those are interviews that are still ongoing. So what we're reporting on in this first stage of the study is what we learned by talking to seven of those practitioners and so by journalists. We took a little bit of a broad definition of what we mean by journalists. We actually mean by, yes, reporters and editors and people who work in. News organizations, but we also talked to product officers. So people who make news products, the technologies that live on news sites. And we also talked to lawyers. So media lawyers, in-house counsels who are trying to think about what generative AI means for this moment. So that was the study. And the other thing to add on this study is that Jake and I are coming from different backgrounds. So Jake is trained as a lawyer. He tries to figure out where is the law right now on a particular topic or question, but he also thinks about where's the law going? Or where could the law go or where should the law go. So he has that legal training and that legal framework and lens, and my own background is more as a communication scholar, a journalism studies person who thinks a lot about the sociology of news production. So why do journalists work the way they do, and how are news organizations created and sustained, and how do they [00:12:00] change over time, and how do they respond to new technologies?


Tamar: Huh. So I want to go back to what you said earlier about what the heck it means to live in a world where news is coming from a mix of human and machines. What the heck does it mean? What is this moment in media in the first place, and how is AI changing it?


Mike: Some newsrooms have been using AI for years, so organizations like the Associated Press, they've been generating stories like finance stories where, you know, the stock market went up or down, or this stock got more expensive or less expensive. These are kind of well-structured stories that have data that can be told in a pretty straightforward way. Sports stories are a similar kind of thing, right? There's a lot of statistically driven events that happen in sports, and that makes it something that's really ripe for computational journalism, because you can tell the story of what happened in a sports game. It won't have the color or the flavor or the character of like a gripping sports narrative, but you can at least do a first pass of a sports story or a finance story or a story about the weather. These are some of the domains where AI has been used to generate media. But what we're seeing now is this rapidly expanding application of these tools like ChatGPT or Midjourney or Dall-E two, or all of these tools that can make media, that make synthetic media. These are starting to creep their way into parts of the newsroom that used to not consider them acceptable things to have as part of journalism, but now we're starting to see some blurring of those lines. So we've seen even the production of AI news anchors. So the characters, the people that you see on the screen are starting to be created by AI, by moving images and videos and sounds [00:14:00] that are not real people. So the last thing is that this is a part of journalism, but it's also part of these larger. Questions about what a cultural industry is, what a media industry is. So the exact same time that we've seen the Hollywood writers and the actors strikes, sort of grappling with this question of what does it mean to be a media producer in this world of generative AI? That's what we're starting to see in journalism.


Tamar: Okay, so fake news anchors aside, what does generative AI in the media actually look like? You know, when I think about AI in journalism, I think about the kind of AI generated content that's like spammy and bot filled. You know, the stuff that either looks really fake or the kind of disinformation that everybody thinks tilts election results.


Mike: Yeah. So what we have not seen is we have not seen this flood of synthetic myths or disinformation. We haven't seen what's called flooding the zone of with synthetic propaganda or fake news. We have not seen that. Interesting. One of our interviewees even said people don't like synthetic content. So in some ways, it's not this widespread phenomenon that, you know, the people were fearing or having a big panic about. But what we have seen is we've seen what we call sort of this back end news work. So where we have seen generative AI appear in is in all of the kind of invisible work that journalists do, as they're researching stories, as they're trying to figure out, you know, the holes in an argument that might be made or what is academic research say about a topic. We've heard them talk about using generative AI to create interview questions, to craft interview questions, to find sources to interview, like who should they be talking to on a particular topic? We heard them talk about [00:16:00] using generative AI to translate, you know, information from one language to another that they could read, or to translate their stories from one language to another and to transcribe, to take audio recordings and to quickly, automatically create transcriptions, proofing, style checking, making text quote unquote better, or discovering patterns in data and even sort of mundane things like summarizing meetings.


Tamar: But but like, isn't that helpful? Yeah. I mean, I use AI transcription, like, isn't that a good thing? Have I do the back end, nitty gritty work that leaves reporters free to focus on the real meat of their investigations.


Mike: That's where we've seen journalists say that's where generative AI really helps me do my work. And who cares, right? It's not a big deal. We haven't seen the flooding of it.


Tamar: Yeah. Who cares?


Mike: Who cares? But that's, you know, as people who. So we're scholars who are influenced by this field called science and technology studies, which that field, that way of doing work actually says, you know what? Focus on the stuff that seems boring because it's the places where things seem boring that actually that's where the power is. And that's for us, was sort of this aha moment in this work when we said, okay, we're not seeing this big flood, the zone with mis and disinformation, but we are seeing journalism as a craft, as a profession, as a culture, slowly but surely, sort of adopting these generative AI tools. And the question then is, so why does that matter? Right. So you say, oh, they're using these tools. Why does that matter? It matters for the same reasons that it matters for other people. Whether it's students writing term papers or doctors trying to make diagnoses, is that you've given over some of the thinking, some of the pattern making, some of some of the interpretive work that journalism is really good at, or it should be really good at. You're giving over some of that to these machine learning models.


Mike: And we know from other, other podcasts and other conversations in this series, we know that those models have [00:18:00] have biases to them. We know that they're based on data sets that, you know, privileged certain voices over other voices. We know that it's only, you know, pretty good for English, sort of, but not great for other languages. It's pretty good for some time periods, but not good for other time periods. It's good for some ways of seeing the world, but not other ways of seeing the world. And we know that also that journalism has historically been in Western contexts has been, you know, pretty white, pretty male, pretty straight, pretty sort of, you know, quote unquote objective in ways that the journalism field thought it was. Self was objective. The big question we have to ask is, do we want that kind of journalism again? And then we've actually taken a step back from all of the work we've been trying to do to diversify journalism, to hear different kinds of voices. If we use those, those data sets to drive that work, then we're we're sort of going backwards in time.


Tamar: Okay, so give me some good news. There has to be some good news.


Mike: Yeah, I think one of the pluses is that it's kind of pushing journalists to think more about what value they provide. And we heard our interviewees say this. We heard them say that, you know, one person said, quote, it's going to force a lot of reporters to provide value in the work that they do. And what we heard our interviewees talk about is to say, you know, yeah, it did make me sort of stop and think. What do I do? What is my role in this? What is the value that I'm providing? What's my my the value of my background, my interpretations, my judgment? And one of our other interviewees said, I don't want to use my brain power thinking of a prompt rather than actually writing something, and thought that was an interesting thing to say because this person was sort of saying, yeah, yeah, yeah, there's all this stuff about prompt engineering and trying to figure out what how [00:20:00] to ask a good question to get the answer that you want out of it. And this person said, you know, I can spend a lot of time doing that. Or I can lean on my training as a journalist because I kind of know what I should be writing about something.


Mike: I kind of know who I should be interviewing for, something. I know what interview questions I should be asking. I know what background research I should be relying on, and that's experience that I've gotten through years and years of doing this craft really well. One of our other interviewees said, you know, there's a bit of a generational divide that seems to be happening in newsrooms where some of the older, more experienced journalists, you know, they don't feel threatened by this backend work. They don't feel threatened by generative AI making incursions into it because they say, oh yeah, I understand that you can do something faster, or you can do something, you know, more efficiently. But at the end of the day, the judgments that I have. Nurtured and created over years of doing this work is going to be way better than a younger person simply prompting ChatGPT, because the the response you're going to get from that is going to feel bland, it's going to feel inert. It's going to feel without that quality of human judgment.


Tamar: Mm. Yeah. That's really interesting in particularly the generational divide. I think about that a lot in media. I'd imagine that I would want my journalists to adapt to new technologies, but not to, you know, radically innovate the form or the process.


Mike: There's this trope in technology worlds to say, oh, experiments, you know, try something out, you know, make a prototype of something. That's the only way that you're going to learn. We heard some journalists say that they're excited to use generative AI in their work. They're excited to try to build new what they would call products, or build new stories, or build new ways of storytelling for for audiences. Some journalists are absolutely excited [00:22:00] and motivated by this moment. We also, though, heard some journalists saying, hold on a second. Our job is not to experiment on the public. That's not our job. Our job is to take care of the public. It's to nurture the public. It's to tell the stories that the public needs to hear, that it maybe doesn't want to hear. And that stems from journalistic judgment, that ability to do that. We should not be experimenting on the public. We should not be using these tools in a way that changes how the public sees itself or understands itself. That's pretty dangerous. And additionally, some of these people said, we're not even ready as a field to know what it means to have good guidelines on using these tools or not. And the chance of us making mistakes about journalism while using these tools, the chance of us making mistakes is just too great. This is this is our democracy we're playing with. This is our collective life that we're playing with. You cannot get it wrong. And that that for me was, I think, a very powerful moment where journalists are trying to figure this moment out and we see the field grappling with this question.


Tamar: Okay, so I want to return to this idea of bias. This comes up over and over again in knowing machines, but it feels particularly relevant here because media is also dealing with this eternal issue of objectivity. Yeah. You know, which may or may not be possible. And I think you and I might come down on different sides of that. I think we could talk about it all day. You mentioned earlier that objectivity has been considered an invention, you know, an economic invention. I happen to think it's aspirational, even if it's impossible, you know, and maybe you and I agree with each other more than we disagree. And we've also talked about this generational divide. And I think about who was it. Peter Baker, right. Who wrote that column about how journalists shouldn't even vote [00:24:00] because it tips the scales of their neutrality. Et cetera. And so objectivity is a really complicated issue in journalism, even without AI. But then we look at these ostensibly objective AI tools and we say, well, they're problematic because they're biased, but journalists are biased, and we see that as a given. So how do we reconcile this? You know, what do journalists mean by objectivity? What do machine learning researchers mean by objectivity? And is it different?


Mike: So such a great point. So historian Michael Schudson has this idea that he calls mature subjectivity. So he says, instead of having these these fights that we have all the time about whether somebody is objective or not or whether neutrality is possible or not, these are kind of age old questions. And the frustrating answer is you're never going to really land somewhere that is going to everyone's going to agree with. Right. So age old question of is journalism objective? Yes. No, whatever. Know in some sense, who cares? Who cares? Schudson Michael Schudson, this great historian of journalism would say, actually, what we should be focusing on is what he calls, quote, mature subjectivity and mature subjectivity. He defines really precisely. He says. That is it is a commitment to facts and it's a commitment to values. And it's a commitment to keep them separate and know how they relate to each other. And to tell your audience how they relate to each other. So he says, mature subjectivity is we're not in the land of just, you know, relativism of two plus two equals five. If we think it means like we're not there. That's not what we're talking about. But nor are we talking about clinging to some kind of unrealistic [00:26:00] model that we are just automatons who are, you know, gathering the world.


Mike: What Daniel Hallin, who's another sociologist, called the, quote, scientism of journalism and the idea that journalism is like a physical science where you can just go out into the world and discover facts and put them in your bag and bring them back and carry them and give them to people. There's always going to be a moment of interpretation because journalists are, in the end, sort of storytellers in the public interest. And it's a good thing to be a storyteller. It's a good thing to have this interpretation. But I go back to this idea of mature subjectivity that Schudson gives us because he says, if it means this commitment to facts and a commitment to values and a commitment to know the difference between them and to be purposeful and intentional, to me, that's exactly the spirit that you should take to generative AI, to machine learning, because there might be parts of this backend work where we say it doesn't matter totally. Like go use ChatGPT to do something. Go use Midjourney to generate something. If it gives you some ideas, great. If not, you know, no big deal. It doesn't mean you totally reject it, but nor does it mean you lean on it as a crutch and say, ah, this is my worldview, right? That you're doing.


Tamar: Yeah, I can see that.


Mike: And the last thing I'll add is there's the other sort of idea in a lot of these conversations that I think could be more in this space, is this American school of philosophy called pragmatism and pragmatism would say, stop trying to define things in some. You know, authoritative, final kind of declarative way that ends the conversation. That's not what pragmatism would say. Pragmatism would say, tell me why it matters to define something a particular way. So pragmatism would say, why does it matter that journalism [00:28:00] uses ChatGPT not to say it never should or it always should, but tell me what the stakes are. Tell me why it matters for journalists to use this in one part of their back end work, versus another part of their back end work. And if you shift the conversation from these strict definitions that people want to cling to as ideologies, and you shift the conversation from there to a pragmatic one that says, why does it matter? What's at stake, what's at stake for public life? What's at stake for the stories we tell each other? For me, that's a much more exciting place to have a conversation about generative AI, because I get to go to the space of why does it matter? And people have lots of different answers to why does it matter? But that's a good thing, right? That's that helps us make the conversation bigger. And what I worry about, you know, debates about objectivity are usually about about make the conversation smaller and don't want to make conversations smaller. I want to make conversations bigger. So I'd rather use pragmatism.


Tamar: Okay, I know I took us a little far afield here. Let's get back to this study. I want to hear more about what the journalists themselves had to say.


Mike: Yeah, we heard a bunch of fascinating things, from talking to these different people who are working in news organizations. One of the first big things that we heard is that this is a field that is really varied and diverse. It's it kind of doesn't even make sense to talk about what journalism is doing with generative AI without going a step deeper and saying journalism means a lot of different things. And because news organizations are really different, they're really different beasts doing different types of work. So if you have a big archive of news work of stories like The New York Times does or Reuters does, if you have a big archive that you're sitting on, then you really care about whether OpenAI is training its system on your work. But if you're if you're a newer organization without necessarily having investment in an archive, maybe you care a little bit less. [00:30:00] Or if you're a news organization that is really invested in doing breaking news stories that require fast, rapid responses, then you might be more okay with using ChatGPT to sort of quickly generate text. But if you're a news organization that does long form investigative reporting, you might be more suspicious of it, or you might be willing to invest your reporter's time in crafting the right language. So. And we also saw differences in terms of, you know, copyright. If you're publishing under a Creative Commons license as a news organization, then you might look differently at your content being scraped.


Mike: One of the other things we heard was sort of this, this fight about timing, this fight about pacing. So some news organizations were desperately wanting to move slowly. They wanted to move thoughtfully and slowly in relation to ChatGPT, to figure out when it makes sense or when it doesn't. But they understood that they don't have control over when ChatGPT changes. When does it go from 3 to 4 to 4.5? And what is one license get you versus another license? So they were grappling with the speed that they wanted to work at versus the speed that the technology was changing. And that's kind of a that's a tricky little balance to to strike. We also heard journalists talking about like elite journalists who are parts of big fancy news organizations that you probably know about that have lots of money that have been around for a long time. They can afford to kind of weather this moment in a way. They can afford to sit back a little bit or, you know, they're not going to be thrown out of business tomorrow because their content is being taken over by a ChatGPT system. We also saw, and this was kind of something you only get from interviews, from talking to people. There are definitely formal consortia. So there are sort of formal, you know, news organization partnerships or professional organizations that are trying to set the scene on what ChatGPT or what generative AI means for journalism.


Mike: But we also heard [00:32:00] sort of these informal consortia, these informal relationships that are happening where some journalists are looking to particular organizations or particular people and saying, I'm going to wait until they do something, or I'm going to wait until that person expresses a perspective and, you know, within the bounds of confidentiality that we gave our interviewees that we want to respect, we won't sort of get into details exactly, but we heard sort of differences in who has authority or legitimacy to sort of set the tone or set the topic on these things. Um, the last thing we heard in our interviews in terms of this field that was I thought was sort of fascinating, was there's a little bit of disagreement on how much journalists think audiences need to know about generative AI because. So one view is and it's this very sort of, you know, very sort of American Western view that that prizes transparency above all else and says, I'm just going to disclose everything. Transparency is the king. I'm going to give my audience openness to whatever I used, you know, did I use ChatGPT in in creating this text? Did I use Midjourney to create this image? Some people said, yeah, I, I need to tell my audiences what was done.


Mike: Other people said, maybe that doesn't matter. You know, it's the story in the end. That is the thing that the audiences are going to make sense of. They're going to find valuable or not. And journalists make lots of decisions about what kinds of tools to use, what kinds of people to interview, what kinds of data to rely on. And we haven't journalists have not historically told audiences every single little bit of what they did while preparing a story. So some journalists are like, maybe this is sort of like that. We don't open up our newsrooms all the time and tell people what we what we did or what we worked on. So maybe we don't need to necessarily tell people every single time we use ChatGPT. And that kind of relates to this back end use versus reader facing [00:34:00] distinction that we heard where, you know, if a journalist used something like Otter.ai to transcribe an audio interview that they did with a source, and they used that transcript as part of crafting a story, does that need to be disclosed to an audience of like, oh, the story that you're reading is using text that was generated by a transcription that came from audio. In a way, the journalists are like, I don't know, maybe audiences don't need to know about that, because, look.


Tamar: That's different though. That is so different.


Mike: So that's the difference that we heard, though. That's why that's what I mean when I say generative AI means a lot of different things. That's very that is very, very different. Of course. Absolutely. From saying to ChatGPT generate an 800 word story on, you know, the budget debate that's happening in the US House of Representatives and just and then just running that story, like running that exact story totally different. Absolutely, totally.


Tamar: That's like your like your asset manager saying like, okay, well, we use this calculator to do this math. But then, you know, it also helped us decide what to invest in.


Mike: Yeah. But here's what if in using Otter.ai to transcribe something, you know, we know that that's trained on particular models. We know it's able to handle some words better. What if you didn't really check that transcript afterward? What if somebody actually used a different phrase, a different word, a different way of stringing together words that Otter.ai was not expecting? What if your source actually said something that was pretty different and pretty important, but because you didn't do the A, the traditional thing, which a lot of journalists told us they used to do, which was they would have to transcribe their own interviews, they would sit there for hours and listen, and they would have to do that work. Some journalists said to us, it's exciting. I don't have to do that work anymore.


Tamar: Yeah.  [00:36:00]Yeah. I mean, like I said, as somebody who uses AI generated transcription, like all the time, it's interesting, even putting aside the potentially problematic issues of mistranslation, there's the fact that, like you said, this is something that reporters used to spend hours doing on their own, transcribing their own tape. It really brings you closer to your material when you actually transcribe it on your own. And to give that piece of it up, I feel like does automatically, maybe kind of insidiously create distance between the reporter and their interviewee and their story. Yeah, yeah. Like like I said, it feels insidious that that space between.


Mike: Yeah. I know as an academic, like when I like for years I've been interviewing people for studies, academic studies I've been doing. And I go away with these long pieces of audio and I listen to them. I'm really conscious that when I was a grad student and I was transcribing these things by hand, and it was like sitting for hours pressing and old, old equipment pressing, stop, start, stop, start over, like painful. And I didn't I didn't have a good setup. But I will say, and not to romanticize it, you know, but I was I was with those interviews. I was inside those conversations in a way that I am not. Now when I read, when I just skim a transcript.


Tamar: Yeah. So having done this study, I'm genuinely curious, like, what do you do with this information? So you know that different journalists at different news outlets, which are already so different from each other, like you're getting all of these different avenues into this question of what is generative AI in journalism matter? And like, what do you want to do with this? Is it just something that you put out there and you say, okay, let's draw our own conclusions and we will. But I'm curious, you know, has this changed [00:38:00] your own? Has this changed your own thinking on the subject, your own perspective? Do you feel like your perspective in this matters as an educator?


Mike: Yeah. Um, one thing that keeps me up at night when I think about these topics is a kind of a fear of being gaslit into a a kind of public life. That you're never going to point to a particular failure or moment of like, oh, the New York Times used ChatGPT to generate a transcript of an interview, and therefore that's the downfall of democracy. It's not it's not going to happen like that. It's not going to happen. Um, what I'm most concerned about, or what I'm interested in as well, is the small creeps of technologies into the work that people do. And that's a long history. And that's that's not specific to journalism. Lots of industries have had this right. And it's not a bad thing. I'm not I'm not saying, oh, the sky is falling. And we need to like stop journalists from using generative AI. Um, but what I do worry about is there are forces that want to. Speed up. Make cheaper. Make more efficient. The generation of and they would use the word content, not news. To make content that is cheaper appeals to more audiences. That's faster. That requires less human oversight and that there isn't going to. There is never going to be a single moment when we we flip from journalism is human made to journalism is computational. It's going to be this gray area.


Mike: So the motivation for me is like, I want to understand the details of that gray area because some of that gray area matters and some of it doesn't matter. So what do I want to do with this work? So the first is is kind of a descriptive [00:40:00] contribution, right. It's to say. If you think. Generative AI is fantastic for news. Let me tell you a different way to think about it. If you think it's the sky is falling and it's going to ruin news, let me tell you a different way to think about it and to move people a little bit from one place to another. So one is just a change or to try to reshape public conversation about it. The other I think about is news organizations are businesses. Right now we're seeing media guilds and news unions and news sort of collective labor or collective action trying to figure out what should they think about generative AI, what should they push for in collective bargaining moments? What's what are their bright lines for journalistic labor and. What I'm most motivated and curious and fascinated by is like, I've been able to have some conversations with people who are in positions to influence what. Collective action conversations might look like, or what a media guild position should be on generative AI. So the chance to help.


Mike: People craft better labor conditions to craft better ways for journalists to tell strong public stories. That for me, that's a that's a real motivator to do this kind of work. If I can help a media guild think smartly about generative AI, that's pretty huge. And then the last thing think about, you know, why do this work? Or why does it matter what it is is I'm a teacher. I'm also an educator. I'm in classrooms, and I am very conscious that we only have students for, you know, for relatively short years, and then they're going out into professional careers. And I am most heartened by emails that I get from students that I taught, you know, ten years ago who it's [00:42:00] rare, but it happens. I've had people email me afterward and say, that thing we talked about in class, like how to approach that ethical conundrum in a newsroom. I actually use that when I was talking with my senior editor in a conversation about whether we should publish that source or not, or whether we should name that person or not. And you know, that that lecture that I gave on pragmatism and the role of mature subjectivity on news actually stuck with people in their heads somewhere, and it helped them have a different conversation that's invisible to most people. It's invisible to me most of the time.


Tamar: But when you identify the difference between articles and content, I think that's a really important point, because it really does speak to journalism as a business. Print journalism is in a really tough spot. And something you said earlier was that a place like The New York Times is able to financially withstand whatever generative AI is going to throw at it, or even just, you know, the complexities of the conversation. And I'm curious, you know, you talked about interviewing a diversity of journalists, and is that everything from, you know, big mainstream media outlets down to like sub stackers? Where do you draw the line of what is journalism? Who are journalists? And from there, what about the smaller outlets that can't afford to have this conversation? You know, who can't financially withstand it?


Mike: Yeah. Yeah, yeah. So we so for this study, we mostly focused on larger scale mainstream news organizations that most people would have heard from. So what we did not look at is the small scale, like very artisanal, you know, particular substacks or people who would identify themselves as journalists but who don't necessarily belong to large news organizations. And that's so that in a way that. That absolutely relates to this larger problem of journalism [00:44:00] means a ton of different things. And somebody, you don't need a license to practice journalism, you don't you don't need to go to school. You don't need you don't need anything to call yourself a journalist. Right. And that's that's a good thing from, you know, perspective of the First Amendment in the, in the US would say that's the power of that is that you get to do journalism and still get protection constitutionally. You don't have to be a member of the New York Times or The Washington Post to get those protections. That's a good thing. What it does, though, is it highlights these power differences in when you're a member of The New York Times, you are a little more insulated from a day to day concern about what kind of traffic did one post versus another post get, or what kind of ad revenue are you getting off one piece of content versus another piece of content? The New York Times, for instance, has the luxury of saying, we're sitting on. You know, 150 ish, maybe more years of content in a library system that is really well designed, that's super well indexed, that they have whole librarians who are tasked with designing ways into this.


Mike: And it makes sense that they would be saying, I don't want ChatGPT to be using this data and indexing it and training new models off it. We'll do that ourselves. Reuters has a long history of innovating in AI. They would quite rightly do the same. The Washington Post has a really sophisticated engineering team that makes new apps and new visualizations, and they're able to do some really fancy stuff, though the conversations happening in those organizations. They get to be a little more wide ranging and imaginary and imaginative, and they get to be slower in a way, too. They get to be faster and they get to be slower. But the sub stacker [00:46:00] is different. One of the things that we did looked at, though, is when we did an analysis of the policy guidelines that were happening for different news organizations. We also did look at medium. So medium is not really a news organization, but it's a space where a lot of people do publish things that they might consider news or they self-identify as journalists. So something like medium is a, you know, a bit of an interesting hybrid or gray area for these things. Medium came out originally with a pretty, you know, anti synthetic media perspective. Right. They were like they said we will not publish synthetic media or media created through generative AI. We strongly discourage others from doing it. And medium is a space for individual human expression. And that's what we're doing. They then they were a little bit of a conundrum because they said, well, first of all, they had to not walk it back.


Mike: They didn't walk it back, but they clarified a few months later and said, it's actually really hard for us to know what is synthetic media or not on this, so we can have a principled position till the cows come home. But when the rubber hits the road, we can't really figure out what is synthetic or not. What we they then they said, they said, well, we will not promote content that we think is synthetically generated. So it can still sit and live on the medium platform, but we are not going to promote that content, as you know, a meta organization or as a platform. We're not going to promote that content, but we recognize that some people might be creating it for reasons that matter to them. But they tried to double down on that and say, we're not going to do it. And then the last bit that they had some ambivalence about is they said, well, you know, and this is platforms have always had this conversation. They said, we're both a platform and a host of content and kind of a regulator of content, and [00:48:00] we have to walk a fine line between being a space for independent media producers. To put their work to promote their work, to reach audiences. So we can't get too heavy handed in policies or infrastructures that prevent our individual members from doing what they want to do. And you saw that with medium, they sort of went through an arc of we don't want to do it. It's banned. You know, it's not allowed to like, well, actually, maybe we should be more nuanced in that. Yeah.


Tamar: Yeah. You saw Twitter doing that too.


Mike: Yeah, totally.


Tamar: But of course these are private companies.


Mike: They're private companies. Right. And, you know, to speak to the point of what is a news organization or who is a journalist. I think the jury is very much out on if medium. If medium started to claim that it's a news organization and started to try to make claims like the New York Times would make or like the Washington Post would make. We know that that's a that's a fraught territory that's related to, you know, section 230 of the Communications Decency Act of whether platforms are are they speakers or are they hosts. And there's a pretty active debate about that. So it touches on that.


Mike: And so I totally agree that there's there isn't necessarily a single story to tell about what generative means for news. And it goes back to the pragmatic question of when does it matter? And different people are going to give you different answers to when generative AI matters for journalism. And that's a good thing, because this is a field that is always going to be in transition. It's a profession that will always be in transition. And this is a moment where it's figuring out how it's in transition. [00:50:00]


Tamar: Mike, thank you so much for speaking with me today. This was a really fun conversation.


Mike: Thank you so much, Tamar. I really enjoyed it. It was fantastic.




Tamar: Next time on Knowing Machines, what does it look like to have an entire map of, say, your Amazon Echo presented to you in an immersive, comprehensive floor to ceiling gallery space and confronting you with the histories, communications, computations, and classifications that allowed both technology and humanity to bring you into this one moment holding this singular device here, Kate, text me a photo.


Kate: I'm doing, I'm going to send it to you right now.


Tamar: Wow. Oh, wow. We'll see you then.