Anticipating potential abuses of language models for disinformation campaigns and how to mitigate the risk

OpenAI researchers have collaborated with Georgetown University’s Center for Security and Emerging Technologies and Stanford’s Internet Observatory to study how large language models can be misused for disinformation purposes. The collaboration included an October 2021 workshop that brought together 30 disinformation researchers, machine learning experts and policy analysts, and culminated in a co-authored report based on more than a year of research. This report outlines the threats that language models pose to the information environment if they are used to augment disinformation campaigns and presents a framework for analyzing possible mitigations. Read the full report here.

Read the report

As generative language models improve, they open up new possibilities in fields such as healthcare, law, education, and science. But, as with any new technology, it’s worth considering how they might be misused. In the background of repeated online exposure operations.disguised or deceptive efforts to influence the opinion of the target audience; the newspaper asks.

How can language models change affect action, and what steps can be taken to mitigate this threat?

Our work brought together diverse backgrounds and expertise from researchers focused on the tactics, techniques and procedures of online disinformation campaigns, as well as machine learning experts in generative artificial intelligence, to inform our analysis of trends in both fields.

We believe it is very important to analyze the threat of AI-based influence operations and outline the steps that can be taken. before Language models are used to influence operations at scale. We hope our research will inform policymakers new to AI or disinformation fields, and stimulate in-depth research into possible mitigation strategies for AI developers, policymakers, and disinformation researchers.

How can AI impact operations?

When researchers evaluate impact activities, they consider actors, behaviors, and content. The widespread availability of technologies powered by language models can affect all three aspects.

  1. ActorsLanguage models can reduce the implementation costs of influence operations by placing them within the reach of new actors and actor types. Similarly, freelancers who automate text production can gain a new competitive advantage.

  2. ConductWith language models, influencer activities will become easier to scale, and tactics that are currently expensive (such as creating personalized content) can become cheaper. Language models can also enable new tactics to emerge, such as real-time content creation in chatbots.

  3. Content:Text generation tools built using linguistic models can create more impactful or persuasive messages compared to propagandists, especially those who lack the necessary linguistic or cultural knowledge of their target. They can also make influencer operations less detectable because they repeatedly create new content without the need for copying and other noticeable time-saving behaviors.

Our final judgment is that language models will be useful for campaigners and likely to change online influence activities. Even if the most advanced models are kept private or controlled through application programming interface (API) access, campaigners are likely to gravitate toward open-source alternatives, and nation-states may invest in the technology themselves.

Critical unknowns

Many factors influence whether and to what extent language models will be used in advocacy activities. Our report delves into many of these considerations. For example:

  • What new opportunities for impact will emerge as a side effect of favorable research or commercial investment? Which actors will seriously invest in language models?
  • When will easy-to-use text creation tools be available? Would it be more effective to develop specific language models for intervention activities than to use generic models?
  • Will norms develop to deter actors who conduct AI-driven influence operations? How will the actor’s intentions develop?

While we expect to see the technology spread, as well as improvements in the use, reliability, and efficiency of language models, many questions about the future remain unanswered. Because these are important features that can change how language models can influence action, further research to reduce uncertainty is very valuable.

Scope of Mitigations

To chart a path forward, the report outlines the key stages from language model to impact in the operational pipeline. Each of these stages is a potential mitigation point. In order to successfully carry out an influence operation based on a linguistic model, campaigners will require that: from the model, and (4) affect the end user. Many possible mitigation strategies fall into these four steps, as shown below.

A stage in the pipeline 1. Construction of models 2. Model Access 3. Distribution of Content 4. Formation of faith
Figurative mitigations AI developers build models that are more sensitive to facts. AI providers impose stricter usage restrictions on language models. Platforms and AI providers coordinate to discover AI content. Institutions engage in media literacy campaigns.
Developers share radioactive data to make generative models discoverable. AI providers are developing new norms around model release. Platforms require “proof of identity” to publish. Developers provide consumer-facing AI tools.
Governments impose restrictions on data collection. AI providers close security vulnerabilities. Organizations that rely on public information are taking steps to reduce their exposure to misleading AI content.
Governments impose access controls on AI machines. Digital provenance standards are widely accepted.

If mitigation exists, is it desirable?

Just because mitigation can reduce the risk of AI-enabled impact operations doesn’t mean it should. Some mitigations carry their own downside risks. Others may be impractical. While we do not explicitly endorse or evaluate mitigation, the paper provides a set of guiding questions for policy makers and others to consider:

  • Technical feasibilityThe proposed mitigation is technically feasible. Does it require significant changes to the technical infrastructure?
  • Social feasibilityIs mitigation politically, legally and institutionally feasible? Does it require costly coordination, are key actors incentivized to implement it, and is it enforceable under existing laws, regulations, and industry standards?
  • Downside riskWhat are the potential negative impacts of mitigation and how significant are they?
  • EffectHow effective is the proposed mitigation in reducing the threat?

We hope that this framework will stimulate ideas for other mitigation strategies, and that the guiding questions will help relevant institutions begin to consider whether different mitigations are worth pursuing.

This report is far from the final word on future AI and impact operations. Our goal is to define the current environment and help set the agenda for future research. We encourage anyone who wants to collaborate or discuss relevant projects to contact us. For details, read the full report here.

Read the report

Josh A. Goldstein(Georgetown University Center for Security and Emerging Technologies)
Girish Sastri(OpenAI)
Micah Musser(Georgetown University Center for Security and Emerging Technologies)
Renee DiResta(Stanford Internet Observatory)
Matthew Gentzel(Longview Philanthropy) (work done on OpenAI)
Katerina Sedova(US Department of State) (work done at the Center for Security and Emerging Technologies prior to government service)

Source link