Research

For the most recent and comprehensive overviews of my scientific publications, see google scholar. My CV is out of date.

Governance of AI

Progress in artificial Intelligence (AI) is likely to be one of the most important developments in the coming century. There is a non-trivial chance that we will see, in two decades, developments that transform society, the economy, and international relations. These will pose radical opportunities and challenges. I seek to anticipate these and identify levers for avoiding the risks. I do this work at Google DeepMind and previously the Centre for the Governance of AI (GovAI) (GovAI's google scholar page).

For my talks, see allandafoe/ai-talks

For an overview of my perspective, I recommend:

February 2025: 80,000 Hours Podcast on Technological Determinism, Cooperative AI, and Frontier Safety (transcript, YouTube, spotify)
[15 min read] Allan Dafoe. (2020). AI Governance: Opportunity and Theory of Impact. Effective Altruism Forum. (Effective Altruism Forum, html, EA Forum Prize Winner)
[25p] Allan Dafoe. (2022). AI Governance: Overview and Theoretical Lenses. In Oxford Handbook on AI Governance, edited by Bullock, J.B., et al., Oxford: Oxford University Press, 2022. (bit.ly/Dafoe-Handbook)
[50p] Allan Dafoe. (2018). AI Governance: A Research Agenda. Centre for the Governance of AI, Future of Humanity Institute, University of Oxford. (pdf)
[22 min video] "AI Strategy, Policy, and Governance." Beneficial AGI Conference, Puerto Rico, 2019 (Video)

Select work in reverse chronological order:

MacInnes, Morgan, Ben Garfinkel, Allan Dafoe. (2024) "Anarchy as Architect: Competitive Pressure, Technology, and the Internal Structure of States." International Studies Quarterly. 68: 4. (link, pdf)
Gabriel, I., Manzini, A., Keeling, G., Hendricks, L. A., Rieser, V., Iqbal, H., ... & Manyika, J. (2024). The ethics of advanced ai assistants. https://arxiv.org/abs/2404.16244.
Weidinger, L., Barnhart, J., Brennan, J., Butterfield, C., Young, S., Hawkins, W., ... & Dafoe, A., Isaac, W. (2024). Holistic safety and responsibility evaluations of advanced AI models. https://arxiv.org/abs/2404.14068
Jonas B Sandbrink, Hamish Hobbs, Jacob L Swett, Allan Dafoe, Anders Sandberg. "Risk-sensitive innovation: leveraging interactions between technologies to navigate technology risks." Science and Public Policy, (2024). https://doi.org/10.1093/scipol/scae043
Phuong, M., Aitchison, M., Catt, E., Cogan, S., Kaskasoli, A., Krakovna, V., ... & Dafoe, A., Shevlane, T. (2024). Evaluating frontier models for dangerous capabilities. https://arxiv.org/abs/2403.13793
Dragan, Anca, Helen King, Allan Dafoe. (2024). Frontier Safety Framework v1.0 blog
- Google DeepMind. (2024). Frontier Safety Framework technical report.
Morris, Meredith Ringel, Jascha Sohl-dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, and Shane Legg. "Levels of AGI: Operationalizing Progress on the Path to AGI." arXiv preprint arXiv:2311.02462 (2023). https://arxiv.org/abs/2311.02462
Seger, Elizabeth, Aviv Ovadya, Divya Siddarth, Ben Garfinkel, and Allan Dafoe. "Democratising AI: Multiple meanings, goals, and methods." In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, pp. 715-722. 2023. (link)
Ho, L., Barnhart, J., Trager, R., Bengio, Y., Brundage, M., Carnegie, A., Chowdhury, R., Dafoe, A., Hadfield, G., Levi, M. and Snidal, D., 2023. International institutions for advanced AI. arXiv:2307.04699. https://arxiv.org/abs/2307.04699 (blog)
Shevlane, T., Farquhar, S., Garfinkel, B., Phuong, M., Whittlestone, J., Leung, J., ... & Dafoe, A. (2023). Model evaluation for extreme risks. arXiv:2305.15324. https://arxiv.org/abs/2305.15324 (blog)
Ding, Jeffrey, and Allan Dafoe. "Engines of power: Electricity, AI, and general-purpose, military transformations." European Journal of International Security 8, no. 3 (2023): 377-394. (journal)
Sandbrink, Jonas, Hamish Hobbs, Jacob Swett, Allan Dafoe, and Anders Sandberg. "Differential technology development: A responsible innovation principle for navigating technology risks." Available at SSRN (2022).
Baobao Zhang, Markus Anderljung, Lauren Kahn, Noemi Dreksler, Michael C. Horowitz, Allan Dafoe. Ethics and Governance of Artificial Intelligence: Evidence from a Survey of Machine Learning Researchers.” (2021). Journal of Artificial Intelligence Research. (arXiv) (journal) (Kahn presentation)
Remco Zwetsloot, Baobao Zhang, Noemi Dreksler, Lauren Kahn, Markus Anderljung, Allan Dafoe, and Michael C. Horowitz. Skilled and Mobile: Survey Evidence of Immigration Preferences of AI Researchers. (2021). The Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. (arXiv)
Allan Dafoe, Yoram Bachrach, Gillian Hadfield, Eric Horvitz, Kate Larson, & Thore Graepel. (2021). Cooperative AI: machines must learn to find common ground. Nature, 593. (link)
Toby Shevlane & Allan Dafoe. (2021). The Machinery of Power: Artificial Intelligence as a General-Purpose Power Technology. (link to draft)
Sophie-Charlotte Fischer, Jade Leung, Markus Anderljung, Cullen O’Keefe, Stefan Torges, Saif M. Khan, Ben Garfinkel, and Allan Dafoe. (2021). AI Policy Levers: A Review of the US Government’s Tools to Shape AI Research, Development, and Deployment. Centre for the Governance of AI, Future of Humanity Institute, University of Oxford. (link)
Waqar Zaidi & Allan Dafoe. (2021). International Control of Powerful Technology: Lessons from the Baruch Plan for Nuclear Weapons. Centre for the Governance of AI, Future of Humanity Institute, University of Oxford. 2021: 9. (pdf)
- Media: John Thornhill. (2021). Only scientists and voters can change the politics of catastrophe. Financial Times. (link)
R. Daniel Bressler, Robert F. Trager, Allan Dafoe. (2021). The Offense-Defense Balance and the Costs of Anarchy: When Welfare Improves Under Offensive Advantage. (link)
Carina Prunkl, Carolyn Ashurst, Markus Anderljung, Helena Webb, Jan Leike, & Allan Dafoe. (2021). Institutionalizing Ethics in AI through Broader Impact Requirements. Nature Machine Intelligence, 3(2), 104-110. (journal, pdf)
Carolyn Ashurst, Markus Anderljung, Carina Prunkl, Jan Leike, Yarin Gal, Toby Shevlane, Allan Dafoe. (2020). A Guide to Writing the NeurIPS Impact Statement. Medium. (link)
Allan Dafoe, Edward Hughes, Yoram Bachrach, Tantum Collins, Kevin R. McKee, Joel Z. Leibo, Kate Larson, & Thore Graepel. (2020). Open Problems in Cooperative AI. (arxiv)
Gregory Lewis, Jacob Jordan, David Relman, Gregory Koblentz, Jade Leung, Allan Dafoe, Cassidy Nelson et al. (2020). The Biosecurity Benefits of Genetic Engineering Attribution. Nature Communications. 11, no. 6294. (journal)
Allan Dafoe. (2020). AI Governance: Opportunity and Theory of Impact. Effective Altruism Forum. (html, Effective Altruism Forum)
- EA Forum Prize winner. Evaluated as in top 7 most important posts of the year.
Jeffrey Ding & Allan Dafoe. (Forthcoming). The Logic of Strategic Assets: From Oil to AI. Security Studies. (arxiv)
- On recommended reading list for the European Council on Foreign Relations director's podcast (at 31:30)
Toby Shevlane, Ben Garfinkel, & Allan Dafoe. (April 2020). Contact tracing apps can help stop coronavirus. But they can hurt privacy. Monkey Cage, Washington Post. (link)
Miles Brundage, Shahar Avin, Jasmine Wang, et al. (2020). Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims. (pdf)
Toby Shevlane & Allan Dafoe. (2020). The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse of the Technology? In Proceedings of the 2020AAAI/ACM Conference on AI, Ethics, and Society (AIES’20). (pp. 173-179). https://bit.ly/ShevlaneDafoeODBK
Cullen O’Keefe, Peter Cihon, Carrick Flynn, Ben Garfinkel, Jade Leung & Allan Dafoe. (2020). The Windfall Clause: Distributing the Benefits of AI for the Common Good. In Proceedings of the 2020 AAAI/ACM Conference on AI, Ethics, and Society (AIES’20) (pp.327-331). (arxiv)
- Cullen O’Keefe, Peter Cihon, Carrick Flynn, Ben Garfinkel, Jade Leung and Allan Dafoe. (2020). The Windfall Clause: Distributing the Benefits of AI. Centre for the Governance of AI Technical Report. (link)
Aaron Tucker, Markus Anderljung, & Allan Dafoe. Social and Governance Implications of Improved Data Efficiency. (2020). In Proceedings of the 2020 AAAI/ACM Conference on AI, Ethics, and Society (AIES’20). (pp. 378-384 (arxiv).
Remco Zwetsloot & Allan Dafoe. (2019). Thinking About Risks From AI: Accidents, Misuse, and Structure. Lawfare. (link)
Baobao Zhang & Allan Dafoe. (2019). Artificial Intelligence: American Attitudes and Trends. Centre for the Governance of AI, Future of Humanity Institute, University of Oxford. (pdf, html)
- Media: Bloomberg, Vox, Axios, the MIT Technology Review and the Future of Life Institute podcast.
Baobao Zhang & Allan Dafoe. (2020). U.S. Public Opinion on the Governance of Artificial Intelligence. In Proceedings of the 2020 AAAI/ACM Conference on AI, Ethics, and Society (AIES’20) (pp. 187-193). (arxiv, conference)
Ben Garfinkel & Allan Dafoe. (2019). How does the Offense-Defense Balance Scale? Journal of Strategic Studies. 42:6, 736-763 (journal, pdf)
Nick Bostrom, Allan Dafoe, & Carrick Flynn. (2019). Public Policy and Superintelligent AI: A Vector Field Approach. in S. Matthew Liao ed. Ethics of Artificial Intelligence. New York: Oxford University Press. (pdf, publisher)
Allan Dafoe. (2018). AI Governance: A Research Agenda. Centre for the Governance of AI, Future of Humanity Institute, University of Oxford. (pdf)
Miles Brundage, Shahar Avin, ..., Allan Dafoe, ..., Dario Amodei. (2018). The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation. (pdf)
Katja Grace, John Salvatier, Allan Dafoe, Baobao Zhang, & Owain Evans. (2018). Viewpoint: When Will AI Exceed Human Performance? Evidence from AI Experts. Journal of Artificial Intelligence Research. 62: 729-754. (journal, arxiv)
- Media: #16 in Altmetric most discussed articles of 2017, BBC, Newsweek, NewScientist, Tech Review, ZDNet, Musk, Slate Star Codex, The Economist,...
Syllabus for Yale seminar "Global Politics of AI", 2017.
Allan Dafoe & Miles Brundage. (2017). Evidence submitted to Lords Select Committee on Artificial Intelligence on behalf of Future of Humanity Institute. (pdf)
Allan Dafoe & Stuart Russell. (2016). Yes, We Are Worried About the Existential Risk of Artificial Intelligence. MIT Technology Review. (link, replication files)
Allan Dafoe. (2015). On Technological Determinism: A Typology, Scope Conditions, and a Mechanism. Science, Technology & Human Values. 40(6): 1047-1076. (journal) (pdf)

Reputation, Honor, Provocation, Resolve

Leaders and publics care about reputation and honor. This concern seems to be an important cause of war. Is it? I investigate this through survey experiments, natural experiments, and theory.

Allan Dafoe, Remco Zwetsloot, and Matthew Cebul. (2021). Reputations for Resolve and Higher-Order Beliefs in Crisis Bargaining. Journal of Conflict Resolution: 0022002721995549.
Matthew Cebul, Allan Dafoe, & Nuno Monteiro. (2021). Coercion and the Credibility of Assurances. Journal of Politics. 83(3) (pdf, appendix)
Allan Dafoe, Sophia Hatz, & Baobao Zhang. (2020). Coercion and Provocation. Journal of Conflict Resolution. (pdf)
Jessica Weiss & Allan Dafoe. (2019). Authoritarian Audiences and Elite Rhetoric in International Crises: Evidence from China. International Studies Quarterly. 63(4), 963-973. (pdf, prereg, journal).
- Monkey Cage (Washington Post) by Weiss
Allan Dafoe & Jessica Weiss. Provocation, Public Opinion, and Crisis Escalation: Evidence from China. (pdf, prereg)
Allan Dafoe, Remco Zwetsloot, & Matthew Cebul. (2021). Reputations for Resolve and Higher-Order Beliefs in Crisis Bargaining. Journal of Conflict Resolution. (pdf, journal, supplementary materials)
Allan Dafoe & Devin Caughey. (2016). Honor and War: Southern U.S. Presidents and the Effects of Concern for Reputation. World Politics. 68(2): 341-381. (pdf; link to other files)
- - Winner of the 2011 International Studies Association Kenneth E. Boulding Award.
  - In (six) most-cited 2016 World Politics articles
  - Boston Globe Brainiac article
Jonathan Renshon, Allan Dafoe, & Paul Huth. (2018). Leader Influence and Reputation Formation in World Politics. American Journal of Political Science. 62: 325-339 (pdf, journal)
Allan Dafoe, Jonathan Renshon, & Paul Huth. (2014). Reputation and Status as Motives for War. Annual Review of Political Science. 17: 371–393 (pdf)
Allan Dafoe. (2012). Resolve, Reputation, and War: Cultures of Honor and Leaders’ Time-in-Office. UC Berkeley Dissertation. (pdf or here)
Allan Dafoe. (2011). Review of Thomas Lindemann's 'Causes of War: the Struggle for Recognition.' Journal of Peace Research. 48(5): 685-686. (pdf)

The Liberal Peace

The peace amongst liberal countries is one of the most important phenomena for the wellbeing of humanity. I seek to understand what causes it.

Joslyn Barnhart, Allan Dafoe, Elizabeth N. Saunders, & Robert Trager. (2020). The Suffragist Peace. International Organization. 74(4), 633-670 (pdf, journal)
1. - Joslyn N. Barnhart, Robert F. Trager, Elizabeth N. Saunders, Allan Dafoe. (2020). "Women’s Suffrage and the Democratic Peace: Female Voters Slow the March to War." Foreign Affairs. August 18. (link)
  - Referenced at https://www.nytimes.com/2022/01/12/opinion/gender-gap-politics.html
Consulted with Steven Pinker who advised President Obama's Athens speech about how to characterize the democratic peace. November 14, 2016. (link)
Allan Dafoe & Nina Kelsey. (2014). Observing the Capitalist Peace: Examining Market-Mediated Signaling and Other Mechanisms. Journal of Peace Research.51(5): 619–633. (journal, pdf, rep files)
1. - Runner-up for Nils Petter Gleditsch JPR Article of the Year Award, 2014.
Allan Dafoe, John Oneal, & Bruce Russett. (2013). The Democratic Peace: Weighing the Evidence and Cautious Inference. International Studies Quarterly. 57(1): 201–214. (article pdf) (replication files)
Allan Dafoe & Bruce Russett. (2013). Does Capitalism Account for the Democratic Peace? The Evidence Still Says No. In Assessing the Capitalist Peace, ed. Gerald Schneider & Nils Petter Gleditsch. Routledge. (replication files)
Moderator and contributor for National Intelligence Council website on Global Trends 2030; August 2012. Moderated virtual roundtable involving William Thompson, Jack S. Levy, Richard Rosecrance, Benjamin Fordham, Bradley Thayer, Joshua Goldstein, Steven Pinker, and Erik Gartzke.
Allan Dafoe. (2011). Statistical Critiques of the Democratic Peace: Caveat Emptor. American Journal of Political Science. 55(2): 247–262. (article pdf) (supporting information) (replication files)

Methodology

Causal inference is central to social science. Many of our tools depend on implausible parametric assumptions, are fragile, and opaque. I seek to develop tools for causal inference, particularly for observational data, that do not depend on implausible assumptions, are more robust, and more transparent.

Eggers, Andrew C., Guadalupe Tuñón, and Allan Dafoe. "Placebo tests for causal inference." American Journal of Political Science (2023). (journal)
Garret Christensen, Allan Dafoe, Edward Miguel, Don A. Moore, Andrew K Rose. (2019). "A study of the impact of data sharing on article citations using journal policies as a natural experiment." PloS one. Dec 18;14(12):e0225883. (journal open)
Devin Caughey, Allan Dafoe, & Luke Miratrix. Beyond the Sharp Null: Permutation Tests, Heterogeneous Effects, and Bounded Null Hypotheses. (arxiv)
Trang Nguyen, Allan Dafoe, & Elizabeth Ogburn. (2019). The Magnitude and Direction of Collider Bias for Binary Variables. Epidemiologic Methods. (journal, arxiv)
Allan Dafoe, Baobao Zhang, & Devin Caughey. (2018). Information Equivalence in Survey Experiments: Diagnostics and Solutions. Political Analysis. 26(4): 399-416. (link to pdf and pre-analysis plan)
Devin Caughey, Allan Dafoe, & Jason Seawright. (2017). Nonparametric Combination (NPC):A Framework for Testing Elaborate Theories. The Journal of Politics. 79(2): 688-701. (pdf, journal, pdf with appendix)
Allan Dafoe. (2018). Nonparametric Identification of Causal Effects under Temporal Dependence. Sociological Methods & Research. 47(2): 136-168 (pdf)
Allan Dafoe. (2012). "Commentary on John Gerring's Social Science Methodology." Qualitative and Multi-Method Research Newsletter. 10(1):1-4. (pdf)

Transparency

Science depends on transparency. I work to promote better transparency norms and practices: scientists should share complete replication files, should preregister their analyses, should make their analyses transparent, and should evaluate the robustness of their results to reasonable alternative specifications.

Leamer-Rosenthal Prize for Open Social Science: Emerging Researcher. (2015).
Brian Nosek, George Alter,..., Allan Dafoe,..., Tal Yarkoni. (2015). Promoting an Open Research Culture. Science. 348 (6242): 1422-1425. (journal) (pdf)
Allan Dafoe. (2014). Science Deserves Better: The Imperative to Share Complete Replication Files. PS: Political Science & Politics. 47(1): 60–66 (journal) (pdf) (blogs 1, 2, 3) (replication files)

Unpublished Drafts

Google Sites

Report abuse