Dallas Card

Email: dalc@umich.edu
Office: North Quad 3421
GitHub, Bluesky, Blog
Google Scholar, ORCiD

I am an assistant professor in the School of Information at the University of Michigan. Before that, I was a postdoctoral researcher in the Stanford NLP Group and the Stanford Data Science Institute. I received my Ph.D. from the Machine Learning Department at Carnegie Mellon University, where I was advised by Noah Smith.

My research centers on making machine learning more reliable and responsible, and on using machine learning and natural language processing to learn about society, history, and culture.

Updates

July 2025: I will be attending ACL 2025 in Vienna, as well as the Pre-ACL workshop hosted by the Pioneer Center for AI in Copenhagen.
June 2025: Two papers by my student Ben Litterer (with myself and David Jurgens) have been accepted to ACL: The SPoRC podcast dataset paper, and a review of methods for measuring linguistic accommodation/coordination! (Official ACL versions forthcoming)
June 2025: I also had a solo authored paper accepted to ACL, investigating the use of the Corpus of Founding Era American English (COFEA) for studying questions of historical legal meaning. Please see the accompanying online appendix for more details!
November 2024: We've released the first version of our Structured Podcast Research Corpus (SPoRC), and accompanying paper!
October 2024: I will be speaking at the Center for Law and Social Science Workshop (CLASS) at USC on November 18th.
July 2024: I will be attending IC2S2 2024 in Philadelphia, July 17-20.

Current Ph.D. Students

Ben Litterer (co-advised with David Jurgens)
Lavinia Dunagan
Meera Desai (co-advised with Abigail Jacobs)

Selected Publications

Mapping the Podcast Ecosystem with the Structured Podcast Research Corpus [preprint]
Benjamin Litterer, David Jurgens, Dallas Card
Association for Computational Linguistics (ACL), 2025
[code] [data]

Coordinating Chaos: A Structured Review of Linguistic Coordination Methodologies [preprint]
Benjamin Litterer, David Jurgens, Dallas Card
Association for Computational Linguistics (ACL), 2025

Meaning Variation and Data Quality in the Corpus of Founding Era American English [preprint]
Dallas Card
Association for Computational Linguistics (ACL), 2025
[online appendix] [code]

You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments
Bangzhao Shu, Lechen Zhang, Minje Choi, Lavinia Dunagan, Lajanugen Logeswaran, Moontae Lee, Dallas Card, David Jurgens
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
[bib]

An Archival Perspective on Pretraining Data
Meera A. Desai, Irene V. Pasquetto, Abigail Z. Jacobs, Dallas Card
Patterns, March 2024
[bib]

When it Rains, it Pours: Modeling Media Storms and the News Ecosystem
Benjamin Litterer, David Jurgens, Dallas Card
Findings of Empirical Methods in Natural Language Processing (EMNLP), 2023
[code] [bib]

Substitution-based Semantic Change Detection using Contextual Embeddings
Dallas Card
Association for Computational Linguistics (ACL), 2023
[code] [bib]

Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
Suchin Gururangan, Dallas Card, Sarah K. Dreier, Emily K. Gade, Leroy Z. Wang, Zeyu Wang, Luke Zettlemoyer, Noah A. Smith
Empirical Methods in Natural Language Processing (EMNLP), 2022
[code] [bib]

Computational analysis of 140 years of US political speeches reveals more positive but increasingly polarized framing of immigration
Dallas Card, Serina Chang, Chris Becker, Julia Mendelsohn, Rob Voigt, Leah Boustan, Ran Abramitzky, Dan Jurafsky
Proceedings of the National Academy of Sciences 119(31), 2022
[data and code] [bib] Media coverage: New York Times, Washington Post, NPR

The Values Encoded in Machine Learning Research
Abeba Birhane, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, and Michelle Bao
ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2022
[data and code] [bib] Media coverage: VentureBeat
[Distinguished Paper Award]

Modular Domain Adaptation
Junshen Chen, Dallas Card, Dan Jurafsky
In Findings of the Association of Computational Linguistics (ACL), 2022
[blog post] [code] [bib]

Problems with Cosine as a Measure of Embedding Similarity for High Frequency Words
Kaitlyn Zhou, Kawin Ethayarajh, Dallas Card, Dan Jurafsky
Association for Computational Linguistics (ACL), 2022
[bib]

On the Opportunities and Risks of Foundation Models
Rishi Bommasani, Drew A. Hudson, Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, et al.
arXiv:2108.07258, 2021
[bib]

Expected Validation Performance and Estimation of a Random Variable's Maximum
Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith
Findings of Empirical Methods in Natural Language Processing (EMNLP), 2021
[bib]

Causal Effects of Linguistic Properties
Reid Pryzant, Dallas Card, Dan Jurafsky, Victor Veitch, and Dhanya Sridhar
North American Chapter of the Association for Computational Linguistics (NAACL), 2021
[bib]

With Little Power Comes Great Responsibility
Dallas Card, Peter Henderson, Urvashi Khandelwal, Robin Jia, Kyle Mahowald, and Dan Jurafsky
Empirical Methods in Natural Language Processing (EMNLP), 2020
[code] [bib]

Detecting Stance in Media On Global Warming
Yiwei Luo, Dallas Card, and Dan Jurafsky
Findings of Empirical Methods in Natural Language Processing (EMNLP), 2020
[code] [bib]

Explain like I am a Scientist: The Linguistic Barriers of Entry to r/science
Tal August, Dallas Card, Gary Hsieh, Noah A. Smith, and Katharina Reinecke
Human Factors in Computing Systems (CHI), 2020
[bib]

On Consequentialism and Fairness
Dallas Card and Noah A. Smith
Frontiers in Artificial Intelligence, 2020
[bib]

Show Your Work: Improved Reporting of Experimental Results
Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, and Noah A. Smith
Empirical Methods in Natural Language Processing (EMNLP), 2019
[code] [bib] Media coverage: WIRED

Variational Pretraining for Semi-supervised Text Classification
Suchin Gururangan, Tam Dang, Dallas Card, and Noah A. Smith
Association for Computational Linguistics (ACL), 2019
[code] [bib]

The Risk of Racial Bias in Hate Speech Detection
Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A. Smith
Association for Computational Linguistics (ACL), 2019
[bib] Media coverage: VOX

Deep Weighted Averaging Classifiers
Dallas Card, Michael Zhang, and Noah A. Smith
ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2019
[code] [blog post] [bib]

Neural Models for Documents with Metadata
Dallas Card, Chenhao Tan, and Noah A. Smith
Association for Computational Linguistics (ACL), 2018
[code] [tutorial] [bib]

Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts
Chenhao Tan, Dallas Card, and Noah A. Smith
Association for Computational Linguistics (ACL), 2017
[blog post] [bib]

Analyzing Framing through the Casts of Characters in the News
Dallas Card, Justin H. Gross, Amber E. Boydstun, and Noah A. Smith
Empirical Methods in Natural Language Processing (EMNLP), 2016
[bib]

The Media Frames Corpus: Annotations of Frames Across Issues
Dallas Card, Amber E. Boydstun, Justin H. Gross, Philip Resnik, and Noah A. Smith
Association for Computational Linguistics (ACL), 2015
[bib]

Recent Professional Service

FAccT steering committee member (2023-2025)
EMNLP 2025 publicity chair
Co-organizer of the NLP+CSS workshop at NAACL 2024, June 21 in Mexico City.
Co-organizer of the 2024 Midwest Speech and Language Days, April 15-16th, University of Michigan
Area Chair for ACL Rolling Review (2025, 2024, 2023), FAccT (2025, 2024, 2023), ACL (2023), NAACL (2021)
Reviewer for COLM (2024), ACL Rolling Review (2022, 2021), ACL (2022, 2021), EMNLP (2022, 2021), NAACL (2022, 2021) TACL (2023, 2022, 2021), EMNLP Ethics reviewer (2023, 2022, 2021), FAccT (2022), AAAI (2022, 2021), AIES (2023), International Journal of Communication (2024), The Web Conference (2023), Philosophy and Technology (2021), PeerJ (2021)

About me

I'm originally from Winnipeg, but I have also lived in Toronto, Waterloo, Halifax, Sydney, Kampala, Pittsburgh, Seattle, Palo Alto, and now Ann Arbor.

I am an occasional guest on The Reality Check podcast. You can hear me in episodes #466 (biased algorithms), #382 (deep learning), #362 (Simpson's paradox), and #227 (fMRI and vegetative states).

[short bio for talks]