University of Michigan's New AI Tools

Just before the start of the fall semester, the University of Michigan announced that it was launching a new suite of tools, all focused on “generative AI”, (although so far limited to language models), which will be available to all students, faculty, and staff. This post provides a preliminary exploration of the new offerings.

Altair vs. Bokeh (part 1)

This is the first of what I hope will be a series of posts comparing Altair and Bokeh. Both are actively supported python packages for making interactive visualizations. This post will only scratch the surface, but is intended to show the basic differences in how they approach creating visualizations.

The Gradual Disappearance of Twitter

It was recently reported by inews.co.uk that Twitter is going to start charging academic researchers and institution $42,000/month if they want to maintain their current level of expansive access to data, and – more significantly – require that they delete all Twitter data from their archives if they do not. I’d heard rumors that this might be happening a few weeks ago, but the iNews article is the first independent reporting that I’ve seen about it.

ChatGPT Hobbyists

ChatGPT has understandably garnered a huge amount of attention from all corners of academia, from philosophy to economics. One of the more quixotic examples I’ve encountered recently is Robert W. McGee, and his many papers on this topic, such as “Is Chat Gpt Biased Against Conservatives? An Empirical Study”. A professor of accounting at Fayetteville University, McGee’s biography reads as something like a Marvel Cinematic Universe version of a nerdy academic supervillain.

Samsung's Encounter with ChatGPT

As ChatGPT continues to ricochet through the news cycle, media outlets are surely on the hunt for new angles they can present to the public in order to keep this story in motion. Among other threads, one that has gained some traction is the question of risks to privacy and security presented by these new systems. Last week, a number of US outlets reported on data leaks at Samsung, in which three employees (in separate incidents) apparently entered confidential company information into ChatGPT.

ChatGPT and Sociotechnical Instability

I’ve written about this before, but it’s worth remembering that almost nothing in sociotechnical systems is guaranteed to remain stable for very long. We’ve recently had two great examples of this, with the first being the changes to Twitter, and the second being ChatGPT (and, by extension, the new Bing). In the first case, a platform which had long seemed relatively static, (especially compared to all the rest), rather suddenly changed hands, which led to major changes in what it delivered.

ChatGPT Dominance

I expect that almost anyone reading this will have heard of ChatGPT by now. Released about a month ago, ChatGPT is a system developed by OpenAI which provides text responses to text input. Although details are scarce, under the hood ChatGPT is basically a large language model, trained with some additional tricks (see Yoav Goldberg’s write up for a good summary). In other words, it is a model which maps from the text input (treated as a sequence of tokens), to a distribution over possible next tokens, and generates text by making repeated calls to this function, and sampling tokens from the predicted distributions.

AI, software, and governance

In a recent article covering the FTX collapse, the New York Times described large language models (LLMs) as “an increasingly powerful breed of A.I. that can write tweets, emails and blog posts and even generate computer programs.” There is a lot that we could pick apart in this definition (e.g., what makes LLMs part of a “breed”, what distinguishes the ability to write an email as opposed to a blog post, etc.), but for the moment I’d like to focus on the term “A.I.” (henceforth “AI”). Referring to LLMs as an example of AI is certainly not atypical. Indeed, it increasingly seems like LLMs have become one of the modern canonical examples of this concept. But why is it that we think of these systems as members of this category? And how much rhetorical work is being done by referring to LLMs as a type of “AI”, as opposed to “models”, “programs”, “systems”, or other similar categories?

Hacking LLM bots

For anyone who missed it, a Twitter account named @mkualquiera recently deployed what seems like a kind of adversarial attack in the wild on a large language model (LLM)-based Twitter bot. I’ll link to the key post below, but it’s worth providing a bit of context, as it wasn’t immediately clear to me what was going on when I first saw the tweet.

Report from FAccT 2022

The fifth iteration of FAccT (the ACM Conference on Fairness, Accountability, and Transparency) was held earlier this month (June 21–24) in Seoul, South Korea. More than just a hybrid conference, this was actually a full in-person conference, combined with a full on-line conference. These happened in parallel, with virtual sessions starting before and continuing after the in-person component each day. Around 500 people attended in person, with another 500 participating remotely.