Chatbot Roulette

There was some excitement online recently around “Freysa”, a contest which combined large language models (LLMs) with some sort of blockchain technology. The basic idea was for participants to interact with an LLM to try to get it to do a thing that it had been instructed not to do. The trick was that the cost of each attempt increased after every submission (up to some maximum), as did the potential reward.

What is a Podcast?

What is a podcast? I expect we all have our own slightly idiosyncratic answer to that question. For me, podcasts are defined partly by some software I use on my phone that regularly downloads audio files from various feeds, which I periodically listen to, often as I’m doing other things. Most of these recordings feature two or more people engaged in a long-form conversation. Some of these take the form of interviews, in which a regular host interviews a rotating series of guests. Others feature rotating subsets of the same group of hosts, talking together about various issues. A few shoes, like Hardcore History, are just long monologues, not so different from an audiobook.

Bespoke Navigation

I don’t normally do this, but I recently did a short bike trip in Michigan, and I thought I would write a brief post about it, in part because it cued various other ideas which have a more direct connection to the kinds of things I normally write about here.

Credible Estimates and Open Science

Somewhere on social media I encountered the recent story that the Tony Blair Institute for Global Change, (a large, avowedly centrist think tank created by Tony Blair), put out a report estimating the potential impact of AI on jobs in the public service, but that ironically, they arrived at their estimates by basically just asking GPT-4. While this does have a certain kind of poetry to it, it’s not exactly what we think of as a reputable methodology.1

Altair vs. Bokeh (part 3): Scatterplots

This is my third entry in a series comparing two interactive data visualization libraries for python, Bokeh and Altair. In part 1, I went through a basic overview of some of the differences in terms of syntactic style and defaults. In part 2, I went slightly deeper into some of the interactive features and more fundamental limitations. In this part, I want to deepen the comparison further, by exploring the basic capabilities each package offers with respect to a basic chart type, namely scatterplots.

The OpenAI Library

The New York Times recently ran a brief article about the reading room in OpenAI’s office in San Francisco. The article was heavy on images and light on text, but the overall theme was the tension between the company’s GPT models—which have been trained on vast swaths of human culture, and are therefore able to regurgitate, remix, and approximate it—versus embracing the design and aesthetic trappings of a traditional library reading room. The article mentioned a handful of books that could be found in the OpenAI library, but many more were clearly visible in the photographs that accompanied it.

Sociotechnical Considerations

A whole genre of podcasts seems to have emerged recently, taking the shape of interviews with CEOs of AI companies, or similar. Although I have not listened to many of these, they mostly seem to be pretty abysmal, both because of the amount of hype they involve, and due to the lack of meaningful specifics.

That being said, of the ones I’ve heard, the recent interview with Dario Amodei of Anthropic on the Ezra Klein Show seems to me to be vastly better than the average. In part, this is because Klein is a smart, curious, and well informed interviewer, who asks a lot of the right questions, and pushes for details when responses get too mushy. In this case, it also helps that Amodei seems much more straightforwardly honest about limitations than most spokespeople in similar positions.

Financing Common Crawl

Mozilla recently published an excellent new report out about Common Crawl, the non-profit whose web crawls have played an important role in the development of numerous large language models (LLMs). Written by Stefan Baack and Mozilla Insights, the report is based on both public documents and new interviews with Common Crawl’s current director and crawl engineer, and goes into some detail about the history of the organization, and how its data is being used.

ChatGPT Prompt Speculations

In a recent tweet that went viral, Dylan Patel claimed to have discovered or revealed the ChatGPT prompt, using a simple hack. The tweet included a link to a text file on pastebin and a screenshot of that same text with newlines removed. More interestingly, the author suggested in a reply that anyone could replicate this finding, and a subsequent tweet included a video of ChatGPT generating text in response to the same trick. That, however, is where things get somewhat strange.

Infinitely Wide Culture

A lot of this is still unresolved in my mind, but I think there is something interesting happening at the intersection of generative AI, art, style, and entertainment. Rather than letting it gestate until more fully formed, I figured I’d just post some preliminary thoughts and come back to this at some later date.

The main reason I’m thinking about this now is the ongoing debate about how generative AI will impact creative fields, such as writing and design (as well as white collar jobs more broadly). Arguably there have already been some pretty dramatic effects, such as the sci-fi magazine Clarkesworld being suddenly overwhelmed by spammy submissions. At the same time, it’s hard to know the extent to which these disruptions may end up being transient phenomena that broader systems will adapt to.