Updates
- Blog posts for Class 8 (Interpretability) and Class 9 (Extractability) are now posted.
Reading for Tuesday, 24 February and Thursday, 26 Feburary
- Dari Amodei. The Adolescence of Technology Confronting and Overcoming the Risks of Powerful AI. January 2026.
This essay is intended as a follow-up to Amodei’s Machines of Loving Grace essay we discussed in Class 3. It is the only required reading for next week. It is quite long and covers a lot of ideas, so we will split it over the two classes next week.
(Recap from Earlier) Readings for Thursday, 19 February
- Jessica Ji, Jenny Jun, Maggie Wu, and Rebecca Gelles. Cybersecurity Risks of AI-Generated Code. CSET Report, November 2024. This is a long report and you are not expected to read the full report, but you should skim it to see what it covers, and pick sections that you think are interesting to read in more details. [Report Webpage] [PDF]
Read at least one of the two posts below:
-
Simon Willison, How StrongDM’s AI team build serious software without even looking at the code. 7 February 2026.
-
Nicholas Carlini, Building a C compiler with a team of parallel Claudes. Anthropic Blog, 5 February 2026.