Blogging Team [5]: Yuezhang Chen, Seth Lifland, Grace Kitthanawong , Iman Mohamed, Salonee Verma
Class Updates
Following up on a question in the previous class, Professor Evans took us through the landscape of social media class action lawsuits. There are many various firms setting up these lawsuits against social media companies, and it will be interesting to see how they shape public perception. Lawsuits helped change the perception of tobacco and since the 1980s there has been a dramatic reduction in smoking and consequent improvements in public health, so it’s possible they could do the same today.
We also discussed the Anthropic vs DoW case, as there was a preliminary injunction recently granted. This generally means that the DoW currently (and perhaps temporarily) won’t be able to continue with their designation of Anthropic as a supply chain risk, and all that entails, until the court issues their final ruling. Judge Rita Lin has been assigned to the case, and does not like long PowerPoint presentations.
Project Pitches
Amber
Project team Amber presented first, titled: “A Model’s Ethical Variability”. The team is investigating the ways in which different LLMs will change their responses in important ways when prompted differently, with a specific emphasis on ethical and moral situations. The results across many different models will ultimately be compiled into a compelling data visualization.
Jade
Project team Jade then presented their work on “Autonomous Game Playing: Terminal”, which aims to test how adjusting different parameters can affect a models strategy in the game Terminal. The project is showcasing how small changes to reward functions can cause large differences in model behavior, which generalizes to real use cases. So far, they have found that tweaking aggression level has a large impact on model win rate, with multiple sweet spots at very low aggression and at medium aggression.
Cyan
Project team Cyan is exploring the nature of political content moderation among LLMs. They are curious about whether LLMs are actually the objective sources that they present themselves as. To do this, they used DeepL, an AI translation platform, to translate 225 political prompts across 10 languages, observing the difference in outputs. They plan to compile their observations using cluster maps and charts to visualize the difference in alignment across languages.
Ember
Team Ember presented their project exploring AI in Hiring. They are building ResumeSimulator, an educational tool that will screen sample resumes to maximize understanding of what happens during resume screening. Their goal is to make the hiring process more transparent and help people understand why their resumes may be filtered out before reaching a recruiter. They are focusing on five common resume evaluation metrics used in automated hiring systems, and the tool will allow users to upload resumes for feedback. Their next steps include integrating more research while polishing the user experience.
Hazel
Project Team Hazel presented “May I AI?”, an exploration into the effects of AI on everyday social writing. They were guided by their curiosity about how laypeople with little existing AI knowledge experience AI integration within their personal life and communication methods. They plan to interview thirty people, exploring their relationships with AI in the field of social communication with a personalized survey structure, and will aggregate their research into three Substack articles. So far, most have stated that they do not appreciate AI-generated writing in personal settings.
Indigo
Team Indigo presented “Competing Futures”. They were motivated by the overwhelming amount of information available on important actors, events, and incentives which could affect our future. By clarifying how all of these vastly different scenarios could impact society, people can get an idea of what the current landscape looks like, and what the future might become. In order to showcase their research, they will create a game which people can play to get a better idea of how possible futures might look different and how different events and assumptions will impact the likely future.
Bisque
Team Bisque presented “Data Centers in Orbit” featuring a three episode podcast. Since solar energy is so much more available and efficient in space, several credible groups have proposed putting data centers in space. The team is planning to interview people with different perspectives, including the person at UVA in charge of building UVA’s 72 million dollar data center expansion project. Each episode will cover a topic in regulation, environment, and economic feasibility. The results will be presented in an interactive website.
Garnet
Team Garnet’s project is on AI and Deskilling in Education. AI has a growing presence in education and there is a risk that many students become overreliant on AI. Garnet wants to understand the long-term impacts of using AI in school. They found a UK study of varying ages surveyed how frequently do they use AI which reveals AI might impact student’s critical thinking. They are working on a research paper and plan to turn it into a website.
Dahlia
Team Dahlia’s project “Is AI Art ‘Real’ Art?” explores how perceptions of art change when people learn it was created with AI, exploring resistance to AI-generated work and its place in art. The next step is an interactive website with interviews and guided questions to capture viewer perspectives. A key finding is that tools like Google Gemini embed AI-detectable watermarks in their outputs.
Fushia
Team Fushia’s project “Unskilled to Autofilled” explores whether AI can generate secure web applications by testing AI-built websites for vulnerabilities and identifying common security flaws. Early findings show that models like Claude Sonnet often use outdated dependencies and default to frameworks like Flask, producing functional but insecure sites that are prone to security vulnerabilities. Multiple models, including Sonnet, Gemini, and ChatGPT, were tested, revealing that while all can build working websites, their quality and design vary, with some producing more polished results than others.
Professor Evans’s Feedback
After all the teams had gone, Professor Evans provided the class with some advice and final feedback, emphasizing that teams should take notice of certain aspects of the presentations they have seen from other teams. Many had great ideas and slides that worked really well, so people should think about how to incorporate effective approaches they see from others into their own presentations. But, there were also slides and presentations that didn’t work as well, and it is good to learn from these and try to understand what are the reasons for a presentation to be less effective and to observe carefully and learn to avoid things that didn’t work in your own presentations.
He highlighted the value of the space on the screen effectively, and encouraged groups to use it in the most useful way, such as having a large image that fills most of the screen rather than having 80% of the screen space unused.
As a reminder, we discussed some ways to make better presentations in Class 14, and if you haven’t already watched it, please watch Patrick Henry Winston’s How to Speak talk.