How is the program structured?

Content

The program runs for 5 weeks. Each week focuses on a different phase of building and testing an alignment method. The goal is to embed values in a system in a way that generalizes and can’t be easily gamed. Participants will form teams during the app process. We recommend teams of 3–5. You can apply solo or with collaborators.

Mentors may support specific teams depending on availability. Teams are expected to coordinate independently and meet regularly. If someone drops out, we’ll help rebalance teams where needed.

</aside>

Week 1: Scoping

In the first week, teams will examine previous attempts within their chosen method. This includes reviewing what has been done before, why they failed or suceeded, and what directions are promising to explore. The goal is to understand what makes this iteration different, and to identify what makes this experiment a success. Teams with assigned mentors will coordinate throughout the week to stress-test their direction.

By the end of Week 1, the team should have:

Reviewed prior work related to their method.
Looked into the novelty in their chosen method.
Drafted a preliminary plan for how to collect and interpret supporting evidence.

<aside> ✍️

Track-specific Examples

Agent Foundations: Define a formal proof under bounded assumptions and sketch a path from proof to implementable architecture
Neuroscience-Based: Identify candidate regions or mechanisms in the brain associated with value encoding, and outline a model for reproducing or testing this behavior
Preference Optimization: Establish a case for how the method improves on prior oversight approaches, supported by references to eval results or known alignment gaps
Open Track: Justify expected scalability and identify alignment evaluations, interpretability tools, and robustness tests appropriate for the method </aside>

Week 2-3: Experimentation

Teams will begin implementation by running tests and iterating based on the research they found from Week 1. We will provide TPU credits and mentorship to help teams build their project from the ground up. In general, every team is expected to test whether their method actually moves the needle on alignment.

By the end of Week 3, the team should have:

At least one working implementation running, even if minimal
Tested the method at increasing scale (larger models, more data, more steps)
Planned for robustness testing in Week 4

<aside> ✍️

For the Agent Foundations tracks

If you’re in this track, you’ll follow one of two subtracks:

Theoretical Focus: Extend proofs, derive constraints, stress-test assumptions
Category Theory: Use string diagrams or string machines to construct and reason about infrakernels </aside>

Week 4: Testing

Teams will critique their alignment method, attempt to break their own evals, and run tests on larger or more adversarial setups. Mentors and the AI-Plans team will advise teams based on prior hackathons in alignment evals.

By the end of Week 4, the team should have:

Attempted to red-team its own method
Documented failure modes and anomalies in the results
Finalizes the version they’ll write up and present

Week 5: Wrap-up

Teams will write their final summary. This includes the method, evidence, assumptions, failure analysis, and proposed next steps. The summary should stand on its own as a falsifiable alignment contribution. Teams will also prepare their poster and get final feedback.

By the end of Week 5, the team should have:

Written a research summary that includes a clear statement of the method, rationale, evaluation, and result
Created poster materials for the final presentation
Integrated the feedback from mentors and peers

Final Day Presentation + Job Fair

The program ends with a public poster session and a job fair.

Presentation Evening: Posters will be presented online in a conference-style format in GatherTown. Each team will have a virtual space to share their work and talk with attendees. Teams will present their work and defend their method. A panel will vote on standout projects.
Job Fair: Research orgs, labs, and startups can host booths, meet researchers, and share open roles.

<aside> 🍿

Attendance and Pricing

All funds go toward program costs and participant stipends.

General admission: €10 for a guaranteed spot
Unpaid participants or below €10 threshold: Free with reservation
Org booths at job fair: €200
15-minute talk slot on main stage: €2000 </aside>

Overview

Content

People

Guides

Apply

Support Us