Ady’s World     Contact

Notes about AI Alignment, Safety & Governance

**I’m still figuring out how to transfer content here

  



    

             
Superposition & Polysemanticity
   
Sparse Autoencoders

Alignment Faking


Transformer Circuits

Jail Breaks
Mechanistic Interpretability 

Benchmarking

Compute Governance

Scaling Laws

Q&A