5 Comments

The security mitigation and capabilities threshold tables are messed up, security mitigation is missing the first four levels, capabilities threshold is missing the first big chunk, looks like the formatting may have cut off everything before the page breaks in the Framework doc

Expand full comment

Otherwise looks good, I always appreciate the summaries. It's really helpful when you say whether or not something is worth reading in full.

Expand full comment

Yep, thanks, it was a pasting issue (I was editing in Google Docs and they failed to copy back). Should be fixed now.

Expand full comment

I think the Mary Phuong talk at EAG London 2024 (https://youtube.com/watch?v=ZTmRT2Hg1oM) can add some detail on their method and considerations. In particular, they are exploring persuasion and misalignment as well, but hadn't got far enough with those to include in the first publication (5:15-6:10, more on persuasion at 13:05-18:00). The audience questions also bring up where the comparative advantage of Deepmind is for evals (23:15-25:00) and missing threat models (26:05-26:45).

Expand full comment