Sort by
No traps match your search.
Traps
Provider
Models
Overlay
Each dot is one model variant. X-axis = release date, Y-axis = pass rate on the selected traps. Filled dots = thinking on, hollow = no thinking. Hover any dot for details. Pick traps above to overlay 1–N at once; "All (avg)" shows the average across every trap.
Provider
Thinking
View
Group by
Sort
Pass rate = how often the model answers correctly (humans pass; cognitive traps work when models fail). Models are profiled by architectural-constraint category, so the view stays meaningful as new traps are added. Click any model for its full per-trap breakdown.