Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

o1 has a ~1650 rating, at that level many or most problems you will be solving are going to be a transplant of a relatively known problem.

Since o1 on codeforces just tried hundreds or thousands of solutions, it's not surprising it can solve problems where it is really about finding a relatively simple correspondence to a known problem and regurgitating an algorithm.

In fact when you run o1 on ""non-standard"" codeforces problems it will almost always fail.

See for example this post running o1 multiple times on various problems: https://codeforces.com/blog/entry/133887

So the thesis that it's about recognizing a problem with a known solution and not actually coming up with a solution yourself seems to hold, as o1 seems to fail even on low rated problems which require more than fitting templates.



o3 is what i’m referring to and it is 2700


It's extremely unlikely for o3 to have hit 2700 on live contests as such a rapid increase in score would have been noticed by the community. I can't find anything online detailing how contamination was avoided since it clearly wasn't run live, including in their video, and neither could I find details about the methodology (number of submissions being the big one, in contests you can also get 'hacked' esp. at a high level), problem selection, etc...

Additionally, people weren't able to replicate o1-mini results in live contests straightforwardly - often getting scores between 700 and 1200, which raises questions as for the methodology.

Perhaps o3 really is that good, but I just don't see how you can claim what you claimed for o3, we have no idea that the problems have never been seen, and the fact people find much lower Elo scores with o1/o1-mini with proper methodology raises even more questions, let alone conclusively proving these are truly novel tasks it's never seen.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: