3 Comments
User's avatar
Joanna Suau's avatar

An interesting article! I wonder what would happen if you told Claude to test the game and write a report with conclusions based on their test case.

Expand full comment
Ronald Ashri's avatar

Thanks :-) In working with Claude Code in particular I’ve found exactly what you are suggesting to be very powerful. Having Claude right spec documents first, then break the work into clear tasks, the execute on the individual tasks with clear acceptance requirements makes the process much more efficient.

Expand full comment
Maaike Coppens's avatar

An interesting experiment after the release of Sonnet 4. I also had a similar impression after the release where the contextual referencing of previous questions was off versus prior to the release. It would be interesting to compare the performance of 4 with the previous model.

Expand full comment