Thanks :-) In working with Claude Code in particular I’ve found exactly what you are suggesting to be very powerful. Having Claude right spec documents first, then break the work into clear tasks, the execute on the individual tasks with clear acceptance requirements makes the process much more efficient.
An interesting experiment after the release of Sonnet 4. I also had a similar impression after the release where the contextual referencing of previous questions was off versus prior to the release. It would be interesting to compare the performance of 4 with the previous model.
An interesting article! I wonder what would happen if you told Claude to test the game and write a report with conclusions based on their test case.
Thanks :-) In working with Claude Code in particular I’ve found exactly what you are suggesting to be very powerful. Having Claude right spec documents first, then break the work into clear tasks, the execute on the individual tasks with clear acceptance requirements makes the process much more efficient.
An interesting experiment after the release of Sonnet 4. I also had a similar impression after the release where the contextual referencing of previous questions was off versus prior to the release. It would be interesting to compare the performance of 4 with the previous model.