Claude 3.5 Sonnet Codes Really Well
One of my favorite things to do with language models is to use them to write code.
I’ve been wanting to build a variation on tic-tac-toe involving a bit of game theory.
I called it “Tactic”.
I wasn’t even really sure if the game would be any more interesting than tic-tac-toe itself, which reliably ends in draws for any players who understand the basics of the game.
Rather than explain too much, I’ll show the prompt I wrote for claude-3.5-sonnet
using Workbench.
Try it yourself!
You will probably receive a response quite similar to what I got.
Related: I need to start saving my model conversations in a consistent format.
Claude outputted code for the following files
server.js
public/index.html
public/styles.css
public/script.js
then instructed me to run
npm init -ynpm install express socket.io
This prompt alone got me surprisingly close to a working version of the game.
I pasted these files into Cursor, then followed Claude’s directions to initialize the project.
I started the server then opened two browser tabs.
Both clients connected and matched after the second client connected.
I started clicking the cells of the board and they appeared selected but they did not show a marker (e.g. “X” or “O”).
I prompted Claude to fix this.
Next, I started trying to submit moves.
The server seemed to be receiving the moves but not notifying the clients of the new board state after calculating it.
I asked Cursor in chat (also using Sonnet) to look at server.js
and public/script.js
to figure out why the clients weren’t updating.
Then, it suggested a code change to server.js
to fix this, which just worked.
At this point, the game was basically working.
Claude feels like a powerful tool to me. I wrote less than 1% of the code characters in this project myself. I molded the project using prompts and my vision for the finished product.
Claude writes code in seconds and the code seems to be mostly correct. When the code isn’t correct, you can often use Claude to find and correct the issue by describing the problem you are seeing.
Takeaways
We’ve seen LLMs writing code for a while now, with varied levels of competence. I haven’t played much with models that are fine-tuned to write code, so it’s possible I am late to the party here, but this version of Sonnet is so good. I think it’s the best model I’ve used to write code. I’ve seen many folks building cool stuff with “Artifacts” which I haven’t even had a chance to touch yet (edit: I tried it out), but seeing Claude create a working, non-trivial, multi-file project in a single prompt impressed me. It was a joy to use in Cursor to refine and improve the idea further.
To the skeptics
I’ve frequently heard experienced engineers say projects like this one are easy for anyone who knows how to code. I’ve made several attempts, both with and without models to get a prototype of this idea working, and got stuck at various points along the way each time. Undoubtedly, the false starts partially contributed to my ability to get it working this time. It’s also possible if I had spent a bit more time on any of those occasions, I might have gotten to a working prototype sooner. For me, the bottom line is Claude helped me ship this to the degree that I had hoped to and meaningfully decreased the time from idea to “thing in the world”.
Here is the code in case you are interested in looking around.