I finally found some time to run a more comprehensive evals of Connections with one guess at a time and using Python code to validate the guesses and give feedback. I ran about 100 puzzles with...
Logs
Some interesting commentary on the behaviors of founders, managers and leaders written by Rands.
There have been a number of small-in-scope, but tough problems that I've run into that models haven't been able crack as l've presented them via prompting. Usually, these are problems with a few...
I'm making another, more thorough pass of course.fast.ai, including all notebooks and videos and this time I am going to focus more on the projects. I'll also be logging a lot more notes as doing so...
A nice writeup by Eugene on building a simple data viewer webapp with a few different framworks. I am going to need to try out including llm-ctx.txt next time I write FastHTML to see if it helps make...
I was going to write a quick guide on how to get up and running using Google's Gemini model via API, since I found it quite straightforward and Twitter is currently dunking on Google for how hard...
I am continuing to see a lot of buzz about ColPali and Qwen2-VL. I'd like to try these out but haven't put together enough of the pieces to make sense of it yet. I am also seeing a lot of...
Played around a bit with baml for extraction structured data with a VLM. It's an interesting approach and has better ergonomics and tooling from most things I've tried so far. I like how you can...
Great to see more concrete results published on how different models are "the best" at writing different programming languages.
Language models can't generate instructions for knitting patterns generate crossword puzzles from scatch