Thought Eddies

2024-12-31

Year in Review: 2024

This year included a lot of writing and learning new things.

2024-12-31

Challenges Building an Electron App

I've been building an Electron app called "Delta. Delta is a tool for knowledge exploration and ideation through the branching of conversations with language models. I have lots of ideas for how I want to make this idea useful and valuable, but today it looks like this.

2024-11-16

Conversation Branching

Language models are more than chatbots - they're tools for thought. The real value lies in using them as intellectual sounding boards to brainstorm, refine and challenge our ideas.

2024-11-02

Models Writing About Coding With Models

I recently found Joe's article, We All Know AI Can’t Code, Right?.

2024-10-22

Language model random number generator

I had the idea to try and use a language model as a random number generator. I didn't expect it to actually work as a uniform random number generator but was curious to see what the distribution of numbers would look like.

2024-09-21

Claude 3.5 Sonnet Connections Evals

I've continued experimenting with techniques to prompt a language model to solve Connections. At a high level, I set out to design an approach to hold the model to a similar standard as a human player, within the restrictions of the game. These standards and guardrails include the following: The...

2024-08-16

VLMs Hallucinate

I've done some experimentation extracting structured data from documents using VLMs. A summary of one approach I've tried can be found in my repo, impulse. I've found using Protobufs to be a relatively effective approach for extracting values from documents. The high-level idea is you write a...

2024-08-12

Structured Output, Functions and Prompting

I've been prompting models to output JSON for about as long as I've been using models. Since text-davinci-003, getting valid JSON out of OpenAI's models didn't seem like that big of a challenge, but maybe I wasn't seeing the long tails of misbehavior because I hadn't massively scaled up a use...

2024-08-03

VLM data extraction with Protobufs

In light of OpenAI releasing structured output in the model API, let's move output structuring another level up the stack to the microservice/RPC level.

2024-07-21

Making Your Vision Real with Models

Using models for various different purposes daily has been a satisfying endeavor for me because they can be used as tools to help make your vision for something come to life. Models are powerful generators that can produce code, writing, images and more based on a user's description of what they...

2024-07-16

VLMs aren't blind

I attempted to reproduce the results for one task from the VLMs are Blind paper. Specifically, Task 1: Counting line intersections. I ran 150 examples of lines generated by the code from the project with line thickness 4.

2024-07-12

Challenges and Opportunities of the Impact of Language Models on Software Engineering

I'm trying something a bit new, writing some of my thoughts about how the future might look based on patterns I've been observing lately.

2024-07-06

Claude Artifacts

I spent some time working with Claude Artifacts for the first time. I started with this prompt I want to see what you can do. Can you please create a 2d rendering of fluid moving around obstacles of different shapes?

2024-06-23

Claude 3.5 Sonnet Codes Really Well

One of my favorite things to do with language models is to use them to write code. I've been wanting to build a variation on tic-tac-toe involving a bit of game theory. I called it "Tactic". I wasn't even really sure if the game would be any more interesting than tic-tac-toe itself, which reliably...

2024-06-18

Language model-based aggregators

Model-based aggregators

2024-06-13

Learning How to Learn

I completed Barbara Oakley's "Learning How to Learn" course on Coursera. The target audience seems to be students, but I found there were helpful takeaways for me as well, as someone who is a decade out of my last university classroom.

2024-06-05

Switching From Pocket to Raindrop for bookmarks

I've been using Pocket for a long time to keep track of things on the web that I want to read later. I save articles on my mobile or from my browser, then revisit them, usually on my desktop. Some articles I get to quickly. Others remain in the stack for a long time and can become...

2024-05-15

Evals: unit testing for language models

Generative AI and language models are fun to play with but you don't really have something you can confidently ship to users until you test what you've built.

2024-01-31

Language Model Streaming With SSE

OpenAI popularized a pattern of streaming results from a backend API in realtime with ChatGPT. This approach is useful because the time a language model takes to run inference is often longer than what you want for an API call to feel snappy and fast. By streaming the results as they're produced,...

2024-01-21

Sandboxed Python Environment

Disclaimer: I am not a security expert or a security professional.

2024-01-13

Fine-tuning gpt-3.5-turbo to learn to play "Connections"

I started playing the NYTimes word game "Connections" recently, by the recommendation of a few friends. It has the type of freshness that Wordle lost for me a long time ago. After playing Connections for a few days, I wondered if an OpenAI language model could solve the game (the objective is to...

Keyboard Shortcuts

Global

Navigation