2023 Review
End of year review
At the beginning of 2023, I set some goals for myself. Here are those goals and how the year turned out.
20 posts
End of year review
At the beginning of 2023, I set some goals for myself. Here are those goals and how the year turned out.
Exploring the Importance of Embracing Failure in Software Development and the Value of Learning from Mistakes
To write software is to experience constant failure until you get a success. When you start learning to write code, very little works, especially on your first try. You make a lot of mistakes. Maybe you copied example code to get started, then modify it to try and do something new. Reading errors...
promptfoo is a Javascript library and CLI for testing and evaluating LLM output quality. It's straightforward to install and get up and running quickly. As a first experiment, I've used it to compare the output of three similar prompts that specify their output structure using different modes of...
I've been following the "AI engineering framework" marvin for several months now. In addition to openaifunctioncall, it's currently one of my favorite abstractions built on top of a language model. The docs are quite good, but as a quick demo, I've ported over a simplified version of an example...
This past week, OpenAI added function calling to their SDK. This addition is exciting because it now incorporates schema as a first-class citizen in making calls to OpenAI chat models. As the example code and naming suggest, you can define a list of functions and schema of the parameters required...
Imagine we have a query to an application that has become slow under load demands. We have several options to remedy this issue. If we settle on using a cache, consider the following failure domain when we design an architecture to determine whether using a cache actually is a good fit for the use...
I've written several posts on using JSON and Pydantic schemas to structure LLM responses. Recently, I've done some work using a similar approach with protobuf message schemas as the data contract. Here's an example to show what that looks like.
Plenty of data is ambiguous without additional description or schema to clarify its meaning. It's easy to come up with structured data that can't easily be interpreted without its accompanying schema. Here's an example:
Code needs structure output
The most popular language model use cases I've seen around have been chatbots agents chat your X use cases
It's necessary to pay attention to the shape of a language model's response when incorporating it as a component in a software application. You can't programmatically tap into the power of a language model if you can't reliably parse its response. In the past, I have mostly used a combination of...
Experimenting with Auto-GPT
Auto-GPT is a popular project on Github that attempts to build an autonomous agent on top of an LLM. This is not my first time using Auto-GPT. I used it shortly after it was released and gave it a second try a week or two later, which makes this my third, zero-to-running effort.
I believe that language models are most useful when available at your fingertips in the context of what you're doing. Github Copilot is a well known application that applies language models in this manner. There is no need to pre-prompt the model. It knows you're writing code and that you're going...
Over the the years, I've developed a system for capturing knowledge that has been useful to me. The idea behind this practice is to provide immediate access to useful snippets and learnings, often with examples. I'll store things like: Amend commit message with tags like #git, #commit, and #amend...
I know a little about nix. Not a lot. I know some things about Python virtual environments, asdf and a few things about package managers. I've heard the combo of direnv and nix is fantastic from a number of engineers I trust, but I haven't had the chance to figure out what these tools can really...
I came upon https://gpa.43z.one today. It's a GPT-flavored capture the flag. The idea is, given a prompt containing a secret, convince the LM to leak the prompt against prior instructions it's been given. It's cool way to develop intuition for how to prompt and steer LMs. I managed to complete all...
Attempts to thwart prompt injection
I've been experimenting with ways to prevent applications for deviating from their intended purpose. This problem is a subset of the generic jailbreaking problem at the model level. I'm not particularly well-suited to solve that problem and I imagine it will be a continued back and forth between...
Jailbreaking as prompt injection
If you want to try running these examples yourself, check out my writeup on using a clean Python setup.
Since the launch of GPT3, and more notably ChatGPT, I’ve had a ton of fun learning about and playing with emerging tools in the language model space.