Setup a Temporal worker in Ruby and got familiar with its ergonomics.
Tried out this gpt-4v demo repo
Experimented with OCR capabilities of open source multi-modal language models.
Tried llava:32b (1.6) and bakllava but neither seemed to touch gpt-4-vison’s performance.
It was cool to see the former run on a macbook though.