Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Anyone alive out here ?



>>> What's the current state of the art? Interfacing models with text has big limits...
>> Quite closely related, I've been wondering what the state of the art for open-source OCR is, particularly of Japanese text.
> I'm waiting for the first Llama-like LLM with image recognition similar to ChatGPT.

Not sure if "similar to ChatGPT" precludes the much worse performance of the "Llama-like" models at https://ollama.com/search?c=vision, but I've been impressed with llava-phi3 for a model small enough to run on a phone that I got for free with a 2-month contract 3 years ago. I do this on (unrooted) android 11 with termux->proot->ollama. But I've never tried OCR with it, Japanese or otherwise.

Home | Main Index | Thread Index