Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Anyone alive out here ?



On 2024/09/04 16:54, J. Hart wrote:
Is anyone out that way doing anything with large language models under Linux ? I've got one running here, and have written a voice input for it using the Vosk C++ api.

Back in May, I played with llama.cpp (https://github.com/ggerganov/llama.cpp) to run some smallish LLM models (6B and 70B) directly on my desktop. It was surprisingly easy, even on AMDGPU.

Later, I installed ollama and blogged about my experience with a few freely available LLMs:

  https://mstdn.io/@codewiz/112527717194517544


I can carry on a vocal conversation with it instead of typing, and get a vocal spoken response if I wish.  It mutes the microphone input to the LLM so it doesn't get confused by hearing itself speak.

Do you have a frontend speech-to-text model that spits out text as input to a regular LLM, then feed the output to a TTS model? From my experience, this leads to lots of funny mistakes :-)

OpenAI promised an integrated "multimedia" experience a few months ago, then failed to launch it.

What's the current state of the art? Interfacing models with text has big limits...

--
_ // Bernie Innocenti
\X/  https://codewiz.org/


Home | Main Index | Thread Index