How can I use a local LLM on Linux to generate a long story?

ChasingEnigma@lemmy.world · edit-2 8 months ago

How can I use a local LLM on Linux to generate a long story?

PeterPoopshit@lemmy.world · edit-2 8 months ago

If you get just the right gguf model (read the description when you download them to get the right K-optimization or whatever it’s called) and actually use multithreading (llamacpp supports multithreading so in theory gpt4all should too), then it’s reasonably fast. I’ve achieved roughly half the speed of ChatGPT just on an 8 core amd fx with ddr3 ram. Even 20b models can be usably fast.