You must log in or # to comment.
I tried the 20b on my PC with ollama and Open WebUI and I have to say for some of my use cases it preforms similar to the online version, I was impressed.
What card are you running it on?
Nvidia rtx 3060 the 12GB Vram one
Not bad. Does it use all the VRAM?
I need to check with that model but usually yes.
Finally, their company name is starting to make SOME sense
It’s awesome but that 128k context window is a throwback to Llama3 days
I bet the closed $ource model has like 2MB context
Crazy but sure ig