Matyas’ Notes

It is a week 1 me running my AI rig after GPU has arrived last week. Few first observations:

GPUs are big and heavy! And expensive….
The ability of having server which can generate tokens “for free” I would say is very libereting. You are just testing and running things against your own server and you are not worrying about your bills.
You can run some fantastic LLMs on 5 years old hardware (my GPU is RTX 3090). This is probably one of the most fascinating things I have found out this week. Today’s models which you can fit into 24GB VRAM are seriously good and on the level of LLMs we had to pay OpenAI or Google last year. My current LLMs which I’m testing is Qwen3-Coder-30B and the things it can build are crazy good.
I have switched from using LLM as assistant in your VScode to running agents (semi)autonomously like Goose AI. Goose is currently my favourite agent. It is open source and you can run it either in GUI or in console. It supports tools, sub-tools and MCP so it can do lots of interesting stuff.
Speaking of Goose I’m learning how to use agents more efficiently - workflows of to plan and implement with agent is something I’m starting to learn and I would say getting better in it. It is fascinating seeing agents doing their own thing. I’m learning how to build Skills and use receipts.
I chose vLLM as my platform to run LLMs. It took little bit of time of testing but I think nailed it down so everything works as it should be. One weakness is probably limited GGUF support.
The pace is high. vLLM has new release every other week with new features. New better model is being dropped every other month on Huggingface.