Hacker Newsnew | past | comments | ask | show | jobs | submit | mcdermott's commentslogin

Update: The very first Playbook I tried (unsloth) didn’t work . There was a bad URL and some package version and dependency issues. I figured out how to get it working and completed the playbook, then submitted a PR that fixes the playbook instructions for others. The PR has not been merge as I write this, so if you run into a problem with the Unsloth playbook, see: PR #14: https://github.com/NVIDIA/dgx-spark-playbooks/pull/14


When raw performance matters, vLLM wins, but Ollama often wins on everything else.

My benchmarks showed vLLM delivering up to 3.2x the requests-per-second of Ollama on identical hardware, with noticeably lower latency at high concurrency.

If you're not looking for the ultimate performance on the latest GPU hardware, then Ollama is still hard to beat. It installs in minutes, runs on laptops, supports CPU fallback, and provides a curated model hub plus on-the-fly model switching. If your typical load is a handful of concurrent users, batch jobs that can wait an extra second, or local exploration during development, Ollama’s “good-enough” performance is exactly that, good enough.

Ollama is the reliable daily driver that gets almost everyone where they need to go; vLLM is the tuned engine you unleash when the freeway opens up and you really need to fly.


The code is located here: https://github.com/robert-mcdermott/ai-knowledge-graph

An example knowledge graph it created is located here: https://robert-mcdermott.github.io/ai-knowledge-graph/


My company has blocked all access to any Anaconda websites from our managed laptops/systems, so I can't access that URL. The Anaconda block was the event that triggered the migration to UV. I'm loving UV, so no interest in using anything from the Anaconda ecosystem again, I just wanted the conda-style centralized environment management as an option, which resulted in the creation of this (UVE).


While I appreciate UV for its clean, per-project virtual environments, it's still convenient at times to have long-lived, general-purpose conda style environments that you can activate from anywhere that aren't tied to an organized project, for general purpose hacking. Since I’ve completely switched from conda to UV, I created this companion utility to replicate conda-like workflows when needed—giving me the best of both worlds.


Author here. This is a simple project I worked on over the long President’s Day weekend. I wanted to create an AI-driven camera and event detection system to use for personal home security purposes, but this concept could be improved to work in any situation where you want to watch and log or alert on some type of event. It’s simplistic at this point, but can be expanded beyond just logging, audible voice notifications and email alerts, to possibly include triggering web-hooks, calling APIs or run commands.


Inside the Private Thoughts of AI


Given that LLMs are adept at handling a variety of traditional NLP tasks like sentiment analysis, named entity recognition, and text classification, it’s interesting to consider whether multimodal (language and vision) LLMs could supplant traditional image classification methods as well. I worked on this project over the long weekend to explore that question and have documented the effort and my findings at the link provided. I'm not a scientist, I just like learning, so don't laugh too hard if this is bogus.


Embedding-plot is a command line utility that can visualize word embeddings using dimensionality reduction techniques (PCA or t-SNE) and clustering in a scatter plot.


I was learning how the security on garage door openers and Remote Keyless Entry (RKE) systems worked and thought it might be an interesting way to authenticate a client to a server rather than using passwords or API keys.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: