A New Toy...

The M5 Max Macbook Pro just arrived. First thing I did was fling llama.cpp, Wilmer and Open WebUI on it.

Honestly, the speeds are really impressive, even considering that llama.cpp hasn't fully integrated the hardware changes yet (at least, that's my understanding). Here's a comparison of Qwen3.5 35b a3b between the M5 Max Macbook vs the M3 Ultra Mac Studio

M5 Max MacBook Pro:

1450 t/s processing, 68 t/s generation

prompt eval time =    
    3202.80 ms /  4654 tokens 
    (0.69 ms per token,  1453.10 tokens per second)
eval time =    
    7098.19 ms /   483 tokens 
   (14.70 ms per token,    68.05 tokens per second)
total time =   10300.99 ms /  5137 tokens

M3 Ultra Mac Studio:

1647 t/s processing, 48 t/s generation

prompt eval time = 
    3810.74 ms / 6280 tokens 
    (0.61 ms per token, 1647.97 tokens per second)
eval time = 
    14695.00 ms / 704 tokens 
    (20.87 ms per token, 47.91 tokens per second)
total time = 
    18505.75 ms / 6984 tokens

So yea- the Studio processes prompts faster (at this size of model and this amount of tokens, though I think that it actually saturates better on the M5 Max at larger prompts), but generates tokens slower than the M5 Max.

Super excited to play with this. I got rid of the M2 Max Macbook, so this is my main travel machine now.

Wrangling Qwen's Overthinking with Workflows

So I've been running Qwen3.5 122b a10b lately on the M2 Ultra (currently GLM 5 is sitting on the M3), and if you've used any of the Qwen3.5 family, you've probably seen or heard about the overthinking issue. The models are great

Slimming Down the Homelab Software Footprint

So my homelab setup post from a while back is already outdated. Not as much on the hardware part; rather the software side has consolidated dramatically. The original setup had somewhere around 20 to 30 separate WilmerAI instances running across my network. Each one was configured for a specific purpose:

The Right Monitor is Hard to Come By

It is shocking how difficult it is to find a 34" curved Ultrawide that is either 2560x1080 or 5120x2160. Back in 2020 or 2021, Spectre made one; it's been discontinued now though. The big issue for me is two fold because I have a triple monitor setup:

My Foray Back Into Linux...

So I decided to make use of one of the mini-pcs I had gotten for the homelab to build a little web browsing box. My first iteration of the web browsing box was a Windows 11 machine, which is the same machine that got me banned from reddit for VPN