Using SSH Tunnels to Make Up for Lack of HTTPS on LAN
The reality is that a lot of folks who run open source apis/web front ends on their local LAN tend to run it as plain HTTP; whether its backend llm apis, the front end sites, or whatever other stuff you've tossed in: no TLS anywhere in sight. On one machine thats usually fine since its all loopback, but the second you spread apps across a few different computers (which some of us do), every prompt and every response starts crossing your LAN in plaintext.
Is plaintext on your own network a huge deal? Honestly... a lot of folks would say it's probably low risk. But the moment you've got guests, other people's phones, or random IoT junk sharing that network, your prompts and the models responses flying around in the clear are more exposure than you'd probably be comfortable with, if you really think about it.
So, with that said: I figured Id write up how I've dealt with that, because the most direct answer (certs) is annoying enough on a local network that I think a lot of folks just dont bother. This is a lot easier, especially on something like a mac where you can make sure it kicks off automatically via launchd.
Why not just do TLS
The correct answer is to put TLS on everything; HTTPS everywhere. And you can. But think about what that actually means on a home network full of mixed machines:
- You stand up your own little CA, then sign a cert for each host (unless you want to deal with some code just straight up rejecting the cert).
- You install and trust that CA on every client. Every browser, every OS trust store, and (this is the annoying one) every app that ships its own trust store and ignores the system one. Plenty of python and node apps do that.
- A lot of these local LLM apps dont even expose a TLS option, so to add it you front them with something like nginx or Caddy, which is now another moving part on every box (Setting up Caddy is what convinced me to go this route lol).
- Then a machine joins, or a cert expires, and you get to redistribute the CA all over again.
Now... granted: none of that is rocket surgery. But it sure is tedious, and it never quite stays done. Especially across macOS, Windows, Linux and a phone all at once. As far as I know theres no version of this that isnt fiddly on at least one of those.
Some important notes to consider first
Before I start: I myself don't have this perfect yet, and am still working through the kinks. So I don't want to oversell this as perfect; eventually I'll find the edge cases and sort them, but just go in with your eyes open that this may require some manual intervention from time to time, unless you are able to figure out the imperfections to it that I haven't yet. This really is just a cheap/easy way to work around the headache of local TLS.
Also: if you dont already have SSH enabled on these boxes, this whole thing hinges on turning that on, which is a security consideration to keep in mind. Youre standing up a full login service thats reachable by everything on the LAN. The locked-down authorized_keys suggestion further down only restricts that one tunnel key; it won't do anything about the rest of the daemon: any other key on the box, and password login if its on, still get a normal shell. So the key restriction protects the key, not the machine.
You dont have to go overboard, but at least consider limiting who can even reach it (such as firewalling it to the machines that actually need to connect; just make sure you set static ips for those machines on your LAN). And make sure to keep the OS patched. sshd is a big target and has had nasty bugs over the years; staying current is most of the battle.
The tunnel in a nutshell
Here's the short version of the setup: SSH local port forwarding lets you open a port on your own machine, say 127.0.0.1:5050, and anything you send there gets pushed through an encrypted SSH connection and comes out on the far machine, talking to a service on its loopback.
Put more simply: The app that's running on your client machine thinks its talking to a plain local HTTP service on the same computer, when it's actually feeding an encrypted pipe to another box on your network. SSH handles the encryption on the wire between the two machines, and the app just makes its usual unencrypted request to localhost.
Authentication
I would recommend that you use authentication keys for this, and don't jam your username/password everywhere, please lol
A normal SSH key can log in and do anything your user can do, which is way more than a forwarding key needs, so it's good to lock it down. When you install the public key in the destination's authorized_keys, you want to prefix it with restrictions:
restrict,port-forwarding,permitopen="127.0.0.1:5050" ssh-ed25519 AAAA...
Roughly what those do:
restrictturns everything off (no shell, no PTY, no agent or X11 forwarding, none of it).port-forwardingturns just forwarding back on.permitopencaps it to the exact loopback ports you list.
This way if that key ever leaks, the worst someone should be able to do is open a forward to those specific loopback ports.
For my setup: I make a dedicated ed25519 key per leg for this, give it a passphrase, and let the OS keychain hand the passphrase over so automation isnt blocked by a prompt. As many machines as I have in my homelab, I'd go insane after a power outage otherwise.
Note: I believe thatpermitopenonly caps the local (-L) forwards, not the remote (-R) kind, becauseport-forwardingquietly re-enables both. With the defaultGatewayPorts no, a remote forward should only bind back to the servers own loopback, so it shouldn't really be exploitable in this setup. That said, if you want to lock it down more, then theres a matchingpermitlistenthat caps the-Rside too.
Example setup
Here's a generic setup example to peek over to help give you an idea of how to get started.
On the client, ie the machine that opens the tunnel:
- Make sure the SSH server is running on the destination computer that you want to hit. On macOS, thats Remote Login under Sharing. Other OSes have their own toggle for it; if I remember right I had to install it on Linux Mint because it wasn't installed by default. While youre there, ssh into the box once by hand the normal way, so its host key gets pinned into your
known_hosts. If you skip this, the automated tunnel later might just hang on a "do you trust this host?" prompt that no script is ever going to answer. - Generate a dedicated key:
ssh-keygen -t ed25519 -f ~/.ssh/my_tunnel, and give it a passphrase. - Install the public key on the destination with that
restrict,port-forwarding,permitopen=...prefix in front of it, scoped to the port(s) youre forwarding.
Before automating anything, confirm the tunnel works by hand:
ssh -i ~/.ssh/my_tunnel -N -L 5050:127.0.0.1:5050 your-user@<destination>
That should ask for the key passphrase and then just sit there, which is what you want (-N means "open the forward, dont run a command"). In a second terminal, you can test by curling it: curl http://127.0.0.1:5050/v1/models. Any HTTP response coming back means traffic is crossing the tunnel.
If that curl just hangs or refuses and you know the service is actually up, check that the destinations sshd allows forwarding at all. AllowTcpForwarding defaults to on, but a hardened box can have it set to no, in which case it silently refuses the forward and youll burn an hour chasing the wrong thing.
Once that manual test returns something, youre past the hard part and now you just gotta make it permanent.
Making it stick
A manual ssh -N dies the second you close the terminal or the link blips, so you gotta get it supervised. On macOS I use a launchd LaunchAgent with KeepAlive on, which brings the tunnel up at login and restarts it when it drops. On Linux youd probably reach for a systemd user service instead. Same idea either way: something watches the ssh -N process and respawns it.
One flag worth setting on the tunnel itself is ExitOnForwardFailure yes. Without it, if ssh connects but cant actually bind one of your forwards (say a leftover tunnel is still holding the port), it'll happily sit there running with a dead forward, and your supervisor sees a "live" process and never restarts it. With the flag on, ssh just exits instead, so the supervisor can do its job and relaunch clean.
Two things worth knowing before you set it:
- First: its scoped to this one tunnel, meaning the
sshprocess you put it on. It shouldn't touch any other SSH youve got going (an interactive login, some other tool, whatever); those are separate connections and dont care what this config block says. - Second: its all-or-nothing for this tunnel: if youre forwarding a whole range of ports and any single one cant bind (say you accidentally kicked off a process on the same port on the client machine), the whole thing bails. Pair that with a supervisor thats eager to relaunch, and you can end up in a tight flap loop, where ssh exits on the stuck port, gets relaunched, hits the same port, and exits again, round and round. The fix is to clear whatever is squatting on that port.
The flap loop is annoying, but it beats the silent half-dead tunnel you get without the flag IMO.
Auto-restart handles clean drops fine (box sleeps, you disconnect, that kind of thing), but it does NOT reliably handle an abrupt mid-connection drop, like a router reboot. Ive watched ssh get stuck half-open in that situation: it neither passes traffic nor exits, so the supervisor sees a process thats technically "alive" and never respawns it. A router firmware update wedged two of my tunnels exactly like that once, while the others happened to survive (luck, not some special property of those legs).
You can narrow that window with keepalive settings (ServerAliveInterval and ServerAliveCountMax are the ones doing the real work; TCPKeepAlive is more of a slow backstop since it rides the OS timer), and I think it's worth doing, but as far as I can tell it doesnt fully close it. The fix I went with is a small watchdog that curls each tunnel port every few minutes and force-restarts any that dont answer. Yes, it's crude, but so far it's worked alright for me. But just keep in mind that this isn't perfect.
One more note: recovery isnt instant even when it does self-heal, so keep that in mind. With the keepalive values I run, ssh takes something like ((30 * 3) == 90) seconds to decide a quiet link is genuinely dead and exit. After that launchd relaunches it pretty much right away. So figure around a minute and a half of gap after a blip, plus a couple seconds to reconnect. That's not something I'd commit to a commercial production network, but for my homelab? Eh... that's good enough for government work.
Clean up after
Once you finish, don't forget to actually swap over everything to use the tunnel. This is pointless if you keep hitting the services on their LAN address lol.
- Repoint your clients at
127.0.0.1. If anything is still hitting the destinations LAN IP directly, its skipping the tunnel and going over the wire in the clear, so the encryption is buying you nothing - Just to be sure- I went ahead and did a rebind of the destination services to use loopback only (ie: killed
listen/host 0.0.0.0). I mostly did this because rebinding purposefully breaks the apps I forgot to move over to the tunnel, so I'll find them easier. When I need to debug something in a hurry, I'm a big fan of "Lets make the change and see what breaks" if I'm in a hurry. (Until you do this, the port is still open on the LAN and anything on the network can hit it directly in plaintext, tunnel or not.)
Working with larger setups
I've personally found that it's really not much more complex with a bunch of machines than it is with one or two. Its mostly about knowing which direction the data flows and redoing the same effort for each machine pair. I run a handful of Macs and a couple of cheap linux mini PCs around the house, with one box acting as a Wilmer hub that the others route through. This meant that I ended up tunneling several legs (workstation to hub, then hub out to each inference box, and also workstation to some mini pcs running services for me). It's just rinse and repeat; same general steps every time.
A few things that bit me, or that I planned around, once a hub got involved:
First: the hub is a client too, not only a destination. Everything above about supervising the tunnel, pinning host keys, and the half-open wedge applies to the hubs outbound legs exactly like it does to the workstation. If you only babysit the workstation leg, youve got unmonitored tunnels sitting on the hub.
Second: I had to mind my ports on the hub. The hub is listening on some port for the inbound leg AND opening local forwards for its outbound legs, so those cant be the same number or theyll collide and one of them silently fails to bind. I gave each box its own port range, so one number means the same thing end to end and nothing steps on anything else. (ie- Mac 1 got 5001-5025, Mac 2 got 5101-5125, etc)
Third: the encryption here works hop by hop. Traffic gets decrypted at the hub (it has to, since the hub is the thing routing it) and re-encrypted on the way back out, so its not one sealed pipe from end to end. For the thing Im actually worried about, plaintext sitting on the LAN, thats totally fine since nothing crosses the wire in the clear.
Fourth: When thinking of a setup similar to mine, consider that a lot of llm backends have no auth out of the box, anything that can reach my hub's listening port can drive every model behind it through the hub's legs to the model machines. The permitopen restriction doesnt help here, because it limits where the tunnel can forward, not who's allowed to use the service on the other end. So if something I didn't intend ends up able to hit that port (a rebind I forgot, a service still bound to 0.0.0.0, a sloppy firewall rule, a new leg I added carelessly), it's in. Another reason to do the rebind and kill listen/0.0.0.0.
Anyhow, thats the high level of what I landed on. Its not perfect, but its a lot less annoying for me than wrangling certs across three operating systems, and it gets the cleartext off the LAN.