🌸 Inspired by Petals & BitTorrent

Run AI in a herd,
not a data centre

OpenHydra splits big language models across volunteer laptops — BitTorrent-style. Your machine serves a slice of Qwen 3.5. You earn HYDRA tokens. Nobody needs a credit card. No cloud required.

$ git clone https://github.com/openhydra-ai/openhydra.git
Get started → ⇩ Desktop app · Soon
🍏

Calling all Mac Mini “OpenClaw” buyers

Did you buy a stack of M4 Mac Minis to run local models? Welcome home. OpenHydra’s architecture is explicitly designed to pool Apple Silicon’s Unified Memory across the internet. Leave your Mac running in the background, seed the swarm, and let your hardware earn HYDRA credits while you sleep.


The five-year-old version

OK but what actually is this?

Glad you asked. It’s honestly not that complicated once you stop calling it “AI infrastructure”.

🎒
Big models are heavy

A 70-billion parameter model weighs ~140 GB. That’s not fitting on your laptop. But split across 8 laptops? Now we’re talking.

🌊
Swarm it, don’t hoard it

Like BitTorrent, everyone in the herd serves a shard. Your laptop handles one piece of the inference. Together the whole model runs.

🪙
And you earn credits

Your node earns barter credits and HYDRA tokens for every request it serves. These are in-network credits — not crypto, not fiat — redeemable for inference on the swarm. A mystery-shopper bot checks quality. Good llamas get priority routing. Cheaters get slashed.

Small models like Qwen 3.5 0.8B run on a single laptop. Bigger ones like Qwen 3 72B need 8 peers. The default install gets you going with Qwen 3.5 immediately — no beefy GPU required.


How it works

Three steps to joining the herd

It’s three commands. Even your grandma could do it. (We’re not sure why your grandma would want distributed AI, but we respect the ambition.)

01
Install and set up

Clone, virtualenv, compile protobufs. Standard Python setup. You’ve done worse.

git clone https://github.com/openhydra-ai/openhydra.git cd openhydra && make venv && source .venv/bin/activate make install && make proto
02
Start your node — auto-joins the global swarm

One command. Your node automatically connects to bootstrap nodes on three continents (EU, US, AP) via the Hivemind Kademlia DHT. The default model is Qwen 3.5 0.8B — lightweight enough to run on a potato, smart enough to actually be useful.

openhydra-node --peer-id your-name
03
Chat. Earn. Repeat.

Use the OpenAI-compatible API, earn HYDRA tokens for every request your node serves, and quietly feel good about contributing to decentralised AI.

curl http://localhost:8080/v1/chat/completions \ -d '{"model":"openhydra-qwen3.5-0.8b","messages":[{"role":"user","content":"hi"}]}'

What you get

Features, listed professionally

We also have a proper features page in the docs, but here’s the version where we’re allowed to be slightly smug.

Drop-in OpenAI & Ollama API

Change one URL. Your existing code works. /v1/chat/completions with SSE streaming, plus Ollama-compatible /api/chat for Open WebUI and Continue.dev.

🧠
KV cache compaction

4-phase Attention Matching keeps long conversations alive without nuking your VRAM. Based on arXiv:2602.16284 — we read the papers so you don’t have to.

🔗
Dual-stack DHT routing

HTTP DHT + Hivemind Kademlia across three continents. Auto-join on startup. If one bootstrap goes down, the llamas find another way. No single point of failure.

🖥
Desktop node app Coming Soon

Tauri v2 app for macOS, Windows, and Linux. Click “Start Node”. Watch credits accumulate. Currently CLI-only — the desktop GUI is in active development.

🛡
Onion routing & encryption

Ed25519 identity, X25519 ECDH + AES-256-GCM per hop, concentric onion routing, and differential privacy noise. No peer sees your full query. Overhead: 0.02%.

🌎
Python & TypeScript SDKs Coming Soon

Zero-dependency Python client. Browser-native TypeScript SDK. The internal SDK scaffolding exists — public release and docs are coming in v1.1.


Standing on the shoulders of giants

We didn’t invent this. We just added llamas.

OpenHydra builds directly on two brilliant ideas. We want to be upfront about our inspirations, because intellectual honesty is cool (and mandatory if you don’t want to get ratio’d on HackerNews).

Academic inspiration
🌸 Petals

“Run large language models at home, BitTorrent‑style.” Petals proved that volunteer compute can serve real LLM inference across the internet. We took that idea and bolted on a token economy, a desktop app, and a very strong llama motif.

petals.dev →
Protocol inspiration
🌊 BitTorrent

Since 2001, BitTorrent has proved you can distribute enormous files to billions of people without a central server. If it works for a band’s entire discography, it can work for Qwen 3.5 tokens. Same energy.

bittorrent.com →

🦎 Fun llama fact #1: Real llamas are pack animals because they share the load across the herd. The weakest llama doesn’t carry the whole tent. This is also the core architectural principle of OpenHydra.

🦎 Fun llama fact #2: A group of llamas is called a herd. OpenHydra’s network of peers is also called a herd. We are very consistent in our metaphors and proud of this.

🦎 Fun llama fact #3: The Hydra in Greek mythology had multiple heads — cut one off and two grow back. Our bootstrap nodes work the same way. (Please don’t cut our bootstrap nodes.)

🦎 Fun llama fact #4: Llamas can spit up to 10 feet when stressed. Our nodes politely return HTTP 503 instead. Both are valid responses to being overwhelmed.


What runs on it

It’s Qwen all the way down (mostly)

The default is Qwen 3.5 0.8B — tiny enough for any laptop. Larger models shard automatically across multiple peers. NF4 quantisation cuts VRAM by 4x. Add any HuggingFace model by editing models.catalog.json.

Default · 1 peer · 2 GB
Qwen 3.5 0.8B
Runs on a potato. The default.
Compact · 1 peer · 5 GB
Qwen 3.5 2B
Strong multilingual. Single peer.
Mid-range · 1 peer · 9 GB
Qwen 3.5 4B
Reasoning on a single peer.
Advanced · 2 peers · 18 GB
Qwen 3.5 9B
High-quality reasoning. int8 quantised.
Frontier · 4 peers · 16 GB/peer
Qwen 3.5 27B
int4 quantised. Bring your friends.

5 models in the default catalog. Add any HuggingFace model to models.catalog.json. If the requested model lacks peers, the coordinator gracefully degrades to the nearest available smaller model.

Ready to join the herd?

Your laptop is sitting there doing nothing useful. It could be serving AI tokens and earning credits on the swarm. Three-headed llamas are waiting for you.