NebulaDNS M1
Zero unsafe Fuzzed Apache-2.0

The authoritative DNS server you can actually see.

NebulaDNS is a modern, observable, API-first authoritative DNS server written in safe Rust. It replaces 2001-era daemons (TinyDNS, djbdns) and 40-year-old C codebases (BIND) with a single small binary that ships with metrics, a control plane, and a Kubernetes operator โ€” so the next failure is detected in seconds, not by customers.

โ‰ค 40 ยตs
Target p50 in-process latency
โ‰ค 150 ยตs
Target p99
โ‰ฅ 2M qps
Target UDP throughput / 16-core node
0
Bytes of GC-induced tail latency

Why we built this

Two production outages in 22 days, same root cause.

We ran djbdns 1.05 (released 2001) behind a CDN. It sends AXFR responses with QDCOUNT=0, which BIND 9.18+ rejects as FORMERR. Our downstream secondaries silently upgraded to BIND 9.18, and our zone-transfer redundancy eroded invisibly for 13 months until only one working agent remained.

No /metrics, no API, no post-deploy verification, no way to see that half the fleet had stopped transferring. Engineers had to SSH into six servers to reconstruct state during the incident.

NebulaDNS is the direct answer. Every failure mode that caused those incidents is a first-class feature here: propagation verification, peer software fingerprinting, atomic versioned deploys, deterministic SOA serials, and an API for every operator action.

What the next incident looks like
  1. 30sDashboard surfaces the FORMERR with peer software fingerprint and packet detail.
  2. 60sDeploy pipeline reports failure, not "complete". Propagation gate blocks.
  3. 2mOn-call rolls back the zone via UI or API โ€” no SSH required.
  4. โˆžCI's interop matrix would have failed before the bad build ever shipped.

Features

Batteries included. All twelve of them.

๐Ÿ›ฐ๏ธ

API-first

Every operator action โ€” create zone, edit record, rotate TSIG key, trigger rollover, roll back โ€” is a single authenticated REST/gRPC call. The CLI and UI are thin clients.

๐Ÿ“Š

Observable by default

Prometheus /metrics is always-on with zero hot-path cost. Enum-typed labels cap cardinality at compile time. Structured JSON logs, OTLP traces.

โœ…

Verified propagation

A deploy isn't "done" until every declared downstream secondary reports the new SOA serial. The propagation gate is built in โ€” the single feature missing from every other auth DNS server.

๐Ÿงฌ

Peer software fingerprinting

The server records version.bind CHAOS responses from every declared secondary. Silent upgrades can't erode your redundancy invisibly.

๐Ÿ”

Safe Rust

#![forbid(unsafe_code)] across every crate. Continuous fuzzing on the wire codec, zone parser, and DNSSEC signer.

โšก

Zero-GC tail latency

Deterministic allocation means no p99.9 hiccups. Lock-free zone store served through arc-swap; readers never block.

๐Ÿ“ฆ

Atomic versioned configuration

Zone data is content-addressed and deployed atomically. Roll forward and rollback are one-liners. No partial reads. No "edit the file and SIGHUP".

๐Ÿงฎ

Deterministic SOA serials

Auto-generated, monotonic serials. Retry-with-same-serial is impossible by construction โ€” no more wedged recoveries.

โ˜ธ๏ธ

Kubernetes-native

Helm chart, operator, and CRDs (Zone, Record, Secondary, TsigKey, DeployGate). GitOps-ready from day one. Drop-in replacement for CoreDNS as cluster DNS.

๐Ÿงฑ

Standards-conformant wire

Strict RFC 1034/1035/5936/1995/1996/8945/4034-5/7766/7858/8484/9250. Errors are explicit (QdCountMismatch) โ€” never silently malformed.

๐Ÿงญ

Redundancy you can see

The dashboard shows, at a glance, which secondaries last transferred each zone and when. No more silent 13-month erosion.

๐Ÿงฏ

Hardened by default

systemd unit with ProtectSystem=strict, CAP_NET_BIND_SERVICE only, seccomp syscall filter, read-only rootfs. Non-root container, distroless base, โ‰ค 25 MB compressed.

Install

Three ways to try it in the next five minutes.

Docker Single container

Docker Compose

NebulaDNS + Prometheus + Grafana, with a real example zone and a dig-based smoke container.

git clone https://github.com/bwalia/nebuladns
cd nebuladns
docker compose -f deploy/docker/compose/docker-compose.yml \
  up --build -d

# Real DNS over UDP and TCP
dig @127.0.0.1 -p 5353 www.example.com A +short
# 192.0.2.10
# 192.0.2.11

# Dashboards
open http://127.0.0.1:3000  # Grafana
open http://127.0.0.1:9091  # Prometheus
Helm Kubernetes

Helm chart

StatefulSet, Service, PDB, NetworkPolicy, ServiceMonitor, PrometheusRule, and a helm test smoke pod, ready to helm install.

helm install nebuladns \
  oci://ghcr.io/nebuladns/charts/nebuladns \
  --namespace nebuladns --create-namespace \
  --values values.yaml

helm test nebuladns -n nebuladns --logs
kubectl -n nebuladns port-forward svc/nebuladns 9090:9090
curl http://127.0.0.1:9090/metrics | head
systemd Bare metal / VM

Debian / Ubuntu via Ansible

Hardened systemd unit with watchdog + sd_notify. Ansible role with atomic symlink swap, smoke test, and rollback.

cd deploy/ansible
ansible-galaxy collection install -r requirements.yml
ansible-playbook -i inventory.ini playbook.yml \
  -e nebuladns_version=v0.1.0 \
  -e @environments/staging.yml

# Rollback? One command, a few seconds per host.
ansible-playbook -i inventory.ini rollback.yml

Build from source

cargo build --release --bin nebuladns --bin nebulactl
./target/release/nebuladns --config config/nebuladns.demo.toml &

dig @127.0.0.1 -p 15353 www.example.com A +short
./target/release/nebulactl health

Comparison

What's missing from every existing auth DNS server.

Capability TinyDNS BIND 9 Knot NSD PowerDNS CoreDNS NebulaDNS
Full REST APIโŒpartial (rndc)partial (knotc)โŒpartialโ€”โœ…
Native Prometheus /metricsโŒXML/JSON onlymodulemodulemoduleโœ…โœ… always-on
Built-in propagation gateโŒโŒโŒโŒโŒโŒโœ…
Kubernetes operator + CRDsโŒโŒโŒโŒโŒโŒโœ…
Online DNSSEC signingโŒโœ…โœ…โŒ (offline)โœ…partialโœ… (HSM/KMS)
Memory-safe languageโŒ (C)โŒ (C)โŒ (C)โŒ (C)โŒ (C++)โš ๏ธ (Go, GC)โœ… (Rust, no GC)
Peer software fingerprintingโŒโŒโŒโŒโŒโŒโœ…
Atomic versioned zone storeโŒโŒโœ…โŒSQL-backedโ€”โœ… content-addressed
React dashboard includedโŒโŒโŒโŒthird-partyโŒโฌ… v1.0 (M9)

โœ… = shipped today ยท โฌ… = planned for v1.0 GA ยท โš ๏ธ = with caveats. See the full competitive analysis in PROJECT_PROMPT.md ยง2.5.

Architecture

One binary. One API. Every piece swappable.

                       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                       โ”‚            React Web UI (v1.0)           โ”‚
                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                           โ”‚ HTTPS (OpenAPI)
                       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                       โ”‚          Control Plane API               โ”‚
                       โ”‚   axum (REST) + tonic (gRPC)             โ”‚
                       โ”‚   AuthN: mTLS + OIDC   AuthZ: RBAC       โ”‚
                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚                  โ”‚
                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                โ”‚  Zone Manager   โ”‚       โ”‚ Propagation        โ”‚
                โ”‚  validate       โ”‚       โ”‚ Verifier           โ”‚
                โ”‚  sign (DNSSEC)  โ”‚       โ”‚  polls secondaries โ”‚
                โ”‚  atomic commit  โ”‚       โ”‚  confirms SOA      โ”‚
                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚       Zone Store (content-addressed)     โ”‚
        โ”‚       sled / redb / pluggable            โ”‚
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚          DNS Data Plane                  โ”‚
        โ”‚   tokio + io_uring (Linux โ‰ฅ 5.19)        โ”‚
        โ”‚   UDP ยท TCP ยท DoT ยท DoH ยท DoQ            โ”‚
        โ”‚   AXFR / IXFR / NOTIFY / TSIG            โ”‚
        โ”‚   DNSSEC online signer                   โ”‚
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

    Observability spine: tracing โ†’ OpenTelemetry โ†’ Prometheus + Loki + Tempo
    
Data plane

Hand-rolled zero-copy wire codec. Lock-free zone snapshots via arc-swap. No allocations on the hot path in steady state.

Control plane

OpenAPI spec is generated from code via utoipa. Every write is idempotent, dry-run-able, and audit-logged.

Storage

Content-addressed, versioned. Rollback is a one-liner. Writes are atomic; readers never block. Optional S3-backed continuous backup.

Metrics

Every query. Every transfer. Every deploy.

Metrics are always-on with a compile-time cardinality budget. The design contract: zero hot-path allocation, โ‰ค 1% CPU overhead, โ‰ค 1 ยตs added to p99 latency. The full catalogue is in the planning prompt; here's a slice of what's live today.

# Wire / query pipeline
nebula_dns_queries_total{proto,qtype,rcode}
nebula_dns_query_duration_seconds_bucket{proto,qtype,le}
nebula_dns_dropped_total{reason}
nebula_dns_formerr_total{peer,direction,reason}

# Zone transfer / NOTIFY (M3)
nebula_axfr_attempts_total{peer,zone,direction,result}
nebula_axfr_last_success_timestamp_seconds{peer,zone}
nebula_peer_version_info{peer,software,version}   # the 1326 signal

# Propagation verifier (M6)
nebula_zone_current_soa_serial{zone}
nebula_secondary_observed_soa_serial{zone,peer}
nebula_zone_propagation_lag_seconds{zone,peer}
nebula_zone_propagation_converged{zone}           0|1

# Runtime
nebula_build_info{version,commit,rustc,target} 1
nebula_process_resident_memory_bytes
nebula_memory_hot_path_allocations_total          # MUST stay at 0

Roadmap

Fourteen milestones to GA.

Done M0 โ€” Skeleton

Workspace, CI, /livez, /readyz, /metrics, systemd unit, distroless container, Helm skeleton, fuzz harness.

Done M1 โ€” Wire + Zone

Full RFC 1035 message codec, name compression, RR types (A/AAAA/NS/CNAME/SOA/MX/TXT/PTR/SRV/CAA), TOML zone loader, UDP + TCP listeners, real dig answering end-to-end.

Next M2 โ€” Auth zone completeness

Wildcards, CNAME chasing, glue, RFC 2308 negative responses, EDNS negotiation, response-rate-limiting.

Planned M3 โ€” Transfers

AXFR, IXFR, NOTIFY, TSIG + daily interop matrix against BIND / Knot / NSD / PowerDNS / CoreDNS.

Planned M4 โ€” DNSSEC

Online signing (Ed25519 / ECDSA / RSA), automatic KSK/ZSK rollover (RFC 6781), HSM (PKCS#11) and AWS KMS backends.

Planned M6 โ€” Propagation verifier

The feature that makes incidents 1273 and 1326 impossible to recur.

Planned M7 โ€” HA + multi-region

Raft in-region, async log-ship cross-region, active-active with CRDTs, explicit failover gates.

Planned M8 โ€” Kubernetes operator + CRDs

First-class Zone / Record / Secondary / TsigKey / DeployGate resources. ExternalDNS provider. CoreDNS cluster-DNS drop-in.

Full milestone schedule and SLOs: PROJECT_PROMPT.md ยง15.

Releases

Versioned, verified, immutable.

Every tagged commit produces a GitHub Release with static Linux (musl) and macOS binaries for nebuladns and nebulactl, per-archive SHA-256 checksums, and auto-generated release notes grouped by PR label. Releases are immutable โ€” a rollback is a new version, never a rewritten tag.

Latest GitHub Releases

Download

Static binaries for x86_64 and aarch64 โ€” Linux (musl) and macOS. Verify with the SHA256SUMS file shipped with every release.

View all releases โ†’
Docs For developers & agents

RELEASE.md

The release runbook. How to cut a version, what the workflow does, how to verify an artifact, how to roll back โ€” and a dedicated section for AI coding agents.

Read RELEASE.md โ†’
CI Tag-triggered

Release workflow

Pushing a v*.*.* tag re-runs the test gate against the exact SHA, cross-compiles four targets, smoke-tests each binary, and publishes the Release with auto-generated notes.

View workflow โ†’

Verify a downloaded release

VERSION=v0.2.0
TARGET=x86_64-unknown-linux-musl
gh release download "$VERSION" \
  --pattern "nebuladns-${VERSION}-${TARGET}.tar.gz" \
  --pattern "nebuladns-${VERSION}-${TARGET}.tar.gz.sha256"

shasum -a 256 -c "nebuladns-${VERSION}-${TARGET}.tar.gz.sha256"
# nebuladns-v0.2.0-x86_64-unknown-linux-musl.tar.gz: OK

tar -xzf "nebuladns-${VERSION}-${TARGET}.tar.gz"
./nebuladns-${VERSION}-${TARGET}/nebuladns --print-default-config | head

Run it. Break it. File an issue.

NebulaDNS is pre-1.0 and moving fast. If the compose rig doesn't spin up on your machine in under two minutes, or the Helm chart lands a pod that isn't answering real dig queries within thirty seconds, that's a bug and we want the trace.

Star on GitHub Report an issue