Shields AI: Building Sub-Millisecond DNS Security

DNS is the phonebook of the internet—and a prime target for attackers. Shields AI provides intelligent DNS security with machine learning threat detection and sub-millisecond latency. Here's how we built it.

The Challenge

DNS security must be:

**Fast**: Every website load starts with DNS
**Accurate**: Block threats, not legitimate sites
**Intelligent**: Detect new threats automatically
**Private**: Don't log or sell query data

Traditional approaches fail on at least one dimension.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                        Shields AI                            │
├─────────────────────────────────────────────────────────────┤
│  ┌───────────┐    ┌───────────┐    ┌───────────┐           │
│  │   eBPF    │───→│   Rust    │───→│    ML     │           │
│  │  Capture  │    │  Filter   │    │  Engine   │           │
│  └───────────┘    └───────────┘    └───────────┘           │
│        │                │                │                   │
│        ▼                ▼                ▼                   │
│  ┌───────────┐    ┌───────────┐    ┌───────────┐           │
│  │  Packet   │    │  Block    │    │  Threat   │           │
│  │  Stats    │    │  Lists    │    │  Score    │           │
│  └───────────┘    └───────────┘    └───────────┘           │
└─────────────────────────────────────────────────────────────┘

Layer 1: High-Performance Packet Processing

DNS queries arrive as UDP packets. We need to process them with minimal latency.

eBPF for Zero-Copy Processing

eBPF (extended Berkeley Packet Filter) lets us process packets in the kernel:

// Simplified eBPF DNS parser
SEC("xdp")
int dns_filter(struct xdp_md *ctx) {
    void *data = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;

    // Parse Ethernet header
    struct ethhdr *eth = data;
    if ((void *)(eth + 1) > data_end)
        return XDP_PASS;

    // Parse IP header
    struct iphdr *ip = (void *)(eth + 1);
    if ((void *)(ip + 1) > data_end)
        return XDP_PASS;

    // Check for UDP port 53 (DNS)
    if (ip->protocol != IPPROTO_UDP)
        return XDP_PASS;

    struct udphdr *udp = (void *)(ip + 1);
    if (udp->dest != htons(53))
        return XDP_PASS;

    // Extract domain name and check blocklist
    struct dns_header *dns = (void *)(udp + 1);
    if (is_blocked(dns->qname))
        return XDP_DROP;  // Block in kernel

    return XDP_PASS;  // Allow to userspace
}

Benefits:

Packet processing in kernel space
No context switches for blocked domains
Microsecond-level latency
Can handle millions of packets/second

Rust for Safety and Speed

Beyond eBPF, our userspace processing is in Rust:

use std::net::UdpSocket;
use tokio::sync::mpsc;

async fn dns_server(blocklist: Arc<BlockList>, ml_engine: Arc<MlEngine>) {
    let socket = UdpSocket::bind("0.0.0.0:53").unwrap();
    let mut buf = [0u8; 512];

    loop {
        let (len, src) = socket.recv_from(&mut buf).await?;

        // Parse DNS query
        let query = DnsQuery::parse(&buf[..len])?;

        // Check blocklist (O(1) hash lookup)
        if blocklist.contains(&query.domain) {
            socket.send_to(&blocked_response(&query), src).await?;
            continue;
        }

        // ML threat scoring (async, doesn't block)
        let threat_score = ml_engine.score(&query).await;
        if threat_score > THRESHOLD {
            log_threat(&query, threat_score);
            socket.send_to(&blocked_response(&query), src).await?;
            continue;
        }

        // Forward to upstream DNS
        let response = forward_query(&query).await?;
        socket.send_to(&response, src).await?;
    }
}

Layer 2: Intelligent Blocklists

Static blocklists are the first line of defense.

Blocklist Architecture

pub struct BlockList {
    // Hash set for exact domain matching
    exact: HashSet<String>,

    // Aho-Corasick automaton for pattern matching
    patterns: AhoCorasick,

    // Bloom filter for quick negative lookups
    bloom: BloomFilter,

    // Categories for fine-grained control
    categories: HashMap<Category, HashSet<String>>,
}

impl BlockList {
    pub fn contains(&self, domain: &str) -> bool {
        // Quick bloom filter check (false = definitely not blocked)
        if !self.bloom.might_contain(domain) {
            return false;
        }

        // Exact match check
        if self.exact.contains(domain) {
            return true;
        }

        // Pattern matching for wildcards
        self.patterns.is_match(domain)
    }
}

Blocklist Sources

We aggregate from multiple sources:

Community blocklists (updated hourly)
Threat intelligence feeds
User-reported domains
ML-identified threats

Total: 5M+ domains across categories:

Malware & phishing
Advertising & tracking
Adult content
Social media (optional)
Gambling (optional)

Layer 3: Machine Learning Threat Detection

Static blocklists can't catch new threats. Our ML engine identifies suspicious domains in real-time.

Feature Extraction

def extract_features(domain: str) -> np.ndarray:
    features = []

    # Lexical features
    features.append(len(domain))
    features.append(domain.count('.'))
    features.append(entropy(domain))
    features.append(consonant_ratio(domain))
    features.append(digit_ratio(domain))

    # N-gram analysis
    bigrams = extract_ngrams(domain, 2)
    features.extend(ngram_frequency(bigrams))

    # Domain age (from WHOIS)
    features.append(domain_age_days(domain))

    # TLD risk score
    features.append(tld_risk_score(domain))

    # Similarity to known malicious patterns
    features.append(malware_similarity_score(domain))

    return np.array(features)

Model Architecture

We use an ensemble of models:

**Random Forest**: Fast, interpretable
**XGBoost**: High accuracy
**Neural Network**: Complex pattern detection

class ThreatEnsemble:
    def __init__(self):
        self.rf = RandomForestClassifier(n_estimators=100)
        self.xgb = XGBClassifier()
        self.nn = load_model('threat_nn.h5')

    def predict(self, features: np.ndarray) -> float:
        # Get predictions from each model
        rf_pred = self.rf.predict_proba(features)[0][1]
        xgb_pred = self.xgb.predict_proba(features)[0][1]
        nn_pred = self.nn.predict(features)[0][0]

        # Weighted ensemble
        return 0.3 * rf_pred + 0.4 * xgb_pred + 0.3 * nn_pred

Training Data

Our models train on:

10M+ known malicious domains
50M+ legitimate domains
Daily updates with new threats
Feedback from user reports

Real-Time Inference

ML inference must not slow down DNS:

// Async ML scoring doesn't block DNS response
async fn score_domain(domain: &str) -> f32 {
    // Check cache first
    if let Some(score) = ML_CACHE.get(domain) {
        return score;
    }

    // Extract features (fast)
    let features = extract_features(domain);

    // Run inference (uses ONNX runtime for speed)
    let score = ML_MODEL.run(&features);

    // Cache result
    ML_CACHE.insert(domain, score);

    score
}

Average inference time: <1ms

Performance Results

Latency

Throughput

Accuracy

Privacy by Design

What We Log

Aggregate query counts (no domains)
Blocked category statistics
Performance metrics

What We Don't Log

Individual domains queried
IP addresses
User identifiers
Query timestamps

Zero-Knowledge Architecture

User Query → Shields AI → Response
                ↓
         Statistics only:
         - "Blocked 47 ads today"
         - "Protected from 3 threats"
         - No specific domains stored

Deployment Options

Cloud (Default)

Managed infrastructure
Global edge network
Automatic updates
No maintenance required

Self-Hosted

Full source code available
Docker/Kubernetes deployment
Your infrastructure, your rules
Same features, local control

Hybrid

Cloud for threat intelligence
Local for query processing
Best of both worlds

Future Roadmap

Short-term

IPv6 optimization
QUIC/HTTP3 support
Mobile apps

Medium-term

Browser extensions
Router integrations
Threat sharing network

Long-term

Federated threat detection
Decentralized blocklists
Hardware appliances

Conclusion

Building DNS security that's both fast and intelligent requires careful engineering at every layer. By combining eBPF for kernel-level processing, Rust for safe and fast userspace code, and machine learning for threat detection, we've created a system that protects users without compromising performance or privacy.

The result: sub-millisecond DNS security that catches 99%+ of threats while maintaining complete privacy.

*Ready to protect your network? Check out Shields AI and get started for free.*