When Tailscale Meets Alibaba Cloud: Why DNS Stops Working and How to Fix It

A quiet server room lit by blue LEDs

One afternoon, our small dev-ops team noticed that a production server on Alibaba Cloud ECS could no longer reach the public Internet—yet we could still SSH into it through Tailscale.
A quick run-through of the usual suspects—routing tables, security-group rules, even a reboot—did nothing.
After two hours of packet tracing, log spelunking, and mild panic, we discovered the root cause is surprisingly simple: the Alibaba Cloud DNS resolver happens to live inside the same IP range that Tailscale blocks by default.

Below you will find a complete, step-by-step account of what went wrong, why it happens, and five tested solutions that let you keep both Tailscale and Alibaba Cloud DNS working together—without hacks that break on the next update.


Table of contents

  1. Symptoms: a server that loses the Internet
  2. First clues: DNS, not routing
  3. Deep dive: Tailscale’s hidden firewall rule
  4. The standards clash: RFC 6598 vs Alibaba Cloud
  5. Five practical fixes—ranked by risk
  6. How to test each fix in under 60 seconds
  7. Key takeaways

1. Symptoms: a server that loses the Internet

What we observed

  • SSH still works—but only via the Tailscale IP.
  • apt update, curl, git fetch, and any outbound HTTPS calls time out.
  • Rebooting the ECS instance or restarting Tailscale brings connectivity back for roughly three minutes, then it dies again.
Terminal showing “Temporary failure in name resolution”

2. First clues: DNS, not routing

2.1 Ping proves IP works

$ ping 8.8.8.8
64 bytes from 8.8.8.8: icmp_seq=1 ttl=115 time=3.05 ms

The server can reach any public IP address, so routing is fine.

2.2 DNS lookup fails

Alibaba Cloud’s VPC uses 100.100.2.136 and 100.100.2.138 as the default internal DNS resolvers.

$ dig @100.100.2.136 example.com
;; connection timed out; no servers could be reached

3. Deep dive: Tailscale’s hidden firewall rule

3.1 Logging every dropped packet

We added a one-liner to log packets before they hit any rule:

sudo iptables -I INPUT 1 -j LOG --log-prefix "PKT_TRACE: "

Within seconds the log showed:

PKT_TRACE: IN=eth0 OUT= MAC=... SRC=100.100.2.136 DST=172.31.x.x PROTO=UDP DPT=53

3.2 The rule that blocks Alibaba DNS

$ sudo iptables -S | grep 100.64
-A ts-input -s 100.64.0.0/10 ! -i tailscale0 -j DROP

Tailscale inserts this rule automatically.
It means:

“Drop every packet whose source address is inside the Carrier-Grade NAT range (100.64.0.0/10) unless it arrived on the Tailscale interface.”

Because Alibaba’s DNS also lives in 100.64.0.0/10, the reply packets from 100.100.2.136 are silently discarded.


4. The standards clash: RFC 6598 vs Alibaba Cloud

4.1 What RFC 6598 says

RFC 6598 reserves 100.64.0.0/10 for Carrier-Grade NAT (CGNAT)—a range that must never appear on the public Internet.
Tailscale treats the range as “Tailscale-only”, which is technically correct.

4.2 Why Alibaba Cloud uses the same range

  • Not routable on the Internet—so packets never leak.
  • Does not overlap with classic private ranges (10.0.0.0/8, 192.168.0.0/16).
  • Keeps DNS close to the hypervisor, reducing latency.

The result: two perfectly valid decisions that collide on the same machine.


5. Five practical fixes—ranked by risk

# Fix One-line summary Pros Cons When to use
1 Delete the rule Manually remove the DROP rule Immediate relief Rule returns on every Tailscale restart Emergency debugging
2 Whitelist DNS IPs Insert two ACCEPT rules above the DROP Transparent to apps Rule order may shift after updates Single server, rare reboots
3 Automated watchdog Cron or systemd keeps the whitelist alive Hands-off Extra moving part Fleet of servers
4 Switch to public DNS Point /etc/resolv.conf at 8.8.8.8 or 1.1.1.1 Zero iptables changes Breaks Alibaba internal domains (OSS, RDS) No internal cloud services
5 Disable Tailscale firewall tailscale up --netfilter-mode=off Removes every Tailscale rule Loses subnet routing, exit-node, ACLs Tailscale for point-to-point only

5.1 Fix 1: Delete the rule (one-off)

sudo iptables -D ts-input -s 100.64.0.0/10 ! -i tailscale0 -j DROP

Test:

dig @100.100.2.136 example.com

Caveat: Every time Tailscale restarts (systemctl restart tailscaled or reboot), the rule is recreated.


5.2 Fix 2: Whitelist Alibaba DNS IPs

Add two ACCEPT rules before the Tailscale DROP:

sudo iptables -I ts-input 1 -s 100.100.2.136/32 -j ACCEPT
sudo iptables -I ts-input 1 -s 100.100.2.138/32 -j ACCEPT

Save the rules so they survive reboots:

# Ubuntu / Debian
sudo apt install iptables-persistent
sudo netfilter-persistent save

Edge case: Future Tailscale updates may reorder the chain, causing the ACCEPT rules to fall below the DROP.
Mitigation: combine with Fix 3.


5.3 Fix 3: Automated watchdog script

Create /usr/local/bin/fix-ts-dns.sh:

#!/bin/bash
while true; do
  if ! iptables -C ts-input -s 100.100.2.136/32 -j ACCEPT 2>/dev/null; then
    iptables -I ts-input 1 -s 100.100.2.136/32 -j ACCEPT
  fi
  if ! iptables -C ts-input -s 100.100.2.138/32 -j ACCEPT 2>/dev/null; then
    iptables -I ts-input 1 -s 100.100.2.138/32 -j ACCEPT
  fi
  sleep 30
done

Make it executable:

chmod +x /usr/local/bin/fix-ts-dns.sh

Wrap it in a systemd service so it starts after tailscaled:

# /etc/systemd/system/fix-ts-dns.service
[Unit]
Description=Keep Alibaba DNS whitelisted in Tailscale chain
After=tailscaled.service

[Service]
Type=simple
ExecStart=/usr/local/bin/fix-ts-dns.sh
Restart=always

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable --now fix-ts-dns.service

5.4 Fix 4: Use public DNS resolvers

Replace the existing /etc/resolv.conf:

sudo rm -f /etc/resolv.conf
echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
echo "nameserver 1.1.1.1" | sudo tee -a /etc/resolv.conf

If you use NetworkManager, set DNS via:

nmcli connection modify eth0 ipv4.dns "8.8.8.8 1.1.1.1"
nmcli connection up eth0

Trade-off:

  • All Alibaba internal endpoints (like bucket-name.oss-internal.aliyuncs.com) will resolve to public IPs, potentially incurring bandwidth charges and higher latency.
  • If your workload never touches Alibaba internal services, this is the simplest permanent fix.

5.5 Fix 5: Disable Tailscale’s netfilter mode

sudo tailscale up --netfilter-mode=off

This tells Tailscale not to touch iptables at all.
Verify:

sudo iptables -S | grep ts-
# Should print nothing

Consequences:

Feature Status
Tailscale subnet routing Broken
Exit-node capability Broken
ACL enforcement Broken
Point-to-point WireGuard tunnel Still works

Use this only if you run Tailscale purely for machine-to-machine access and do not need subnet relay or exit-node features.


6. How to test each fix in under 60 seconds

  1. Check DNS

    time dig @100.100.2.136 example.com +short
    

    A non-empty answer list means DNS is alive.

  2. Check outbound HTTPS

    curl -s -o /dev/null -w "%{http_code}\n" https://example.com
    

    Should return 200.

  3. Check Tailscale health

    tailscale status
    tailscale ping some-other-node
    
  4. Reboot test

    sudo reboot
    

    After the machine comes back, rerun steps 1-3 to ensure the fix is persistent.


7. Key takeaways

  • Root cause is not a bug in either Tailscale or Alibaba Cloud—both follow their own correct assumptions.
  • Fastest relief: whitelist the two Alibaba DNS IPs in the Tailscale ts-input chain.
  • Cleanest long-term: if your project does not rely on Alibaba internal endpoints, switch to public DNS and forget the conflict ever existed.
  • Automate the workaround with a tiny systemd service if you manage more than a handful of hosts.
Cable management in a data center

If you try one of the fixes above—or invent a better one—let the community know.
Sharing the exact commands you used helps the next engineer spend less time debugging and more time building.