OPNsense at Home, Part 3: Connectivity


This is part 3 of a 4-part series. Part 1: The Migration covers the hardware and initial setup. Part 2: Securing the Network covers DNS, ad blocking, IDS, and hardening. Part 4: Observability covers the logging and monitoring stack.

Dual WAN failover

The fiber connection connects directly from the ONT to the OPNsense WAN port with no intermediate router. My ISP requires VLAN tagging on the WAN interface. The setup is to create a VLAN sub-interface on the physical WAN port with the ISP-specific tag, then run DHCP on that sub-interface. No PPPoE.

The 5G gateway connects via ethernet to the second WAN port. It is a double-NAT situation, which is inelegant but fine for a failover link.

The gateway monitoring trap. The default monitor IP for each gateway is the gateway device itself. For the 5G link, this showed 0.8ms round-trip time, because it was pinging a device two feet away, not the internet. The gateway looked perpetually healthy even when the 5G service was down.

Fix: change the monitor IP to an actual internet target. Each gateway needs a unique monitor IP. If you reuse the same IP, the duplicate monitor route is silently ignored and health checks for the second gateway stop working. I use 1.1.1.1 for one and 8.8.8.8 for the other.

The step everyone misses. After creating your gateway group (System > Gateways > Group, fiber as Tier 1, 5G as Tier 2, trigger on packet loss or high latency), you have to edit your LAN firewall rule, go into Advanced settings, and set the Gateway to your failover group. Without that step, the gateway group does absolutely nothing. Traffic ignores it entirely. I spent an embarrassing amount of time on this.

WireGuard VPN

OPNsense includes WireGuard natively since 24.1, no plugin needed. Prior releases had it as a separately installed plugin. The setup has a few steps and one critical one that the UI makes easy to skip.

One thing before the steps: do not use the default WireGuard port. It is the first thing scanners try, and there is no reason to make it easy. Pick something arbitrary in the high range. Port 55555 is used in the examples below; pick your own.

  1. Create an Instance (the server): VPN > WireGuard > Instances > Add. Set a listen port (55555 or whatever you chose), a tunnel address that does not overlap your LAN subnets (172.16.99.1/24 works), and use the gear icon to auto-generate the keypair.

  2. Create a Peer for each client device. Generate keys on the client first using the WireGuard app. Paste the client's public key into OPNsense, set the allowed IP to a unique address in the tunnel subnet (172.16.99.2/32, 172.16.99.3/32, etc.).

  3. Go back to the Instance and select your peers. The UI does not do this automatically, and if you skip it, clients will authenticate but get no traffic.

  4. Assign the WireGuard interface (Interfaces > Assignments > wg0) and enable it. Set IPv4 to None. WireGuard manages its own addressing.

  5. Firewall rules: a WAN rule passing UDP on port 55555, and a rule on the WireGuard interface passing traffic from the WireGuard net to your LAN subnets.

Client config for full tunnel (all traffic through VPN):

[Interface]
PrivateKey = <client private key>
Address = 172.16.99.2/32
DNS = 192.168.20.1

[Peer]
PublicKey = <OPNsense public key>
Endpoint = <your public IP or DDNS hostname>:55555
AllowedIPs = 0.0.0.0/0

For split tunnel (only home LAN traffic through VPN), replace AllowedIPs = 0.0.0.0/0 with your LAN subnet ranges.

Fixing bufferbloat with FQ_CoDel

If you have ever noticed your connection feeling sluggish while someone else on the network is running a large download or upload, that is probably bufferbloat. The symptoms are vague: video calls stutter, SSH sessions lag, web pages take a beat longer to load. Speedtest results look fine because speedtests measure throughput, not latency under load. The problem is not bandwidth. It is what happens to your packets while they wait in line.

What bufferbloat actually is

Every network device between you and the internet has a buffer. Your ISP's modem, your router, switches along the path. When traffic exceeds the link capacity, packets queue up in these buffers. A small buffer is fine. The problem is that most consumer equipment ships with buffers that are far too large, sometimes holding hundreds of milliseconds of traffic. This is FIFO (first in, first out) buffering with no intelligence about what is in the queue.

The result: a large file transfer fills the buffer, and every other packet (your DNS lookup, your video call audio, your SSH keystroke) has to wait behind all of it. Latency spikes from 5ms to 200ms+ under load. The connection is not slow. It is full of packets waiting in a dumb queue.

Why your ISP modem is the problem

Even if your router has smart queue management, it does not help if the bottleneck is downstream in a device you do not control. Your ISP's modem (or ONT, depending on your setup) has its own buffer. At full line rate, packets pile up there before they ever hit the internet. That buffer is a dumb FIFO with no queue discipline.

The fix is to make your router the intentional bottleneck. If you cap your router's throughput to slightly below your actual line rate, packets queue up inside your router instead of in the ISP modem. Your router can then apply smart queue management (FQ_CoDel) to decide which packets go first, while the modem's buffer stays nearly empty.

Without shaping, the flow looks like this:

traffic -> ISP modem buffer (dumb FIFO, grows huge) -> Internet
                    ^ bufferbloat happens here

With shaping:

traffic -> OPNsense pipe (FQ_CoDel, bounded) -> ISP modem (nearly empty) -> Internet
                    ^ OPNsense controls this queue

Why FQ_CoDel and not CAKE

CAKE (Common Applications Kept Enhanced) is the gold standard for bufferbloat mitigation on Linux. It handles shaping, flow isolation, and fairness in one package. But CAKE is CPU-intensive at gigabit speeds, and if you are already running Suricata IDS on the same box (see Part 2), adding CAKE risks making the CPU the bottleneck instead of the link.

FQ_CoDel (Fair Queue Controlled Delay) is lighter. It does not do everything CAKE does, but it handles the two things that matter most: it isolates flows so one bulk transfer cannot starve interactive traffic, and it uses the CoDel algorithm to detect and drop packets from flows that are building up queues. On a 1Gbps symmetric connection with Suricata running simultaneously, FQ_CoDel achieves A+ bufferbloat test results without visible CPU impact.

The setup

This is three steps: create pipes, attach queues, write rules. The OPNsense shaper UI is under Firewall > Shaper. Enable Advanced Mode (toggle in the top left) before starting.

Step 1: Pipes. Under Firewall > Shaper > Pipes, create two pipes:

Download pipe:

  • Bandwidth: 850 Mbit/s (85% of 1000)
  • Scheduler Type: FQ_CoDel
  • CoDel ECN: enabled (useful on the download side, allows endpoints that support ECN to back off before drops are needed)
  • FQ-CoDel Limit: 1000
  • FQ-CoDel Quantum: leave empty (defaults to 1514, matching standard Ethernet MTU)
  • FQ-CoDel Flows: leave empty (defaults to 1024)

Upload pipe: same settings, 850 Mbit/s. ECN is optional on upload, less critical.

Step 2: Queues. Under Firewall > Shaper > Queues, create two queues:

  • Download queue: pipe = Download, weight = 100
  • Upload queue: pipe = Upload, weight = 100

Step 3: Rules. Under Firewall > Shaper > Rules, create two rules:

  • Download rule: Interface = WAN, Direction = in, Protocol/Source/Destination = any, Target = Download queue
  • Upload rule: Interface = WAN, Direction = out, Protocol/Source/Destination = any, Target = Upload queue

Click Apply after each step.

Tuning: from 85% to the real ceiling

85% is where you start, not where you stay. The goal is to find the highest bandwidth cap that still keeps latency at +0ms under full load. This takes a few iterations.

Here is what the tuning process looked like on my 1Gbps symmetric connection, testing with Waveform after each change:

Pipes (Down / Up)Download latency under loadUpload latency under loadGrade
850 / 850+0ms+0msA+
900 / 900+0ms+0msA+
925 / 925+0ms+0msA+
950 / 950+3ms+0msFail
925 / 950flickeringflickeringFail
910 / 935+0ms+0msA+

At 950 Mbps download, the pipe ceiling is close enough to line rate that under full saturation, occasional bursts overflow into the modem buffer. The modem's FIFO has no CoDel. It just queues everything, and that is where the +3ms came from. Stepping back to 925/950 was not enough either. The latency was flickering at the boundary, sometimes +0ms, sometimes +1ms. Not stable enough to trust.

The final configuration landed at 910 down / 935 up. That retains about 89% of download and 94% of upload capacity with zero added latency under full load. The gap between the pipe cap and line rate is the intentional headroom that keeps the modem buffer empty and OPNsense in control of the queue at all times.

A few things I confirmed did not need changing from defaults:

  • FQ-CoDel Target (5ms): sojourn times were staying well under it
  • FQ-CoDel Interval (100ms): reacting fast enough for gigabit traffic
  • FQ-CoDel Flows (1024): sufficient flow isolation for a home network
  • FQ-CoDel Quantum (1514): matches standard Ethernet MTU

The key insight from tuning is that download and upload can have different ceilings. My upload tolerated a higher percentage (93.5%) than download (91%) before latency crept in. If you have asymmetric speeds from your ISP, tune each direction independently.

The difference is immediately noticeable in daily use. Video calls stop dropping frames when someone starts a download. SSH sessions stay responsive. The throughput number on a speedtest might be slightly lower (you did cap it), but the connection feels faster because latency stays flat under load.

Next: Part 4: Observability covers syslog, NetFlow, and building a dashboard over everything.