前言

为什么会有这篇文章呢?其实是因为我很早之前就对“以Linux为底层的软路由系统”很感兴趣,之前在学校学的就是网络工程和系统服务部署,因此对Linux系统和网络可以说还是比较熟悉的(特别是网络,有Datacom HCIE认证)。最开始搜软路由就会出现很多的OpenWRT,iKuai,RouterOS这样的文章,不过我只玩了段时间的OpenWRT + iStoreOS。

iStoreOS虽然在某些设计上确实很方便(比如web界面配置等),但使用一段时间下来还是感觉不顺心,这玩意儿怎么这么臃肿复杂,把一些很简单的事情做的很复杂,后来想了想,可能因为OpenWRT刚开始就是为了web界面的路由器设计的,要对新手友好些,但对外来说很多功能其实没必要了,而且OpenWRT的很多功能我都可以自己手搓实现。并且,会尽可能的将每一个服务都进行最大限度的性能优化,确保软路由的运行速度和效率,对我来说这是必须的~

基于以上原因,最终选择简洁的Ubuntu Server 25.04,且安装时选择最小安装来实现,为了保证在物理机上可一次性完美安装和功能实现,所有要部署的服务都会先在虚拟机上进行实验,确认没问题后才会上物理机,确保物理机系统的干净整洁(在这方面还是有洁癖的)。

另外提一嘴,2025年09月05日晚22:00,在小黄鱼上购买了一台多网口小主机(N5105)做软路由,虽然之前买过天钡家的AMD款的天钡 WTR PRO安装飞牛用到现在,但是一直没有做过软路由,因此耗“巨资”购买了它。

甚是喜欢呢,哈哈,不过夏天被动散热效率太低了,一会就发烫了,后面会整个静音风扇(类似电脑机箱里那个)就够压住了,它主要是没风带走热量,有风的话5分钟温度就降到室温了。

配置过程

以下过程均先在虚拟机上测试后才会“搬运”到物理机上,以下是虚拟机的配置列表,正好对应N5105的配置进行实验。

image

1、系统安装

关于U盘刻录系统镜像,插入机器并通过U盘启动这一步略过,不了解的可以上网查一下,教程太多了。

开机进入系统安装界面

image

选择English,随后进入到更新安装包的阶段,如果选择第一个update to the new installer那么需要确保你的机器目前已经连上网了,否则就选择第二个

随后选择键盘格式为English (US)

选择Ubuntu Server (minimized)最小化安装Ubuntu Server,然后就可以看到网卡界面了,目前连接公网的接口已经通过DHCP获取到了地址,可正常上网,其余三个接口后续要做桥接,因此先不配置。

image

跳过代理,来到软件包更新界面,如果没有要更新的,直接选择Done就行

在存储配置这一界面,为了后期分区扩容什么的方便,选择LVM安装,无需进行LVM加密

随后分区直接Done即可,因为目前设备就一块磁盘,且对于软路由来说不会存储有太多东西,因此直接一整块磁盘都做为根分区即可。

设置账户密码,随后选择安装OpenSSH Server方便后续远程连接。

到达软件包安装界面,虽然后面会用到docker,但目前系统还无法上外网,安装docker可能不顺利,因此都不选择,直接下一步。随后系统就会开始安装,因为选择的是最小安装,因此等待时间不会太久。

远程连接

安装完成重启开机后,因刚才选择了安装OpenSSH Server,因此可直接在windows的cmd中使用下面的命令远程连接到设备上(前提是和外网接口同网络)

ssh 用户名@接口地址

image

随后要做的就是提权更改root密码,切换root用户,更改ssh配置,使得root用户可使用密钥登录

sudo passwd root                                   # 输入一次ubuntu密码,两次要设置的root密码
su root
vim /etc/ssh/sshd_config
# 修改以下内容
Port 22
SyslogFacility AUTH
LogLevel VERBOSE
PermitRootLogin without-password
LoginGraceTime 2m
DenyUsers xxx	                                  # 禁止xxx用户登录
MaxAuthTries 3                                     # 最大认证次数3
MaxSessions 2                                      # 最大会话数2
PasswordAuthentication no                          # 禁止密码验证登录
PermitEmptyPasswords no                            # 禁止空密码验证
UseDNS no										   # 不对客户端进行DNS泛解析验证,加快SSH连接速度
PubkeyAuthentication yes                           # 开启SSH公钥认证登录
AuthorizedKeysFile .ssh/authorized_keys            # 用户登录公钥路径,如果该用户需要多设备免密登录,可以再authorized_keys文件内另起一行写入其他设备的公钥,也可以在此配置后再跟上.ssh/xxx_keys即可
RSAAuthentication yes                              # 允许RSA算法验证

在windows终端上生成ssh-key

ssh-keygen -t rsa -b 4096 						# 一直回车即可

会在C:\users\用户名\.ssh目录下生成一个私钥文件和公钥文件(pub结尾),使用文本文档打开pub文件,复制全部内容到Ubuntu的/root/.ssh/authorized_keys文件内,重启ssh后即可,尝试使用ssh登录可看到无需输入密码就可以进入系统

systemctl restart ssh
systemctl enable ssh

image

2、安装常用工具并调整系统时区

apt update && apt install iputils-ping traceroute unzip wget curl dnsutils iproute2 net-tools -y

调整系统时区

timedatectl set-timezone Asia/Shanghai

卸载不需要软件

默认最小化安装也会安装snapd,可以卸载掉

apt purge snapd apparmor unattended-upgrades cloud-init -y

标记几个重要的包为手动安装,防止后面卸载掉

apt-mark manual fdisk  hwctl  lvm2  mdadm  screen sudo sudo-rs
apt autoremove -y
apt clean all
reboot

3、命令行补全

安装bash-completion

apt install bash-completion -y

在/etc/bash.bashrc文件下添加脚本命令

if ! shopt -oq posix; then
  if [ -f /usr/share/bash-completion/bash_completion ]; then
    . /usr/share/bash-completion/bash_completion
  elif [ -f /etc/bash_completion ]; then
    . /etc/bash_completion
  fi
fi

退出当前用户终端重新进入即可

4、开启内核的路由转发功能

vim /etc/sysctl.d/90-softrouting.conf
net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1
net.ipv4.tcp_congestion_control = bbr
sysctl -p /etc/sysctl.d/90-softrouting.conf

5、安装NetworkManager管理网络

apt install network-manager -y

目前Ubuntu Server默认管理网络的工具是netplan,其实是不太方便的,配置network-manager管理网络

vim /etc/NetworkManager/NetworkManager.conf
# 修改下面内容
[ifupdown]
managed=true
vim /etc/netplan/00-installer-config.yaml                # 每个系统的文件名都不一样,我这里叫00-installer-config.yaml
# 在下添加,注意开头与 version:2 对齐:
renderer: NetworkManager

应用netplay并重启NetworkManager(可能会断网,做好准备)

netplay apply
systemctl restart NetworkManager

随后查看nmcli接管情况

nmcli con show

出现下面类似的输出说明接管完成

image

修改各网卡的Connect Name为网卡名称,为方便可选择使用Network Manager的伪图形化界面快速修改

nmtui

效果如下

image

6、创建桥接网卡

现代光猫、软路由的LAN网卡其实本质上就是一个虚拟的桥接网卡,类似与Switch(交换机),通过将物理网卡绑定在桥接网卡下实现交换机二层网络接口的功能(默认网卡是三层网络,每个接口必须属于一个独立的IP段)

创建虚拟网卡LAN

nmcli con add \
 con-name lan \
 type bridge \
 ifname lan \
 ipv4.method manual \
 ipv4.addr 192.168.100.254/24 \
 ipv6.method auto \
 stp no

这里我们关闭了STP,因为当LAN接口下没有任何物理端口up时,LAN接口处于关闭状态,接入端口后STP需要经过2个Forwarding时间(30秒)才能进入UP状态,即便进入UP状态,还需要DHCP服务反应一段时间,这就会造成终端在开机或者睡眠结束后无法立马获取到地址上网,需要等待1-2分钟,这是比较痛苦的。

注意:这里如果是在虚拟机中,确保添加的三个内网网卡不属于同一个vmnet,否则,STP不开启的情况下,你将尝到广播风暴的味道~但同时,现实中我们关闭了STP也要确保软路由下接的只有三层接口(终端、路由器等),如果两个接口接同一个交换机,一定要开STP或者聚合接口,要不然也是会环路的

将三张局域网网卡绑定到桥接网卡lan下

nmcli con modify ens34 master lan
nmcli con modify ens35 master lan
nmcli con modify ens36 master lan

image

image

7、配置轻量DHCP服务器

这里有两种轻量选择,一种是dns和dhcp都具备的dnsmasq,另一种是转为dhcp准备的udhcp。后面因为要做流量分流,国内流量和国外流量分开走,需要路由分流和DNS分流,因此需要后面肯定要安装对应的dns服务,但是这里不选择dnsmasq,是因为它在处理大量域名列表的时候太过臃肿,CPU压力大,源于它的设计问题。后面再说,先说DHCP。

这里选择udhcpd是因为他非常的轻量,开发时就是为了嵌入式设备的小型化设备设计的,因此在运行时几乎不消耗资源。

安装udhcpd

apt install udhcpd -y

默认udhcpd有一个udhcpd.service只能监听一个网卡下发地址,可以另创建多实例udhcpd的systemctl服务,实现多接口udhcpd也可以实现下发地址

vim /etc/systemd/system/udhcpd@.service
# 加入以下内容
[Unit]
Description=udhcpd DHCP server for interface %i
After=NetworkManager-wait-online.service
Wants=NetworkManager-wait-online.service

[Service]
Type=simple
ExecStart=/usr/sbin/udhcpd -f /etc/udhcpd/udhcpd@%i.conf
Restart=always
PIDFile=/run/udhcpd@%i.pid

[Install]
WantedBy=multi-user.target

重新加载配置

systemctl daemon-reload

配置UDHCPD服务器

mkdir /etc/udhcpd
vim /etc/udhcpd/udhcpd@lan.conf

配置UDHCP服务器

# 指定要下发的地址范围
start          192.168.10.100
end            192.168.10.200

# 请确保这里是要监听DHCP请求的网络接口,例如 eth0, wlan0 等
interface       lan 

# DHCP租约文件,所有租约都将存储在这里。
lease_file     /var/lib/misc/udhcpd@lan.leases
max_leases      101

# 指定到期时间
opt     lease   864000     # 10 days of seconds

# 指定要下发的掩码和网关
opt     subnet  255.255.255.0
opt     router  192.168.10.254

# 指定下发的DNS,如果后面要做DNS分流,一定也要指定dns是本设备
opt     dns     192.168.10.254

# 指定光播地址
opt     broadcast       192.168.10.255

# 指定下发的域
opt     domain  home

# 也可以指定静态下发地址
#static_lease  00:60:08:11:CE:4E 192.168.1.55
#static_lease  AA:BB:CC:DD:EE:FF 192.168.1.75
systemctl restart udhcpd
systemctl enable udhcpd

创建租约文件

touch /var/lib/misc/udhcpd@lan.leases

启动服务

systemctl restart udhcpd@lan
systemctl enable udhcpd@lan

如果后续要创建新的接口udhcpd服务,在/etc/udhcpd目录下创建udhcpd@xx.conf文件后,使systemctl restart udhcpd@xxx​systemctl enable udhcpd@xx​即可

8、配置smartdns服务

之前说了dnsmasq在处理大量域名列表的时候太过臃肿,CPU压力大。因此选用smartdns服务,smartdns在设计时就保证了即使处理万级的域名列表也很轻松,特别是我们后面要配置的域名列表内有11万的国内域名列表。11万行的列表,dnsmasq处理时会有很长时间的CPU沾满的情况,绝对不适合,而smartdns即使处理11万行域名列表也可以较为轻松应对。

为什么要进行dns分流?因为国内大厂的站点都有CDN和DNS分流解析,国内DNS服务器和国外DNS服务器解析出来的IP地址不一样,国外的DNS走他们的海外节点,国内的DNS走国内的节点。因此,要进行分流的话,必须要做DNS分流。

SmartDNS官网:

https://pymumu.github.io/smartdns/configuration/

安装smartdns

apt install smartdns -y

获取国内域名列表

cd /etc/smartdns
wget -O cn.conf https://raw.githubusercontent.com/felixonmars/dnsmasq-china-list/refs/heads/master/accelerated-domains.china.conf

可以先复制到windows电脑内用记事本打开,Ctrl + h 进行替换,把文件内的server=/换成空 ,/114.114.114.114换成空 ,确保每一行只有一个域名

将*.cn加入到列表的最上方,这样如果访问的是.cn域名的站点服务器直接从文件最上方就能匹配到,不需要往下寻找,消耗CPU资源了

*.cn

配置smartdns

vim /etc/smartdns/smartdns.conf
# 在lan接口监听dns查询
bind :53@lan

# 创建china组且不放入default组,在default组内的server在为明确指定域名对应的dns服务器时,会使用default组内的dns服务器
server 114.114.114.114  -group china -exclude-default-group
server 223.5.5.5  -group china -exclude-default-group
server 223.6.6.6  -group china -exclude-default-group

# 创建global组,并在default组内,当没有明确的域名对应的dns服务器时,会使用default组内的dns服务进行查询,且使用DOH配置
server 8.8.8.8 -group global
server 8.8.4.4 -group global
server 1.1.1.1 -group global

# 创建一个名为cnlist的domain-set,可将cn.conf内容导入到domain-set内快速查询
# -a no -speed-check-mode none确保在使用china组的时候不预先进行测速选择最优服务器,cn.conf文件内容过大的话,测速会导致速度验证下降
domain-set -name cnlist -file /etc/smartdns/cn.conf
nameserver /domain-set:cnlist/china -a no -speed-check-mode none

# IPv6域名解析直接返回为空,在分流中不仅会导致流量无法分流,还会降低smartdns查询速度
force-AAAA-SOA yes
# 指定域名解析缓存大小,缓存到内存中,后续有相同的查询的话就不需要在文件内查询了
cache-size 65536
# 禁用缓存持久化。20万规则下,保存和加载缓存文件会成为启动和关闭时的性能瓶颈,且可能产生冲突。
cache-persist no
prefetch-domain yes

# 最小TTL值
rr-ttl 300
# 允许的最小TTL值
rr-ttl-min 300
# 允许的最大TTL值
rr-ttl-max 86400
# 只记录致命错误
log-level fatal
# 每个上游返回的IP数量,减少处理开销
max-reply-ip-num 4
# 关闭双栈测速,节省资源
dualstack-ip-selection no

这里解释一下:

smartdns在针对某个域名指定查询DNS服务器的时候可以使用conf-file​domain-set​两种方法,这里使用domain-set​domain-set​ 的方案在处理每个DNS查询时,CPU消耗极低且恒定,可将域名列表文件中的每一个域名作为独立的键(Key) 加载到一个高度优化的哈希表中进行哈希查找。收到查询后,SmartDNS 直接计算查询域名的哈希值,然后在哈希表中进行一次查找,立刻知道是否存在。而conf-file是线速查找,顺序遍历。收到一个DNS查询后,SmartDNS 需要从第一条规则开始,依次向下匹配,直到找到匹配项或遍历完所有20万条规则。

使用domain-set​,99% 的查询都不会因为匹配这20万条规则而增加任何可感知的延迟。而使用 conf-file​,每个查询都可能因为要遍历巨大的规则列表而产生几毫秒甚至更高的延迟,这在网络游戏中是致命的。并且domain-set​能处理的每秒查询量(QPS) 会远高于conf-file​方案,在高负载环境下更加稳定。

总而言之,在cn.conf​文件内容在万级时,使用smartdns绝对是性能优化的最优解!

使得resovectl不监听系统网卡53端口,但是还是处理上游dns服务器,防止和dnsmasq冲突

vim /etc/systemd/resolved.conf
DNSStubListener=no
systemctl restart systemd-resolved.service

由于smartdns 是在NetworkManager前启动的,而dnsmasq中的lan接口又是NetworkManager创建的,因此会导致dnsmasq在系统启动时无法得知lan接口导致启动失败,将其调整为NetworkManager后启动

vim /usr/lib/systemd/system/smartdns.service
# 修改以下内容
After=NetworkManager-wait-online.service
Wants=NetworkManager-wait-online.service

重新加载配置

systemctl daemon-reload

可选配置

DOH,加密global组内的DNS查询,防止DNS查询信息泄漏

# 创建global组,并在default组内,当没有明确的域名对应的dns服务器时,会使用default组内的dns服务进行查询
#server 8.8.8.8 -group global
#server 8.8.4.4 -group global
#server 1.1.1.1 -group global
# 注释掉原本的global组配置
server-https https://8.8.8.8/dns-query -group global
server-https https://dns.google.com/dns-query -group global
server-https https://1.1.1.1/dns-query -group global
server-https https:/cloudflare-dns.com/dns-query -group global

server-tcp 8.8.8.8
server-tcp 1.1.1.1
server-https https://dns.google/dns-query -group global
server-https https://doh.opendns.com/dns-query -group global

9、进行科学上网和流量分流

使用OpenVPN

篇幅太长,看我之前的文章

https://blog.opennw.cn/archives/openvpnde-you-hua#15%E3%80%81%E5%AE%9E%E7%8E%B0-openvpn-%E5%88%86%E6%B5%81%E5%B9%B6%E8%87%AA%E5%8A%A8%E6%9B%B4%E6%96%B0%E8%B7%AF%E7%94%B1

这里要注意,安装openvpn-dco的话,有小部分概率tun网卡使用会不顺畅,具体看文章开头,目前只在inter cpu遇到过,而且我配置了十几台服务器了,只见过这一次。

另外注意一下:

1、设置好DNS和路由分流后,苹果设备如果在iCloud中开启了专用代理​那么流量就会通过苹果的服务器再做一层VPN代理加密,不仅会导致苹果设备分流失败,还会导致数据转发效率低(因为加密和隧道做了两次),如果希望进行分流,建议关闭专用代理​后重启设备。

2、即便是关掉专用代理,有可能发现访问抖音APP还是会走隧道,但是访问其他APP就不会(比如CSDN),那么这种情况就是抖音有了海外CDN的缓存或者打了标记,删除重装一下就好了。

使用Wireguard + WStunnel(推荐)

简介

WireGuard 可用于建立安全网络或访问被屏蔽的网站和应用程序,即通过 VPN 路由所有流量,且因内核原生支持,因此可自动进行多进程和最低开销的传输,同时速度非常快。但问题在于它相对容易被屏蔽。在某些地区,它可能会遭到临时或永久屏蔽。如何绕过 WireGuard 协议的屏蔽呢?WebSocket 可以帮您解决这个问题!我们将使用 Wstunnel 将 WireGuard 的所有网络活动封装在 WebSocket 中,从而防止 WireGuard 协议被屏蔽。

使用Wireguard Over WebSocket + TLS的好处在于:Wireguard属于内核级VPN,在使用时可以完美调度所有CPU核心,不会产生OpenVPN那样单核心负载成为性能瓶颈的情况出现,而WStunnel可自动创建与Wiregaurd一样的进程数封装数据进行传递。

这就保证了Wireguard Over WStunnel可以快速、高效、低开销的完成高防情况下的vpn流量传递;

配置过程

下面的过程可创建Linux下的wstunnel多实例system服务,且可保证在接口松动或因某些原因导致服务器正常但客户端无法通信的情况下,可在两次ping测失败后自动重新建立连接的故障自愈。

自动恢复时间计算为:
最快:假设网络中断正好发生在ping命令即将执行的时刻(即“网关刚不通就ping检测了一次”),2秒的ping超时 + 10秒sleep + 2秒的ping超时 + 3s的systemd等待时间 = 17s
最慢:假设ping命令刚成功执行(网关还通,failed重置为0),然后立即中断(即“刚ping测成功网关就不通了”),10秒sleep + 2秒的ping超时 + 10秒sleep + 2秒的ping超时 + 3s的systemd等待时间 = 27s

也可以根据需求调整sleep时间和systemd等待时间,但建议使用默认值。

创建systemd服务

vim /etc/systemd/system/wstunnel@.service
# 加入以下内容
[Unit]
Description=WSTunnel Service for %i
After=NetworkManager-wait-online.service
Wants=NetworkManager-wait-online.service

[Service]
Type=simple
ExecStart=/usr/bin/wstunnel-init %i
Restart=always
RestartSec=3s
User=root
Group=root

[Install]
WantedBy=multi-user.target
systemctl daemon-reload

创建目录

mkdir /etc/wstunnel

创建wstunnel-init.sh

vim /usr/bin/wstunnel-init
# 加入内容
#!/bin/bash

# 获取实例名(从参数传入)
INSTANCE="$1"
if [ -z "$INSTANCE" ]; then
    echo "[ERROR] $(date "+%Y-%m-%d %H:%M:%S") 未指定实例名"
    exit 1
fi

# 全局切换工作目录到 /etc/wstunnel
cd /etc/wstunnel

CONF_FILE="/etc/wstunnel/$INSTANCE.conf"
LOG_FILE="/var/log/$INSTANCE.log"

if [ ! -f "$CONF_FILE" ]; then
    echo "[ERROR] $(date "+%Y-%m-%d %H:%M:%S") 配置文件不存在: $CONF_FILE"
    exit 1
fi

# 捕获主进程 PID 用于 ping 监控中的重启
MAIN_PID=$$

# 解析配置文件(去掉末尾 &)
WSTUNNEL_CMD=$(awk '/<wstunnel>/,/<\/wstunnel>/' "$CONF_FILE" | sed -e '1d' -e '$d' -e 's/&$//')
START_CMDS=$(awk '/<start>/,/<\/start>/' "$CONF_FILE" | sed -e '1d' -e '$d' -e 's/&$//')
END_CMDS=$(awk '/<end>/,/<\/end>/' "$CONF_FILE" | sed -e '1d' -e '$d' -e 's/&$//')
GATEWAY=$(awk '/<gateway>/,/<\/gateway>/' "$CONF_FILE" | sed -e '1d' -e '$d' | tr -d '[:space:]')

is_ip() {
    echo "$1" | grep -Eq '^([0-9]{1,3}\.){3}[0-9]{1,3}$'
}

echo "========================================================================================================"
echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") 正在启动 wstunnel 进程......"
# 启动 wstunnel
bash -c "$WSTUNNEL_CMD" >> "$LOG_FILE" 2>&1 &
WSTUNNEL_PID=$!
sleep 1
echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") wstunnel 进程启动成功,开始执行后续命令......"
# 执行 start 段命令
echo "$START_CMDS" | while read -r line; do
    # 跳过空行和 # 开头的注释行
    [ -z "$line" ] && continue
    case "$line" in
        \#*) continue ;;
    esac
    echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") → $line"
    # 执行命令并保持错误不会导致退出
    bash -c "$line" || echo "[WARNING] $(date "+%Y-%m-%d %H:%M:%S") 命令执行失败: $line"
done

if [ -n "$GATEWAY" ] && is_ip "$GATEWAY"; then
    echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") 提取到 GATEWAY 地址 $GATEWAY,开始进行ping测试......"
    (
        failed=0
        while true; do
            if ! ping -c 1 -W 2 "$GATEWAY" >/dev/null 2>&1; then
                failed=$((failed + 1))
                if [ "$failed" -ge 2 ]; then
                    echo "[WARNING] $(date "+%Y-%m-%d %H:%M:%S") 连续两次ping  $GATEWAY 失败,正在关闭进程......"
                    kill -TERM $MAIN_PID
                    exit 0
                fi
            else
                failed=0
            fi
            sleep 10
        done
    ) &
fi

# 捕获终止信号
trap '
    echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") 收到终止信号,开始执行结束命令......"
    echo "$END_CMDS" | while read -r line; do
        [ -z "$line" ] && continue
        case "$line" in
            \#*) continue ;;
        esac
        echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") → $line"
        bash -c "$line" || echo "[WARNING] $(date "+%Y-%m-%d %H:%M:%S") 命令执行失败: $line"
    done
    kill $WSTUNNEL_PID 2>/dev/null || true
    echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") 执行完毕,正在关闭服务......"
    echo "========================================================================================================"
    exit 0
' SIGTERM SIGINT
# 等待 wstunnel 进程结束
wait $WSTUNNEL_PID

添加可执行权限

chmod +x /usr/bin/wstunnel-init

创建配置文件

(router命令参考第27节的自定义路由添加与查看程序​)

vim /etc/wstunnel/ws-wg0.conf
# 加入以下内容
<wstunnel>
wstunnel client -L udp://51820:localhost:51820?timeout_sec=0 --http-upgrade-path-prefix xxx.xxx.cn@123 --websocket-ping-frequency-sec 10 wss://xxx.cn:443
</wstunnel>

<start>
ip route add 154.xxx.xxx.xxx/32 via 192.168.11.2
router add file china.txt 192.168.11.2
wg-quick up wg0
</start>

<gateway>
10.10.40.1
</gateway>

<end>
ip route del 154.xxx.xxx.xxx/32 via 192.168.11.2
router del file china.txt 192.168.11.2
wg-quick down wg0
</end>

启动服务

systemctl start wstunnel@ws-wg0
systemctl enable wstunnel@ws-wg0		# 开机自启动

查看日志

root@Router:/etc/wstunnel# systemctl status wstunnel@ws-wg0
● wstunnel@ws-wg0.service - WSTunnel Service for ws-wg0
     Loaded: loaded (/etc/systemd/system/wstunnel@.service; enabled; preset: enabled)
     Active: active (running) since Fri 2025-12-12 13:33:18 CST; 35s ago
 Invocation: 767eac44d4ba4b82b4ca211b1f80954c
   Main PID: 22947 (wstunnel-init.s)
      Tasks: 8 (limit: 3922)
     Memory: 4.1M (peak: 7M)
        CPU: 208ms
     CGroup: /system.slice/system-wstunnel.slice/wstunnel@ws-wg0.service
             ├─22947 /bin/bash /usr/bin/wstunnel-init.sh ws-wg0
             ├─22962 wstunnel client -L "udp://51820:localhost:51820?timeout_sec=0" --http-upgrade-path-prefix xxxx.xxxxx.cn@123 --websocket-ping-frequency-sec 10 wss:/
/xxx.xxx.cn:443
             ├─23047 /bin/bash /usr/bin/wstunnel-init.sh ws-wg0
             └─23276 sleep 10

Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] wg setconf wg0 /dev/fd/63
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -4 address add 10.10.40.10/24 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -6 address add fd00:10:10:40::10/96 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip link set mtu 1380 up dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23020]: [#] resolvconf -a wg0 -m 0 -x
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -6 route add ::/1 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -6 route add 8000::/1 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -4 route add 128.0.0.0/1 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -4 route add 0.0.0.0/1 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[22947]: [INFO] 2025-12-12 13:33:19 提取到 GATEWAY 地址 10.10.40.1,开始进行ping测试......

root@Router:/etc/wstunnel# journalctl -u wstunnel@ws-wg0.service
Dec 12 13:33:18 Router systemd[1]: Started wstunnel@ws-wg0.service - WSTunnel Service for ws-wg0.
Dec 12 13:33:18 Router wstunnel-init.sh[22947]: ========================================================================================================
Dec 12 13:33:18 Router wstunnel-init.sh[22947]: [INFO] 2025-12-12 13:33:18 正在启动 wstunnel 进程......
Dec 12 13:33:19 Router wstunnel-init.sh[22947]: [INFO] 2025-12-12 13:33:19 wstunnel 进程启动成功,开始执行后续命令......
Dec 12 13:33:19 Router wstunnel-init.sh[22976]: [INFO] 2025-12-12 13:33:19 → router add 154.xxx.xxx.134/32 via 192.168.11.2
Dec 12 13:33:19 Router wstunnel-init.sh[22976]: [INFO] 2025-12-12 13:33:19 → router add file /etc/wstunnel/china.txt 192.168.11.2
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: |----------------------------------------------------------|
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Using routing table: main (default)                      |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Processing route file: /etc/wstunnel/china.txt           |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Using gateway: 192.168.11.2                              |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Using processes: 1                                       |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Processing routes, please wait...                        |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: |----------------------------------------------------------|
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: |----------------------------------------------------------|
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Processing complete:                                     |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Total lines: 5497                                        |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Valid CIDRs: 5497                                        |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Invalid CIDRs: 0                                         |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Successful adding: 5497                                  |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Failures: 0 (may already exist/not exist)                |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Routes adding finished!                                  |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: | Execution time: 77 ms                                    |
Dec 12 13:33:19 Router wstunnel-init.sh[22980]: |----------------------------------------------------------|
Dec 12 13:33:19 Router wstunnel-init.sh[22976]: [INFO] 2025-12-12 13:33:19 → wg-quick up wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip link add wg0 type wireguard
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] wg setconf wg0 /dev/fd/63
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -4 address add 10.10.40.10/24 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -6 address add fd00:10:10:40::10/96 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip link set mtu 1380 up dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23020]: [#] resolvconf -a wg0 -m 0 -x
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -6 route add ::/1 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -6 route add 8000::/1 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -4 route add 128.0.0.0/1 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[23002]: [#] ip -4 route add 0.0.0.0/1 dev wg0
Dec 12 13:33:19 Router wstunnel-init.sh[22947]: [INFO] 2025-12-12 13:33:19 提取到 GATEWAY 地址 10.10.40.1,开始进行ping测试......

root@Router:/etc/wstunnel# cat /var/log/ws-wg0.log
2025-12-12T05:33:18.027334Z  INFO wstunnel: Starting wstunnel client v10.5.0
2025-12-12T05:33:18.027370Z  INFO wstunnel::protocols::udp::server: Starting UDP server listening cnx on 127.0.0.1:51820 with cnx timeout of 0s
2025-12-12T05:33:19.131430Z  INFO wstunnel::protocols::udp::server: New UDP connection from 127.0.0.1:29748
2025-12-12T05:33:19.131650Z  INFO wstunnel::protocols::tcp::server: Opening TCP connection to xxx.xxxx.cn:443
2025-12-12T05:33:19.331896Z  INFO wstunnel::protocols::tls::server: Doing TLS handshake using SNI DnsName("xxx.xxx.cn") with the server xxx.xxxx.cn:443

10、调整系统启动时间

ubuntu18以后使用netplan管理网络,会有一个毛病,就是如果NetworkManager作为netplan的后端服务的话,默认的systemd遗留下来的systemd-networkd-wait-online这个服务无法正常监听到网络启动,导致系统启动时间有2分钟超时时间等待服务超时。既然我们已经用了NetworkManager,那就直接给它禁用吧,已经没用了,而且NetworkManager也有自己的wait-online服务(NetworkManager-wait-online.service)

systemctl disable  systemd-networkd-wait-online.service

重启系统即可发现开机速度快了

另外为了保证systemd-networkd服务不在对网络有影响,禁用相关的所有服务

systemctl stop systemd-networkd.socket 
systemctl stop systemd-networkd.service
systemctl disable systemd-networkd.socket 
systemctl disable systemd-networkd.service
systemctl mask systemd-networkd.socket 
systemctl mask systemd-networkd.service

11、配置防火墙和NAT

以下是nftables防火墙的加速数据包转发、NAT和有状态包过滤防护配置,内容太大了,不单独解释,没学过的自行查资料

netfilter官网:

https://netfilter.org/
vim /etc/nftables.conf
#!/usr/sbin/nft -f

flush ruleset

# 预定义配置
define lan_int={ lan, wlp6s0 }
define wan_int={ enp1s0 }
define vpn_ip={ 10.10.10.0/24, 10.10.20.0/24, 10.10.30.0/24  }

table inet filter {
		# flowtables,数据包卸载功能,匹配的数据包可绕过prerouting、routing、forwarding和postrouting hook直接跳出netfilter,减少cpu消耗
        flowtable myft {
                hook ingress priority filter;
                devices = { enp1s0, lan, wlp6s0 }
                counter
        }


        chain input {
                type filter hook input priority filter; policy drop;

				# 放行已建立连接和与连接相关的连接(有状态防火墙)
                ct state { established, related } accept

                iif "lo" accept

                iif $lan_int accept

                ip saddr $vpn_ip accept

                iif $wan_int ip protocol tcp tcp dport { 22 } accept
        }

        chain forward {
                type filter hook forward priority filter; policy drop;

				# 将已建立连接和与连接相关的连接调用在加速数据转发的flowtable中(要放在accept之前,否则直接放行了,不会被匹配)
                ct state { established, related } flow add @myft
				# 放行已建立连接和与连接相关的连接(有状态防火墙)
                ct state { established, related } accept

                iif $lan_int accept

                ip saddr $vpn_ip accept

        }

        chain output {
                type filter hook output priority filter; policy accept;

                ct state { established, related } accept
        }

        chain postrouting {
                type nat hook postrouting priority 100; policy accept;

				# 出接口非环回接口和LAN接口时进行SNAT伪装,fully-random需要更换xanmod内核,否则需要删除fully-random才能生效
                oif { eth0 } masquerade fully-random

                ct state { established, related } accept
}
}

加载nftables规则并设置为开机自启动

nft -f /etc/nftables.conf
systemctl enable nftables

同样的还是修改nftabels的启动顺序在NetworkManager之后

vim /usr/lib/systemd/system/nftables.service
# 修改以下内容,并
After=NetworkManager-wait-online.service
Wants=NetworkManager-wait-online.service
systemctl daemon-reload

12、安装docker

国内安装docker会受限制,因此使用阿里云软件源进行安装

# 从阿里云镜像源下载 Docker 官方 GPG 密钥(国内访问快,无 SSL 连接问题)
# 并转换为 apt 可识别的密钥格式,保存到系统密钥目录
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
# 生成适配当前系统架构和版本的阿里云 Docker 源配置
# 其中:
# - $(dpkg --print-architecture) 自动获取系统架构(如 amd64)
# - $(lsb_release -cs) 自动获取 Ubuntu 系统版本代号(如 jammy、focal)
# - signed-by 指定已添加的阿里云密钥路径,确保源验证通过
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# 更新软件包索引(此时会从阿里云源获取信息,速度快且稳定)
sudo apt update

# 安装 Docker 引擎及相关组件(docker-ce 为 Docker 引擎,docker-ce-cli 为命令行工具,containerd.io 为容器运行时)
sudo apt install docker-ce docker-ce-cli containerd.io

# 配置docker开机自启动
sudo systemctl enable docker

检验docker是否安装

docker info
-----------------------------------------------------------------------
Client: Docker Engine - Community
 Version:    28.4.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.27.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.39.2
    Path:     /usr/libexec/docker/cli-plugins/docker-compose
.......................

关闭docker自动生成防火墙配置,后面我们自己配置,并且设置docker默认桥接网卡的ip地址

vim /etc/docker/daemon.json
{
  "iptables": false,
  "ip6tables": false,
  "bip": "192.168.17.1/24"
}

重启docker生效

systemctl restart docker

13、配置IPv6相关内容

在有IPv6地址的环境下,我们为了方便访问内网设备,肯定是要获取IPv6地址的,外网接口从运营商获取到ICMPv6的RA报文,从中提取IPv6前缀并根据eui64生成一个IPv6地址。更重要的是,我们还要通过这个获取到的前缀为局域网内的设备下发IPv6地址。

以上前提:光猫是桥接模式,由ubuntu路由器复制PPPOE拨号,得到运营商下发的PD前缀才可,大多为56或50。否则最多获取到/64的地址,/64的地址是不能继续拆分向下分发的,因为拆分的话就到/65了,而下发ipv6地址最少要求的是/64;

如果ubuntu路由器只能通过RS/RA获取到/64的地址,那么,LAN接口下的设备是无缘公网IPv6地址了,只能WAN接口获取/64地址,然后LAN接口配置/64的私网地址下发RA消息,使得局域网获取到一个私网的/64前缀生成地址。如果希望WAN侧主机通过IPv6访问内网主机,需要做DNAT才可以。

配置外网接口获取IPv6地址

nmcli con modify ens33 ipv6.method auto \
 ipv6.addr-gen-mode eui64 \
 ipv6.ip6-privacy 0
nmcli con up ens33				# 重启外网接口然后使用ip addr查看接口是否获取到IPv6地址

随后进行LAN接口配置

nmcli con modify lan ipv6.method shared \
 ipv6.address fd00:192:168:100::1/64
nmcli con up lan				# 重启内网接口

随后LAN接口下的终端主机在开启IPv6功能后即可通过ICMPv6的RS和RA报文获取对应的IPV6地址和IPv6 DNS信息

image

image

14、配置无线网络

🤔 关于无线网络调优要点:

  • 如果你的无线网卡接口是Mini PCIE的,建议不要选择Intel的,Intel的很多Mini PCIE无线网卡作为热点(AP模式) 来发射Wi-Fi信号时,对其大部分消费级网卡的驱动进行了软件锁定,而且AP模式发射的2.4G信号特别差。这是Intel 为MiniPCIE接口网卡设计时就决定了主要作为Client模式使用而非AP模式。使用其他品牌的无线网卡或者USB网卡可以得到很好的解决。

  • 通常来说WiFi信号强度和用户体验,是发射功率(单位:dBm)和天线(包括增益和覆盖模式,单位:DB)共同作用的结果。因此即使发射功率较小,但天线数量和信号覆盖范围大也有很流畅的使用体验。但最好还是功率和天线增益都高一些的好。

  • 另外,当你使用5G信号时,在不考虑无线网卡自身性能(如支持的技术标准)的前提下,信道数和信道宽度(Channel Width)是决定Wi-Fi数据吞吐量(即速度)最关键的因素之一。

上面说了Mini PCIE无线网卡接口的软路由尽量不要选择Intel无线网卡,可能踩坑,我这里使用的是联发科的MT7922网卡。在使用时,这款网卡在Linux系统能较好地支持AP模式,有较高的传输速率,6.8以上的内核无需手打驱动。再者,我后面购买了一代的无线网卡延长线和天线(增益3DB),无线信号覆盖范围已经可以将我的出租屋全方位覆盖,如果你家里很大需要更大的信号覆盖范围,建议购买8/12DB的增益天线或者外接AP组网。

软路由无线组网方式有三种,根据自己的要求进行选择

🤔 软路由无线组网的三种方法:

1、使用主板内置插槽;
大部分小主机内置了有PCIE或Mini PCIE插槽的无线网卡,可以选择一个合适的无线网卡+增益天线进行释放无线信号;

2、使用USB无线网卡;
与使用主板插槽类似,但是USB无线网卡通用性强,也省心,通常USB无线网卡的功率都比较高,自带增益天线。

3、外接AP;
使用软路由作为核心路由,下联接口插入一个家用路由器(最好支持仅AP模式),这种组网是信号覆盖范围和速率最好的。

在配置无线网络时,有三种NetworkManager方法可以进行配置(传统可能使用hostapd服务,但是过于麻烦,而NetworkManager提供了快速简便的方法进行配置)

🤔 释放无线信号的三种方法:

1、使用”个人热点“;

2、手动配置各种参数;

3、桥接到LAN网卡(推荐);

以上三种方法的区别在于:

  • 个人热点:创建后NetworkManager会使用内置的dnsmasq下发dhcp和dns,方便快捷,但是无线网卡和LAN网卡属于不同的三层网络,需要配置不同的IP段,且如果手动配置了DHCP和DNS服务,会无法下发地址,因为接口对应端口已经被监听;

  • 手动配置DHCP和DNS参数:这种方式下所有参数都需要手动配置,包括DHCP和DNS,但是无线网卡和LAN网卡属于不同的三层网络,需要配置不同的IP段。且需要独立配置第二个DHCP实例监听无线网卡的DHCP请求;

  • 桥接到LAN网卡:此模式下,LAN网卡和无线网卡共用一个三层IP和IPv6,对于无线网卡来说,它不需要管IP下发的问题,只需要发射无线信号、设置密码,随后把客户端的对应数据上送给LAN网卡,由LAN网卡统一下发DHCP、DNS等。这也是家用路由器常见的做法,无线客户端获取到的IP地址是和有线客户端一个网段的。

下面举例三种模式的创建方法(5G Channel 36),如果无法启动且报错 Error: Connection activation failed: 802.1X supplicant took too long to authenticate​,可能存在多种原因,如:配置了当前国家代码下不允许使用的配置、无线接口模式为shared但已经被别的服务监听了53和67,68端口、网卡不支持5G或高MHz等等;

首先将自己的无线网卡国家代码设置为CN,避免因为默认处于严格限制的00而使得很多功能禁止使用(80MHz信道宽度和某些提升无线安全和性能的功能)

iw reg set CN

以上命令为临时修改,使用下面的方法写入内核模块参数,创建或编辑下面的文件

vim /etc/modprobe.d/cfg80211.conf
# 添加以下内容
options cfg80211 ieee80211_regdom="CN"

更新initramfs确保在引导时加载,随后重启系统无线国家代码也是CN了

update-initramfs -u

查看附近wifi所处的大部分信道

nmcli device wifi list

查看网卡支持的信道,[ ]​外没有写disabled​和no IR​的都是可以使用的

iw list | grep -A 10 "dBm"

根据如何配置一个没有使用或最少使用的信道,信道与信道之间最少相隔5,我这里没人使用信道149

个人热点配置(shared模式确保会自动下发地址)

nmcli device wifi hotspot \
 con-name SoftRouting \
 mode ap \
 ifname wlx0013ef6f25bd  \
 ssid SoftRouting \
 802-11-wireless-security.key-mgmt wpa-psk \
 802-11-wireless-security.psk xxxx \
 802-11-wireless.band a 802-11-wireless.channel 36 \
 ipv4.method shared \
 ipv4.address 192.168.20.254/24
nmcli con up SoftRouting

手动配置:

nmcli con add con-name SoftRouting \
 ifname wlx0013ef6f25bd \
 type wifi \
 mode ap \
 802-11-wireless.ssid SoftRouting \
 802-11-wireless-security.key-mgmt wpa-psk \
 802-11-wireless-security.psk xxxx \
 802-11-wireless.band a 802-11-wireless.channel 36 \
 ipv4.method manual \
 ipv4.address 192.168.20.254/24 
nmcli con up SoftRouting

桥接到LAN网卡配置

nmcli connection add con-name SoftRouting \
 ifname wlx0013ef6f25bd \
 type wifi \
 master lan \
 wifi.mode ap \
 wifi.ssid SoftRouting  \
 wifi-sec.key-mgmt wpa-psk \
 wifi-sec.psk Ljr1873599@ \
 802-11-wireless.band a \
 802-11-wireless.channel 149 
nmcli con up SoftRouting
nmcli con up lan

扩展:设置加密方式为更安全的WPA2(AES),默认是TKIP和AES混用,而客户端连接时可能选择到了不安全的TKIP

nmcli con modify SoftRouting \
 802-11-wireless-security.proto rsn \
 802-11-wireless-security.pairwise ccmp \
 802-11-wireless-security.group ccmp 
nmcli con up SoftRouting
nmcli con up lan

开启AP隔离(可选),无线终端之间无法互访

nmcli con modify SoftRouting 802-11-wireless.ap-isolation true

禁用省电模式,确保无线网卡始终处于高性能状态

nmcli con modify SoftRouting 802-11-wireless.powersave disable

查看无线信息

iw dev wlx0013ef6f25bd info
-------------------------------------------------------------------------------------------------------
Interface wlx0013ef6f25bd
        ifindex 6
        wdev 0x1
        addr 00:13:ef:6f:25:bd
        ssid SoftRouting
        type AP
        wiphy 0
        channel 149 (5745 MHz), width: 20 MHz, center1: 5745 MHz
        txpower 19.00 dBm
        multicast TXQ:
                qsz-byt qsz-pkt flows   drops   marks   overlmt hashcol tx-bytes        tx-packets
                0       0       6308    0       0       0       0       995117          8377

传输功率的单位是 dBm。数值越大,信号覆盖范围越广。常见家用路由器的功率一般在 20 dBm(100mW)左右。

这里有个踩坑点,如果你使用的无线网卡是MT792x的,且系统是6.8的内核,那么你大概率会看到只有3dbm的功率,这在2025年的2月已经修复了,是内核补丁修复人员说这是误报,实际功率正常。如果你不想更换内核,且测速正常的话就不用管了(MT7921/2正常无线桥接下的5G模式80MHz信道宽度的情况下带宽应该在700Mbps左右)。

查看客户端连接信息(需要已有连接的客户端)

iw dev wlp6s0 station dump
-------------------------------------------------------------------------------------------------------
Station 5e:7c:0b:29:d6:56 (on wlp6s0)
	inactive time:	29757 ms
	rx bytes:	3435638710
	rx packets:	2569586
	tx bytes:	3683176612
	tx packets:	2643850
	tx retries:	115217
	tx failed:	63
	rx drop misc:	0
	signal:  	-29 [-34, -30] dBm
	signal avg:	-29 [-33, -30] dBm
	tx bitrate:	780.0 MBit/s VHT-MCS 9 80MHz VHT-NSS 2
	tx duration:	140452409 us
	rx bitrate:	866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2
	rx duration:	40286512 us
	last ack signal:-28 dBm
	avg ack signal:	-28 dBm
	airtime weight: 256
	authorized:	yes
	authenticated:	yes
	associated:	yes
	preamble:	long
	WMM/WME:	yes
	MFP:		yes
	TDLS peer:	no
	DTIM period:	2
	beacon interval:100
	short slot time:yes
	connected time:	3361 seconds
	associated at [boottime]:	8987.760s
	associated at:	1758546903718 ms
	current time:	1758550264908 ms

其中的tx bitrate:173.3 MBit/s​和rx bitrate:173.3 MBit/s​分别是网卡的最大上传和下载速度

更改网卡信道宽度(需要网卡支持),以获得更大的带宽吞吐量。(标准的80MHz所使用的信道为36, 40, 44, 48 而160MHz则因每一组都可能会有雷达干扰,因此除非在偏远地区,否则很多情况下甚至表现不如80MHz,使用的话推荐信道36, 40, 44, 48, 52, 56, 60, 64)

nmcli con modify SoftRouting 802-11-wireless.channel 36
nmcli con modify SoftRouting 802-11-wireless.channel-width 80mhz
nmcli con up SoftRouting 

再次查看无线网卡信息

phy#0
	Interface wlx0013ef6f25bd
		ifindex 6
		wdev 0x1
		addr 00:13:ef:6f:25:bd
		ssid SoftRouting
		type AP
		channel 48 (5240 MHz), width: 80 MHz, center1: 5210 MHz
		txpower 18.00 dBm
		multicast TXQ:
			qsz-byt	qsz-pkt	flows	drops	marks	overlmt	hashcol	tx-bytes	tx-packets
			0	0	277	0	0	0	0	38459		332

测速后发现上下行均可达650mbps左右(homebox测速)

15、部署DDNS-GO

上面我们通过RS和RA报文让WAN侧接口获得了运营商下发的公网IPv6地址,但这个地址是会随机变化的,如果希望可以远程管理不太方便。因此我们需要部署DDNS-GO,将域名与IPv6地址绑定,同时定时检测IPv6地址是否发生变化,检测到变化后自动将域名提供商内DNS的解析地址改变。前提是有域名,我已经提前买了一个域名并且完成了备案。

首先确保接口获得的IPv6地址是单个的,不存在“临时地址”

nmcli con modify ens33 ipv6.method auto \
 ipv6.addr-gen-mode eui64 \
 ipv6.ip6-privacy 0
nmcli con up ens33

docker部署

docker pull jeessy/ddns-go
mkdir /Docker_Data/ddns-go
docker run -d --name ddns-go --restart=always --network=host -v /Docker_Data/ddns-go:/root jeessy/ddns-go

随后LAN侧获取到地址后浏览器输入网关地址+9876即可进入配置界面

随后为了方便远程管理,打开WAN侧的IPv6 SSH端口,这里要注意:为了安全,一定要使用密钥登录并关闭密码登录,最好后面再配置Fail2ban防止SSH爆破。修改后的input规则如下

        chain input {
                type filter hook input priority filter; policy drop;

                ct state { established, related } accept

                iif "lo" accept

                iif $lan_int accept

                ip saddr $vpn_ip accept

                iif $wan_int ip6 nexthdr icmpv6 accept

                iif $wan_int ip6 nexthdr tcp tcp dport { 22 } accept

                iif $wan_int ip protocol tcp tcp dport { 22 } accept
        }
nft -f /etc/nftables.conf

尝试使用IPv6地址SSH登录,前提:SSH客户端已经可通过IPv6正常上网;

16、流量记录

在网上有很多专业级的开源监控系统,但是对于我们的路由器来说,过于丰富的功能是一个累赘,但我们又需要一个界面美观的流量监控。于是经过我一天的查找和部署,找到一个国内作者开发的基于docker版vnstat的网卡监控web界面,有着资源占用极小,界面美观大气的优点,下面是部署过程

制作镜像(可选)

如果你必须制作,可以使用作者已经做好的镜像me1dlinger/vnstat_dashboard​

git clone https://github.com/me1dlinger/vnstat_dashboard.git
cd /vnstat_dashboard/vnstat_assist

这里发现作者设置的镜像源不对,而且我的设备可以直接上外网,因此删除其中dockerfile中镜像源的配置后执行下面的命令

docker build -t opennw/vnstat-dashboard .

构建完成后,先使用下面的命令部署docker版的vnstat

docker run -d \
    --restart=always \
    --network=host \
    -e HTTP_PORT=9695 \
    -v /etc/localtime:/etc/localtime:ro \
    -v /etc/timezone:/etc/timezone:ro \
    --name vnstat \
    vergoh/vnstat

随后部署刚才构建好的镜像

mkdir -p /Docker_Data/vnstat/log/python
mkdir -p /Docker_Data/vnstat/backups
docker run -d \
  --name vnstat-dashboard \
  --network=host \
  --restart=always \
  -v /Docker_Data/vnstat/log/python:/app/log/python \
  -v /Docker_Data/vnstat/backups:/app/backups \
  -e VNA_AUTH_ENABLE=1 \
  -e VNSTAT_API_URL=http://127.0.0.1:9695/json.cgi \
  -e VNA_SECRET_KEY=public \
  -e VNA_EXPIRE_SECONDS=3600 \
  -e VNA_USERNAME=admin \
  -e VNA_PASSWORD=admin \
  opennw/vnstat-dashboard

如果不想进行认证,将VNA_AUTH_ENABLE=设置为1,VNA_USERNAME、VNA_PASSWORD和VNA_SECRET_KEY可以不设置,但需要你确保足够安全,只有你可以访问。(但是web界面还是需要填用户名密码,只是随便写就行)

成功运行后,浏览器输入 http://xxxx:19328进入登录界面,此时需要先点击右上角 设置图标,填入你的web界面地址和默认网卡名。注意是web界面地址,不是vnstat的json.cgi地址。

保存后输入账户名密码即可进入流量监控界面,但注意,只有vnstat-docker运行时,才能有正常的流量记录,且首次部署需要等待5分钟才能看到采集的流量记录

17、拦截广告

smartdns可以拦截广告域名并把解析地址指向0.0.0.0,从而达到拦截广告的目的。可以手动捕捉广告域名,也可以在这个项目里提取:

https://github.com/217heidai/adblockfilters
cd /etc/smartdns/
touch blacklist.txt

将广告域名添加进blacklist.txt中

将下面的内容加入到smartdns.conf​文件中,记得放在cnlist的前面,确保在匹配到分流域名前先匹配到黑名单列表

# 广告域名domain-set,将解析结果指向0来阻止广告,使用domain-set加快查询
domain-set -name blacklist -file /etc/smartdns/blacklist.txt
address /domain-set:blacklist/0 -a no -speed-check-mode none

重启smartdns,随后,在blacklist列表中的广告域名就会被拦截。主要表现为:

1、部分页面广告消失;

2、无法拦截站点作者嵌入页面的图片链接或弹窗样式;

还有一种Adgurad项目拦截广告的,更为高级,但是对我来说用处不大,需要的可以在Github搜索这个项目

18、查看SSH登录成功和失败信息

在Ubuntu24.04及之前,我习惯使用last和lastb输出ssh登录成功和失败信息,用于查看是否有人尝试爆破,但不知为何,在Ubuntu24.10开始不在提供last和lastb命令。last还可以安装wtmpdb实现,但是lastb已经无法安装了,也没有替代命令。

因为官方希望使用systemd中的journalctl工具替代传统的rsyslog工具。下面是一个基于systemd/sd-journal.h​库开发的高效、快速、美观的ssh日志查看工具

安装编译需要的工具和库

apt install g++ libsystemc-dev -y

编写程序

vim sclient_show.cpp
#include <bits/stdc++.h>
#include <cstdio>
#include <cstring>
#include <termios.h>
#include <unistd.h>
#include <sys/select.h>
#include <signal.h>
#include <systemd/sd-journal.h>
#include <time.h>

using namespace std;

const string C_RESET = "\033[0m";
const string C_BRIGHT_WHITE = "\033[97;1m";
const string C_BRIGHT_YELLOW = "\033[93;1m";
const string C_BRIGHT_GREEN = "\033[92;1m";
const string C_BRIGHT_MAGENTA = "\033[95;1m";
const string C_BRIGHT_BLUE = "\033[94;1m";

string get_timestamp(uint64_t usec) {
    time_t sec = usec / 1000000ULL;
    struct tm tm_struct;
    localtime_r(&sec, &tm_struct);
    char buf[32];
    strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S", &tm_struct);
    return string(buf);
}

void sigint_handler(int sig) {
    fprintf(stdout, "%s------------------------------------------------------------------%s\n", C_BRIGHT_WHITE.c_str(), C_RESET.c_str());
    fprintf(stdout, "Terminated by user.\n");
    exit(0);
}

void print_help() {
    fprintf(stdout, "Usage: %s [-s | -f] [-p <num|all>] [-i <keyword>] [-u <user>] [-h]\n", "sclient_show");
    fprintf(stdout, "\nOptions:\n");
    fprintf(stdout, "  -s              Show successful login logs (default)\n");
    fprintf(stdout, "  -f              Show failed login logs\n");
    fprintf(stdout, "  -p <num|all>    Page size (default 30), or 'all' to show everything\n");
    fprintf(stdout, "  -i <keyword>    Filter logs containing keyword\n");
    fprintf(stdout, "  -u <user>       Filter logs for specific user\n");
    fprintf(stdout, "  -h              Show this help message\n");
    fprintf(stdout, "\nStatus field meanings:\n");
    fprintf(stdout, "  - Accepted: Authentication successful, login allowed\n");
    fprintf(stdout, "  - Failed: Authentication failed (e.g., wrong password, connection closed)\n");
    fprintf(stdout, "  - Invalid: Invalid user or credentials (Including empty username attempts, Where User field will be blank. For example: scanning tools or malicious scripts)\n");
    fprintf(stdout, "  - Disallowed: User not listed in AllowUsers, login not allowed\n");
    fprintf(stdout, "  - Closed: Connection closed during pre-authentication\n");
    exit(0);
}

bool wait_space() {
    fprintf(stdout, "......Press <space> for next page, Ctrl+C to quit......");
    fflush(stdout);
    int fd = STDIN_FILENO;
    struct termios oldt, newt;
    if (tcgetattr(fd, &oldt) != 0) return false;
    newt = oldt;
    cfmakeraw(&newt);
    tcsetattr(fd, TCSADRAIN, &newt);
    fd_set rfds;
    FD_ZERO(&rfds);
    FD_SET(fd, &rfds);
    int ret = select(fd + 1, &rfds, NULL, NULL, NULL);
    bool is_space = false;
    if (ret > 0 && FD_ISSET(fd, &rfds)) {
        char ch;
        if (read(fd, &ch, 1) == 1) {
            is_space = (ch == ' ');
        }
    }
    fprintf(stdout, "\r%*s\r", 60, " ");
    fflush(stdout);
    tcsetattr(fd, TCSADRAIN, &oldt);
    return is_space;
}

void process_logs(bool show_success, vector<vector<string>>& successes, vector<vector<string>>& failures) {
    // Optimization: Use systemd journal C API (libsystemd) for direct, efficient access to logs without spawning processes or text streaming
    sd_journal *j = NULL;
    int r = sd_journal_open(&j, SD_JOURNAL_LOCAL_ONLY);
    if (r < 0) {
        fprintf(stderr, "Failed to open journal: %s\n", strerror(-r));
        return;
    }

    // Filter by SSH unit (exact match)
    r = sd_journal_add_match(j, "_SYSTEMD_UNIT=ssh.service", 0);
    if (r < 0) goto close_journal;

    // Seek to the end (newest entries)
    sd_journal_seek_tail(j);

    // Iterate backwards (newest first)
    const void *data;
    size_t length;
    while (sd_journal_previous(j) > 0) {
        uint64_t usec = 0;
        if (sd_journal_get_realtime_usec(j, &usec) < 0) continue;

        string ts = get_timestamp(usec);

        if (sd_journal_get_data(j, "MESSAGE", &data, &length) < 0) continue;

        string message((const char*)data + 8, length - 8);  // Skip "MESSAGE=" prefix

        if (show_success) {
            // Accepted
            size_t pos = message.find("Accepted ");
            if (pos != string::npos) {
                string rest = message.substr(pos + 9);
                size_t for_pos = rest.find(" for ");
                if (for_pos != string::npos) {
                    string after_for = rest.substr(for_pos + 5);
                    istringstream iss(after_for);
                    string user;
                    iss >> user;
                    string token;
                    string ip;
                    while (iss >> token) {
                        if (token == "from") {
                            iss >> ip;
                            break;
                        }
                    }
                    if (!user.empty() && !ip.empty() && user.find("SHA256:") != 0) {
                        successes.push_back({ts, "Accepted", user, ip});
                    }
                }
            }
        } else {
            // Failed
            size_t failed_pos = message.find("Failed ");
            if (failed_pos != string::npos) {
                string rest = message.substr(failed_pos + 7);
                size_t for_pos = rest.find(" for ");
                if (for_pos != string::npos) {
                    string after_for = rest.substr(for_pos + 5);
                    istringstream iss(after_for);
                    string token;
                    iss >> token;
                    string user;
                    if (token == "invalid") {
                        iss >> token; // user
                        iss >> user;
                    } else {
                        user = token;
                    }
                    string ip;
                    while (iss >> token) {
                        if (token == "from") {
                            iss >> ip;
                            break;
                        }
                    }
                    if (!user.empty() && !ip.empty() && user.find("SHA256:") != 0) {
                        failures.push_back({ts, "Failed", user, ip});
                    }
                }
                continue;
            }

            // Invalid
            size_t invalid_pos = message.find("Invalid user ");
            if (invalid_pos != string::npos) {
                string rest = message.substr(invalid_pos + 13);
                istringstream iss(rest);
                string user;
                iss >> user;
                string ip;
                string token;
                if (user == "from") {
                    user = "";
                    iss >> ip;
                } else {
                    while (iss >> token) {
                        if (token == "from") {
                            iss >> ip;
                            break;
                        }
                    }
                }
                if (user.find("SHA256:") != 0) {
                    failures.push_back({ts, "Invalid", user, ip});
                }
                continue;
            }

            // Disallowed
            size_t disallowed_pos = message.find("not allowed because not listed in AllowUsers");
            if (disallowed_pos != string::npos) {
                size_t user_pos = message.find("User ");
                if (user_pos != string::npos) {
                    string rest = message.substr(user_pos + 5);
                    istringstream iss(rest);
                    string user;
                    iss >> user;
                    string token;
                    string ip;
                    while (iss >> token) {
                        if (token == "from") {
                            iss >> ip;
                            break;
                        }
                    }
                    if (!user.empty() && !ip.empty() && user.find("SHA256:") != 0) {
                        failures.push_back({ts, "Disallowed", user, ip});
                    }
                }
                continue;
            }

            // Closed
            size_t closed_pos = message.find("Connection closed by authenticating user ");
            if (closed_pos != string::npos && message.find("[preauth]") != string::npos) {
                string rest = message.substr(closed_pos + 41);
                istringstream iss(rest);
                string user;
                iss >> user;
                string ip;
                iss >> ip;
                if (!user.empty() && !ip.empty() && user.find("SHA256:") != 0) {
                    failures.push_back({ts, "Closed", user, ip});
                }
                continue;
            }
        }
    }

close_journal:
    sd_journal_close(j);
}

void update_colw(const vector<string>& row, vector<size_t>& colw) {
    for (size_t j = 0; j < row.size(); ++j) {
        size_t len = row[j].length();
        if (len > colw[j]) colw[j] = len;
    }
}

void append_row_to_buf(const vector<string>& row, char*& buf_ptr, size_t& remaining, const vector<size_t>& colw, const vector<string>& color_seq, size_t min_pad) {
    // Start with "|"
    memcpy(buf_ptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); buf_ptr += C_BRIGHT_WHITE.length(); remaining -= C_BRIGHT_WHITE.length();
    *buf_ptr++ = '|'; remaining -= 1;
    memcpy(buf_ptr, C_RESET.c_str(), C_RESET.length()); buf_ptr += C_RESET.length(); remaining -= C_RESET.length();

    for (size_t j = 0; j < row.size(); ++j) {
        // For each column (j>0): "|"
        if (j > 0) {
            memcpy(buf_ptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); buf_ptr += C_BRIGHT_WHITE.length(); remaining -= C_BRIGHT_WHITE.length();
            *buf_ptr++ = '|'; remaining -= 1;
            memcpy(buf_ptr, C_RESET.c_str(), C_RESET.length()); buf_ptr += C_RESET.length(); remaining -= C_RESET.length();
        }
        const string& field = row[j];
        size_t field_len = field.length();
        size_t pad_total = colw[j] - field_len + 2 * min_pad;
        size_t pad_left = pad_total / 2;
        size_t pad_right = pad_total - pad_left;
        if (field != "/") {
            size_t color_len = color_seq[j].length();
            memcpy(buf_ptr, color_seq[j].c_str(), color_len); buf_ptr += color_len; remaining -= color_len;
            if (pad_left > 0) {
                memset(buf_ptr, ' ', pad_left); buf_ptr += pad_left; remaining -= pad_left;
            }
            memcpy(buf_ptr, field.c_str(), field_len); buf_ptr += field_len; remaining -= field_len;
            if (pad_right > 0) {
                memset(buf_ptr, ' ', pad_right); buf_ptr += pad_right; remaining -= pad_right;
            }
            memcpy(buf_ptr, C_RESET.c_str(), C_RESET.length()); buf_ptr += C_RESET.length(); remaining -= C_RESET.length();
        } else {
            size_t empty_w = pad_left + pad_right;
            if (empty_w > 0) {
                memset(buf_ptr, ' ', empty_w); buf_ptr += empty_w; remaining -= empty_w;
            }
        }
    }
    // End with "|"
    memcpy(buf_ptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); buf_ptr += C_BRIGHT_WHITE.length(); remaining -= C_BRIGHT_WHITE.length();
    *buf_ptr++ = '|'; remaining -= 1;
    memcpy(buf_ptr, C_RESET.c_str(), C_RESET.length()); buf_ptr += C_RESET.length(); remaining -= C_RESET.length();
    *buf_ptr++ = '\n'; remaining -= 1;
}

void batch_print_rows(const vector<vector<string>>& rows, size_t start, size_t end, int out_fd, const vector<size_t>& colw, const vector<string>& color_seq, size_t min_pad) {
    char page_buf[65536];
    char* page_ptr = page_buf;
    size_t page_remaining = sizeof(page_buf);
    for (size_t j = start; j < end; ++j) {
        append_row_to_buf(rows[j], page_ptr, page_remaining, colw, color_seq, min_pad);
        if (page_remaining < 2048) {
            size_t written = sizeof(page_buf) - page_remaining;
            if (written > 0) {
                ssize_t res = write(out_fd, page_buf, written);
                (void)res;
            }
            page_ptr = page_buf;
            page_remaining = sizeof(page_buf);
        }
    }
    size_t final_written = sizeof(page_buf) - page_remaining;
    if (final_written > 0) {
        ssize_t res = write(out_fd, page_buf, final_written);
        (void)res;
    }
}

void print_header(FILE* out, const vector<size_t>& colw, size_t min_pad) {
    static const char* keys[4] = {"Timestamp", "Status", "User", "Source IP"};
    static const string header_colors[4] = {C_BRIGHT_BLUE, C_BRIGHT_BLUE, C_BRIGHT_BLUE, C_BRIGHT_BLUE};
    char header_buf[4096];
    char* hptr = header_buf;
    size_t hrem = sizeof(header_buf) - 1;

    // Start with "|"
    memcpy(hptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); hptr += C_BRIGHT_WHITE.length(); hrem -= C_BRIGHT_WHITE.length();
    *hptr++ = '|'; hrem -= 1;
    memcpy(hptr, C_RESET.c_str(), C_RESET.length()); hptr += C_RESET.length(); hrem -= C_RESET.length();

    for (size_t i = 0; i < 4; ++i) {
        // For each column (i>0): "|"
        if (i > 0) {
            memcpy(hptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); hptr += C_BRIGHT_WHITE.length(); hrem -= C_BRIGHT_WHITE.length();
            *hptr++ = '|'; hrem -= 1;
            memcpy(hptr, C_RESET.c_str(), C_RESET.length()); hptr += C_RESET.length(); hrem -= C_RESET.length();
        }
        size_t klen = strlen(keys[i]);
        size_t pad_total = colw[i] - klen + 2 * min_pad;
        size_t pad_left = pad_total / 2;
        size_t pad_right = pad_total - pad_left;
        size_t header_color_len = header_colors[i].length();
        memcpy(hptr, header_colors[i].c_str(), header_color_len); hptr += header_color_len; hrem -= header_color_len;
        if (pad_left > 0) {
            memset(hptr, ' ', pad_left); hptr += pad_left; hrem -= pad_left;
        }
        memcpy(hptr, keys[i], klen); hptr += klen; hrem -= klen;
        if (pad_right > 0) {
            memset(hptr, ' ', pad_right); hptr += pad_right; hrem -= pad_right;
        }
        memcpy(hptr, C_RESET.c_str(), C_RESET.length()); hptr += C_RESET.length(); hrem -= C_RESET.length();
    }
    // End with "|"
    memcpy(hptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); hptr += C_BRIGHT_WHITE.length(); hrem -= C_BRIGHT_WHITE.length();
    *hptr++ = '|'; hrem -= 1;
    memcpy(hptr, C_RESET.c_str(), C_RESET.length()); hptr += C_RESET.length(); hrem -= C_RESET.length();
    *hptr++ = '\n'; hrem -= 1;
    *hptr = '\0';
    fputs(header_buf, out);
    fflush(out);
}

void print_section_with_paging(vector<vector<string>>& entries, size_t page_size) {
    // Entries are already newest first from journal API

    vector<size_t> colw(4, 0);
    static const char* keys[4] = {"Timestamp", "Status", "User", "Source IP"};
    for (size_t i = 0; i < 4; ++i) colw[i] = strlen(keys[i]);
    for (const auto& row : entries) {
        update_colw(row, colw);
    }

    size_t min_pad = 4;  // Minimum padding on each side
    size_t num_cols = 4;
    size_t visible_len = num_cols + 1;
    for (auto w : colw) visible_len += w + 2 * min_pad;

    string dashline(visible_len, '-');

    vector<string> color_seq = {C_BRIGHT_YELLOW, C_BRIGHT_GREEN, C_BRIGHT_MAGENTA, C_BRIGHT_BLUE};

    fprintf(stdout, "%s%s%s\n", C_BRIGHT_WHITE.c_str(), dashline.c_str(), C_RESET.c_str());
    print_header(stdout, colw, min_pad);
    fprintf(stdout, "%s%s%s\n", C_BRIGHT_WHITE.c_str(), dashline.c_str(), C_RESET.c_str());
    fflush(stdout);

    size_t display_total = entries.size();
    int out_fd = STDOUT_FILENO;
    if (display_total <= page_size || page_size == SIZE_MAX) {  // SIZE_MAX means all
        batch_print_rows(entries, 0, display_total, out_fd, colw, color_seq, min_pad);
    } else {
        size_t i = 0;
        while (i < display_total) {
            size_t end = i + page_size;
            if (end > display_total) end = display_total;
            batch_print_rows(entries, i, end, out_fd, colw, color_seq, min_pad);
            i = end;
            if (i < display_total && !wait_space()) break;
        }
    }
    fprintf(stdout, "%s%s%s\n", C_BRIGHT_WHITE.c_str(), dashline.c_str(), C_RESET.c_str());
}

// Filter function: Filter entries by keyword and user
void filter_entries(vector<vector<string>>& entries, const string& keyword, const string& target_user) {
    if (keyword.empty() && target_user.empty()) return;

    vector<vector<string>> filtered;
    for (const auto& row : entries) {
        bool match = true;
        if (!keyword.empty()) {
            string full_line;
            for (const auto& field : row) {
                full_line += field + " ";
            }
            if (full_line.find(keyword) == string::npos) {
                match = false;
            }
        }
        if (match && !target_user.empty()) {
            if (row[2] != target_user) {  // row[2] is User
                match = false;
            }
        }
        if (match) {
            filtered.push_back(row);
        }
    }
    entries = std::move(filtered);
}

int main(int argc, char* argv[]) {
    char buf[BUFSIZ];
    setvbuf(stdout, buf, _IOFBF, BUFSIZ);

    ios::sync_with_stdio(false);
    cin.tie(nullptr);
    cout.tie(nullptr);

    bool show_success = true;
    string keyword;
    string target_user;
    size_t page_size = 30;

    int opt;
    while ((opt = getopt(argc, argv, "sfp:i:u:h")) != -1) {
        switch (opt) {
            case 's':
                show_success = true;
                break;
            case 'f':
                show_success = false;
                break;
            case 'p':
                if (strcmp(optarg, "all") == 0) {
                    page_size = SIZE_MAX;  // Means show all
                } else {
                    page_size = atoi(optarg);
                    if (page_size == 0) page_size = 30;  // Default
                }
                break;
            case 'i':
                keyword = optarg;
                break;
            case 'u':
                target_user = optarg;
                break;
            case 'h':
                print_help();
                break;
            default:
                fprintf(stderr, "Usage: %s [-s | -f] [-p <num|all>] [-i <keyword>] [-u <user>] [-h]\n", argv[0]);
                return 1;
        }
    }

    signal(SIGINT, sigint_handler);

    vector<vector<string>> successes, failures;
    successes.reserve(8192);
    failures.reserve(8192);
    process_logs(show_success, successes, failures);

    vector<vector<string>> entries;
    if (show_success) {
        entries = std::move(successes);  // Already newest first
    } else {
        // Use only failures since all failed logs are now collected there
        entries = std::move(failures);
    }

    // Apply filtering
    filter_entries(entries, keyword, target_user);

    if (entries.empty()) {
        fprintf(stdout, "No matching entries found.\n");
        return 0;
    }

    print_section_with_paging(entries, page_size);

    return 0;
}

编译命令

g++ -std=c++17 -O3 -static sclient_show.cpp -o sclient_show -lsystemd -lcap -static-libgcc -static-libstdc++

使用方法:

查看登录成功的信息

sclient_show
# 或
sclient_show -s

image

查看登录失败的信息

sclient_show -f

查看使用方法

sclient_show -h
Usage: sclient_show [-s | -f] [-p <num|all>] [-i <keyword>] [-u <user>] [-h]

Options:
  -s              Show successful login logs (default)
  -f              Show failed login logs
  -p <num|all>    Page size (default 30), or 'all' to show everything
  -i <keyword>    Filter logs containing keyword
  -u <user>       Filter logs for specific user
  -h              Show this help message

Status field meanings:
  - Accepted: Authentication successful, login allowed
  - Failed: Authentication failed (e.g., wrong password, connection closed)
  - Invalid: Invalid user or credentials (Including empty username attempts, Where User field will be blank. For example: scanning tools or malicious scripts)
  - Disallowed: User not listed in AllowUsers, login not allowed
  - Closed: Connection closed during pre-authentication

-i 进行关键字匹配

sclient_show -i

-u 根据用户名进行匹配

sclient_show -u

19、使用Fail2ban进行防护

篇幅过长不再讲述,可以看我原文,里面详细介绍了Fail2ban防止SSH爆破、Nginx CC攻击等行为的处理

https://blog.opennw.cn/archives/fail2ban

20、更换XanMod内核

默认的 Linux 内核被设计为一种通用解决方案,能够在不同的系统和硬件配置上提供广泛的兼容性。它稳定、可靠且经过广泛测试,但并不总是针对特定用例提供最佳性能。

自定义内核(例如 XanMod)则能满足这一需求。XanMod 内核是基于最新稳定版本的 Linux 内核,旨在通过低延迟提高系统的响应性能。它是由社区驱动的项目,结合了其他内核的最佳特性和独特的增强功能,更加专注于优化桌面、多媒体和游戏工作负载,以提供更具响应性和流畅性的 Linux 使用体验。同样在软路由上也有很大增益效果

对于较旧的 Linux 发行版来说,切换到像 XanMod 这样的自定义内核可以提供显著的性能改进。但需要注意的是,使用自定义内核需要更多的技术知识,而且可能不像默认 Linux 内核那样稳定。

XanMod 6.18.5相对于默认6.18内核在数据转发和VPN上的优势

XanMod是针对桌面/服务器优化的内核分支,集成上游补丁、backports(如BBRv3)和自定义调整(如Cloudflare TCP补丁),而vanilla 6.18是Linus主线,注重稳定性但少自定义优化。XanMod在网络密集任务中更出色,尤其路由/VPN。

  • ​数据转发(路由/NAT)优势​:

    • ​高时钟和抢占​:1000Hz无滴答内核+全抢占 vs. vanilla 300Hz。解释:高时钟减少调度延迟,在高PPS(e.g., 10-40 Mpps)下,上下文切换快20-50%,转发小包(如DNS)更顺畅。

    • ​拥塞/队列管理​:内置BBRv3(更公平 vs. vanilla BBR)和CAKE qdisc(防bufferbloat)。解释:BBRv3在高延迟链路(如跨海路由)增吞吐10-20%;FLOWOFFLOAD硬件加速NAT,CPU卸载15-30%。

    • ​TCP/Netfilter优化​:Cloudflare补丁处理小包风暴(崩塌段,CPU省10%);full-cone NAT原生支持(vanilla需额外)。解释:高并发下,conntrack更高效,端口回收快。

    • 总体:基准测试显示XanMod转发吞吐高5-15%,CPU低(e.g., 10Gbps线速无瓶颈)。适合软路由,避免vanilla的保守调度。

  • ​VPN优势​:

    • ​低延迟焦点​:全抢占+RCU Boost使加密/解密顺滑,RTT低5-10ms(e.g., WireGuard隧道)。

    • ​多任务​:BFQ IO调度器快5-15%在日志/加密负载;AMD/Intel特定优化(如V-Cache)提升缓存命中。

    • ​吞吐​:BBRv3+TCP调整在丢包链路增速度10-20%;FLOWOFFLOAD加速IPsec。

    • ​总体​:XanMod处理1000+隧道更稳,性能高10-25%,但空闲功耗略高。vanilla适合简单场景,但XanMod更匹配高性能VPN。

这些优势来自XanMod的补丁集(如Zen、Liquorix影响),定期更新保持前沿。

安装前的准备

更新系统

apt update && sudo apt upgrade

添加并注册 APT 存储库

wget -qO - https://dl.xanmod.org/archive.key | sudo gpg --dearmor -vo /etc/apt/keyrings/xanmod-archive-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/xanmod-archive-keyring.gpg] http://deb.xanmod.org $(lsb_release -sc) main" | sudo tee /etc/apt/sources.list.d/xanmod-release.list

更新 apt 软件包索引

apt update

安装 XanMod

检查CPU支持内核版本

注意:一定要选择符合的版本进行安装,否则将导致无法正常启动!

awk -f <(wget -O - https://dl.xanmod.org/check_x86-64_psabi.sh)

根据输出的内容,你可以清楚地看到v2​、v3​或v4​的标识,据此选择对应的 XanMod 内核

或者查看支持的所有等级

/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 --help | grep -i 'supported'

安装内核

以下命令以 XanMod EDGE x64 v3 内核的安装为例,你可以根据自己的需求修改命令

apt list | grep linux-xanmod*
apt install linux-xanmod-x64v3

查看所有内核

dpkg --list | egrep -i --color 'linux-image|linux-headers'

如果安装成功,你就可以看到 XanMod EDGE x64 v3 内核

ii  linux-headers-6.14.0-33               6.14.0-33.33                             all          Header files related to Linux kernel version 6.14.0
ii  linux-headers-6.14.0-33-generic       6.14.0-33.33                             amd64        Linux kernel headers for version 6.14.0
ii  linux-headers-6.14.0-36               6.14.0-36.36                             all          Header files related to Linux kernel version 6.14.0
ii  linux-headers-6.14.0-36-generic       6.14.0-36.36                             amd64        Linux kernel headers for version 6.14.0
ii  linux-headers-6.18.5-x64v3-xanmod1    6.18.5-x64v3-xanmod1-0~20260111.g3f3cf7e amd64        Linux kernel headers for 6.18.5-x64v3-xanmod1 on amd64
ii  linux-headers-generic                 6.14.0-36.36                             amd64        Generic Linux kernel headers
ii  linux-image-6.14.0-33-generic         6.14.0-33.33                             amd64        Signed kernel image generic
ii  linux-image-6.14.0-36-generic         6.14.0-36.36                             amd64        Signed kernel image generic
ii  linux-image-6.18.5-x64v3-xanmod1      6.18.5-x64v3-xanmod1-0~20260111.g3f3cf7e amd64        Linux kernel, version 6.18.5-x64v3-xanmod1
ii  linux-image-generic                   6.14.0-36.36                             amd64        Generic Linux kernel image

重启系统

reboot

验证安装

cat /proc/version

删除不需要的内核和配置文件

内核已经安装成功,但是当前并未生效,所以你需要将其他内核删除。当然,你也可以不删除,指定启动内核即可。关于此请自行搜索教程。

apt purge linux-headers-6.14.0-33  linux-headers-6.14.0-33-generic linux-headers-6.14.0-36  linux-headers-6.14.0-36-generic linux-headers-generic linux-image-6.14.0-33-generic linux-image-6.14.0-36-generic linux-image-generic  
apt autoremove --purge

查看正在使用的内核

uname -r

查看BBR3状态

modinfo tcp_bbr

如果提示:modinfo: ERROR: Module tcp_bbr not found.错误请执行sudo depmod​命令再查看BBR3状态。

现在 XanMod 内核已经安装成功并生效。

21、数据转发调优

禁用无线网卡省电模式,确保无线网卡始终处于高性能状态

nmcli con modify SoftRouting 802-11-wireless.powersave disable

增大网卡收发数据时的缓冲区大小,防止流量大时丢包

nmcli con modify eth0 ethtool.ring-rx 8192 ethtool.ring-tx 8192
nmcli con modify eth1 ethtool.ring-rx 8192 ethtool.ring-tx 8192
nmcli con modify eth2 ethtool.ring-rx 8192 ethtool.ring-tx 8192
nmcli con modify eth3 ethtool.ring-rx 8192 ethtool.ring-tx 8192
nmcli con modify SoftRouting ethtool.ring-rx 8192 ethtool.ring-tx 8192
nmcli con modify lan ethtool.ring-rx 8192 ethtool.ring-tx 8192

网卡参数调优

nmcli connection modify eth0 ethtool.feature-gro on ethtool.feature-gso on ethtool.feature-tso on ethtool.feature-rx on ethtool.feature-tx on ethtool.feature-sg on ethtool.feature-rxhash on ethtool.feature-ntuple on ethtool.channels-combined 4 ethtool.coalesce-rx-usecs 0 ethtool.feature-rx-udp-gro-forwarding on ethtool.feature-rx-gro-list off ethtool.feature-tx-udp-segmentation on ethtool.feature-tx-udp_tnl-segmentation on ethtool.feature-tx-udp_tnl-csum-segmentation on ethtool.feature-gro on ethtool.feature-lro on ethtool.feature-tso on ethtool.feature-tx-tcp-segmentation on ethtool.feature-tx-tcp-ecn-segmentation on ethtool.feature-tx-tcp6-segmentation on
nmcli connection modify eth1 ethtool.feature-gro on ethtool.feature-gso on ethtool.feature-tso on ethtool.feature-rx on ethtool.feature-tx on ethtool.feature-sg on ethtool.feature-rxhash on ethtool.feature-ntuple on ethtool.channels-combined 4 ethtool.coalesce-rx-usecs 0 ethtool.feature-rx-udp-gro-forwarding on ethtool.feature-rx-gro-list off ethtool.feature-tx-udp-segmentation on ethtool.feature-tx-udp_tnl-segmentation on ethtool.feature-tx-udp_tnl-csum-segmentation on ethtool.feature-gro on ethtool.feature-lro on ethtool.feature-tso on ethtool.feature-tx-tcp-segmentation on ethtool.feature-tx-tcp-ecn-segmentation on ethtool.feature-tx-tcp6-segmentation on
nmcli connection modify eth2 ethtool.feature-gro on ethtool.feature-gso on ethtool.feature-tso on ethtool.feature-rx on ethtool.feature-tx on ethtool.feature-sg on ethtool.feature-rxhash on ethtool.feature-ntuple on ethtool.channels-combined 4 ethtool.coalesce-rx-usecs 0 ethtool.feature-rx-udp-gro-forwarding on ethtool.feature-rx-gro-list off ethtool.feature-tx-udp-segmentation on ethtool.feature-tx-udp_tnl-segmentation on ethtool.feature-tx-udp_tnl-csum-segmentation on ethtool.feature-gro on ethtool.feature-lro on ethtool.feature-tso on ethtool.feature-tx-tcp-segmentation on ethtool.feature-tx-tcp-ecn-segmentation on ethtool.feature-tx-tcp6-segmentation on
nmcli connection modify eth3 ethtool.feature-gro on ethtool.feature-gso on ethtool.feature-tso on ethtool.feature-rx on ethtool.feature-tx on ethtool.feature-sg on ethtool.feature-rxhash on ethtool.feature-ntuple on ethtool.channels-combined 4 ethtool.coalesce-rx-usecs 0 ethtool.feature-rx-udp-gro-forwarding on ethtool.feature-rx-gro-list off ethtool.feature-tx-udp-segmentation on ethtool.feature-tx-udp_tnl-segmentation on ethtool.feature-tx-udp_tnl-csum-segmentation on ethtool.feature-gro on ethtool.feature-lro on ethtool.feature-tso on ethtool.feature-tx-tcp-segmentation on ethtool.feature-tx-tcp-ecn-segmentation on ethtool.feature-tx-tcp6-segmentation on
nmcli con modify SoftRouting ethtool.feature-gro on ethtool.feature-gso on ethtool.feature-tso on ethtool.feature-rx on ethtool.feature-tx on ethtool.feature-sg on ethtool.feature-rxhash on ethtool.feature-ntuple on ethtool.channels-combined 4 ethtool.coalesce-rx-usecs 0 ethtool.feature-rx-udp-gro-forwarding on ethtool.feature-rx-gro-list off ethtool.feature-tx-udp-segmentation on ethtool.feature-tx-udp_tnl-segmentation on ethtool.feature-tx-udp_tnl-csum-segmentation on ethtool.feature-gro on ethtool.feature-lro on ethtool.feature-tso on ethtool.feature-tx-tcp-segmentation on ethtool.feature-tx-tcp-ecn-segmentation on ethtool.feature-tx-tcp6-segmentation on

解释

  • ethtool.feature-gro on全称:Generic Receive Offload (GRO,通用接收卸载)。
    作用:在网卡接收侧,将多个小数据包(segments)聚合为更大的包,减少内核处理中断次数和 CPU 开销。适用于高流量接收场景,提升 PPS(每秒包数)约 10-20%。
    适用:下载/转发大流量时推荐启用。

  • ethtool.feature-gso on全称:Generic Segmentation Offload (GSO,通用分段卸载)。
    作用:在发送侧,允许内核将大包(> MTU)保持完整发送到网卡,由网卡硬件分段成小包,减少 CPU 分段计算。GSO 是 TSO 的通用版,支持 UDP 等协议。
    适用:高带宽上传时,降低 CPU 负载。

  • ethtool.feature-tso on全称:TCP Segmentation Offload (TSO,TCP 分段卸载)。
    作用:GSO 的 TCP 特定实现,由网卡计算 TCP 校验和并分段大 TCP 包。比软件分段快,尤其在 10G+ 链路。
    适用:TCP 为主的流量(如 web、文件传输);与 GSO 结合使用。

  • ethtool.feature-rx on全称:Receive Checksum Offload (RX 校验和卸载)。
    作用:将接收包的校验和(checksum)验证从 CPU 移到网卡硬件,加速包处理。减少错误包重传。
    适用:所有接收流量;默认多为 on,但显式启用确保一致。

  • ethtool.feature-tx on全称:Transmit Checksum Offload (TX 校验和卸载)。
    作用:发送包时,由网卡计算并填充校验和,卸载 CPU 计算负担。
    适用:高吞吐发送场景,与 RX 配对使用。

  • ethtool.feature-sg on全称:Scatter-Gather I/O (散射-聚集 I/O)。
    作用:允许网卡使用非连续(scatter-gather)内存缓冲区传输数据,提高 DMA(直接内存访问)效率,减少内存拷贝。
    适用:大数据包或多缓冲传输时,提升整体 I/O 性能。

  • ethtool.feature-rxhash on全称:Receive Hashing (接收哈希)。
    作用:基于包头(IP/TCP/UDP 元组)计算哈希,将接收流量均匀分发到多个 RX 队列(RSS),实现多核负载均衡。
    适用:多队列网卡 + 多核 CPU 时,防止单队列瓶颈。

  • ethtool.feature-ntuple on全称:n-Tuple Filtering (n 元组过滤)。
    作用:扩展 RX 哈希,支持基于更细粒度流规则(n-tuple,如源/目标端口)将包引导到特定队列,进一步优化流量分类和负载均衡。需与 rxhash 结合。
    适用:复杂流量(如虚拟化、NFV)场景,提高精确分发。

  • ethtool.channels-combined 4: 作用:设置网卡的 combined channels(组合通道/队列)数量为 4。这里的 "combined" 表示 RX(接收)和 TX(发送)队列共享这些通道,而不是分开设置(separate RX/TX)。它启用 RSS(Receive Side Scaling) 和 multi-queue 支持,将传入/传出数据包分布到多个 CPU 核心处理。通常设置为cpu核心数。
    适用:负载均衡,减少锁争用(lock contention),适合多流 forwarding(如 nftables offload)。结合 RPS/RFS,效果更好

  • ethtool.coalesce-rx-usecs 0:
    作用:设置 RX coalescing(接收中断合并)的 usecs(微秒)延迟为 0。这意味着网卡在收到数据包后 立即(0 us 延迟)触发 CPU 中断,而不是等待一段时间积累多个包再中断(默认 ~50-100 us)。
    适用:在高负载 forwarding 下,减少包缓冲时间,PPS +10-20%。适合实时路由,低延迟,但会增大CPU负载。

  • rx-udp-gro-forwarding:

    作用:专为转发路径(routing/forwarding)设计,允许 UDP 包在转发时应用 GRO(Generic Receive Offload),合并小包成大包。优点:显著提升 UDP 吞吐量(20-100%),减少 CPU 开销,尤其适合 QUIC/HTTP/3、Tailscale 或 WireGuard 等高 UDP 流量场景。在 Linux 6.2+ 内核中推荐开启,用于出口节点或路由器。缺点:如果 rx-gro-list 已启用,它会优先,导致 rx-udp-gro-forwarding 效果受限,可能降低 UDP 性能;某些命名空间或隧道(如 VXLAN/Geneve)中可能破坏流量。

重启网卡生效,注意以上调优不适合虚拟网卡LAN,但是物理网卡调整后LAN也会生效

nmcli con up SoftRouting
nmcli con up eth0
nmcli con up eth1
nmcli con up eth2
nmcli con up eth3
nmcli con up lan

开启网卡vpn硬件调优

nmcli con modify eth0 ethtool.feature-tx-esp-segmentation on ethtool.feature-tx-gre-segmentation on ethtool.feature-tx-udp_tnl-segmentation on
nmcli con modify eth1 ethtool.feature-tx-esp-segmentation on ethtool.feature-tx-gre-segmentation on ethtool.feature-tx-udp_tnl-segmentation on
nmcli con modify eth2 ethtool.feature-tx-esp-segmentation on ethtool.feature-tx-gre-segmentation on ethtool.feature-tx-udp_tnl-segmentation on
nmcli con modify eth3 ethtool.feature-tx-esp-segmentation on ethtool.feature-tx-gre-segmentation on ethtool.feature-tx-udp_tnl-segmentation on
nmcli con up eth0
nmcli con up eth1
nmcli con up eth2
nmcli con up eth3
nmcli con up lan
  • x-esp-segmentation (ESP 发送分段卸载): ESP 是 IPsec VPN 的加密协议,用于安全隧道。当数据包太大需要分段时,此功能让网卡硬件自动处理 ESP 封装包的分段和重组,而非 CPU。作用:减少 CPU 开销,提高 IPsec VPN 的吞吐量和效率,避免软件分段导致的延迟。在高安全需求的网络中(如企业 VPN),这能显著提升性能。

  • tx-gre-segmentation (GRE 发送分段卸载): GRE (Generic Routing Encapsulation) 用于创建点对点隧道(如在路由器间传输多协议流量)。此功能让网卡硬件处理 GRE 封装包的分段。作用:优化 GRE 隧道的包处理,降低 CPU 利用率,支持更大 MTU(最大传输单元),适用于 overlay 网络或遗留协议迁移场景,提高整体网络吞吐。

  • tx-gre-csum-segmentation (GRE 发送校验和分段卸载): 类似于上述,但额外包括校验和 (checksum) 计算的分段卸载。网卡硬件同时计算 GRE 包的校验和并分段。作用:进一步减轻 CPU 负担,尤其在校验和密集的流量中,提升 GRE 隧道的可靠性和速度。

  • tx-udp_tnl-segmentation (UDP 隧道发送分段卸载): VXLAN (Virtual eXtensible LAN) 使用 UDP 作为隧道协议来扩展 L2 网络。此功能让网卡处理 UDP 隧道包(如 VXLAN)的分段。作用:加速 VXLAN overlay 网络的性能,减少 CPU 在云或虚拟化环境(如 Kubernetes、OpenStack)中的负载,支持大规模虚拟机/容器网络而不会瓶颈。

  • tx-udp_tnl-csum-segmentation (UDP 隧道发送校验和分段卸载): 类似于上述,但包括 UDP 隧道包的校验和计算分段。作用:结合校验和优化,进一步提升 VXLAN 等 UDP 基隧道的效率和可靠性。

监控

watch -n1 'ethtool -S eth1 | grep -E "rx|tx" && mpstat -P ALL 1 1'
apt install sysstat -y
sar -n DEV 1

开启nftables flowtables,加速数据转发,降低cpu使用率

table inet filter {
		# flowtables,数据包卸载功能,匹配的数据包可绕过prerouting、routing、forwarding和postrouting hook直接跳出netfilter,减少cpu消耗
        flowtable myft {
                hook ingress priority filter;
                devices = { enp1s0, lan, wlp6s0 }
                counter
        }

        chain forward {
                type filter hook forward priority filter; policy drop;

				# 将已建立连接和与连接相关的连接调用在加速数据转发的flowtable中(要放在accept之前,否则直接放行了,不会被匹配)
                ct state { established, related } flow add @myft
				# 放行已建立连接和与连接相关的连接(有状态防火墙)
                ct state { established, related } accept

                iif $lan_int accept

                ip saddr $vpn_ip accept

        }
nft -f /etc/nftables.conf

安装irqbalance​绑定中断到核心 CPU,避免轮询冲突

apt install irqbalance -y
systemctl enable irqbalance

查看irqbalance日志

如果出现上面字样,则使用下面的命令

mkdir -p /etc/systemd/system/irqbalance.service.d
tee /etc/systemd/system/irqbalance.service.d/override.conf > /dev/null <<EOF
[Service]
ProtectKernelTunables=no
EOF
systemctl daemon-reload
systemctl restart irqbalance

重新启动进程后等待几分钟查看日志,如果出现下面的字样​​

应该为这些IRQ被设备的其他驱动所接管,例如NVME等,通常情况下“放任不管”是最好的选择,否则可能影响稳定

安装 cpufrequtils并调整为高性能模式

固定CPU高频转发,减少数据转发时的上下文中断

apt install cpufrequtils -y

配置 /etc/default/cpufrequtils或者/etc/init.d/cpufrequtils中的GOVERNOR="ondemand"为GOVERNOR="performance“

重启cpufrequtils

systemctl restart cpufrequtils
systemctl enable cpufrequtils

​​

出现上面字样表示设置成功

调整内核参数

ubuntu是一个通用服务器,默认的一些内核参数是为了最为服务器使用的,既然我们要做路由器肯定就需要进行一些内核配置调整

将下面的内容覆盖掉/etc/sysctl.d/90-softrouting.conf​文件

#反向路径验证,防止 IP 欺骗和 DDoS 攻击(0 为关闭,适合 VPN 服务器以避免兼容问题;若无 VPN,建议改为 1)
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.all.rp_filter=0

# 启用 IPv4 和 IPv6 包转发(路由器核心功能)
net.ipv4.ip_forward=1
net.ipv6.conf.all.forwarding=1

# ECMP哈希优化
net.ipv4.fib_multipath_hash_policy=1

# 禁用 ICMP 重定向,防止 MITM 攻击
net.ipv4.conf.all.accept_redirects=0
net.ipv6.conf.all.accept_redirects=0
net.ipv4.conf.default.accept_redirects=0
net.ipv6.conf.default.accept_redirects=0

# 禁用发送 ICMP 重定向,进一步增强路由安全
net.ipv4.conf.all.send_redirects=0
net.ipv4.conf.default.send_redirects=0

# 禁用源路由包接受,防止 IP 欺骗
net.ipv4.conf.all.accept_source_route=0
net.ipv4.conf.default.accept_source_route=0

# 忽略广播 ICMP Echo 请求,防放大攻击
net.ipv4.icmp_echo_ignore_broadcasts=1

# 启用 IPv4 恶意 ICMP 错误消息保护
net.ipv4.icmp_ignore_bogus_error_responses=1

# 关闭TCP SYN cookies,XanMod的TCP补丁已优化崩溃处理,否则高性能场景下可能降低效率
net.ipv4.tcp_syncookies=0
net.ipv4.tcp_rfc1337 = 1

# 设置 TCP 半连接超时为 10s,提高连接建立效率
net.netfilter.nf_conntrack_tcp_timeout_syn_recv=10

# Conntrack 建立连接超时缩短至 1800s
net.netfilter.nf_conntrack_tcp_timeout_established=1800

# UDP超时
net.netfilter.nf_conntrack_udp_timeout=600

# 缩短 FIN 超时时间,加速连接关闭
net.ipv4.tcp_fin_timeout=15

# 宽松跟踪
net.netfilter.nf_conntrack_tcp_loose=1

# 缩短 TCP CLOSE_WAIT 超时
net.netfilter.nf_conntrack_tcp_timeout_close_wait=60

# 防止保存TCP metrics到新连接,加速连接建立
net.ipv4.tcp_no_metrics_save=1

# 缩短 TCP TIME_WAIT 超时,释放端口更快
net.netfilter.nf_conntrack_tcp_timeout_time_wait=30

# 通用 Conntrack 超时为 600s,适用于非 TCP/UDP 协议;防资源泄漏
net.netfilter.nf_conntrack_generic_timeout=600

# 增大 Conntrack 最大连接数,支持高并发路由
net.netfilter.nf_conntrack_max=1048576


# 对乱序包更宽容
net.netfilter.nf_conntrack_tcp_be_liberal=1

#
net.nf_conntrack_max = 1048576

# core dump 优化
kernel.core_uses_pid=1

# TCP Keepalive 时间为 300s(5 分钟),检测死连接
net.ipv4.tcp_keepalive_time=300

# Keepalive 探测间隔 60s
net.ipv4.tcp_keepalive_intvl=15

# Keepalive 最大探测次数 5 次
net.ipv4.tcp_keepalive_probes=5
net.ipv4.tcp_synack_retries=2

# 本地端口范围 2000-65535,支持更多并发连接
net.ipv4.ip_local_port_range=2000 65535

# TCP 接收缓冲区(最小、默认、最大),优化高延迟网络
net.ipv4.tcp_rmem=8192 87380 33554432

# TCP 发送缓冲区(最小、默认、最大),提升发送性能
net.ipv4.tcp_wmem=8192 87380 33554432

# 网络接口接收缓冲区最大值
net.core.rmem_max=67108864

# 网络接口发送缓冲区最大值
net.core.wmem_max=67108864

# UDP 接收缓冲区最小值,优化 UDP 流量(如 DNS、DHCP);防止小包丢弃
net.ipv4.udp_rmem_min = 16384

# UDP 发送缓冲区最小值,提升 UDP 突发性能
net.ipv4.udp_wmem_min = 16384

# 启用接收缓冲自动调优,根据网络条件动态调整大小,避免内存浪费
net.ipv4.tcp_moderate_rcvbuf=1

# 限制广告窗口大小,防止过度膨胀,同时支持大窗口—这来自Cloudflare的优化,能在高丢包链路减少重传
net.ipv4.tcp_adv_win_scale=-2

# TCP 监听队列长度,支持更多并发
net.core.somaxconn=65536

# SYN 后备队列长度,防 SYN 洪水
net.ipv4.tcp_max_syn_backlog=8192

# 网络接口输入队列最大长度,防高负载丢包
net.core.netdev_max_backlog=500000

# TCP 窗口缩放启用,提高长距离网络性能
net.ipv4.tcp_window_scaling=1

# TCP 拥塞控制算法为 BBR,适合高带宽延迟网络
net.ipv4.tcp_congestion_control=bbr

# 启用 TCP Fast Open,减少连接建立 RTT
net.ipv4.tcp_fastopen=3

# 切换到 VyOS 推荐的 fq_codel 队列
net.core.default_qdisc=fq_codel

# IPv6 路由垃圾回收阈值
net.ipv6.route.gc_thresh=1024

# IPv6 接受RA消息,可以通过SLAAC获取到IPv6地址
net.ipv6.conf.all.accept_ra=2

# 文件系统事件通知队列最大数,提高监控性能
fs.fanotify.max_queued_events=65536

# 每个用户最大监控文件/目录数
fs.inotify.max_user_watches=524288

# 系统最大文件描述符数
fs.file-max=2097152

# ARP 缓存垃圾回收阈值,防止 ARP 风暴
net.ipv4.neigh.default.gc_thresh1=4096
net.ipv4.neigh.default.gc_thresh2=8192
net.ipv4.neigh.default.gc_thresh3=16384

# 允许 TIME_WAIT 套接字复用新连接;加速端口回收 15-20%。
net.ipv4.tcp_tw_reuse=1

# 增大 TIME_WAIT 桶限,防端口耗尽;高并发必需
net.ipv4.tcp_max_tw_buckets=1440000

# 启用 MTU 探测,支持 Jumbo Frames;提升大包吞吐 10%
net.ipv4.tcp_mtu_probing=1

# 增大 ancillary 缓冲,支持 UDP/TCP 选项;低开销优化
net.core.optmem_max=25165824
net.ipv4.tcp_mem = 786432 1048576 26777216

# 哈希桶数匹配 conntrack_max/4;加速查找,减少cpu消耗
net.netfilter.nf_conntrack_buckets=262144

# 增大V4路由最大条目数
net.ipv4.route.max_size=1048576

# 增大V6路由最大条目数
net.ipv6.route.max_size=131072

# 设置路由缓存垃圾回收(GC)的超时时间
net.ipv4.route.gc_timeout=100
net.ipv6.route.gc_timeout=100

# ARP缓存条目有效期
net.ipv4.neigh.default.base_reachable_time=30000

# 过期ARP缓存条目的垃圾回收检查时间间隔
net.ipv4.neigh.default.gc_stale_time=75000

# 禁止内核指针暴露,提高安全
kernel.kptr_restrict=2

# 减少调度开销
kernel.sched_autogroup_enabled=0

# 每个 CPU 轮询最大包数(默认 300,提升到 6000 处理高 PPS)
net.core.netdev_budget = 6000

# 轮询周期最大微秒(默认 2000,延长到 8000 减少切换,适合 >1Gbps)
net.core.netdev_budget_usecs=9000
net.core.rmem_default=67108864
net.core.wmem_default=67108864

# 启用忙轮询,适合低延迟VPN,在高PPS场景(如小包洪水或DDoS),忙轮询减少中断开销,CPU节省15-30%,轻微增加空闲CPU使用
net.core.busy_poll=50
net.core.busy_read=50

# NUMA 优化,单CPU用0,多cpu用1或4
vm.zone_reclaim_mode=0

# 禁用swap
vm.swappiness = 0
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2

# 优先低延迟
net.ipv4.tcp_low_latency=1

# RPS/RFS:CPU 包转向(轮询核心)
net.core.rps_sock_flow_entries = 65536

# NAPI 权重(默认 64,提升聚合更多包)
net.core.dev_weight = 128

# 允许TCP套接字接受来自任何VRF的连接
net.ipv4.tcp_l3mdev_accept = 1

# 允许UDP套接字接受来自任何VRF的
net.ipv4.udp_l3mdev_accept = 1

# 允许RAW套接字接受来自任何VRF的连接
net.ipv4.raw_l3mdev_accept = 1

# 启用 Selective ACK(SACK),优化丢包恢复;TCP 性能提升 5-10%
net.ipv4.tcp_sack = 1

# 启用 D-SACK(Dup SACK),报告重复 ACK 以细化拥塞控制
net.ipv4.tcp_dsack = 1

# 启用 Forward ACK(FACK),结合 SACK 改进拥塞窗口计算;适合高丢包链路
net.ipv4.tcp_fack = 1


# Flowtable超时:10分钟,减缓过期
net.netfilter.nf_flowtable_tcp_timeout=600
net.netfilter.nf_flowtable_udp_timeout=600

# AccECN 允许每个 RTT(往返时延)多个反馈,提高拥塞控制效率,ECN=2在VPN隧道中更兼容,如果对端不支持ECN,它会优雅降级,避免不必要的丢包
net.ipv4.tcp_ecn = 2

# MPTCP 接收性能提升,支持服务器端标志和 ADD_ADDR 处理。
#net.mptcp.enabled = 1

#
net.ipv6.conf.all.autoconf = 0

# Cloudflare补丁优化
net.ipv4.tcp_notsent_lowat=131072
net.ipv4.tcp_collapse_max_bytes=6291456


# 固定2048页 * 2MB = 4096MB内存给数据转发使用,内存足够多时再开启,通常设置为总内存的一半,不会增加耗电。
# 4096的内存够支持 10-40+ Mpps(百万包/秒)或 10-100 Gbps 线速转发。
# 如果是1G和2.5G的速率,设置为1024页(2048MB)内存非常充足
vm.nr_hugepages=1024

刷新内核参数

sysctl -p /etc/sysctl.d/90-softrouting.conf

禁用Swap分区

在高 PPS 场景,swap 可能导致 OOM(Out of Memory)崩溃或 thrashing(过度分页),而非优雅降级。Kubernetes 等实时系统也强制禁用 swap 以隔离性能。

sudo swapoff -a
echo 'vm.swappiness = 0' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
ost   0       lan              local |
| fe80::62be:b4ff:fe02:1360               /        /        kernel    host   0       eth0             local |
| ff00::/8                                /        /        kernel    link   256     eth0             local |
| ff00::/8                                /        /        kernel    link   256     wg0              local |
| ff00::/8                                /        /        kernel    link   256     eth1             local |
| ff00::/8                                /        /        kernel    link   256     wlx0013ef6f25bd  local |
| ff00::/8                                /        /        kernel    link   256     lan              local |
-------------------------------------------------------------------------------------------------------------

22、安装FRR软件

FRR 官方仓库从 frr-7 开始已包含 libyang2 等依赖包,支持 Ubuntu 24.04。这是最简单、稳定的方式,避免版本冲突。

  1. 添加 GPG 密钥(验证包签名):

    curl -s https://deb.frrouting.org/frr/keys.gpg | sudo tee /usr/share/keyrings/frrouting.gpg > /dev/null
  2. 添加 FRR 仓库(使用稳定版):

    FRRVER="frr-stable"
    echo "deb [signed-by=/usr/share/keyrings/frrouting.gpg] https://deb.frrouting.org/frr noble $FRRVER" | sudo tee -a /etc/apt/sources.list.d/frr.list
  3. 更新包列表并安装 FRR:

    sudo apt update
    sudo apt install frr frr-pythontools -y
  4. 可选插件安装

    apt install frr-snmp frr-rpki-rtrlib frr-dbgsym prometheus-frr-exporter

如果aa把frr相关创建文件的权限限制住,导致无法正常使用

mkdir /etc/apparmor.d/disable/
ln -s /usr/lib/frr/* /etc/apparmor.d/disable/

随后

apparmor_parser -R /etc/apparmor.d
systemctl restart apparmor
systemctl restart frr

请注意,以下配置为了解释的更清晰所以在FRR内配置ip地址,但是实际生产环境中最好在Linux BASH下配置,不要在frr内配置。防止frr重启或崩溃后地址失效

OSPF配置

Router-1/2/3之间建立OSPF,Router-1使用进程1,Router-2和Router-3使用进程100;

Router-1下发默认路由给邻居;

Router-1

vim /etc/frr/daemons
# 修改以下内容
ospfd=yes
systemctl restart frr

进入虚拟接口

vtysh
configure
!
ip forwarding
!
interface ens32
 ip address 192.168.100.100/24
exit
!
interface lo
 ip address 192.168.1.1/32
exit
!
router ospf
 ospf router-id 192.168.1.1
 network 192.168.1.1/32 area 0
 network 192.168.100.0/24 area 0
 default-information originate always metric-type 1
exit
!
end
!
wirte memory

Router-2

vim /etc/frr/daemons
# 添加以下内容
ospfd=yes
ospfd_instances=100
systemctl restart frr

如果aa把frr相关创建文件的权限限制住,导致无法正常使用

ln -s /usr/lib/frr/* /etc/apparmor.d/disable/

随后

apparmor_parser -R /etc/apparmor.d
systemctl restart apparmor
systemctl restart frr

进入虚拟接口

vtysh
configure
!
ip forwarding
!
interface ens32
 ip address 192.168.100.200/24
 ip ospf 100 area 0
exit
!
interface ens34
 ip address 192.168.200.100/24
 ip ospf 100 area 0
exit
!
interface lo
 ip address 192.168.1.2/32
 ip ospf 100 area 0
exit
!
router ospf 100
 ospf router-id 192.168.1.2
exit
!
end
!
wirte memory

这里注意,多进程OSPF中只能在接口下激活ospf,不能再router ospf 100视图下激活

Router-3

vim /etc/frr/daemons
# 添加以下内容
ospfd=yes
ospfd_instances=100
systemctl restart frr

如果aa把frr相关创建文件的权限限制住,导致无法正常使用

ln -s /usr/lib/frr/* /etc/apparmor.d/disable/

随后

apparmor_parser -R /etc/apparmor.d
systemctl restart apparmor
systemctl restart frr

进入虚拟接口

vtysh
configure
!
ip forwarding
!
interface ens32
 ip address 192.168.200.200/24
 ip ospf 100 area 0
exit
!
interface lo
 ip address 192.168.1.3/32
 ip ospf 100 area 0
exit
!
router ospf 100
 ospf router-id 192.168.1.3
exit
!
end
!
wirte memory

查看ospf邻居状态和路由

ubuntu# show ip ospf neighbor

OSPF Instance: 100

Neighbor ID     Pri State           Up Time         Dead Time Address          Interface                        RXmtL RqstL DBsmL
192.168.1.1       1 Full/DR         36m02s            35.196s 192.168.100.100  ens32:192.168.100.200                0     0     0
192.168.1.3       1 Full/Backup     1m26s             39.331s 192.168.200.200  ens34:192.168.200.100                0     0     0
ubuntu(config)# do show ip route

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

O[100]>* 0.0.0.0/0 [110/201] via 192.168.200.100, ens32, weight 1, 00:00:30
O[100]>* 192.168.1.1/32 [110/200] via 192.168.200.100, ens32, weight 1, 00:05:19
O[100]>* 192.168.1.2/32 [110/100] via 192.168.200.100, ens32, weight 1, 00:05:19
O[100]   192.168.1.3/32 [110/0] is directly connected, lo, weight 1, 00:05:32
L * 192.168.1.3/32 is directly connected, lo, weight 1, 00:06:15
C>* 192.168.1.3/32 is directly connected, lo, weight 1, 00:06:15
O[100]>* 192.168.100.0/24 [110/200] via 192.168.200.100, ens32, weight 1, 00:05:19
O[100]   192.168.200.0/24 [110/100] is directly connected, ens32, weight 1, 00:05:24
C>* 192.168.200.0/24 [0/100] is directly connected, ens32, weight 1, 00:08:29
L>* 192.168.200.200/32 is directly connected, ens32, weight 1, 00:08:29

Linux BASH下查看路由

root@ubuntu:~# ip route show table all

default nhid 11 via 192.168.200.100 dev ens32 proto ospf metric 20
192.168.1.1 nhid 11 via 192.168.200.100 dev ens32 proto ospf metric 20
192.168.1.2 nhid 11 via 192.168.200.100 dev ens32 proto ospf metric 20
192.168.100.0/24 nhid 11 via 192.168.200.100 dev ens32 proto ospf metric 20
192.168.200.0/24 dev ens32 proto kernel scope link src 192.168.200.200 metric 100
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1
local 192.168.1.3 dev lo table local proto kernel scope host src 192.168.1.3
broadcast 192.168.1.3 dev lo table local proto kernel scope link src 192.168.1.3
local 192.168.200.200 dev ens32 table local proto kernel scope host src 192.168.200.200
broadcast 192.168.200.255 dev ens32 table local proto kernel scope link src 192.168.200.200
fe80::/64 dev ens32 proto kernel metric 1024 pref medium
local ::1 dev lo table local proto kernel metric 0 pref medium
local fe80::bb4f:14f0:3868:6150 dev ens32 table local proto kernel metric 0 pref medium
multicast ff00::/8 dev ens32 table local proto kernel metric 256 pref medium
PBR配置
pbr-map pbr seq 10
 match src-ip 10.20.10.0/24
 set nexthop 10.20.20.1
exit
!
interface ens32
 pbr-policy pbr
!
end
!
write memory

在ubuntu上netplan后端为systemd-networkd时,可能会遇到systemd-networkd服务重启,或下一跳接口up/down后策略失效的现象,这源于 systemd-networkd 的默认行为是管理外部路由、策略和下一跳,这可能与外部配置(如来自 FRR 的配置)发生冲突。为避免这种情况,需要取消systemd-networkd管理其他服务生成的pbr的能力

编辑 /etc/systemd/networkd.conf(如果不存在则创建),并在 [Network] 部分下添加:

[Network]
ManageForeignRoutingPolicyRules=no
ManageForeignRoutes=no
ManageForeignNextHops=no
systemctl restart systemd-networkd

或者,安装NetworkManager,并将其设置为netplan的后端,也可以避免pbr下一跳接口down后,frr设置的pbr策略失效

简单的BGP配置示例

Server配置

root@Server:/# vtysh
Server# configure													# 进入配置试图
Server(config)# router bgp 64512									# 启动BGP AS为64512
Server(config-router)# neighbor 10.10.17.2 remote-as 64512			# 指定对端地址和AS,同AS为IBGPP,不同AS为EBGP,对端地址需要中间设备也可达
Server(config-router)# neighbor 10.10.17.2 next-hop-self			# 从EBGP邻居学到的路由通告给自己的IBGP邻居的时候不会修改下一跳,因此在IBGP之间建立邻居大多数情况下需要指定下一跳为自己,EBGP邻居不需要
Server(config-router)# network 192.168.1.1/32						# 通过路由给自己的邻居,这个路由必须是路由表(或BGP路由表)内存在的路由才可
Server(config-router)# exit
Server(config)# exit
Server# write														# 保存配置

Client配置

root@Client:~# vtysh
Client# configure
Client(config)# router bgp 64512
Client(config-router)# bgp router-id 10.10.17.2
Client(config-router)# neighbor 10.10.17.1 remote-as 64512
Client(config-router)# neighbor 10.10.17.1 next-hop-self
Client(config-router)# network 192.168.10.0/24
Client(config-router)# exit
Client(config)# exit
Client# write

检查BGP邻居状态

client# do show bgp ipv4 unicast summary								# 查看ipv4单播邻居摘要

BGP router identifier 10.10.17.2, local AS number 64512 VRF default vrf-id 0
BGP table version 67
RIB entries 3, using 384 bytes of memory
Peers 1, using 24 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
10.10.17.1      4      64512         6         6       67    0    0 00:02:00            1        1 FRRouting/10.5.0_git

查看BGP传递来的路由

Client# do show bgp ipv4 unicast neighbors 10.10.17.1 routes
BGP table version is 67, local router ID is 10.10.17.2, vrf id 0
Default local pref 100, local AS 64512
Status codes:  s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
 *>i 192.168.1.1/32   10.10.17.1               0    100      0 i

Displayed 1 routes and 2 total paths

或者

Client# do show ip route bgp										# 查看路由表内全部的BGP路由
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

IPv4 unicast VRF default:
B>* 192.168.1.1/32 [200/0] via 10.10.17.1, vpls-wg0, weight 1, 00:03:42
Client#

或者

do show ip route			# 查看全部路由

退出后在Linux BASH界面查看路由和ping 测试

root@Client:~# ip route show | grep 192.168.1.1
192.168.1.1 nhid 44704 via 10.10.17.1 dev vpls-wg0 proto bgp metric 20
root@Client:~# ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=8.31 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=9.09 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=8.45 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=9.32 ms
^C
--- 192.168.1.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 8.313/8.793/9.321/0.423 ms
root@Client:~#

查看全部配置

Server

Server# do show running-config
Building configuration...

Current configuration:
!
frr version 10.5.0_git
frr defaults traditional
hostname NAS-Private
!
router bgp 64512
 bgp router-id 10.10.17.1
 neighbor 10.10.17.2 remote-as 64512
 !
 address-family ipv4 unicast
  network 192.168.1.1/32
  neighbor 10.10.17.2 next-hop-self
 exit-address-family
exit
!
end

Client

Client# do show running-config
Building configuration...

Current configuration:
!
frr version 10.5.0
frr defaults traditional
hostname SoftRouting
log syslog informational
service integrated-vtysh-config
!
router bgp 64512
 bgp router-id 10.10.17.2
 neighbor 10.10.17.1 remote-as 64512
 !
 address-family ipv4 unicast
  network 192.168.10.0/24
  neighbor 10.10.17.1 next-hop-self
 exit-address-family
exit
!
end
Client#

23、实时监控包转发速率和带宽

#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <string>
#include <map>
#include <chrono>
#include <algorithm>
#include <iomanip>
#include <cstring>
#include <cstdio>
#include <cstdlib>
#include <unistd.h>
#include <sys/ioctl.h>
#include <net/if.h>
#include <linux/ethtool.h>
#include <linux/sockios.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <getopt.h>
#include <signal.h>
#include <cmath>
#include <ctime> // for nanosleep
// ANSI color codes
const std::string C_WHITE = "\033[97m";
const std::string C_RESET = "\033[0m";
const std::string C_CYAN_BOLD = "\033[96;1m";
const std::string C_GREEN = "\033[92m";
const std::string C_GREEN_BOLD = "\033[92;1m";
const std::string C_YELLOW = "\033[93m";
const std::string C_YELLOW_BOLD = "\033[93;1m";
// Structs unchanged
struct NetStats {
    long long rx_bytes = 0;
    long long rx_packets = 0;
    long long rx_multicast = 0;
    long long tx_bytes = 0;
    long long tx_packets = 0;
};
struct BmStats {
    long long rx_broadcast = -1;
    long long tx_broadcast = -1;
    long long rx_multicast = -1;
    long long tx_multicast = -1;
    bool ethtool_failed = false; // New: flag to permanent fallback
};
// Global flag for interrupt
volatile sig_atomic_t stop = 0;
void sigint_handler(int signum) {
    stop = 1;
}
// Get net stats (optimized: read once, parse all)
std::map<std::string, NetStats> get_net_stats(const std::string& target_iface = "") {
    std::map<std::string, NetStats> stats;
    std::ifstream file("/proc/net/dev");
    if (!file.is_open()) return stats;
    std::string line;
    std::getline(file, line); // Skip headers
    std::getline(file, line);
    while (std::getline(file, line)) {
        if (line.empty()) continue;
        std::istringstream iss(line);
        std::string iface;
        iss >> iface;
        iface = iface.substr(0, iface.find(':'));
        if (iface == "lo") continue;
        if (!target_iface.empty() && iface != target_iface) continue;
        NetStats ns;
        iss >> ns.rx_bytes >> ns.rx_packets;
        long long dummy;
        for (int i = 0; i < 5; ++i) iss >> dummy;
        iss >> ns.rx_multicast;
        iss >> ns.tx_bytes >> ns.tx_packets;
        stats[iface] = ns;
        if (!target_iface.empty()) break;
    }
    return stats;
}
// Get bm stats (low-level ioctl)
BmStats get_bm_stats(const std::string& iface, bool& ethtool_failed) {
    if (ethtool_failed) { // Permanent fallback
        BmStats bm;
        return bm;
    }
    BmStats bm;
    int sock = socket(AF_INET, SOCK_DGRAM, 0);
    if (sock < 0) {
        ethtool_failed = true;
        return bm;
    }
    struct ifreq ifr;
    std::memset(&ifr, 0, sizeof(ifr));
    std::strncpy(ifr.ifr_name, iface.c_str(), IFNAMSIZ - 1);
    struct ethtool_drvinfo info;
    info.cmd = ETHTOOL_GDRVINFO;
    ifr.ifr_data = reinterpret_cast<char*>(&info);
    if (ioctl(sock, SIOCETHTOOL, &ifr) < 0) {
        ethtool_failed = true;
        close(sock);
        return bm;
    }
    unsigned int n_stats = info.n_stats;
    if (n_stats == 0) {
        ethtool_failed = true;
        close(sock);
        return bm;
    }
    // Get strings (pre-allocate)
    size_t gstr_size = sizeof(struct ethtool_gstrings) + n_stats * ETH_GSTRING_LEN;
    std::vector<char> buf(gstr_size);
    struct ethtool_gstrings* gstrings = reinterpret_cast<struct ethtool_gstrings*>(buf.data());
    gstrings->cmd = ETHTOOL_GSTRINGS;
    gstrings->string_set = ETH_SS_STATS;
    ifr.ifr_data = buf.data();
    if (ioctl(sock, SIOCETHTOOL, &ifr) < 0) {
        ethtool_failed = true;
        close(sock);
        return bm;
    }
    // Get stats
    size_t stats_size = sizeof(struct ethtool_stats) + n_stats * sizeof(uint64_t);
    std::vector<char> stats_buf(stats_size);
    struct ethtool_stats* estats = reinterpret_cast<struct ethtool_stats*>(stats_buf.data());
    estats->cmd = ETHTOOL_GSTATS;
    estats->n_stats = n_stats;
    ifr.ifr_data = stats_buf.data();
    if (ioctl(sock, SIOCETHTOOL, &ifr) < 0) {
        ethtool_failed = true;
        close(sock);
        return bm;
    }
    uint64_t* data = reinterpret_cast<uint64_t*>(stats_buf.data() + sizeof(struct ethtool_stats));
    for (unsigned int i = 0; i < n_stats; ++i) {
        char* str_ptr = reinterpret_cast<char*>(gstrings->data + i * ETH_GSTRING_LEN);
        std::string stat_name(str_ptr, strnlen(str_ptr, ETH_GSTRING_LEN));
        // Support common name variations for robustness
        if (stat_name == "rx_broadcast" || stat_name == "rx_broadcast_pkts" ||
            stat_name == "rx_broadcast_packets" || stat_name == "broadcast") {
            bm.rx_broadcast = data[i];
        } else if (stat_name == "tx_broadcast" || stat_name == "tx_broadcast_pkts" ||
                   stat_name == "tx_broadcast_packets") {
            bm.tx_broadcast = data[i];
        } else if (stat_name == "rx_multicast" || stat_name == "rx_multicast_pkts" ||
                   stat_name == "rx_multicast_packets" || stat_name == "multicast") {
            bm.rx_multicast = data[i];
        } else if (stat_name == "tx_multicast" || stat_name == "tx_multicast_pkts" ||
                   stat_name == "tx_multicast_packets") {
            bm.tx_multicast = data[i];
        }
    }
    close(sock);
    return bm;
}
// Compute max width
int compute_max_width(const std::vector<std::string>& ifaces) {
    int max_w = 12;
    for (const auto& iface : ifaces) {
        max_w = std::max(max_w, static_cast<int>(iface.length()));
    }
    return max_w;
}
// Get separator (const buffer)
void get_separator(char* buf, int width) {
    std::strcpy(buf, C_WHITE.c_str());
    std::memset(buf + C_WHITE.length(), '-', width);
    std::strcpy(buf + C_WHITE.length() + width, C_RESET.c_str());
}
// Get header (use sprintf)
void get_header(char* buf, int W, const int* fws) {
    const char* labels[] = {"rxpck/s", "txpck/s", "rxMbps", "txMbps", "rxbcast/s", "txbcast/s", "rxmcast/s", "txmcast/s"};
    char* ptr = buf;
    std::strcpy(ptr, C_CYAN_BOLD.c_str());
    ptr += C_CYAN_BOLD.length();
    *ptr++ = '|';
    std::sprintf(ptr, "%-*s| ", W, "Interface");
    ptr += W + 2; // |
    for (int i = 0; i < 8; ++i) {
        std::sprintf(ptr, "%*s", fws[i], labels[i]);
        ptr += fws[i];
        std::strcpy(ptr, i < 7 ? "| " : "|");
        ptr += (i < 7 ? 2 : 1);
    }
    std::strcpy(ptr, C_RESET.c_str());
}
// Get row (use sprintf for rates)
void get_row(char* buf, const std::string& iface, const std::vector<double>& rates, int W, const int* fws) {
    double rxpck = rates[0], txpck = rates[1], rxmbps = rates[2], txmbps = rates[3];
    double rxbcast = rates[4], txbcast = rates[5], rxmcast = rates[6], txmcast = rates[7];
    bool rx_active = (rxpck > 0 || rxmbps > 0 || rxbcast > 0 || rxmcast > 0);
    bool tx_active = (txpck > 0 || txmbps > 0 || txbcast > 0 || txmcast > 0);
    const char* rx_attr = rx_active ? C_GREEN_BOLD.c_str() : C_GREEN.c_str();
    const char* tx_attr = tx_active ? C_YELLOW_BOLD.c_str() : C_YELLOW.c_str();
    char* ptr = buf;
    std::strcpy(ptr, C_WHITE.c_str());
    ptr += C_WHITE.length();
    *ptr++ = '|';
    std::strcpy(ptr, C_RESET.c_str());
    ptr += C_RESET.length();
    std::strcpy(ptr, C_WHITE.c_str());
    ptr += C_WHITE.length();
    std::sprintf(ptr, "%-*s", W, iface.c_str());
    ptr += W;
    std::strcpy(ptr, C_RESET.c_str());
    ptr += C_RESET.length();
    std::strcpy(ptr, C_WHITE.c_str());
    ptr += C_WHITE.length();
    std::strcpy(ptr, "| ");
    ptr += 2;
    std::strcpy(ptr, C_RESET.c_str());
    ptr += C_RESET.length();
    const char* attrs[] = {rx_attr, tx_attr, rx_attr, tx_attr, rx_attr, tx_attr, rx_attr, tx_attr};
    double vals[] = {rxpck, txpck, rxmbps, txmbps, rxbcast, txbcast, rxmcast, txmcast};
    bool is_float[] = {false, false, true, true, false, false, false, false};
    for (int i = 0; i < 8; ++i) {
        std::strcpy(ptr, attrs[i]);
        ptr += std::strlen(attrs[i]);
        if (is_float[i]) {
            std::sprintf(ptr, "%*.*f", fws[i], 2, vals[i]);
        } else {
            std::sprintf(ptr, "%*.0f", fws[i], vals[i]);
        }
        ptr += fws[i];
        std::strcpy(ptr, C_RESET.c_str());
        ptr += C_RESET.length();
        std::strcpy(ptr, C_WHITE.c_str());
        ptr += C_WHITE.length();
        std::strcpy(ptr, i < 7 ? "| " : "|");
        ptr += (i < 7 ? 2 : 1);
        std::strcpy(ptr, C_RESET.c_str());
        ptr += C_RESET.length();
    }
}
// Compute total width based on W and fws
int compute_total_width(int W, const int* fws) {
    int sum_fws = 0;
    for (int i = 0; i < 8; ++i) {
        sum_fws += fws[i];
    }
    // 10 '|' + 8 spaces + W + sum_fws
    return 10 + 8 + W + sum_fws;
}
int main(int argc, char* argv[]) {
    std::string target_iface;
    int opt;
    static struct option long_options[] = {
        {"interface", required_argument, 0, 'i'},
        {0, 0, 0, 0}
    };
    while ((opt = getopt_long(argc, argv, "i:", long_options, nullptr)) != -1) {
        if (opt == 'i') target_iface = optarg;
    }
    signal(SIGINT, sigint_handler);
    // Initial stats
    auto initial_stats = get_net_stats(target_iface);
    std::vector<std::string> initial_ifaces;
    if (!target_iface.empty()) {
        if (initial_stats.find(target_iface) == initial_stats.end()) {
            std::cerr << "Interface not found: " << target_iface << std::endl;
            return 1;
        }
        initial_ifaces = {target_iface};
    } else {
        for (const auto& kv : initial_stats) initial_ifaces.push_back(kv.first);
        std::sort(initial_ifaces.begin(), initial_ifaces.end());
    }
    int W = compute_max_width(initial_ifaces);
    // Hide cursor
    [[maybe_unused]] auto ignored1 = write(STDOUT_FILENO, "\033[?25l", 6);
    const int MAX_BUF = 4096; // Fixed buffer size
    char top_sep[MAX_BUF];
    char header_line[MAX_BUF];
    char mid_sep[MAX_BUF];
    char bottom_sep[MAX_BUF];
    auto prev_stats = initial_stats;
    std::map<std::string, BmStats> prev_bm;
    std::map<std::string, bool> ethtool_fail_flags; // Per-iface failure flag
    for (const auto& iface : initial_ifaces) {
        bool failed = false;
        prev_bm[iface] = get_bm_stats(iface, failed);
        ethtool_fail_flags[iface] = failed;
        if (prev_bm[iface].rx_broadcast < 0) prev_bm[iface].rx_broadcast = 0;
        if (prev_bm[iface].tx_broadcast < 0) prev_bm[iface].tx_broadcast = 0;
        if (prev_bm[iface].rx_multicast < 0) prev_bm[iface].rx_multicast = prev_stats[iface].rx_multicast;
        if (prev_bm[iface].tx_multicast < 0) prev_bm[iface].tx_multicast = 0;
    }
    // Initial display with zero rates
    struct winsize ws;
    ioctl(STDOUT_FILENO, TIOCGWINSZ, &ws);
    int term_cols = ws.ws_col;
    int fws[8] = {9, 9, 9, 9, 9, 9, 9, 9};
    int total_width = compute_total_width(W, fws);
    if (total_width > term_cols) {
        int sum_fws = 0;
        for (int i = 0; i < 8; ++i) sum_fws += fws[i];
        int fixed_width = 10 + 8 + W;
        int excess = total_width - term_cols;
        double total_prop = static_cast<double>(sum_fws);
        for (int i = 0; i < 8; ++i) {
            double prop = static_cast<double>(fws[i]) / total_prop;
            int reduce = static_cast<int>(std::round(prop * excess));
            fws[i] = std::max(fws[i] - reduce, 4); // Min width 4 to keep readable
        }
        // Recalculate total_width after adjustment
        total_width = fixed_width;
        for (int i = 0; i < 8; ++i) total_width += fws[i];
    }
    get_separator(top_sep, total_width);
    get_header(header_line, W, fws);
    get_separator(mid_sep, total_width);
    get_separator(bottom_sep, total_width);
    char rows_buf[initial_ifaces.size() * MAX_BUF];
    char* rows_ptr = rows_buf;
    for (const auto& iface : initial_ifaces) {
        std::vector<double> rates(8, 0.0);
        char row[MAX_BUF];
        get_row(row, iface, rates, W, fws);
        std::strcpy(rows_ptr, row);
        rows_ptr += std::strlen(row);
        *rows_ptr++ = '\n';
    }
    // Build full output buffer for initial display
    char full_buf[4 * MAX_BUF + initial_ifaces.size() * MAX_BUF];
    char* ptr = full_buf;
    std::strcpy(ptr, top_sep); ptr += std::strlen(top_sep); *ptr++ = '\n';
    std::strcpy(ptr, header_line); ptr += std::strlen(header_line); *ptr++ = '\n';
    std::strcpy(ptr, mid_sep); ptr += std::strlen(mid_sep); *ptr++ = '\n';
    std::memcpy(ptr, rows_buf, rows_ptr - rows_buf); ptr += (rows_ptr - rows_buf);
    std::strcpy(ptr, bottom_sep); ptr += std::strlen(bottom_sep); *ptr++ = '\n';
    [[maybe_unused]] auto ignored2 = write(STDOUT_FILENO, full_buf, ptr - full_buf);
    int num_lines = 4 + initial_ifaces.size() + 1;
    int prev_num_lines = num_lines;
    auto start_time = std::chrono::steady_clock::now();
    int iface_check_counter = 0; // Check interfaces every 10 loops (~10s)
    while (!stop) {
        // Get terminal size
        ioctl(STDOUT_FILENO, TIOCGWINSZ, &ws);
        term_cols = ws.ws_col;
        auto current_time = std::chrono::steady_clock::now();
        double interval = std::chrono::duration<double>(current_time - start_time).count();
        if (interval < 1.0) {
            struct timespec ts;
            ts.tv_sec = 0;
            ts.tv_nsec = 100000000; // 0.1s
            nanosleep(&ts, nullptr);
            continue;
        }
        auto current_stats = get_net_stats(target_iface);
        std::vector<std::string> current_ifaces = initial_ifaces; // Use cache
        if (target_iface.empty() && ++iface_check_counter >= 10) { // Rare check
            iface_check_counter = 0;
            current_ifaces.clear();
            for (const auto& kv : current_stats) current_ifaces.push_back(kv.first);
            std::sort(current_ifaces.begin(), current_ifaces.end());
            initial_ifaces = current_ifaces; // Update cache
        }
        int new_W = compute_max_width(current_ifaces);
        if (new_W > W) W = new_W;
        // Define field widths
        int fws[8] = {9, 9, 9, 9, 9, 9, 9, 9};
        total_width = compute_total_width(W, fws);
        if (total_width > term_cols) {
            int sum_fws = 0;
            for (int i = 0; i < 8; ++i) sum_fws += fws[i];
            int fixed_width = 10 + 8 + W; // Updated fixed part
            int excess = total_width - term_cols;
            double total_prop = static_cast<double>(sum_fws);
            for (int i = 0; i < 8; ++i) {
                double prop = static_cast<double>(fws[i]) / total_prop;
                int reduce = static_cast<int>(std::round(prop * excess));
                fws[i] = std::max(fws[i] - reduce, 4); // Min width 4 to keep readable
            }
            // Recalculate total_width after adjustment
            total_width = fixed_width;
            for (int i = 0; i < 8; ++i) total_width += fws[i];
            // If still over (due to min), further reduce, but for simplicity, accept
        }
        get_separator(top_sep, total_width);
        get_header(header_line, W, fws);
        get_separator(mid_sep, total_width);
        get_separator(bottom_sep, total_width);
        rows_ptr = rows_buf;
        for (const auto& iface : current_ifaces) {
            NetStats prev = prev_stats[iface];
            NetStats curr = current_stats[iface];
            double rxpck = std::round(std::max(0LL, curr.rx_packets - prev.rx_packets) / interval);
            double txpck = std::round(std::max(0LL, curr.tx_packets - prev.tx_packets) / interval);
            double rx_bytes_rate = std::max(0LL, curr.rx_bytes - prev.rx_bytes) / interval;
            double tx_bytes_rate = std::max(0LL, curr.tx_bytes - prev.tx_bytes) / interval;
            double rxmbps = rx_bytes_rate * 8 / 1000000.0;
            double txmbps = tx_bytes_rate * 8 / 1000000.0;
            bool failed = ethtool_fail_flags[iface];
            BmStats curr_bm = get_bm_stats(iface, failed);
            ethtool_fail_flags[iface] = failed;
            if (curr_bm.rx_broadcast < 0) curr_bm.rx_broadcast = 0;
            if (curr_bm.tx_broadcast < 0) curr_bm.tx_broadcast = 0;
            if (curr_bm.rx_multicast < 0) curr_bm.rx_multicast = curr.rx_multicast;
            if (curr_bm.tx_multicast < 0) curr_bm.tx_multicast = 0;
            BmStats prev_b = prev_bm[iface];
            double rxbcast = std::round(std::max(0LL, curr_bm.rx_broadcast - prev_b.rx_broadcast) / interval);
            double txbcast = std::round(std::max(0LL, curr_bm.tx_broadcast - prev_b.tx_broadcast) / interval);
            double rxmcast = std::round(std::max(0LL, curr_bm.rx_multicast - prev_b.rx_multicast) / interval);
            double txmcast = std::round(std::max(0LL, curr_bm.tx_multicast - prev_b.tx_multicast) / interval);
            std::vector<double> rates = {rxpck, txpck, rxmbps, txmbps, rxbcast, txbcast, rxmcast, txmcast};
            char row[MAX_BUF];
            get_row(row, iface, rates, W, fws);
            std::strcpy(rows_ptr, row);
            rows_ptr += std::strlen(row);
            *rows_ptr++ = '\n';
            prev_bm[iface] = curr_bm;
        }
        // Build full output buffer
        ptr = full_buf;
        std::strcpy(ptr, top_sep); ptr += std::strlen(top_sep); *ptr++ = '\n';
        std::strcpy(ptr, header_line); ptr += std::strlen(header_line); *ptr++ = '\n';
        std::strcpy(ptr, mid_sep); ptr += std::strlen(mid_sep); *ptr++ = '\n';
        std::memcpy(ptr, rows_buf, rows_ptr - rows_buf); ptr += (rows_ptr - rows_buf);
        std::strcpy(ptr, bottom_sep); ptr += std::strlen(bottom_sep); *ptr++ = '\n';
        num_lines = 4 + current_ifaces.size() + 1;
        // Move up and clear, then write
        char move_buf[32];
        std::sprintf(move_buf, "\033[%dA\033[J", prev_num_lines - 1);
        [[maybe_unused]] auto ignored3 = write(STDOUT_FILENO, move_buf, std::strlen(move_buf));
        [[maybe_unused]] auto ignored4 = write(STDOUT_FILENO, full_buf, ptr - full_buf);
        prev_num_lines = num_lines;
        start_time = current_time;
        prev_stats = current_stats;
    }
    // Show cursor
    [[maybe_unused]] auto ignored5 = write(STDOUT_FILENO, "\033[?25h\n", 8);
    return 0;
}

以上脚本写入/usr/bin/packets_show,然后使用chmod + x赋予可执行权限,使用packs_show命令运行,效果如下

  • ​IFACE​:接口名称(如 lo、eth0、lan)。

  • ​rxpck/s​:每秒接收包数(Receive Packets per second)。

  • ​txpck/s​:每秒发送包数(Transmit Packets per second)。

  • rxMbps:每秒接收带宽。

  • txMbps:每秒发送带宽。

  • rxbcast/s:每秒接收广播包数。

  • txbcast/s:每秒发送广播包数。

  • rxmcst/s:每秒接收组播包数(Receive Multicast packets per second,不包括广播)。

  • txmcst/s:每秒发送组播包数(Receive Multicast packets per second,不包括广播)。

使用-i指定要监控的网卡

root@SoftRouting:~# packs_show -i eth1
----------------------------------------------------------------------------------------------
|Interface   | rxpck/s| txpck/s|  rxMbps|  txMbps| rxbcast/s| txbcast/s| rxmcast/s| txmcast/s|
----------------------------------------------------------------------------------------------
|eth1        |      14|      20|    0.01|    0.02|         0|         0|         0|         0|
----------------------------------------------------------------------------------------------

24、使用XDP作为数据平面快速转发

1、安装

官方apt仓库下载(较旧)

apt install xdp-tools -y

编译安装(较新)

安装编译所需软件

apt update && apt install make clang m4 pkg-config zlib1g-dev libelf-dev libpcap-dev -y

克隆xdp-tools项目

git clone https://github.com/xdp-project/xdp-tools
cd xdp-tools
./config
make
make install

debain仓库找到xdp-tools以及相关依赖包的二进制包进行安装(较新)​

https://packages.debian.org/sid/amd64/xdp-tools/download

https://packages.debian.org/sid/amd64/libxdp1/download

以xdp-tools的1.5.7安装为例

下载所有包

wget http://ftp.cn.debian.org/debian/pool/main/x/xdp-tools/xdp-tools_1.5.7-3_amd64.deb
wget http://ftp.cn.debian.org/debian/pool/main/x/xdp-tools/libxdp1_1.5.7-3_amd64.deb

安装包

apt install ./*.deb -y

2、查看自己网卡的支持状况

root@SoftRouting:~# xdp-loader features eth0
NETDEV_XDP_ACT_BASIC: yes
NETDEV_XDP_ACT_REDIRECT: yes
NETDEV_XDP_ACT_NDO_XMIT: no
NETDEV_XDP_ACT_XSK_ZEROCOPY: yes
NETDEV_XDP_ACT_HW_OFFLOAD: no
NETDEV_XDP_ACT_RX_SG: no
NETDEV_XDP_ACT_NDO_XMIT_SG: no

项目

含义

NETDEV_XDP_ACT_BASIC

yes

支持 XDP_TX / XDP_PASS / XDP_DROP 等基本动作

NETDEV_XDP_ACT_REDIRECT

yes

最重要!支持 XDP_REDIRECT 到其他网卡→ 跨接口高速转发核心功能

NETDEV_XDP_ACT_XSK_ZEROCOPY

yes

支持 AF_XDP zero-copy(可选)

NETDEV_XDP_ACT_NDO_XMIT

no

不支持旧的 ndo_xmit 方式(现代驱动大都不用)

可以看到我的设备是支持native模式的XDP的,这可以大大转发数据时的CPU使用率

3、xdp-forward加入端口的场景分析

3-1、无需Netfilter的数据中心场景下

数据在转发时,经过网卡,XDP程序对数据查找路由表,发现跨子网通信且路由表内可找到转发路由且是要跨接口,则直接跨过内核将数据发送到下一个网卡驱动中转发出去。这种模式下数据的转发完全依靠路由表进行,完全绕过了内核,也会导致Netfilter失效,防火墙策略以及NAT无效。因此适合在数据中心中部署网络时使用。

xdp-forward load eth0 eth1 eth2 eth3

加入到xdp-forward后,默认网卡为native模式(不支持会自动后退),转发模式为:FULL(基于路由表转发,但不是FIB表,不考虑同子网情况)

注意:

1、一定要多端口一起加入,否则xdp-forward转发不会生效。同时,一起加入xdp-forward的接口类似一个转发组,在这个组内查表转发的数据可以进行XDP重定向快速转发数据;

2、所有跨子网且跨端口转发的数据都会绕过内核,会导致Netfilter失效,因此只适合无需要NAT的数据中心场景下;

3-2、需要Netfilter的家庭以及企业网关场景下(推荐)

在家庭或企业网关场景中,网关路由器需要进行NAT转发以及各种Netfilter的防火墙规则,如果完全绕过内核的话,这些都会失效。但!xdp-forward的开发者已经想到这种情况了,在xdp-forward中,可以使用-f指定转发模式为Flowtable(默认为fib)。

这种模式下依靠nftables的硬件或软件Flowtable,因此需要先配置nftables的Flowtable。当flowtable中有匹配的数据流时,xdp-forward可以根据flowtbale中的数据流进行XDP重定向,达到快速转发数据的目的。

在NAT的场景下nftables中flowtable存储的数据流信息是NAT后的信息,而xdp-forward 在flowtable模式下的转发依靠的就是这个NAT后的信息,所有这就保证了NAT功能与XDP的完美结合,直接达到类似硬件路由器的NAT卸载性能,是家庭和企业网关出口的首选推荐。

xdp-forward load -f flowtable eth0 eth1 eth2 eth3

当Flowtable建立后,后续的流量都会通过xdp重定向快速转发,因此,flowtable中保存的流信息会长时间匹配不到而导致flowtable过期(默认30秒),从而导致可能经常需要数据经过内核建立flowtable信息。为减缓这种情况,需要将内核参数中默认的tcp/udp flowtable超时从30秒调整到10分钟左右。

vim /etc/sysctl.d/90-softrouting.conf
# Flowtable超时:10分钟,减缓过期
net.netfilter.nf_flowtable_tcp_timeout=600
net.netfilter.nf_flowtable_udp_timeout=600
sysctl -p /etc/sysctl.d/90-softrouting.conf

注意:

1、一定要多端口一起加入,否则xdp-forward转发不会生效。同时,一起加入xdp-forward的接口类似一个转发组,在这个组内查表转发的数据可以进行XDP重定向快速转发数据;

2、需要先配置nftables的flowtable才能配置xdp-forward载入接口,否则会报错;

3、nftables的flowtable目前只支持TCP和UDP协议,对于其他的协议目前还不支持,不过网络中的大部分数据都是TCP和UDP,也足够了;

4、在配置nftables的flowtable时,device = { xxx, xxx } 中,不能添加桥接网卡名,只能添加物理网卡名,否则虽然flowtable加载成功,但是xdp-forward无法正确根据flowtable转发数据;

xdp-forward中常见问题(1.5.7-3版本)

1、xdp-forward在fib模式下进行xdp重定向转发有两个前提:1是不同子网转发,2是必须是从一个接口到另一个接口的数据转发;

2、xdp-forward的flowtable模式下,fib模式中的第一条条件可以忽略,因为flowtable模式下,nftables的flowtable内是有同子网间的信息的,只要确保是不同接口间转发的即可;

3、xdp-forward支持:
nat(flowtable模式下)
pppoe(flowtable模式下)
vlan、vlan子接口(fib模式下,但会直接把带vlan tag的数据转发到另一个接口,不会摘掉vlan)
bridge(fib模式下满足跨子网和跨端口的条件
flowtable模式下需要满足跨端口条件)
策略路由(-F full模式)
VRF接口
但是必须只能加载到物理网卡上,虚拟网卡不能加载(包括nftables的flowtable的device中也是)。

4、验证接口状态

root@SoftRouting:~# xdp-loader status
CURRENT XDP PROGRAM STATUS:

Interface        Prio  Program name      Mode     ID   Tag               Chain actions
--------------------------------------------------------------------------------------
lo                     <No XDP program loaded!>
eth0                   xdp_dispatcher    native   442  bbf8adf53baeeeb7
 =>              50     xdp_fwd_flow_full          452  c11133030d14e240  XDP_PASS
eth1                   xdp_dispatcher    native   457  bbf8adf53baeeeb7
 =>              50     xdp_fwd_flow_full          452  c11133030d14e240  XDP_PASS
eth2                   xdp_dispatcher    native   463  bbf8adf53baeeeb7
 =>              50     xdp_fwd_flow_full          452  c11133030d14e240  XDP_PASS
eth3                   xdp_dispatcher    native   469  bbf8adf53baeeeb7
 =>              50     xdp_fwd_flow_full          452  c11133030d14e240  XDP_PASS
wlx0013ef6f25bd        <No XDP program loaded!>
lan                    <No XDP program loaded!>
docker0                <No XDP program loaded!>
wg0                    <No XDP program loaded!>

5、桥接lan网卡

FIB模式下

直接加载到物理网卡 eth1 和 eth2 上(用 native 模式),然后再把 eth1 和 eth2 加入到 lan bridge(比如 br-lan),整个流程是这样的:

首包

客户端A(接 eth1) → 发出 ARP Request(谁是 192.168.10.55?)
   ↓
eth1 物理口收到 → XDP 程序看到是广播包 → 直接 XDP_PASS(xdp_fwd_fib_full 不处理广播)
   ↓
进入内核网络栈 → bridge 正常处理广播 → 洪泛(flood)到 eth2、eth3
   ↓
客户端B(接 eth2)回复 ARP Reply(我是 192.168.10.55,我的 MAC 是 xx:xx:xx:xx:xx:xx)
   ↓
eth2 物理口收到 → 同样是广播,XDP_PASS → 交给 bridge
   ↓
bridge 同时做了两件事:
   1. 学习:eth2 端口看到客户端B的 MAC → 写入 FDB(MAC → eth2)
   2. 转发:把 ARP Reply 发给 eth1
   ↓
客户端A 收到 Reply,建立好 ARP 表

后续包

客户端A → 直接发单播数据包,目的 MAC = 客户端B 的 MAC
   ↓
eth1 物理口收到 → XDP 程序查 IPv4 FIB,发现目的 IP 192.168.10.55 是本子网
   ↓
但 xdp_fwd_fib_full 只会根据 IP 路由表决定是否 REDIRECT
   → 路由表显示“直接投递”(dev lan scope link),它认为不需要跨接口
   → 所以仍然返回 XDP_PASS(不是 REDIRECT!)
   ↓
交给 bridge → bridge 查 FDB,发现目的 MAC 对应 eth2 端口 → 直接从 eth2 发出去

在同一个 bridge 下的统一子网里,XDP_REDIRECT 几乎永远不会触发。 真正实现“硬件交换机级高速转发”的其实是 Linux bridge 本身的 FDB 查找 + XDP native 驱动层提速,而不是 xdp_fwd_fib_full 的 REDIRECT

为什么XDP Native可以加速同子网转发?

1、无XDP加载时:网络包从网卡RX队列进入内核时,需要立即分配skb(struct sk_buff,Linux网络包数据结构),然后经过netif_receive_skb()函数进入网络栈。这涉及多次内存分配、CPU中断处理和上下文切换,尤其在高PPS(包/秒)场景下,开销大,导致CPU利用率高、延迟增加。

2、加载XDP(native模式)后,即使返回XDP_PASS:

  • XDP程序在驱动层(RX队列)最早点执行(在skb分配之前)。程序可以快速检查包(e.g., 头部解析),然后返回PASS。

  • PASS后,包才分配skb进入内核栈(包括netfilter prerouting/forward等)。

关键优化:XDP在native模式下,netfilter在PASS后正常生效,但XDP预过滤让netfilter只处理有效包,间接优化其性能(减少无效流量进入链)。

结果:加载XDP后,同子网互访的转发延迟降低10-30%,PPS提升(可达线速级别),CPU消耗减少,尤其在高负载下(如内网文件共享、视频流)

无XDP:RX → skb分配 → prerouting → bridge soft-forward(CPU高) → TX;

Flowtable模式

同 “ 3-2、需要Netfilter的家庭以及企业网关场景下(推荐) ” 中一样

6、有需要访问互联网的网卡(需要nat,家庭/企业出口)

xdp-forward默认不支持nat,但家庭场景下可能需要可以nat的网卡,这种场景下我们在添加xdp-forward到网卡时不写外网接口即可。

这样的话实现的就是局域网内的高速通信了,属于过渡实现

外网 (wan) ── eth0 (万兆,支持 native XDP)   ← 只负责上网,不加载任何 XDP
                 ▲
                 │  (内核路由 + nf_tables NAT (masquerade)
                 │
lan bridge (br-lan) ── eth1 ┐
                    ── eth2 ├─ 三个都支持 native XDP
                    ── eth3 ┘

支持native的网卡和不支持native的网卡能否一起加入xdp-forward中?不行!可以添加成功,但是高速转发失效

场景

是否还能被 XDP 快速转发(native REDIRECT)

实际表现

支持 native 的网卡 A ↔ 支持 native 的网卡 B

完全可以,极速

线速,CPU 几乎不动

支持 native 的网卡 A ↔ 不支持 native 的网卡 C

不可以(快速转发失效)

包会回落到 XDP_PASS,走普通内核桥接/路由,速度大幅下降

不支持 native 的网卡 C ↔ 不支持 native 的网卡 D

不可能

只能走 skb/generic 模式(如果强制加载),性能很差,或者直接加载失败

7、检验是否正常快速转发

root@SoftRouting:~# xdp-monitor -s -i 1

Summary                        87 redir/s               0 err,drop/s            0 xmit/s
Summary                        74 redir/s               0 err,drop/s            0 xmit/s
Summary                        66 redir/s               0 err,drop/s            0 xmit/s
Summary                        66 redir/s               0 err,drop/s            0 xmit/s
Summary                        66 redir/s               0 err,drop/s            0 xmit/s
Summary                        66 redir/s               0 err,drop/s            0 xmit/s

存在正常被XDP重定向流量的话redir/s会有数值显示

8、systemd开机自动加载

以家庭/企业网关场景为例

vim /etc/systemd/system/xdp-forward.service
[Unit]
Description=Load XDP Forward on Network Interfaces
After=nftables.service
Wants=nftables.service

[Service]
Type=oneshot
ExecStart=/usr/sbin/xdp-forward load -f flowtable eth0 eth1 eth2 eth3
RemainAfterExit=true
ExecStop=/usr/sbin/xdp-forward unload eth0 eth1 eth2 eth3
[Install]
WantedBy=multi-user.target
systemctl start xdp-forward.service
systemctl enable xdp-forward.service

9、直观的pps显示

vim xdp_pps.cpp
#include <iostream>
#include <string>
#include <cstdio>
#include <cstdlib>
#include <unistd.h>
#include <cstring>
#include <iomanip>
#include <algorithm>
#include <cctype>
#include <chrono>
#include <sstream>
#include <sys/ioctl.h>
#include <cmath>

// ANSI color codes
#define BRIGHT_CYAN "\033[1;36m"
#define BRIGHT_GREEN "\033[1;32m"
#define BRIGHT_WHITE "\033[1;37m"
#define RESET "\033[0m"

// Function to print horizontal border
void print_border(int total_width) {
    std::cout << std::string(total_width, '-') << std::endl;
}

// Function to print header row
void print_header(int col_setw[6]) {
    std::cout << "|" << BRIGHT_CYAN << std::left << std::setw(col_setw[0]) << "Xdp-Monitor "
              << RESET << "| " << BRIGHT_CYAN << std::left << std::setw(col_setw[1]) << "Redir/s"
              << RESET << "| " << BRIGHT_CYAN << std::left << std::setw(col_setw[2]) << "PP/s "
              << RESET << "| " << BRIGHT_CYAN << std::left << std::setw(col_setw[3]) << "Err,Drop/s"
              << RESET << "| " << BRIGHT_CYAN << std::left << std::setw(col_setw[4]) << "Total Redir"
              << RESET << "| " << BRIGHT_CYAN << std::left << std::setw(col_setw[5]) << "Runtime"
              << RESET << "|" << std::endl;
}

// Function to format runtime as "Xs"
std::string format_runtime(std::chrono::steady_clock::time_point start) {
    auto now = std::chrono::steady_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::seconds>(now - start);
    long seconds = duration.count();
    std::stringstream ss;
    ss << seconds << "s";
    return ss.str();
}

// Function to print data row (summary in bright white, data in bright green)
void print_data(const char* label, const char* redir, const char* pps, const char* err_drop, const char* total_redir, const std::string& runtime, int col_setw[6]) {
    std::cout << "|" << BRIGHT_WHITE << std::left << std::setw(col_setw[0]) << label
              << RESET << "| " << BRIGHT_GREEN << std::left << std::setw(col_setw[1]) << redir
              << RESET << "| " << BRIGHT_GREEN << std::left << std::setw(col_setw[2]) << pps
              << RESET << "| " << BRIGHT_GREEN << std::left << std::setw(col_setw[3]) << err_drop
              << RESET << "| " << BRIGHT_GREEN << std::left << std::setw(col_setw[4]) << total_redir
              << RESET << "| " << BRIGHT_GREEN << std::left << std::setw(col_setw[5]) << runtime
              << RESET << "|" << std::endl;
}

// Function to refresh the data row without clearing the screen
void refresh_data(const char* redir, const char* pps, const char* err_drop, const char* total_redir, const std::string& runtime, int col_setw[6], int total_width) {
    // Move cursor up 2 lines (to overwrite the old data row and bottom border)
    std::cout << "\033[2A";

    // Print new data row
    print_data("Summary ", redir, pps, err_drop, total_redir, runtime, col_setw);

    // Print bottom border again (unchanged)
    print_border(total_width);
    std::cout << std::flush;
}

int main() {
    // Optimize stdout buffering
    setvbuf(stdout, NULL, _IOFBF, 0);

    // Get terminal width
    struct winsize ws;
    int term_width_val = 121; // default
    if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &ws) != -1) {
        term_width_val = ws.ws_col;
    }
    int fixed_sep = 12;
    int orig_content = 109;
    int target_width = std::min(term_width_val, 121);
    int target_content = target_width - fixed_sep;
    double scale = static_cast<double>(target_content) / orig_content;

    int min_widths[6] = {12, 7, 5, 10, 11, 7};
    int orig_setw[6] = {13, 21, 21, 21, 21, 12};
    int col_setw[6];
    int sum_content = 0;

    for (int i = 0; i < 6; ++i) {
        col_setw[i] = std::max(min_widths[i], static_cast<int>(std::round(orig_setw[i] * scale)));
        sum_content += col_setw[i];
    }

    // Adjust if sum_content > target_content
    while (sum_content > target_content) {
        int max_idx = 0;
        for (int i = 1; i < 6; ++i) {
            if (col_setw[i] > col_setw[max_idx]) {
                max_idx = i;
            }
        }
        if (col_setw[max_idx] > min_widths[max_idx]) {
            --col_setw[max_idx];
            --sum_content;
        } else {
            break;
        }
    }

    int total_width = fixed_sep + sum_content;

    // Record start time
    auto start_time = std::chrono::steady_clock::now();

    // Command to execute
    const char* cmd = "xdp-monitor -s -i 1";

    // Open pipe to command output
    FILE* pipe = popen(cmd, "r");
    if (!pipe) {
        std::cerr << "Error opening pipe to command." << std::endl;
        return EXIT_FAILURE;
    }

    char buffer[1024];
    char redir[32] = "0";
    char pps[32] = "0";
    char err_drop[32] = "0";
    long long total_redir_count = 0;
    char total_redir_str[32] = "0";

    // Print the entire table once (header and initial data)
    print_border(total_width);
    print_header(col_setw);
    print_border(total_width);
    print_data("summary ", redir, pps, err_drop, total_redir_str, format_runtime(start_time), col_setw);
    print_border(total_width);

    // Loop to read output continuously (low CPU as fgets blocks)
    while (fgets(buffer, sizeof(buffer), pipe) != nullptr) {
        size_t len = strlen(buffer);
        if (len == 0 || buffer[len-1] != '\n') continue;  // Skip incomplete/empty

        buffer[len-1] = '\0';  // Trim newline

        // Quick check for "Summary" (case-sensitive as per example)
        if (strncmp(buffer, "Summary", 7) == 0) {
            // Use sscanf for fast parsing (format: "Summary %s redir/s %s err,drop/s %s xmit/s")
            // Adjust fields if exact format varies; assumes values are numeric strings
            char temp1[32], temp2[32], temp3[32];
            if (sscanf(buffer, "Summary %31s redir/s %31s err,drop/s %31s xmit/s", redir, err_drop, pps) == 3 ||
                sscanf(buffer, "Summary %31s redir/s %31s err,drop/s %31s", redir, err_drop, temp3) == 3) {  // Fallback if no xmit/s
                // If no pps/xmit, keep "0"
            } else {
                // Default on parse fail
                strcpy(redir, "0");
                strcpy(pps, "0");
                strcpy(err_drop, "0");
            }

            // Accumulate total redir (assuming redir is per second and updates every second)
            total_redir_count += atoll(redir);
            snprintf(total_redir_str, sizeof(total_redir_str), "%lld", total_redir_count);

            // Refresh only the data row and bottom border with current runtime
            refresh_data(redir, pps, err_drop, total_redir_str, format_runtime(start_time), col_setw, total_width);
        }
    }

    pclose(pipe);
    return EXIT_SUCCESS;
}

编译

g++ -std=c++17 -O3 -static xdp_pps.cpp -o xdp_pps

移动到/usr/bin/目录下使用

效果如下

性能评估

  • ​CPU 使用率​:程序的核心循环使用 fgets 从管道读取输出,这是阻塞式操作(即在没有新数据时不会消耗 CPU 资源)。xdp-monitor -s -i 1 命令每秒输出一次总结,程序仅在接收到数据时进行简单处理(如解析字符串、累加计数、刷新显示)。没有忙等待或高频轮询,因此 CPU 占用极低,通常在闲置时接近 0%,处理时也仅为毫秒级开销。输出刷新使用 ANSI 转义码只重写特定行,而不是清屏或重绘整个终端,进一步降低了计算负担。

  • ​内存使用​:程序使用固定大小的缓冲区(1024 字节)和少量局部变量,没有动态内存分配或潜在泄漏。总体内存 footprint 很小(几 KB 级别),不会随时间增长。

  • ​I/O 优化​:通过 setvbuf(stdout, NULL, _IOFBF, 0) 设置全缓冲输出,减少了频繁的系统调用。解析使用高效的 sscanf 和 atoll,适合实时处理。

  • ​整体效率​:基于静态代码分析和类似工具的基准,这个包装器不会引入显著开销。XDP-monitor 本身设计为低开销工具,使用 BPF tracepoints 监控 XDP 统计,而不干扰网络性能。 它旨在最小化资源使用,适合高性能环境。

25、自定义欢迎界面

cd /etc/update-motd.d
rm -rf *
vim 99-softrouting
#!/bin/bash

echo
cat <<'BANNER'

  ███████╗ ██████╗ ███████╗ ████████╗██████╗  ██████╗  ██╗   ██╗ ████████╗ ██╗ ███╗   ██╗  ██████╗
  ██╔════╝██╔═══██╗██╔════╝ ╚══██╔══╝██╔══██╗██╔═══██╗ ██║   ██║ ╚══██╔══╝ ██║ ████╗  ██║ ██╔════╝
  ███████╗██║   ██║█████╗      ██║   ███████║██║   ██║ ██║   ██║    ██║    ██║ ██╔██╗ ██║ ██║  ███╗
  ╚════██║██║   ██║██╔══╝      ██║   ██╔══██║██║   ██║ ██║   ██║    ██║    ██║ ██║╚██╗██║ ██║   ██║
  ███████║╚██████╔╝██║         ██║   ██║  ██║╚██████╔╝ ╚██████╔╝    ██║    ██║ ██║ ╚████║ ╚██████╔╝
  ╚══════╝ ╚═════╝ ╚═╝         ╚═╝   ╚═╝  ╚═╝ ╚═════╝   ╚═════╝     ╚═╝    ╚═╝ ╚═╝  ╚═══╝  ╚═════╝

BANNER
echo

printf "\033[36m 主机名\033[0m      %-25s   \033[36m出口 IP\033[0m      %s\n" "$(uname -n)" "$(ip -4 route show default 2>/dev/null | awk '{print $3}' | head -n1)"
printf "\033[36m 系统版本\033[0m    %-25s   \033[36m内核版本\033[0m     %s\n" "$(cat /etc/os-release | grep PRETTY_NAME | cut -d'"' -f2 | awk '{print $1,$2}')" "$(uname -r)"
printf "\033[36m 已运行时间\033[0m  %-26s   \033[36m当前时间\033[0m     %s\n" "$(awk '{s=$1; days=s/86400; if(days<1) printf("%.1f时", s/3600); else printf("%.1f天", days)}' /proc/uptime)" "$(date +'%Y-%m-%d %H:%M:%S')"

echo
chmod +X 99-softrouting
bash 99-softrouting

效果如下

禁用默认存在的motd-news.service服务,这个服务的作用是在用户登录后推送最新的安全更新消息,自定义启动脚本后就不需要了

systemctl mask motd-news.service

26、自定义连接追踪

vim conn_show.cpp
#include <iostream>
#include <string>
#include <vector>
#include <cstdio>
#include <getopt.h>
#include <cctype> // for isupper
#include <sstream>
#include <iomanip>
#include <algorithm>
#include <cstring>
#include <termios.h>
#include <unistd.h>
#include <sys/select.h>
#include <signal.h>
#include <cstdlib> // for system

const std::string BRIGHT_CYAN = "\033[1;36m";
const std::string BRIGHT_YELLOW = "\033[1;33m";
const std::string BRIGHT_GREEN = "\033[1;32m";
const std::string BRIGHT_BLUE = "\033[1;34m";
const std::string BRIGHT_RED = "\033[1;31m";
const std::string BRIGHT_WHITE = "\033[1;37m";
const std::string RESET = "\033[0m";

std::string g_dashline;
std::vector<size_t> g_colw;
std::vector<std::vector<std::string>> g_display_rows; // Store parsed display rows
size_t g_total = 0;

void sigint_handler(int sig) {
    std::cout << BRIGHT_RED << g_dashline << RESET << std::endl;
    std::cout << "Terminated by user." << std::endl;
    exit(0);
}

bool wait_space() {
    std::cout << "......Press <space> for next page, Ctrl+C to quit......";
    std::cout.flush();
    int fd = STDIN_FILENO;
    struct termios oldt, newt;
    if (tcgetattr(fd, &oldt) != 0) return false;
    newt = oldt;
    cfmakeraw(&newt);
    tcsetattr(fd, TCSADRAIN, &newt);
    fd_set rfds;
    FD_ZERO(&rfds);
    FD_SET(fd, &rfds);
    int ret = select(fd + 1, &rfds, NULL, NULL, NULL);
    bool is_space = false;
    if (ret > 0 && FD_ISSET(fd, &rfds)) {
        char ch;
        if (read(fd, &ch, 1) == 1) {
            is_space = (ch == ' ');
        }
    }
    std::cout << "\r" << std::string(60, ' ') << "\r";
    std::cout.flush();
    tcsetattr(fd, TCSADRAIN, &oldt);
    return is_space;
}

void batch_print_rows(const std::vector<std::vector<std::string>>& rows, size_t start, size_t end, int out_fd) {
    const size_t PAGE_BUF_SIZE = 32768; // Large buffer for batch writes
    char page_buf[PAGE_BUF_SIZE];
    char* page_ptr = page_buf;
    size_t page_remaining = PAGE_BUF_SIZE;

    auto flush_buffer = [&]() {
        size_t written = PAGE_BUF_SIZE - page_remaining;
        if (written > 0) {
            ssize_t res = write(out_fd, page_buf, written);
            (void)res; // Suppress unused warning
        }
        page_ptr = page_buf;
        page_remaining = PAGE_BUF_SIZE;
    };

    for (size_t i = start; i < end; ++i) {
        const auto& row = rows[i];
        std::string row_str = "|";

        for (size_t j = 0; j < 10; ++j) {
            std::string disp = row[j];
            bool is_slash = (disp == "/");
            std::string color = "";
            if (!is_slash) {
                if (j == 0) { // Proto
                    color = BRIGHT_GREEN;
                } else if (j == 1 || j == 2 || j == 3 || j == 4) { // Merged IPs:Ports: Ori_Src, Ori_Dst, Rep_Src, Rep_Dst
                    std::string proto = row[0];
                    bool has_port = (proto == "tcp" || proto == "udp");
                    if (has_port) {
                        size_t colon_pos = disp.rfind(':');
                        if (colon_pos != std::string::npos) {
                            std::string ip_part = disp.substr(0, colon_pos);
                            std::string colon = ":";
                            std::string port_part = disp.substr(colon_pos + 1);
                            disp = BRIGHT_YELLOW + ip_part + RESET + BRIGHT_WHITE + colon + RESET + BRIGHT_GREEN + port_part + RESET;
                        }
                    } else {
                        disp = BRIGHT_YELLOW + disp + RESET;
                    }
                } else if (j == 5) { // Use
                    color = BRIGHT_YELLOW;
                } else if (j == 7) { // State
                    color = BRIGHT_BLUE;
                } else if (j == 8) { // Flags
                    color = BRIGHT_GREEN;
                } // Mark and Expire no color
            }
            if (!color.empty()) {
                disp = color + disp + RESET;
            }
            row_str += disp + std::string(g_colw[j] - row[j].length(), ' ') + " |";
        }
        row_str += "\n";

        size_t row_len = row_str.length();
        if (page_remaining < row_len) {
            flush_buffer();
        }
        memcpy(page_ptr, row_str.c_str(), row_len);
        page_ptr += row_len;
        page_remaining -= row_len;
    }

    flush_buffer();
}

int main(int argc, char* argv[]) {
    bool ipv4 = true;
    bool ipv6 = false;
    bool count_only = false;
    std::string filter_state = "";
    std::string protocol_filter = "";
    std::string keyword = "";
    size_t page_size = 30;
    bool show_all = false;
    int opt;

    while ((opt = getopt(argc, argv, "46f:t:p:i:ch")) != -1) {
        switch (opt) {
            case '4':
                ipv4 = true;
                ipv6 = false;
                break;
            case '6':
                ipv4 = false;
                ipv6 = true;
                break;
            case 'f':
                filter_state = optarg;
                break;
            case 't':
                protocol_filter = optarg;
                break;
            case 'p':
                if (std::string(optarg) == "all") {
                    show_all = true;
                } else {
                    page_size = std::stoul(optarg);
                }
                break;
            case 'i':
                keyword = optarg;
                break;
            case 'c':
                count_only = true;
                break;
            case 'h':
                std::cout << "Usage: " << argv[0] << " [-4] [-6] [-f state] [-t protocol] [-p N|all] [-i keyword] [-c] [-h]" << std::endl;
                std::cout << "  -4: Show IPv4 connections (default)" << std::endl;
                std::cout << "  -6: Show IPv6 connections" << std::endl;
                std::cout << "  -f: Filter by state" << std::endl;
                std::cout << "  -t: Filter by protocol" << std::endl;
                std::cout << "  -p: Page size (N) or show all (all)" << std::endl;
                std::cout << "  -i: Filter by keyword" << std::endl;
                std::cout << "  -c: Show total connections count (IPv4 + IPv6)" << std::endl;
                std::cout << "  -h: Show this help message" << std::endl;
                return 0;
            default:
                std::cerr << "Usage: " << argv[0] << " [-4] [-6] [-f state] [-t protocol] [-p N|all] [-i keyword] [-c] [-h]" << std::endl;
                return 1;
        }
    }

    // Check if conntrack is installed
    if (system("command -v conntrack >/dev/null 2>&1") != 0) {
        std::cerr << "conntrack is not installed. Please install conntrack first." << std::endl;
        return 1;
    }

    if (count_only) {
        FILE* pipe = popen("conntrack -C", "r");
        if (!pipe) {
            std::cerr << "Error executing conntrack -C command." << std::endl;
            return 1;
        }
        char buffer[128];
        if (fgets(buffer, sizeof(buffer), pipe) != NULL) {
            std::string count_str = buffer;
            if (count_str.back() == '\n') count_str.pop_back();
            std::cout << "Total connections: " << count_str << std::endl;
        } else {
            std::cout << "Total connections: 0" << std::endl;
        }
        pclose(pipe);
        return 0;
    }

    std::string cmd = "conntrack -L";
    if (ipv4) {
        cmd += " --family ipv4";
    } else if (ipv6) {
        cmd += " --family ipv6";
    }
    if (!protocol_filter.empty()) {
        cmd += " -p " + protocol_filter;
    }
    cmd += " 2>/dev/null"; // Suppress the conntrack summary message

    FILE* pipe = popen(cmd.c_str(), "r");
    if (!pipe) {
        std::cerr << "Error executing conntrack command." << std::endl;
        return 1;
    }

    char buffer[2048]; // Larger buffer for longer lines (IPv6)
    std::vector<std::string> lines;
    while (fgets(buffer, sizeof(buffer), pipe) != NULL) {
        std::string line = buffer;
        if (line.back() == '\n') line.pop_back();
        lines.push_back(line);
    }
    pclose(pipe);

    std::vector<std::string> headers = {"Proto", "Ori_SrcAddr", "Ori_DstAddr", "Rep_SrcAddr", "Rep_DstAddr", "Use", "Mark", "State", "Flags", "Expire"};

    bool is_ipv6 = ipv6;

    for (const std::string& line : lines) {
        if (line.empty()) continue;

        std::istringstream iss(line);
        std::vector<std::string> tokens;
        std::string token;
        while (iss >> token) {
            tokens.push_back(token);
        }
        if (tokens.size() < 10) continue; // Minimal check

        std::string proto = tokens[0];
        std::string expire;
        size_t idx;
        if (tokens.size() > 2 && std::all_of(tokens[2].begin(), tokens[2].end(), ::isdigit)) {
            expire = tokens[2];
            idx = 3;
        } else {
            expire = "";
            idx = 2;
        }

        std::string state = "";
        if (idx < tokens.size() && tokens[idx].find('=') == std::string::npos && std::isupper(tokens[idx][0])) {
            state = tokens[idx];
            idx++;
        }

        // Filter check for state
        bool match = filter_state.empty();
        if (!match) {
            if (state == filter_state || line.find("[" + filter_state + "]") != std::string::npos) {
                match = true;
            }
        }
        if (!match) continue;

        // Determine if has ports
        bool has_ports = (proto == "tcp" || proto == "udp");
        bool is_icmp = (proto == "icmp");

        // Parse fields
        std::string f_src, f_dst, f_sport, f_dport;
        std::string s_src, s_dst, s_sport, s_dport;
        std::string mark = "";
        std::string use = "";
        std::string flags = "";

        bool parse_success = true;
        try {
            // First direction
            if (idx >= tokens.size() || tokens[idx].substr(0, 4) != "src=") { parse_success = false; }
            else { f_src = tokens[idx++].substr(4); }

            if (idx >= tokens.size() || tokens[idx].substr(0, 4) != "dst=") { parse_success = false; }
            else { f_dst = tokens[idx++].substr(4); }

            if (has_ports) {
                if (idx >= tokens.size() || tokens[idx].substr(0, 6) != "sport=") { parse_success = false; }
                else { f_sport = tokens[idx++].substr(6); }

                if (idx >= tokens.size() || tokens[idx].substr(0, 6) != "dport=") { parse_success = false; }
                else { f_dport = tokens[idx++].substr(6); }
            } else if (is_icmp) {
                // Skip type/code/id for first, but parse to advance idx
                idx += 3; // type= code= id=
                if (idx > tokens.size()) { parse_success = false; }
            }

            // Collect flags after first direction if any
            while (idx < tokens.size() && tokens[idx][0] == '[') {
                flags += tokens[idx];
                idx++;
            }

            // Second direction
            if (idx >= tokens.size() || tokens[idx].substr(0, 4) != "src=") { parse_success = false; }
            else { s_src = tokens[idx++].substr(4); }

            if (idx >= tokens.size() || tokens[idx].substr(0, 4) != "dst=") { parse_success = false; }
            else { s_dst = tokens[idx++].substr(4); }

            if (has_ports) {
                if (idx >= tokens.size() || tokens[idx].substr(0, 6) != "sport=") { parse_success = false; }
                else { s_sport = tokens[idx++].substr(6); }

                if (idx >= tokens.size() || tokens[idx].substr(0, 6) != "dport=") { parse_success = false; }
                else { s_dport = tokens[idx++].substr(6); }
            } else if (is_icmp) {
                // Skip second type/code/id
                idx += 3;
                if (idx > tokens.size()) { parse_success = false; }
            }

            // Collect flags [ ]
            while (idx < tokens.size() && tokens[idx][0] == '[') {
                flags += tokens[idx];
                idx++;
            }

            // Mark
            if (idx < tokens.size() && tokens[idx].substr(0, 5) == "mark=") {
                mark = tokens[idx].substr(5);
                idx++;
            }
            // Use
            if (idx < tokens.size() && tokens[idx].substr(0, 4) == "use=") {
                use = tokens[idx].substr(4);
                idx++;
            }
        } catch (...) {
            parse_success = false;
        }

        if (!parse_success) continue;

        // Keyword filter
        if (!keyword.empty()) {
            bool keyword_match = false;
            std::vector<std::string> fields = {proto, f_src, f_sport, f_dst, f_dport, s_src, s_sport, s_dst, s_dport, use, mark, state, flags, expire};
            for (const auto& field : fields) {
                if (field.find(keyword) != std::string::npos) {
                    keyword_match = true;
                    break;
                }
            }
            if (!keyword_match) continue;
        }

        g_total++;

        // Add row with merged IP:Port
        std::string ori_src = f_src;
        if (!f_sport.empty()) {
            if (is_ipv6) {
                ori_src = "[" + f_src + "]:" + f_sport;
            } else {
                ori_src = f_src + ":" + f_sport;
            }
        }

        std::string ori_dst = f_dst;
        if (!f_dport.empty()) {
            if (is_ipv6) {
                ori_dst = "[" + f_dst + "]:" + f_dport;
            } else {
                ori_dst = f_dst + ":" + f_dport;
            }
        }

        std::string rep_src = s_src;
        if (!s_sport.empty()) {
            if (is_ipv6) {
                rep_src = "[" + s_src + "]:" + s_sport;
            } else {
                rep_src = s_src + ":" + s_sport;
            }
        }

        std::string rep_dst = s_dst;
        if (!s_dport.empty()) {
            if (is_ipv6) {
                rep_dst = "[" + s_dst + "]:" + s_dport;
            } else {
                rep_dst = s_dst + ":" + s_dport;
            }
        }

        g_display_rows.push_back({proto, ori_src, ori_dst, rep_src, rep_dst, use.empty() ? "" : use, mark.empty() ? "" : mark, state.empty() ? "" : state, flags, expire});
    }

    // Compute column widths
    g_colw.resize(10, 0);
    for (size_t i = 0; i < 10; ++i) {
        g_colw[i] = headers[i].length();
    }
    for (const auto& row : g_display_rows) {
        for (size_t i = 0; i < 10; ++i) {
            g_colw[i] = std::max(g_colw[i], row[i].length());
        }
    }

    // Lambda to print separator
    auto print_separator = [&]() {
        std::cout << "+";
        for (size_t w : g_colw) {
            std::cout << std::string(w + 1, '-') << "+";
        }
        std::cout << std::endl;
    };

    // Compute total width
    size_t total_width = 1;
    for (size_t w : g_colw) {
        total_width += w + 2;
    }

    // Print total connections line centered and wrapped
    std::string family = ipv4 ? "IPv4" : "IPv6";
    std::string total_str = "Total " + family + " connections: " + std::to_string(g_total);
    size_t inner_width = total_width - 2;
    size_t text_len = total_str.length();
    size_t left_dash = (inner_width - text_len) / 2;
    size_t right_dash = inner_width - text_len - left_dash;
    std::cout << "|" << std::string(left_dash, '-') << BRIGHT_BLUE << total_str << RESET << std::string(right_dash, '-') << "|" << std::endl;

    // Top border
    print_separator();

    // Header row with color
    std::cout << "|";
    std::cout << BRIGHT_CYAN;
    for (size_t i = 0; i < 10; ++i) {
        std::cout << std::left << std::setw(g_colw[i]) << headers[i] << " |";
    }
    std::cout << RESET << std::endl;

    // Header separator
    print_separator();

    // Setup for paging
    size_t display_total = g_display_rows.size();
    int out_fd = STDOUT_FILENO;
    signal(SIGINT, sigint_handler);

    if (show_all || display_total <= page_size) {
        batch_print_rows(g_display_rows, 0, display_total, out_fd);
    } else {
        size_t i = 0;
        while (i < display_total) {
            size_t end = i + page_size;
            if (end > display_total) end = display_total;
            batch_print_rows(g_display_rows, i, end, out_fd);
            i = end;
            if (i < display_total && !wait_space()) break;
        }
    }

    // Bottom border
    print_separator();

    return 0;
}
g++ -std=c++17 -O3 -static conn_show.cpp -o conn_show

使用方法:

chmod +x conn_show
mv conn_show /usr/bin/

可选参数:

-4显示IPv4连接(默认)

-6显示IPv4连接

-i进行关键字搜索显示

-p显示每页输出行数(默认30行),-p all显示所有

-t指定显示协议(默认所有)

-f指定显示状态(默认所有)

26、自定义路由添加与查看程序

vim router.cpp
#include <iostream>
#include <string>
#include <vector>
#include <regex>
#include <chrono>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <cstdio>
#include <cstdlib>
#include <dirent.h>
#include <fstream>
#include <iomanip>
#include <fcntl.h>
#include <cstring>
#include <asm/types.h>
#include <sys/socket.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <linux/if_link.h>
#include <net/if.h>
#include <arpa/inet.h>
#include <errno.h>
#include <unordered_map>
#include <termios.h>
#include <sys/select.h>
#include <signal.h>
#include <getopt.h>
#include <ctype.h>
#include <sstream>
#include <netinet/in.h>
#include <thread>
#include <mutex>
#include <linux/neighbour.h>  // 新增:用于ndmsg和NUD状态

using namespace std;

string chomp(const string& s) {
    string t = s;
    size_t pos = t.find_last_not_of("\r\n");
    if (pos != string::npos) {
        t.erase(pos + 1);
    }
    return t;
}

string get_temp(bool is_dir = false) {
    string cmd = is_dir ? "mktemp -d" : "mktemp";
    FILE* fp = popen(cmd.c_str(), "r");
    if (!fp) return "";
    char buf[256];
    if (fgets(buf, sizeof(buf), fp) == nullptr) {
        pclose(fp);
        return "";
    }
    pclose(fp);
    return chomp(buf);
}

struct Cleaner {
    string dir;
    Cleaner(const string& d) : dir(d) {}
    ~Cleaner() {
        string rm_cmd = "rm -rf \"" + dir + "\"";
        if (system(rm_cmd.c_str()) != 0) {
            cerr << "Warning: Failed to remove temporary directory: " << dir << endl;
        }
    }
};

// Netlink helper functions
static void parse_rtattr(struct rtattr **tb, int max, struct rtattr *rta, int len) {
    memset(tb, 0, sizeof(struct rtattr *) * (max + 1));
    while (RTA_OK(rta, len)) {
        if (rta->rta_type <= max)
            tb[rta->rta_type] = rta;
        rta = RTA_NEXT(rta, len);
    }
}

static void addattr_l(struct nlmsghdr *n, int maxlen, int type, const void *data, int alen) {
    int len = RTA_LENGTH(alen);
    struct rtattr *rta;
    if (NLMSG_ALIGN(n->nlmsg_len) + RTA_ALIGN(len) > maxlen) {
        cerr << "addattr_l ERROR: message exceeded bound" << endl;
        return;
    }
    rta = ((struct rtattr *) (((char *) (n)) + NLMSG_ALIGN((n)->nlmsg_len)));
    rta->rta_type = type;
    rta->rta_len = RTA_ALIGN(len);
    if (alen)
        memcpy(RTA_DATA(rta), data, alen);
    n->nlmsg_len = NLMSG_ALIGN(n->nlmsg_len) + RTA_ALIGN(len);
}

static int nl_request(struct nlmsghdr *nlh, int type, const string& bind_dev = "", int expected_replies = 1) {
    int nl_sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
    if (nl_sock < 0) {
        perror("socket");
        return -1;
    }

    // 优化:增大发送/接收缓冲区
    int bufsize = 1048576;
    setsockopt(nl_sock, SOL_SOCKET, SO_SNDBUF, &bufsize, sizeof(bufsize));
    setsockopt(nl_sock, SOL_SOCKET, SO_RCVBUF, &bufsize, sizeof(bufsize));

    if (!bind_dev.empty()) {
        if (setsockopt(nl_sock, SOL_SOCKET, SO_BINDTODEVICE, bind_dev.c_str(), bind_dev.length() + 1) < 0) {
            perror("setsockopt SO_BINDTODEVICE");
            close(nl_sock);
            return -1;
        }
    }

    struct sockaddr_nl sa = {};
    sa.nl_family = AF_NETLINK;
    if (bind(nl_sock, (struct sockaddr *) &sa, sizeof(sa)) < 0) {
        perror("bind");
        close(nl_sock);
        return -1;
    }

    struct sockaddr_nl kern = {};
    kern.nl_family = AF_NETLINK;

    if (sendto(nl_sock, nlh, nlh->nlmsg_len, 0, (struct sockaddr *) &kern, sizeof(kern)) < 0) {
        perror("sendto");
        close(nl_sock);
        return -1;
    }

    char buf[65536];  // 增大缓冲区
    int ret = 0;
    int replies = 0;
    while (replies < expected_replies) {
        struct iovec iov = {buf, sizeof(buf)};
        struct msghdr msg = {(void *) &kern, sizeof(kern), &iov, 1, NULL, 0, 0};
        int len = recvmsg(nl_sock, &msg, MSG_DONTWAIT);  // 非阻塞以防卡住
        if (len < 0) {
            if (errno == EAGAIN || errno == EWOULDBLOCK) break;  // 超时退出
            if (errno == EINTR) continue;
            perror("recvmsg");
            ret = -1;
            break;
        }
        if (len == 0) break;

        for (struct nlmsghdr *nh = (struct nlmsghdr *) buf; NLMSG_OK(nh, len); nh = NLMSG_NEXT(nh, len)) {
            if (nh->nlmsg_type == NLMSG_DONE) {
                goto done;
            }
            if (nh->nlmsg_type == NLMSG_ERROR) {
                struct nlmsgerr *err = (struct nlmsgerr *) NLMSG_DATA(nh);
                if (err->error == 0) {
                    // Success
                } else {
                    errno = -err->error;
                    ret = -1;
                }
                replies++;
                if (replies >= expected_replies) goto done;
            }
            if (type == RTM_GETLINK && nh->nlmsg_type == RTM_NEWLINK) {
                // For get vrf table, process here if needed
                replies++;
                if (replies >= expected_replies) goto done;
            }
        }
    }
done:
    close(nl_sock);
    return ret;
}

static int nl_batch_send(int nl_sock, char* batch_buf, size_t batch_len, int* failures, int num_requests) {
    struct sockaddr_nl kern = {};
    kern.nl_family = AF_NETLINK;

    if (sendto(nl_sock, batch_buf, batch_len, 0, (struct sockaddr *) &kern, sizeof(kern)) < 0) {
        perror("sendto");
        return -1;
    }

    char buf[65536];  // 增大
    int ret = 0;
    int replies = 0;
    while (replies < num_requests) {
        struct iovec iov = {buf, sizeof(buf)};
        struct msghdr msg = {(void *) &kern, sizeof(kern), &iov, 1, NULL, 0, 0};
        int len = recvmsg(nl_sock, &msg, MSG_DONTWAIT);  // 非阻塞
        if (len < 0) {
            if (errno == EAGAIN || errno == EWOULDBLOCK) break;
            if (errno == EINTR) continue;
            perror("recvmsg");
            ret = -1;
            break;
        }
        if (len == 0) break;

        for (struct nlmsghdr *nh = (struct nlmsghdr *) buf; NLMSG_OK(nh, len); nh = NLMSG_NEXT(nh, len)) {
            if (nh->nlmsg_type == NLMSG_DONE)
                goto done;
            if (nh->nlmsg_type == NLMSG_ERROR) {
                struct nlmsgerr *err = (struct nlmsgerr *) NLMSG_DATA(nh);
                if (err->error != 0) {
                    (*failures)++;
                }
                replies++;
                if (replies >= num_requests) goto done;
            }
        }
    }
done:
    return ret;
}

static uint32_t get_vrf_table(int ifindex) {
    struct {
        struct nlmsghdr nl;
        struct ifinfomsg ifi;
        char buf[1024];
    } req = {};

    req.nl.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
    req.nl.nlmsg_flags = NLM_F_REQUEST;
    req.nl.nlmsg_type = RTM_GETLINK;
    req.ifi.ifi_family = AF_UNSPEC;
    req.ifi.ifi_index = ifindex;

    char buf[8192];
    int nl_sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
    if (nl_sock < 0) return 0;

    struct sockaddr_nl kern = {AF_NETLINK, 0, 0};
    if (sendto(nl_sock, &req, req.nl.nlmsg_len, 0, (struct sockaddr *) &kern, sizeof(kern)) < 0) {
        close(nl_sock);
        return 0;
    }

    int replies = 0;
    int expected_replies = 1;  // Expect one NEWLINK or ERROR
    while (replies < expected_replies) {
        struct iovec iov = {buf, sizeof(buf)};
        struct msghdr msg = {(void *) &kern, sizeof(kern), &iov, 1, NULL, 0, 0};
        int len = recvmsg(nl_sock, &msg, 0);
        if (len <= 0) break;

        for (struct nlmsghdr *nh = (struct nlmsghdr *) buf; NLMSG_OK(nh, len); nh = NLMSG_NEXT(nh, len)) {
            if (nh->nlmsg_type == NLMSG_DONE) {
                close(nl_sock);
                return 0;
            }
            if (nh->nlmsg_type == RTM_NEWLINK) {
                struct ifinfomsg *ifi = (struct ifinfomsg *) NLMSG_DATA(nh);
                if (ifi->ifi_index != ifindex) continue;

                struct rtattr *tb[IFLA_MAX + 1];
                parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), IFLA_PAYLOAD(nh));

                struct rtattr *linkinfo = tb[IFLA_LINKINFO];
                if (!linkinfo) continue;

                struct rtattr *li_tb[IFLA_INFO_MAX + 1];
                parse_rtattr(li_tb, IFLA_INFO_MAX, (struct rtattr *)RTA_DATA(linkinfo), RTA_PAYLOAD(linkinfo));

                if (li_tb[IFLA_INFO_KIND] && strcmp((char *) RTA_DATA(li_tb[IFLA_INFO_KIND]), "vrf") == 0) {
                    struct rtattr *data = li_tb[IFLA_INFO_DATA];
                    if (!data) continue;

                    struct rtattr *vd_tb[IFLA_VRF_MAX + 1];
                    parse_rtattr(vd_tb, IFLA_VRF_MAX, (struct rtattr *)RTA_DATA(data), RTA_PAYLOAD(data));

                    if (vd_tb[IFLA_VRF_TABLE]) {
                        uint32_t table = *(uint32_t *) RTA_DATA(vd_tb[IFLA_VRF_TABLE]);
                        close(nl_sock);
                        return table;
                    }
                }
                replies++;
                if (replies >= expected_replies) {
                    close(nl_sock);
                    return 0;
                }
            }
        }
    }
    close(nl_sock);
    return 0;
}

static string get_vrf_name(int ifindex) {
    if (ifindex == 0) return "default";

    struct {
        struct nlmsghdr nl;
        struct ifinfomsg ifi;
        char buf[1024];
    } req = {};

    req.nl.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
    req.nl.nlmsg_flags = NLM_F_REQUEST;
    req.nl.nlmsg_type = RTM_GETLINK;
    req.ifi.ifi_family = AF_UNSPEC;
    req.ifi.ifi_index = ifindex;

    int nl_sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
    if (nl_sock < 0) return "default";

    struct sockaddr_nl kern = {AF_NETLINK, 0, 0};
    if (sendto(nl_sock, &req, req.nl.nlmsg_len, 0, (struct sockaddr *) &kern, sizeof(kern)) < 0) {
        close(nl_sock);
        return "default";
    }

    char buf[8192];
    int replies = 0;
    int expected_replies = 1;  // Expect one NEWLINK or ERROR
    while (replies < expected_replies) {
        struct iovec iov = {buf, sizeof(buf)};
        struct msghdr msg = {(void *) &kern, sizeof(kern), &iov, 1, NULL, 0, 0};
        int len = recvmsg(nl_sock, &msg, 0);
        if (len <= 0) break;

        for (struct nlmsghdr *nh = (struct nlmsghdr *) buf; NLMSG_OK(nh, len); nh = NLMSG_NEXT(nh, len)) {
            if (nh->nlmsg_type == NLMSG_DONE) {
                close(nl_sock);
                return "default";
            }
            if (nh->nlmsg_type == RTM_NEWLINK) {
                struct ifinfomsg *ifi = (struct ifinfomsg *) NLMSG_DATA(nh);
                if (ifi->ifi_index != ifindex) continue;

                struct rtattr *tb[IFLA_MAX + 1];
                parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), IFLA_PAYLOAD(nh));

                if (!tb[IFLA_MASTER]) {
                    close(nl_sock);
                    return "default";
                }

                int master_idx = *(int *) RTA_DATA(tb[IFLA_MASTER]);

                uint32_t vrf_table = get_vrf_table(master_idx);
                if (vrf_table == 0) {
                    close(nl_sock);
                    return "default";
                }

                char ifname[IF_NAMESIZE];
                if (if_indextoname(master_idx, ifname)) {
                    close(nl_sock);
                    return ifname;
                } else {
                    close(nl_sock);
                    return "default";
                }
                replies++;
                if (replies >= expected_replies) {
                    close(nl_sock);
                    return "default";
                }
            }
        }
    }
    close(nl_sock);
    return "default";
}

static int nl_route_generic(int nl_type, int family, const void* dest_addr, int prefixlen, const void* gw_addr, int oif, uint32_t table, const string& bind_vrf, uint32_t metric = 0, const void* prefsrc_addr = nullptr, bool blackhole = false) {
    struct {
        struct nlmsghdr nl;
        struct rtmsg rt;
        char buf[1024];
    } req = {};

    req.nl.nlmsg_len = NLMSG_LENGTH(sizeof(struct rtmsg));
    req.nl.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
    if (nl_type == RTM_NEWROUTE)
        req.nl.nlmsg_flags |= NLM_F_CREATE | NLM_F_EXCL;
    req.nl.nlmsg_type = nl_type;

    req.rt.rtm_dst_len = prefixlen;
    if (table > 255) {
        req.rt.rtm_table = RT_TABLE_UNSPEC;
        addattr_l(&req.nl, sizeof(req), RTA_TABLE, &table, 4);
    } else {
        req.rt.rtm_table = table;
    }
    if (nl_type == RTM_NEWROUTE) {
        req.rt.rtm_protocol = RTPROT_STATIC;
    }
    req.rt.rtm_type = blackhole ? RTN_BLACKHOLE : RTN_UNICAST;
    req.rt.rtm_family = family;

    int addr_size = (family == AF_INET) ? 4 : 16;

    if (prefixlen > 0) {
        addattr_l(&req.nl, sizeof(req), RTA_DST, dest_addr, addr_size);
    }
    if (!blackhole) {
        if (gw_addr) {
            addattr_l(&req.nl, sizeof(req), RTA_GATEWAY, gw_addr, addr_size);
        }
        if (oif)
            addattr_l(&req.nl, sizeof(req), RTA_OIF, &oif, sizeof(oif));
    }
    if (metric > 0)
        addattr_l(&req.nl, sizeof(req), RTA_PRIORITY, &metric, 4);
    if (prefsrc_addr)
        addattr_l(&req.nl, sizeof(req), RTA_PREFSRC, prefsrc_addr, addr_size);

    if (blackhole || gw_addr)
        req.rt.rtm_scope = RT_SCOPE_UNIVERSE;
    else
        req.rt.rtm_scope = RT_SCOPE_LINK;

    return nl_request(&req.nl, 0, bind_vrf);
}

static void usage() {
    cout << "Usage:" << endl;
    cout << "  router add | del [ipv4|ipv6] <network> <mask> [<gateway>] [dev <device>|null] [table <table>] [vrf <vrf|default>] [nexthop-vrf <nexthop_vrf|default>] [metric <metric>] [prefsrc <ip>]" << endl;
    cout << "    - network: IP or network address" << endl;
    cout << "    - mask: Subnet mask (example: 255.255.255.0 or 24 for IPv4, 64 for IPv6)" << endl;
    cout << "    - gateway: Next hop IP (optional if device is specified)" << endl;
    cout << "    - dev: Optional interface or 'null' for blackhole" << endl;
    cout << "    - table: Optional routing table (default: main)" << endl;
    cout << "    - vrf: Optional target VRF name (default: default)" << endl;
    cout << "    - nexthop-vrf: Optional VRF for nexthop resolution (for route leaking, default: default)" << endl;
    cout << "    - metric: Optional metric value" << endl;
    cout << "    - prefsrc: Optional preferred source IP" << endl;
    cout << " " << endl;
    cout << "  router add | del [ipv4|ipv6] file <file> [<gateway>] [dev <device>|null] [proce <num>] [table <table>] [vrf <vrf|default>] [nexthop-vrf <nexthop_vrf|default>] [metric <metric>] [prefsrc <ip>]" << endl;
    cout << "    - file: Path to file with CIDR lines (example: router add ipv6 file cn.txt 2001:db8::1)" << endl;
    cout << "    - proce: Number of processes (default: 1)" << endl;
    cout << "    - prefsrc: Optional preferred source IP for all routes" << endl;
    cout << " " << endl;
    cout << "  router show [-4|-6] [-p N|all] [-i PATTERN] [-t TABLE] [-v NEXTVRF_NAME] [-o PROTOCOL] [neighbor]" << endl;
    cout << "    -4: Show IPv4 routes" << endl;
    cout << "    -6: Show IPv6 routes" << endl;
    cout << "    -p: Page size or all, default size 30" << endl;
    cout << "    -i: Filter by keyword" << endl;
    cout << "    -t: Filter by table name or table id" << endl;
    cout << "    -v: Filter by nexthop-vrf name" << endl;
    cout << "    -o: Filter by protocol" << endl;
    cout << "    neighbor: Show IPv6 neighbors (only with -6)" << endl;
    cout << " " << endl;
    cout << "  router -h: Show this help" << endl;
    cout << " " << endl;
}

// Routel integrated functions start here

const string C_RESET = "\033[0m";
const string C_YELLOW = "\033[33m";
const string C_MAGENTA = "\033[35m";
const string C_WHITE = "\033[37m";
const string C_RED = "\033[31m";
const string C_BRIGHT_CYAN = "\033[96;1m";
const string C_BRIGHT_YELLOW = "\033[93;1m";
const string C_BRIGHT_MAGENTA = "\033[95;1m";
const string C_BRIGHT_RED = "\033[91;1m";
const string C_BRIGHT_BLUE = "\033[94;1m";
const string C_BRIGHT_WHITE = "\033[97;1m";
const string C_BRIGHT_GREEN = "\033[92;1m";  // 新增亮绿色

string g_dashline;
string g_family_str;
size_t g_total;
size_t g_inner_width;
vector<size_t> g_colw;
vector<string> g_color_seq;
char g_pattern_lower[1024]; // For fast case-insensitive search
vector<vector<string>> g_display_rows; // Store parsed display rows

// 协议号到名称的动态映射,直接硬编码
unordered_map<uint8_t, string> g_proto_map;

// Table ID to VRF name cache
unordered_map<uint32_t, string> g_table_to_vrf_cache;
bool g_vrf_cache_initialized = false;

// Function to initialize table to VRF cache by dumping all links
static void init_table_to_vrf_cache() {
    if (g_vrf_cache_initialized) return;
    g_vrf_cache_initialized = true;

    int nl_sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
    if (nl_sock < 0) return;

    struct {
        struct nlmsghdr nl;
        struct ifinfomsg ifi;
    } req = {};
    req.nl.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
    req.nl.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
    req.nl.nlmsg_type = RTM_GETLINK;
    req.ifi.ifi_family = AF_UNSPEC;

    struct sockaddr_nl kern = {AF_NETLINK, 0, 0};
    if (sendto(nl_sock, &req, req.nl.nlmsg_len, 0, (struct sockaddr *) &kern, sizeof(kern)) < 0) {
        close(nl_sock);
        return;
    }

    char buf[65536];  // 增大
    int len;
    while ((len = recv(nl_sock, buf, sizeof(buf), 0)) > 0) {
        for (struct nlmsghdr *nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len); nh = NLMSG_NEXT(nh, len)) {
            if (nh->nlmsg_type == NLMSG_DONE) goto done;
            if (nh->nlmsg_type != RTM_NEWLINK) continue;

            struct ifinfomsg *ifi = (struct ifinfomsg *) NLMSG_DATA(nh);
            struct rtattr *tb[IFLA_MAX + 1];
            parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), IFLA_PAYLOAD(nh));

            struct rtattr *linkinfo = tb[IFLA_LINKINFO];
            if (!linkinfo) continue;

            struct rtattr *li_tb[IFLA_INFO_MAX + 1];
            parse_rtattr(li_tb, IFLA_INFO_MAX, (struct rtattr *)RTA_DATA(linkinfo), RTA_PAYLOAD(linkinfo));

            if (li_tb[IFLA_INFO_KIND] && strcmp((char *) RTA_DATA(li_tb[IFLA_INFO_KIND]), "vrf") == 0) {
                struct rtattr *data = li_tb[IFLA_INFO_DATA];
                if (!data) continue;

                struct rtattr *vd_tb[IFLA_VRF_MAX + 1];
                parse_rtattr(vd_tb, IFLA_VRF_MAX, (struct rtattr *)RTA_DATA(data), RTA_PAYLOAD(data));

                if (vd_tb[IFLA_VRF_TABLE]) {
                    uint32_t table = *(uint32_t *) RTA_DATA(vd_tb[IFLA_VRF_TABLE]);
                    if (tb[IFLA_IFNAME]) {
                        string vrf_name = (char *) RTA_DATA(tb[IFLA_IFNAME]);
                        g_table_to_vrf_cache[table] = vrf_name;
                    }
                }
            }
        }
    }
done:
    close(nl_sock);
}

// Function to get VRF name by table ID
static string get_vrf_name_by_table(uint32_t table_id) {
    init_table_to_vrf_cache();
    auto it = g_table_to_vrf_cache.find(table_id);
    if (it != g_table_to_vrf_cache.end()) {
        return it->second;
    }
    return "";
}

// Parse: Use stringstream for efficient split, minimal allocations
vector<string> fast_split(const string& line_str) {
    vector<string> parts;
    parts.reserve(10); // Estimate tokens per line
    stringstream ss(line_str);
    string token;
    while (ss >> token) {
        parts.push_back(std::move(token));
    }
    return parts;
}

void update_colw(const vector<string>& row, vector<size_t>& colw) {
    for (size_t j = 0; j < colw.size(); ++j) {
        size_t len = row[j].length();
        if (len > colw[j]) colw[j] = len;
    }
}

void sigint_handler(int sig) {
    fprintf(stdout, "%s%s%s\n", C_BRIGHT_WHITE.c_str(), g_dashline.c_str(), C_RESET.c_str());
    fprintf(stdout, "Terminated by user.\n");
    exit(0);
}

bool wait_space() {
    fprintf(stdout, "......Press <space> for next page, Ctrl+C to quit......");
    fflush(stdout);
    int fd = STDIN_FILENO;
    struct termios oldt, newt;
    if (tcgetattr(fd, &oldt) != 0) return false;
    newt = oldt;
    cfmakeraw(&newt);
    tcsetattr(fd, TCSADRAIN, &newt);
    fd_set rfds;
    FD_ZERO(&rfds);
    FD_SET(fd, &rfds);
    int ret = select(fd + 1, &rfds, NULL, NULL, NULL);
    bool is_space = false;
    if (ret > 0 && FD_ISSET(fd, &rfds)) {
        char ch;
        if (read(fd, &ch, 1) == 1) {
            is_space = (ch == ' ');
        }
    }
    fprintf(stdout, "\r%*s\r", 60, " ");
    fflush(stdout);
    tcsetattr(fd, TCSADRAIN, &oldt);
    return is_space;
}

// Optimized fast case-insensitive search: assumes line is already lowercased
bool fast_match(const char* line_lower, const char* pat_lower) {
    if (!pat_lower || !*pat_lower) return true;
    return strstr(line_lower, pat_lower) != NULL;
}

// Function to get table ID from string, now supports VRF names
uint32_t get_table_id(const string& table_str) {
    if (table_str == "all") return RT_TABLE_UNSPEC;
    if (table_str == "default") return RT_TABLE_DEFAULT;
    if (table_str == "main") return RT_TABLE_MAIN;
    if (table_str == "local") return RT_TABLE_LOCAL;
    if (table_str == "unspec") return RT_TABLE_UNSPEC;
    try {
        return stoul(table_str);
    } catch (...) {
        // Assume it's a VRF name
        int vrf_idx = if_nametoindex(table_str.c_str());
        if (vrf_idx == 0) {
            cerr << "Invalid table or VRF name: " << table_str << endl;
            exit(1);
        }
        uint32_t vrf_table = get_vrf_table(vrf_idx);
        if (vrf_table == 0) {
            cerr << "Invalid table or VRF name: " << table_str << endl;
            exit(1);
        }
        return vrf_table;
    }
}

// Function to convert protocol to string(使用map)
string proto_to_str(uint8_t p) {
    auto it = g_proto_map.find(p);
    if (it != g_proto_map.end()) {
        return it->second;
    }
    char buf[32];
    snprintf(buf, sizeof(buf), "%hhu", p);
    return buf;
}

// Function to convert scope to string
string scope_to_str(uint8_t s) {
    switch (s) {
        case RT_SCOPE_UNIVERSE: return "global";
        case RT_SCOPE_SITE: return "site";
        case RT_SCOPE_LINK: return "link";
        case RT_SCOPE_HOST: return "host";
        case RT_SCOPE_NOWHERE: return "nowhere";
        default: {
            char buf[32];
            snprintf(buf, sizeof(buf), "%hhu", s);
            return buf;
        }
    }
}

// Function to convert table to string, now shows VRF name if applicable
string table_to_str(uint32_t t) {
    switch (t) {
        case RT_TABLE_UNSPEC: return "unspec";
        case RT_TABLE_DEFAULT: return "default";
        case RT_TABLE_MAIN: return "main";
        case RT_TABLE_LOCAL: return "local";
        default: {
            string vrf_name = get_vrf_name_by_table(t);
            if (!vrf_name.empty()) {
                return vrf_name;
            }
            char buf[32];
            snprintf(buf, sizeof(buf), "%u", t);
            return buf;
        }
    }
}

// Build approximate line for filtering (similar to ip route output) using char buffer for efficiency
// Optimized: build lowercased line directly for faster matching
void build_line_for_filter(const vector<string>& row, const string& prefix, char* line, size_t max_len, int af_family) {
    char* ptr = line;
    size_t rem = max_len - 1; // Leave space for null terminator
    *ptr = '\0';

    if (!prefix.empty()) {
        int written = snprintf(ptr, rem, "%s ", prefix.c_str());
        if (written > 0) {
            for (int k = 0; k < written; ++k) {
                ptr[k] = tolower(static_cast<unsigned char>(ptr[k]));
            }
            ptr += written;
            rem -= written;
        }
    }

    // Special handling for default route to include 0.0.0.0/0 or ::/0 for matching
    if (row[0] == "default") {
        const char* def_str = (af_family == AF_INET) ? "default 0.0.0.0/0 " : "default ::/0 ";
        int written = snprintf(ptr, rem, "%s", def_str);
        if (written > 0) {
            for (int k = 0; k < written; ++k) {
                ptr[k] = tolower(static_cast<unsigned char>(ptr[k]));
            }
            ptr += written;
            rem -= written;
        }
    } else {
        int written = snprintf(ptr, rem, "%s ", row[0].c_str());
        if (written > 0) {
            for (int k = 0; k < written; ++k) {
                ptr[k] = tolower(static_cast<unsigned char>(ptr[k]));
            }
            ptr += written;
            rem -= written;
        }
    }

    size_t num_cols = row.size();
    for (size_t j = 1; j < num_cols; ++j) {  // Start from 1 since 0 already handled
        if (row[j] != "/") {
            int written = snprintf(ptr, rem, "%s ", row[j].c_str());
            if (written > 0) {
                for (int k = 0; k < written; ++k) {
                    ptr[k] = tolower(static_cast<unsigned char>(ptr[k]));
                }
                ptr += written;
                rem -= written;
            }
        }
    }
    *ptr = '\0';
}

// Read routes directly from kernel via Netlink
void read_routes(int af_family, uint32_t table_id, bool do_filter, const char* pat_lower, const string& vrf_filter, uint8_t filter_proto) {
    int nl_sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
    if (nl_sock < 0) {
        perror("socket failed");
        return;
    }

    // 优化:增大缓冲
    int bufsize = 1048576;
    setsockopt(nl_sock, SOL_SOCKET, SO_RCVBUF, &bufsize, sizeof(bufsize));

    struct sockaddr_nl sa = {};
    sa.nl_family = AF_NETLINK;
    if (bind(nl_sock, (struct sockaddr*)&sa, sizeof(sa)) < 0) {
        perror("bind failed");
        close(nl_sock);
        return;
    }

    struct {
        struct nlmsghdr nl;
        struct rtmsg rt;
        char buf[1024]; // Extra space for attributes
    } req = {};
    req.nl.nlmsg_len = NLMSG_LENGTH(sizeof(struct rtmsg));
    req.nl.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
    req.nl.nlmsg_type = RTM_GETROUTE;
    req.rt.rtm_family = af_family;

    // Set table properly
    if (table_id == RT_TABLE_UNSPEC) {
        req.rt.rtm_table = RT_TABLE_UNSPEC;
    } else {
        req.rt.rtm_table = (table_id <= 255 ? (uint8_t)table_id : RT_TABLE_COMPAT);
        if (table_id > 255) {
            struct rtattr *rta = (struct rtattr *) ((char *) &req + NLMSG_ALIGN(req.nl.nlmsg_len));
            rta->rta_type = RTA_TABLE;
            rta->rta_len = RTA_LENGTH(sizeof(uint32_t));
            *(uint32_t *) RTA_DATA(rta) = table_id;
            req.nl.nlmsg_len = NLMSG_ALIGN(req.nl.nlmsg_len) + RTA_ALIGN(rta->rta_len);
        }
    }

    if (send(nl_sock, &req, req.nl.nlmsg_len, 0) < 0) {
        perror("send failed");
        close(nl_sock);
        return;
    }

    char buf[65536]; // 增大缓冲区 for fewer recv calls
    struct sockaddr_nl sa_rx = {};
    struct iovec iov = {buf, sizeof(buf)};
    struct msghdr msg = {&sa_rx, sizeof(sa_rx), &iov, 1, NULL, 0, 0};
    ssize_t len;

    unordered_map<uint32_t, string> if_cache; // Cache for if_indextoname
    unordered_map<uint32_t, string> vrf_cache; // Cache for VRF names

    while ((len = recvmsg(nl_sock, &msg, MSG_DONTWAIT)) > 0 || (len < 0 && errno == EINTR)) {
        if (len < 0 && errno == EINTR) continue;
        if (len < 0) break;  // 防止卡住
        for (struct nlmsghdr *nh = (struct nlmsghdr*)buf; NLMSG_OK(nh, len); nh = NLMSG_NEXT(nh, len)) {
            if (nh->nlmsg_type == NLMSG_DONE) {
                close(nl_sock);
                return;
            }
            if (nh->nlmsg_type != RTM_NEWROUTE) continue;

            struct rtmsg *rt = (struct rtmsg*)NLMSG_DATA(nh);
            if (rt->rtm_family != af_family) continue;

            // Filter by protocol if specified
            if (filter_proto != 0 && rt->rtm_protocol != filter_proto) continue;

            struct rtattr *rta[RTA_MAX + 1] = {};
            int attrlen = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*rt));
            for (struct rtattr *tb = RTM_RTA(rt); RTA_OK(tb, attrlen); tb = RTA_NEXT(tb, attrlen)) {
                if (tb->rta_type <= RTA_MAX) rta[tb->rta_type] = tb;
            }

            // Get actual table
            uint32_t actual_table = rt->rtm_table;
            if (rta[RTA_TABLE]) {
                actual_table = *(uint32_t*)RTA_DATA(rta[RTA_TABLE]);
            }

            if (table_id != RT_TABLE_UNSPEC && actual_table != table_id) continue;

            vector<string> row(9, "/");
            string prefix;
            if (do_filter) {
                switch (rt->rtm_type) {
                    case RTN_LOCAL: prefix = "local"; break;
                    case RTN_ANYCAST: prefix = "anycast"; break;
                    case RTN_MULTICAST: prefix = "multicast"; break;
                    case RTN_BROADCAST: prefix = "broadcast"; break;
                    case RTN_BLACKHOLE: prefix = "blackhole"; break;  // Added for blackhole filter
                }
            }

            // Dst
            if (rt->rtm_dst_len == 0) {
                row[0] = "default";
            } else if (rta[RTA_DST]) {
                char addr_str[INET6_ADDRSTRLEN];
                if (inet_ntop(rt->rtm_family, RTA_DATA(rta[RTA_DST]), addr_str, sizeof(addr_str))) {
                    char dst_buf[INET6_ADDRSTRLEN + 12];
                    size_t addr_len = strlen(addr_str);
                    memcpy(dst_buf, addr_str, addr_len);
                    snprintf(dst_buf + addr_len, sizeof(dst_buf) - addr_len, "/%hhu", rt->rtm_dst_len);
                    row[0] = dst_buf;
                }
            }

            // Gateway
            if (rta[RTA_GATEWAY]) {
                char gw_str[INET6_ADDRSTRLEN];
                if (inet_ntop(rt->rtm_family, RTA_DATA(rta[RTA_GATEWAY]), gw_str, sizeof(gw_str))) {
                    row[1] = gw_str;
                }
            }

            // Prefsrc
            if (rta[RTA_PREFSRC]) {
                char ps_str[INET6_ADDRSTRLEN];
                if (inet_ntop(rt->rtm_family, RTA_DATA(rta[RTA_PREFSRC]), ps_str, sizeof(ps_str))) {
                    row[2] = ps_str;
                }
            }

            // Protocol
            row[3] = proto_to_str(rt->rtm_protocol);

            // Scope
            row[4] = scope_to_str(rt->rtm_scope);

            // Metric
            if (rta[RTA_PRIORITY]) {
                uint32_t prio = *(uint32_t*)RTA_DATA(rta[RTA_PRIORITY]);
                char buf[32];
                snprintf(buf, sizeof(buf), "%u", prio);
                row[5] = buf;
            } else {
                row[5] = "0";
            }

            // Dev: 优先检查黑洞类型,忽略OIF
            if (rt->rtm_type == RTN_BLACKHOLE) {
                row[6] = "blackhole";
            } else if (rta[RTA_OIF]) {
                uint32_t ifidx = *(uint32_t*)RTA_DATA(rta[RTA_OIF]);
                auto it = if_cache.find(ifidx);
                if (it != if_cache.end()) {
                    row[6] = it->second;
                } else {
                    char ifname[IF_NAMESIZE];
                    if (if_indextoname(ifidx, ifname)) {
                        row[6] = ifname;
                        if_cache[ifidx] = ifname;
                    }
                }
            }

            // Table
            row[7] = table_to_str(actual_table);

            // VRF
            string vrf_str = "/";
            if (rta[RTA_OIF]) {
                uint32_t oif = *(uint32_t*)RTA_DATA(rta[RTA_OIF]);
                auto vit = vrf_cache.find(oif);
                if (vit != vrf_cache.end()) {
                    vrf_str = vit->second;
                } else {
                    string name = get_vrf_name(oif);
                    vrf_str = (name == "default") ? "default" : name;
                    vrf_cache[oif] = vrf_str;
                }
            }
            row[8] = vrf_str;

            bool matches = true;
            if (do_filter) {
                char line[512];
                build_line_for_filter(row, prefix, line, sizeof(line), af_family);
                matches = fast_match(line, pat_lower);
            }

            if (!vrf_filter.empty()) {
                char row8_lower[IF_NAMESIZE + 1];
                strcpy(row8_lower, row[8].c_str());
                for (char* p = row8_lower; *p; ++p) *p = tolower(static_cast<unsigned char>(*p));
                char vrf_lower[IF_NAMESIZE + 1];
                strcpy(vrf_lower, vrf_filter.c_str());
                for (char* p = vrf_lower; *p; ++p) *p = tolower(static_cast<unsigned char>(*p));
                if (strcmp(row8_lower, vrf_lower) != 0) matches = false;
            }

            if (matches) {
                g_total++;
                update_colw(row, g_colw);
                g_display_rows.push_back(std::move(row));
            }
        }
    }
    close(nl_sock);
}

// 新增: 读取IPv6邻居信息
void read_neighbors(bool do_filter, const char* pat_lower) {
    int nl_sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
    if (nl_sock < 0) {
        perror("socket failed");
        return;
    }

    // 优化缓冲
    int bufsize = 1048576;
    setsockopt(nl_sock, SOL_SOCKET, SO_RCVBUF, &bufsize, sizeof(bufsize));

    struct sockaddr_nl sa = {};
    sa.nl_family = AF_NETLINK;
    if (bind(nl_sock, (struct sockaddr*)&sa, sizeof(sa)) < 0) {
        perror("bind failed");
        close(nl_sock);
        return;
    }

    struct {
        struct nlmsghdr nl;
        struct ndmsg nd;
    } req = {};
    req.nl.nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg));
    req.nl.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
    req.nl.nlmsg_type = RTM_GETNEIGH;
    req.nd.ndm_family = AF_INET6;

    if (send(nl_sock, &req, req.nl.nlmsg_len, 0) < 0) {
        perror("send failed");
        close(nl_sock);
        return;
    }

    char buf[65536];
    struct sockaddr_nl sa_rx = {};
    struct iovec iov = {buf, sizeof(buf)};
    struct msghdr msg = {&sa_rx, sizeof(sa_rx), &iov, 1, NULL, 0, 0};
    ssize_t len;

    unordered_map<uint32_t, string> if_cache;

    while ((len = recvmsg(nl_sock, &msg, MSG_DONTWAIT)) > 0 || (len < 0 && errno == EINTR)) {
        if (len < 0 && errno == EINTR) continue;
        if (len < 0) break;
        for (struct nlmsghdr *nh = (struct nlmsghdr*)buf; NLMSG_OK(nh, len); nh = NLMSG_NEXT(nh, len)) {
            if (nh->nlmsg_type == NLMSG_DONE) {
                close(nl_sock);
                return;
            }
            if (nh->nlmsg_type != RTM_NEWNEIGH) continue;

            struct ndmsg *nd = (struct ndmsg*)NLMSG_DATA(nh);
            if (nd->ndm_family != AF_INET6) continue;

            // 过滤 NUD_NONE (state == 0)
            if (nd->ndm_state == 0) continue;

            struct rtattr *rta[NDA_MAX + 1] = {};
            int attrlen = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*nd));
            for (struct rtattr *tb = (struct rtattr *) (((char *) (nd)) + NLMSG_ALIGN(sizeof(struct ndmsg))); RTA_OK(tb, attrlen); tb = RTA_NEXT(tb, attrlen)) {
                if (tb->rta_type <= NDA_MAX) rta[tb->rta_type] = tb;
            }

            // 过滤多播地址(ff00::/8)和NOARP状态
            bool skip = false;
            if (rta[NDA_DST]) {
                uint8_t* addr_bytes = (uint8_t*)RTA_DATA(rta[NDA_DST]);
                if (addr_bytes[0] == 0xff) {  // IPv6多播地址以ff开头
                    skip = true;
                }
            }
            if (nd->ndm_state == NUD_NOARP) {
                skip = true;
            }
            if (skip) continue;

            vector<string> row(5, "/");

            // Destination
            if (rta[NDA_DST]) {
                char addr_str[INET6_ADDRSTRLEN];
                if (inet_ntop(AF_INET6, RTA_DATA(rta[NDA_DST]), addr_str, sizeof(addr_str))) {
                    row[0] = addr_str;
                }
            }

            // LLaddr
            if (rta[NDA_LLADDR]) {
                uint8_t *mac = (uint8_t*)RTA_DATA(rta[NDA_LLADDR]);
                char mac_str[18];
                snprintf(mac_str, sizeof(mac_str), "%02x:%02x:%02x:%02x:%02x:%02x",
                         mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
                row[1] = mac_str;
            }

            // Dev
            char ifname[IF_NAMESIZE];
            if (if_indextoname(nd->ndm_ifindex, ifname)) {
                row[2] = ifname;
            }

            // State
            string state;
            switch (nd->ndm_state) {
                case NUD_INCOMPLETE: state = "INCOMPLETE"; break;
                case NUD_REACHABLE: state = "REACHABLE"; break;
                case NUD_STALE: state = "STALE"; break;
                case NUD_DELAY: state = "DELAY"; break;
                case NUD_PROBE: state = "PROBE"; break;
                case NUD_FAILED: state = "FAILED"; break;
                case NUD_NOARP: state = "NOARP"; break;
                case NUD_PERMANENT: state = "PERMANENT"; break;
                default: state = "UNKNOWN"; break;
            }
            row[3] = state;

            // Probe (如果有NDA_PROBES)
            if (rta[NDA_PROBES]) {
                uint32_t probes = *(uint32_t*)RTA_DATA(rta[NDA_PROBES]);
                char buf[32];
                snprintf(buf, sizeof(buf), "%u", probes);
                row[4] = buf;
            }

            bool matches = true;
            if (do_filter) {
                char line[512];
                build_line_for_filter(row, "", line, sizeof(line), AF_INET6);  // For neighbors, always IPv6
                matches = fast_match(line, pat_lower);
            }

            if (matches) {
                g_total++;
                update_colw(row, g_colw);
                g_display_rows.push_back(std::move(row));
            }
        }
    }
    close(nl_sock);
}

// NEW: Inline row formatting directly into a pointer (no per-row buf, for batch efficiency)
void append_row_to_buf(const vector<string>& row, char*& buf_ptr, size_t& remaining) {
    // "| "
    if (remaining >= 2) {
        memcpy(buf_ptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); buf_ptr += C_BRIGHT_WHITE.length(); remaining -= C_BRIGHT_WHITE.length(); memcpy(buf_ptr, "|", 1); buf_ptr += 1; remaining -= 1; memcpy(buf_ptr, C_RESET.c_str(), C_RESET.length()); buf_ptr += C_RESET.length(); remaining -= C_RESET.length(); memcpy(buf_ptr, " ", 1); buf_ptr += 1; remaining -= 1;
    }
    size_t num_cols = row.size();
    for (size_t j = 0; j < num_cols; ++j) {
        if (j > 0) {
            if (remaining >= 2) {
                memcpy(buf_ptr, "  ", 2);
                buf_ptr += 2;
                remaining -= 2;
            }
        }
        const string& field = row[j];
        int field_w = static_cast<int>(g_colw[j]);
        if (field != "/") {
            // Color + field + pad + reset (snprintf for pad calc)
            int pad = field_w - static_cast<int>(field.length());
            size_t color_len = g_color_seq[j % g_color_seq.size()].length();
            size_t reset_len = C_RESET.length();
            size_t est = color_len + field.length() + pad + reset_len;
            if (remaining >= est) {
                // Direct memcpy for fixed parts + snprintf only for pad
                memcpy(buf_ptr, g_color_seq[j % g_color_seq.size()].c_str(), color_len);
                buf_ptr += color_len;
                remaining -= color_len;
                memcpy(buf_ptr, field.c_str(), field.length());
                buf_ptr += field.length();
                remaining -= field.length();
                if (pad > 0 && remaining >= static_cast<size_t>(pad)) {
                    memset(buf_ptr, ' ', pad);  // Faster than snprintf for spaces
                    buf_ptr += pad;
                    remaining -= pad;
                }
                memcpy(buf_ptr, C_RESET.c_str(), reset_len);
                buf_ptr += reset_len;
                remaining -= reset_len;
            }
        } else {
            if (remaining >= static_cast<size_t>(field_w)) {
                memcpy(buf_ptr, field.c_str(), field.length());
                buf_ptr += field.length();
                remaining -= field.length();
                int pad = field_w - static_cast<int>(field.length());
                if (pad > 0 && remaining >= static_cast<size_t>(pad)) {
                    memset(buf_ptr, ' ', pad);
                    buf_ptr += pad;
                    remaining -= pad;
                }
            }
        }
    }
    // " |\n"
    if (remaining >= 3) {
        memcpy(buf_ptr, " ", 1); buf_ptr += 1; remaining -= 1; memcpy(buf_ptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); buf_ptr += C_BRIGHT_WHITE.length(); remaining -= C_BRIGHT_WHITE.length(); memcpy(buf_ptr, "|", 1); buf_ptr += 1; remaining -= 1; memcpy(buf_ptr, C_RESET.c_str(), C_RESET.length()); buf_ptr += C_RESET.length(); remaining -= C_RESET.length(); memcpy(buf_ptr, "\n", 1); buf_ptr += 1; remaining -= 1;
    }
}

// NEW: Batch print: Inline append rows to large buf, then single write(2)
void batch_print_rows(const vector<vector<string>>& rows, size_t start, size_t end, int out_fd) {
    char page_buf[131072];  // Increased for more routes
    char* page_ptr = page_buf;
    size_t page_remaining = sizeof(page_buf);
    for (size_t j = start; j < end; ++j) {
        append_row_to_buf(rows[j], page_ptr, page_remaining);
        if (page_remaining < 2048) {  // Flush if low (~10 rows left)
            size_t written = sizeof(page_buf) - page_remaining;
            if (written > 0) {
                ssize_t res = write(out_fd, page_buf, written);
                (void)res;  // FIXED: Capture return value and ignore to fully suppress warning
            }
            page_ptr = page_buf;
            page_remaining = sizeof(page_buf);
        }
    }
    // Final flush
    size_t final_written = sizeof(page_buf) - page_remaining;
    if (final_written > 0) {
        ssize_t res = write(1, page_buf, final_written);  // STDOUT_FILENO=1
        (void)res;  // FIXED: Capture return value and ignore to fully suppress warning
    }
}

void print_header(FILE* out, const vector<string>& keys) {
    char header_buf[2048];
    char* hptr = header_buf;
    size_t hrem = sizeof(header_buf) - 1;

    // FIXED: Match data rows: "| " (add space after |)
    if (hrem >= 2) {
        memcpy(hptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); hptr += C_BRIGHT_WHITE.length(); hrem -= C_BRIGHT_WHITE.length(); memcpy(hptr, "|", 1); hptr += 1; hrem -= 1; memcpy(hptr, C_RESET.c_str(), C_RESET.length()); hptr += C_RESET.length(); hrem -= C_RESET.length(); memcpy(hptr, " ", 1); hptr += 1; hrem -= 1;
    }
    size_t blue_len = C_BRIGHT_CYAN.length();
    if (hrem >= blue_len) {
        memcpy(hptr, C_BRIGHT_CYAN.c_str(), blue_len);
        hptr += blue_len;
        hrem -= blue_len;
    }

    size_t num_keys = keys.size();
    for (size_t i = 0; i < num_keys; ++i) {
        if (i > 0) {
            if (hrem >= 2) {
                memcpy(hptr, "  ", 2);
                hptr += 2;
                hrem -= 2;
            }
        }
        int kw = static_cast<int>(g_colw[i]);
        size_t klen = keys[i].length();
        if (hrem >= klen) {
            memcpy(hptr, keys[i].c_str(), klen);
            hptr += klen;
            hrem -= klen;
        }
        int pad = kw - static_cast<int>(klen);
        if (pad > 0 && hrem >= static_cast<size_t>(pad)) {
            memset(hptr, ' ', pad);
            hptr += pad;
            hrem -= pad;
        }
    }

    // FIXED: Match data rows: " |\n" (add space before |) + reset color before suffix
    size_t reset_len = C_RESET.length();
    if (hrem >= reset_len) {
        memcpy(hptr, C_RESET.c_str(), reset_len);
        hptr += reset_len;
        hrem -= reset_len;
    }
    if (hrem >= 3) {
        memcpy(hptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); hptr += C_BRIGHT_WHITE.length(); hrem -= C_BRIGHT_WHITE.length(); *hptr++ = ' '; hrem -= 1; memcpy(hptr, "|", 1); hptr += 1; hrem -= 1; memcpy(hptr, C_RESET.c_str(), C_RESET.length()); hptr += C_RESET.length(); hrem -= C_RESET.length(); memcpy(hptr, "\n", 1); hptr += 1; hrem -= 1;
    }
    *hptr = '\0';
    fputs(header_buf, out);
    fflush(out);
}

void print_total(FILE* out) {
    char total_buf[2048];
    char text[256];
    snprintf(text, sizeof(text), "%s Entries: %zu", g_family_str.c_str(), g_total);
    size_t text_len = strlen(text);
    size_t pad_left = (g_inner_width - text_len) / 2;
    size_t pad_right = g_inner_width - text_len - pad_left;
    char* tptr = total_buf;
    size_t trem = sizeof(total_buf) - 1;

    memcpy(tptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); tptr += C_BRIGHT_WHITE.length(); trem -= C_BRIGHT_WHITE.length(); if (trem >= 1) *tptr++ = '|'; trem -= 1; memcpy(tptr, C_RESET.c_str(), C_RESET.length()); tptr += C_RESET.length(); trem -= C_RESET.length();
    size_t blue_len = C_BRIGHT_CYAN.length();
    if (trem >= blue_len) {
        memcpy(tptr, C_BRIGHT_CYAN.c_str(), blue_len);
        tptr += blue_len;
        trem -= blue_len;
    }

    if (trem >= pad_left + text_len + pad_right) {
        memset(tptr, ' ', pad_left);
        tptr += pad_left;
        trem -= pad_left;
        memcpy(tptr, text, text_len);
        tptr += text_len;
        trem -= text_len;
        memset(tptr, ' ', pad_right);
        tptr += pad_right;
        trem -= pad_right;
    }

    size_t reset_len = C_RESET.length();
    if (trem >= reset_len) {
        memcpy(tptr, C_RESET.c_str(), reset_len);
        tptr += reset_len;
        trem -= reset_len;
    }
    if (trem >= 2) {
        memcpy(tptr, C_BRIGHT_WHITE.c_str(), C_BRIGHT_WHITE.length()); tptr += C_BRIGHT_WHITE.length(); trem -= C_BRIGHT_WHITE.length(); memcpy(tptr, "|", 1); tptr += 1; trem -= 1; memcpy(tptr, C_RESET.c_str(), C_RESET.length()); tptr += C_RESET.length(); trem -= C_RESET.length(); memcpy(tptr, "\n", 1); tptr += 1; trem -= 1;
    }
    *tptr = '\0';
    fputs(total_buf, out);
    fflush(out);
}

// Routel show logic
void show_routes(int argc, char* argv[]) {
    // 初始化协议映射(硬编码)
    g_proto_map = {
        {0, "unspec"},
        {1, "redirect"},
        {2, "kernel"},
        {3, "static"},
        {4, "static"},
        {8, "gated"},
        {9, "ra"},
        {10, "mrt"},
        {11, "zebra"},
        {12, "bird"},
        {13, "dnrouted"},
        {14, "xorp"},
        {15, "ntk"},
        {16, "dhcp"},
        {18, "keepalived"},
        {42, "babel"},
        {99, "openr"},
        {186, "bgp"},
        {187, "isis"},
        {188, "ospf"},
        {189, "rip"},
        {190, "ripng"},
        {191, "nhrp"},
        {192, "eigrp"},
        {193, "ldp"},
        {194, "sharp"},
        {195, "pbr"},
        {196, "static"},
        {197, "openfabric"},
        {198, "srte"}
    };

    // Buffered output for low CPU syscalls
    char buf[BUFSIZ];
    setvbuf(stdout, buf, _IOFBF, BUFSIZ);
    string family = "inet";
    size_t page_size = 30;
    bool page_all = false;
    bool do_filter = false;
    char pattern[1024] = {0};
    string table = "all";
    string vrf_filter;
    string filter_proto_name; // New: for -o option
    uint8_t filter_proto = 0; // New: protocol filter (0 means no filter)
    bool show_neighbor = false;
    int opt;
    optind = 2;  // Skip "show"
    while ((opt = getopt(argc, argv, "h46f:i:p:t:v:o:")) != -1) {
        switch (opt) {
            case 'h':
                usage();
                return;
            case '4': family = "inet"; break;
            case '6': family = "inet6"; break;
            case 'f': family = optarg; break;
            case 'p':
                if (strcmp(optarg, "all") == 0) page_all = true;
                else page_size = atoi(optarg);
                break;
            case 'i': strncpy(pattern, optarg, sizeof(pattern)-1); do_filter = true; break;
            case 't': table = optarg; break;
            case 'v': vrf_filter = optarg; break;
            case 'o': filter_proto_name = optarg; break; // New: parse -o
            default: return;
        }
    }
    // 检查剩余参数是否包含 "neighbor"
    for (int i = optind; i < argc; ++i) {
        if (strcmp(argv[i], "neighbor") == 0) {
            show_neighbor = true;
        }
    }

    // Lowercase pattern for fast match
    if (do_filter) {
        strncpy(g_pattern_lower, pattern, sizeof(g_pattern_lower)-1);
        g_pattern_lower[sizeof(g_pattern_lower)-1] = '\0';
        for (int i = 0; g_pattern_lower[i]; ++i) g_pattern_lower[i] = tolower(static_cast<unsigned char>(g_pattern_lower[i]));
    }

    // Validate -v NEXTVRF_NAME
    if (!vrf_filter.empty()) {
        int vrf_idx = if_nametoindex(vrf_filter.c_str());
        if (vrf_idx == 0) {
            cerr << "Invalid NextVRF name: " << vrf_filter << endl;
            exit(1);
        }
        uint32_t vrf_table = get_vrf_table(vrf_idx);
        if (vrf_table == 0) {
            cerr << "Invalid NextVRF name: " << vrf_filter << endl;
            exit(1);
        }
    }

    // New: Validate and convert -o protocol name to number
    if (!filter_proto_name.empty()) {
        bool found = false;
        for (const auto& pair : g_proto_map) {
            if (pair.second == filter_proto_name) {
                filter_proto = pair.first;
                found = true;
                break;
            }
        }
        if (!found) {
            cerr << "Unknown protocol: " << filter_proto_name << endl;
            exit(1);
        }
    }

    int af_family = (family == "inet") ? AF_INET : (family == "inet6") ? AF_INET6 : AF_UNSPEC;
    uint32_t table_id = get_table_id(table);
    g_family_str = (family == "inet" ? std::string("IPv4") : "IPv6") + (show_neighbor ? " Neighbors" : " Routes");
    init_table_to_vrf_cache(); // Pre-init cache for faster first run

    if (show_neighbor) {
        if (af_family != AF_INET6) {
            cerr << "Neighbor only supported for IPv6 (-6)" << endl;
            exit(1);
        }
        g_colw.resize(5);
        vector<string> keys = {"Destination", "LLaddr", "Dev", "State", "Probe"};
        for (size_t i = 0; i < 5; ++i) g_colw[i] = keys[i].length();
        g_display_rows.reserve(8192);
        read_neighbors(do_filter, g_pattern_lower);
    } else {
        g_colw.resize(9);
        vector<string> keys = {"Destination", "Gateway", "Prefsrc", "Protocol", "Scope", "Metric", "Device", "Table", "NextVRF"};
        for (size_t i = 0; i < 9; ++i) g_colw[i] = keys[i].length();
        g_display_rows.reserve(8192);
        read_routes(af_family, table_id, do_filter, g_pattern_lower, vrf_filter, filter_proto); // Pass filter_proto
    }

    size_t num_cols = g_colw.size();
    size_t visible_len = 0;
    for (auto w : g_colw) visible_len += w;
    visible_len += 2 * (num_cols - 1); // Match Python calculation
    g_inner_width = visible_len + 2;
    g_dashline = string(visible_len + 4, '-');
    g_color_seq = {C_BRIGHT_YELLOW, C_BRIGHT_MAGENTA, C_BRIGHT_WHITE, C_BRIGHT_GREEN, C_BRIGHT_MAGENTA, C_BRIGHT_WHITE, C_BRIGHT_BLUE, C_BRIGHT_WHITE, C_BRIGHT_GREEN};
    fprintf(stdout, "%s%s%s\n", C_BRIGHT_WHITE.c_str(), g_dashline.c_str(), C_RESET.c_str());
    print_total(stdout);
    fprintf(stdout, "%s%s%s\n", C_BRIGHT_WHITE.c_str(), g_dashline.c_str(), C_RESET.c_str());
    if (show_neighbor) {
        vector<string> neighbor_keys = {"Destination", "LLaddr", "Dev", "State", "Probe"};
        print_header(stdout, neighbor_keys);
    } else {
        vector<string> route_keys = {"Destination", "Gateway", "Prefsrc", "Protocol", "Scope", "Metric", "Device", "Table", "NextVRF"};
        print_header(stdout, route_keys);
    }
    fprintf(stdout, "%s%s%s\n", C_BRIGHT_WHITE.c_str(), g_dashline.c_str(), C_RESET.c_str());
    fflush(stdout); // Flush header
    signal(SIGINT, sigint_handler);
    if (g_total == 0) {
        fprintf(stdout, "%s%s%s\n", C_BRIGHT_WHITE.c_str(), g_dashline.c_str(), C_RESET.c_str());
        return;
    }
    size_t display_total = g_display_rows.size();
    int out_fd = STDOUT_FILENO;  // For write(2)
    if (page_all || display_total <= page_size) {
        // For small output, use batch (1 big write)
        batch_print_rows(g_display_rows, 0, display_total, out_fd);
    } else {
        size_t i = 0;
        while (i < display_total) {
            size_t end = i + page_size;
            if (end > display_total) end = display_total;
            batch_print_rows(g_display_rows, i, end, out_fd);
            i = end;
            if (i < display_total && !wait_space()) break;
        }
    }
    fprintf(stdout, "%s%s%s\n", C_BRIGHT_WHITE.c_str(), g_dashline.c_str(), C_RESET.c_str());
    fflush(stdout);
}

// Batch processing function for threads
void batch_process_thread(const vector<string>& lines, int nl_type, int family, const void* gw_addr, int oif, uint32_t table_id, const string& bind_vrf, uint32_t metric, const void* prefsrc_addr, int* total, int* invalid, int* valid, int* failures, bool blackhole = false) {
    int nl_sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
    if (nl_sock < 0) {
        perror("socket");
        return;
    }

    // 优化缓冲:增大以支持更大批次,但结合小MAX_BATCH_SIZE
    int bufsize = 1048576;
    setsockopt(nl_sock, SOL_SOCKET, SO_SNDBUF, &bufsize, sizeof(bufsize));
    setsockopt(nl_sock, SOL_SOCKET, SO_RCVBUF, &bufsize, sizeof(bufsize));

    if (!bind_vrf.empty()) {
        if (setsockopt(nl_sock, SOL_SOCKET, SO_BINDTODEVICE, bind_vrf.c_str(), bind_vrf.length() + 1) < 0) {
            perror("setsockopt SO_BINDTODEVICE");
            close(nl_sock);
            return;
        }
    }

    struct sockaddr_nl sa = {};
    sa.nl_family = AF_NETLINK;
    if (bind(nl_sock, (struct sockaddr *) &sa, sizeof(sa)) < 0) {
        perror("bind");
        close(nl_sock);
        return;
    }

    const size_t MAX_BATCH_SIZE = 16384; // 减小到16KB,防止EMSGSIZE
    char batch_buf[MAX_BATCH_SIZE];
    size_t offset = 0;
    int num_in_batch = 0;

    int addr_size = (family == AF_INET) ? 4 : 16;
    int max_prefix = (family == AF_INET) ? 32 : 128;

    for (const string& line : lines) {
        string trimmed = line;
        trimmed.erase(0, trimmed.find_first_not_of(" \t"));
        if (trimmed.empty() || trimmed[0] == '#') continue;
        (*total)++;
        size_t slash_pos = trimmed.find('/');
        if (slash_pos == string::npos) { (*invalid)++; continue; }
        string ip_str = trimmed.substr(0, slash_pos);
        string prefix_str = trimmed.substr(slash_pos + 1);
        char dest_addr[16] = {0};
        if (inet_pton(family, ip_str.c_str(), dest_addr) != 1) { (*invalid)++; continue; }
        int prefixlen;
        try { prefixlen = stoi(prefix_str); } catch (...) { (*invalid)++; continue; }
        if (prefixlen < 0 || prefixlen > max_prefix) { (*invalid)++; continue; }

        // Compute network address
        char net_addr[16] = {0};
        if (family == AF_INET) {
            uint32_t d = ntohl(*(uint32_t*)dest_addr);
            uint32_t m = (prefixlen == 0) ? 0 : (~0U << (32 - prefixlen));
            uint32_t net = d & m;
            *(uint32_t*)net_addr = htonl(net);
        } else {
            // For IPv6, apply mask bit by bit
            uint8_t* d_bytes = (uint8_t*)dest_addr;
            uint8_t* net_bytes = (uint8_t*)net_addr;
            int full_bytes = prefixlen / 8;
            int rem_bits = prefixlen % 8;
            memcpy(net_bytes, d_bytes, full_bytes);
            if (rem_bits > 0) {
                net_bytes[full_bytes] = d_bytes[full_bytes] & (0xFF << (8 - rem_bits));
            }
        }

        struct {
            struct nlmsghdr nl;
            struct rtmsg rt;
            char buf[1024];
        } req = {};

        req.nl.nlmsg_len = NLMSG_LENGTH(sizeof(struct rtmsg));
        req.nl.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
        if (nl_type == RTM_NEWROUTE)
            req.nl.nlmsg_flags |= NLM_F_CREATE | NLM_F_EXCL;
        req.nl.nlmsg_type = nl_type;

        req.rt.rtm_dst_len = prefixlen;
        if (table_id > 255) {
            req.rt.rtm_table = RT_TABLE_UNSPEC;
            addattr_l(&req.nl, sizeof(req), RTA_TABLE, &table_id, 4);
        } else {
            req.rt.rtm_table = table_id;
        }
        if (nl_type == RTM_NEWROUTE) {
            req.rt.rtm_protocol = RTPROT_STATIC;
        }
        req.rt.rtm_type = blackhole ? RTN_BLACKHOLE : RTN_UNICAST;
        req.rt.rtm_family = family;

        if (prefixlen > 0) {
            addattr_l(&req.nl, sizeof(req), RTA_DST, net_addr, addr_size);
        }
        if (!blackhole) {
            if (gw_addr) {
                addattr_l(&req.nl, sizeof(req), RTA_GATEWAY, gw_addr, addr_size);
            }
            if (oif)
                addattr_l(&req.nl, sizeof(req), RTA_OIF, &oif, sizeof(oif));
        }
        if (metric > 0)
            addattr_l(&req.nl, sizeof(req), RTA_PRIORITY, &metric, 4);
        if (prefsrc_addr)
            addattr_l(&req.nl, sizeof(req), RTA_PREFSRC, prefsrc_addr, addr_size);

        if (blackhole || gw_addr)
            req.rt.rtm_scope = RT_SCOPE_UNIVERSE;
        else
            req.rt.rtm_scope = RT_SCOPE_LINK;

        size_t msg_len = NLMSG_ALIGN(req.nl.nlmsg_len);
        if (offset + msg_len > MAX_BATCH_SIZE) {
            if (nl_batch_send(nl_sock, batch_buf, offset, failures, num_in_batch) != 0) {
                cerr << "Batch send failed" << endl;
            }
            offset = 0;
            num_in_batch = 0;
        }

        memcpy(batch_buf + offset, &req.nl, req.nl.nlmsg_len);
        offset += msg_len;
        num_in_batch++;
        (*valid)++;
    }

    if (offset > 0) {
        if (nl_batch_send(nl_sock, batch_buf, offset, failures, num_in_batch) != 0) {
            cerr << "Final batch send failed" << endl;
        }
    }

    close(nl_sock);
}

int main(int argc, char* argv[]) {
    if (argc < 2) {
        usage();
        return 1;
    }

    string action = argv[1];
    if (action == "-h") {
        usage();
        return 0;
    }

    if (action == "show") {
        show_routes(argc, argv);
        return 0;
    }

    if (action != "add" && action != "del") {
        cerr << "Invalid action: must be add, del, or show" << endl;
        return 1;
    }

    // Parse family (ipv4 or ipv6), default ipv4
    int family = AF_INET;
    int arg_start = 2;
    if (argc > 2) {
        string fam_str = argv[2];
        if (fam_str == "ipv4") {
            family = AF_INET;
            arg_start = 3;
        } else if (fam_str == "ipv6") {
            family = AF_INET6;
            arg_start = 3;
        }
    }

    bool is_batch = false;
    bool blackhole = false;
    string file, gateway, dev, table_str, target_vrf, nexthop_vrf, dest, mask_str, metric_str, prefsrc;
    int proce = 1;
    uint32_t metric = 0;
    int i = arg_start;

    regex ip_regex(family == AF_INET ? R"(^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$)" : R"(^[0-9a-fA-F:]+$)");

    while (i < argc) {
        string arg = argv[i++];
        if (arg == "file") {
            is_batch = true;
            continue;
        }
        if (is_batch) {
            if (file.empty()) {
                file = arg;
                continue;
            }
            if (gateway.empty() && regex_match(arg, ip_regex)) {
                gateway = arg;
                continue;
            }
            if (arg == "proce") {
                if (i >= argc) { cerr << "Missing proce value" << endl; return 1; }
                proce = stoi(argv[i++]);
            } else if (arg == "table") {
                if (i >= argc) { cerr << "Missing table value" << endl; return 1; }
                table_str = argv[i++];
            } else if (arg == "vrf") {
                if (i >= argc) { cerr << "Missing vrf value" << endl; return 1; }
                target_vrf = argv[i++];
            } else if (arg == "nexthop-vrf") {
                if (i >= argc) { cerr << "Missing nexthop-vrf value" << endl; return 1; }
                nexthop_vrf = argv[i++];
            } else if (arg == "dev") {
                if (i >= argc) { cerr << "Missing dev value" << endl; return 1; }
                dev = argv[i++];
                if (dev == "null") {
                    blackhole = true;
                }
            } else if (arg == "metric") {
                if (i >= argc) { cerr << "Missing metric value" << endl; return 1; }
                metric_str = argv[i++];
            } else if (arg == "prefsrc") {
                if (i >= argc) { cerr << "Missing prefsrc value" << endl; return 1; }
                prefsrc = argv[i++];
            } else {
                cerr << "Unknown parameter: " << arg << endl;
                return 1;
            }
        } else {
            if (dest.empty()) {
                dest = arg;
                continue;
            }
            if (mask_str.empty()) {
                mask_str = arg;
                continue;
            }
            if (gateway.empty() && regex_match(arg, ip_regex)) {
                gateway = arg;
                continue;
            }
            if (arg == "dev") {
                if (i >= argc) { cerr << "Missing dev value" << endl; return 1; }
                dev = argv[i++];
                if (dev == "null") {
                    blackhole = true;
                }
            } else if (arg == "table") {
                if (i >= argc) { cerr << "Missing table value" << endl; return 1; }
                table_str = argv[i++];
            } else if (arg == "vrf") {
                if (i >= argc) { cerr << "Missing vrf value" << endl; return 1; }
                target_vrf = argv[i++];
            } else if (arg == "nexthop-vrf") {
                if (i >= argc) { cerr << "Missing nexthop-vrf value" << endl; return 1; }
                nexthop_vrf = argv[i++];
            } else if (arg == "via") {
                if (i >= argc) { cerr << "Missing gateway after via" << endl; return 1; }
                gateway = argv[i++];
            } else if (arg == "metric") {
                if (i >= argc) { cerr << "Missing metric value" << endl; return 1; }
                metric_str = argv[i++];
            } else if (arg == "prefsrc") {
                if (i >= argc) { cerr << "Missing prefsrc value" << endl; return 1; }
                prefsrc = argv[i++];
            } else {
                cerr << "Unknown parameter: " << arg << endl;
                return 1;
            }
        }
    }

    if (!metric_str.empty()) {
        try {
            metric = stoul(metric_str);
        } catch (...) {
            cerr << "Invalid metric value: " << metric_str << endl;
            return 1;
        }
    }

    char prefsrc_addr[16] = {0};
    if (!prefsrc.empty()) {
        if (inet_pton(family, prefsrc.c_str(), prefsrc_addr) != 1) {
            cerr << "Invalid prefsrc IP: " << prefsrc << endl;
            return 1;
        }
    }

    if (getuid() != 0) {
        cerr << "This program requires root privileges" << endl;
        return 1;
    }

    if (is_batch) {
        if (file.empty()) {
            cerr << "Batch mode requires file" << endl;
            return 1;
        }
        if (dev.empty() && gateway.empty()) {
            cerr << "Batch mode requires either gateway or dev" << endl;
            return 1;
        }
        if (proce < 1) {
            cerr << "Process count must be positive integer" << endl;
            return 1;
        }
        if (proce > 64) {
            cerr << "Warning: High process count (" << proce << ") may consume excessive resources" << endl;
        }
        if (access(file.c_str(), F_OK) != 0) {
            cerr << "File " << file << " does not exist" << endl;
            return 1;
        }

        // Batch mode logic optimized with threads
        auto start_time = chrono::high_resolution_clock::now();

        // Preload file to page cache for faster first read
        {
            ifstream preload(file, ios::binary);
            if (preload) {
                string buffer;
                buffer.resize(1024 * 1024);  // 1MB buffer
                while (preload.read(&buffer[0], buffer.size())) {}
            }  // File content now in page cache
        }

        // Read all lines in main thread once
        vector<string> all_lines;
        all_lines.reserve(100000);  // 预分配
        ifstream infile(file);
        if (!infile) {
            cerr << "Failed to open file: " << file << endl;
            return 1;
        }
        string line;
        while (getline(infile, line)) {
            all_lines.push_back(line);
        }
        infile.close();
        int lines = all_lines.size();

        int split_lines = (lines + proce - 1) / proce;

        // Prepare common parameters
        int nl_type = (action == "add" ? RTM_NEWROUTE : RTM_DELROUTE);
        char gw_addr[16] = {0};
        void* gw_ptr = nullptr;
        if (blackhole && !gateway.empty()) {
            cerr << "Warning: Ignoring gateway for blackhole route (dev null)" << endl;
            gateway.clear();
        } else if (!gateway.empty()) {
            if (inet_pton(family, gateway.c_str(), gw_addr) != 1) {
                cerr << "Invalid gateway: " << gateway << endl;
                return 1;
            }
            gw_ptr = gw_addr;
        }

        int oif = 0;
        if (!dev.empty() && !blackhole) {
            oif = if_nametoindex(dev.c_str());
            if (oif == 0) {
                cerr << "Invalid device: " << dev << endl;
                return 1;
            }
        }

        uint32_t table_id = RT_TABLE_MAIN;
        if (!table_str.empty()) {
            table_id = stoul(table_str);
        }

        string bind_vrf;
        if (nexthop_vrf == "default") {
            bind_vrf = "";
        } else if (!nexthop_vrf.empty()) {
            bind_vrf = nexthop_vrf;
        } else if (target_vrf == "default") {
            bind_vrf = "";
        } else {
            bind_vrf = target_vrf;
        }

        if (target_vrf == "default" && table_str.empty()) {
            table_id = RT_TABLE_MAIN;
        } else if (!target_vrf.empty() && table_str.empty()) {
            int vrf_idx = if_nametoindex(target_vrf.c_str());
            if (!vrf_idx) {
                cerr << "Target VRF not found: " << target_vrf << endl;
                return 1;
            }
            uint32_t vrf_tbl = get_vrf_table(vrf_idx);
            if (vrf_tbl == 0) {
                cerr << "Not a valid target VRF: " << target_vrf << endl;
                return 1;
            }
            table_id = vrf_tbl;
        }

        if (!nexthop_vrf.empty() && nexthop_vrf != "default") {
            int nh_vrf_idx = if_nametoindex(nexthop_vrf.c_str());
            if (!nh_vrf_idx) {
                cerr << "Nexthop VRF not found: " << nexthop_vrf << endl;
                return 1;
            }
        }

        // Parallel processing with threads
        vector<thread> threads;
        vector<int> totals(proce, 0);
        vector<int> valids(proce, 0);
        vector<int> invalids(proce, 0);
        vector<int> failures(proce, 0);

        for (int p = 0; p < proce; ++p) {
            size_t start = p * split_lines;
            size_t end = min(start + split_lines, static_cast<size_t>(lines));
            vector<string> chunk(all_lines.begin() + start, all_lines.begin() + end);
            threads.emplace_back(batch_process_thread, chunk, nl_type, family, gw_ptr, oif, table_id, bind_vrf, metric, prefsrc.empty() ? nullptr : prefsrc_addr, &totals[p], &invalids[p], &valids[p], &failures[p], blackhole);
        }

        for (auto& th : threads) {
            th.join();
        }

        // Aggregate
        int g_total = 0, g_valid = 0, g_invalid = 0, g_failures = 0;
        for (int p = 0; p < proce; ++p) {
            g_total += totals[p];
            g_valid += valids[p];
            g_invalid += invalids[p];
            g_failures += failures[p];
        }

        const int box_width = 60;
        string sep = "|" + string(box_width - 2, '-') + "|";

        string op = (action == "add" ? "adding" : "deleting");
        string table_name = table_str.empty() ? "main (default)" : table_str;

        auto print_line = [&](const string& text) {
            int pad = (box_width - 3) - text.length();
            if (pad < 0) pad = 0;
            cout << "| " << text << string(pad, ' ') << "|" << endl;
        };

        cout << sep << endl;
        print_line("Using routing table: " + table_name);
        if (!target_vrf.empty()) {
            print_line("Using target VRF: " + target_vrf);
        }
        if (!nexthop_vrf.empty()) {
            print_line("Using nexthop VRF: " + nexthop_vrf);
        }
        if (!metric_str.empty()) {
            print_line("Using metric: " + metric_str);
        }
        if (!prefsrc.empty()) {
            print_line("Using prefsrc: " + prefsrc);
        }
        print_line("Processing route file: " + file);
        if (!gateway.empty()) {
            print_line("Using gateway: " + gateway);
        } 
        if (!dev.empty()) {
            print_line("Using dev: " + dev);
        }
        print_line("Using processes: " + to_string(proce));
        print_line("Processing routes, please wait...");
        cout << sep << endl;

        int success = g_valid - g_failures;

        cout << sep << endl;
        print_line("Processing complete:");
        print_line("Total lines: " + to_string(g_total));
        print_line("Valid CIDRs: " + to_string(g_valid));
        print_line("Invalid CIDRs: " + to_string(g_invalid));
        print_line("Successful " + op + ": " + to_string(success));
        print_line("Failures: " + to_string(g_failures) + " (may already exist/not exist)");
        print_line("Routes " + op + " finished!");

        auto end_time = chrono::high_resolution_clock::now();
        auto elapsed = chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count();
        print_line("Execution time: " + to_string(elapsed) + " ms");
        cout << sep << endl;

        return 0;
    } else {
        // Single mode
        if (dest.empty() || mask_str.empty() || (gateway.empty() && dev.empty() && !blackhole)) {
            cerr << "Single mode requires network, mask, and either gateway or dev" << endl;
            return 1;
        }

        // Handle if dest has /prefix (compatibility)
        size_t slash_pos = dest.find('/');
        if (slash_pos != string::npos) {
            mask_str = dest.substr(slash_pos + 1);
            dest = dest.substr(0, slash_pos);
        }

        // Parse dest
        char dest_addr[16] = {0};
        if (inet_pton(family, dest.c_str(), dest_addr) != 1) {
            cerr << "Invalid network address: " << dest << endl;
            return 1;
        }

        // Parse mask
        int prefixlen;
        int max_prefix = (family == AF_INET) ? 32 : 128;
        regex cidr_num_regex(R"(^[0-9]{1,3}$)");
        if (regex_match(mask_str, cidr_num_regex)) {
            prefixlen = stoi(mask_str);
            if (prefixlen < 0 || prefixlen > max_prefix) {
                cerr << "Invalid CIDR prefix: " << mask_str << endl;
                return 1;
            }
        } else if (family == AF_INET) {
            char mask_addr[4];
            if (inet_pton(AF_INET, mask_str.c_str(), mask_addr) != 1) {
                cerr << "Invalid mask: " << mask_str << endl;
                return 1;
            }
            uint32_t mask_val = ntohl(*(uint32_t*)mask_addr);
            // Check valid mask (contiguous 1s)
            int count = 0;
            bool seen_zero = false;
            for (int bit = 31; bit >= 0; --bit) {
                if (mask_val & (1U << bit)) {
                    if (seen_zero) {
                        cerr << "Invalid non-contiguous mask: " << mask_str << endl;
                        return 1;
                    }
                    count++;
                } else {
                    seen_zero = true;
                }
            }
            prefixlen = count;
        } else {
            cerr << "IPv6 only supports CIDR prefix notation" << endl;
            return 1;
        }

        // Compute network address
        char net_addr[16] = {0};
        if (family == AF_INET) {
            uint32_t d = ntohl(*(uint32_t*)dest_addr);
            uint32_t m = (prefixlen == 0) ? 0 : (~0U << (32 - prefixlen));
            uint32_t net = d & m;
            *(uint32_t*)net_addr = htonl(net);
        } else {
            uint8_t* d_bytes = (uint8_t*)dest_addr;
            uint8_t* net_bytes = (uint8_t*)net_addr;
            int full_bytes = prefixlen / 8;
            int rem_bits = prefixlen % 8;
            memcpy(net_bytes, d_bytes, full_bytes);
            if (rem_bits > 0) {
                net_bytes[full_bytes] = d_bytes[full_bytes] & (0xFF << (8 - rem_bits));
            }
        }

        // Parse gateway if provided
        char gw_addr[16] = {0};
        void* gw_ptr = nullptr;
        if (blackhole && !gateway.empty()) {
            cerr << "Warning: Ignoring gateway for blackhole route (dev null)" << endl;
        } else if (!gateway.empty()) {
            if (inet_pton(family, gateway.c_str(), gw_addr) != 1) {
                cerr << "Invalid gateway: " << gateway << endl;
                return 1;
            }
            gw_ptr = gw_addr;
        }

        // Check if gateway is IPv6 link-local and dev is not specified
        if (family == AF_INET6 && gw_ptr && dev.empty() && !blackhole) {
            uint8_t* gw_bytes = (uint8_t*)gw_addr;
            if (gw_bytes[0] == 0xfe && (gw_bytes[1] & 0xc0) == 0x80) {
                cerr << "Link-local IPv6 gateway requires 'dev <interface>' option" << endl;
                return 1;
            }
        }

        // Handle dev, table, vrf
        int oif = 0;
        if (!dev.empty() && !blackhole) {
            oif = if_nametoindex(dev.c_str());
            if (!oif) {
                cerr << "Invalid device: " << dev << endl;
                return 1;
            }
        }
        uint32_t table_id = RT_TABLE_MAIN;
        if (!table_str.empty()) {
            table_id = stoi(table_str);
        }
        string bind_vrf;
        if (nexthop_vrf == "default") {
            bind_vrf = "";
        } else if (!nexthop_vrf.empty()) {
            bind_vrf = nexthop_vrf;
        } else if (target_vrf == "default") {
            bind_vrf = "";
        } else {
            bind_vrf = target_vrf;
        }

        if (target_vrf == "default" && table_str.empty()) {
            table_id = RT_TABLE_MAIN;
        } else if (!target_vrf.empty() && table_str.empty()) {
            int vrf_idx = if_nametoindex(target_vrf.c_str());
            if (!vrf_idx) {
                cerr << "Target VRF not found: " << target_vrf << endl;
                return 1;
            }
            uint32_t vrf_tbl = get_vrf_table(vrf_idx);
            if (vrf_tbl == 0) {
                cerr << "Not a valid target VRF: " << target_vrf << endl;
                return 1;
            }
            table_id = vrf_tbl;
        }

        if (!nexthop_vrf.empty() && nexthop_vrf != "default") {
            int nh_vrf_idx = if_nametoindex(nexthop_vrf.c_str());
            if (!nh_vrf_idx) {
                cerr << "Nexthop VRF not found: " << nexthop_vrf << endl;
                return 1;
            }
        }

        // Perform Netlink operation
        int nl_type = (action == "add" ? RTM_NEWROUTE : RTM_DELROUTE);
        if (nl_route_generic(nl_type, family, net_addr, prefixlen, gw_ptr, oif, table_id, bind_vrf, metric, prefsrc.empty() ? nullptr : prefsrc_addr, blackhole) != 0) {
            cerr << "Failed to " << action << " route: " << strerror(errno) << endl;
            return 1;
        }

        return 0;
    }
}

编译命令

g++ -std=c++20 -O3 -static router.cpp -o router

用法:

# router -h
Usage:
  router add | del <network> <mask> [<gateway>] [dev <device>] [table <table>] [vrf <vrf|default>] [nexthop-vrf <nexthop_vrf|default>] [metric <metric>]
    - network: IP or network address
    - mask: Subnet mask (example: 255.255.255.0 or 24)
    - gateway: Next hop IP (optional if device is specified)
    - dev: Optional interface
    - table: Optional routing table (default: main)
    - vrf: Optional target VRF name (default: default)
    - nexthop-vrf: Optional VRF for nexthop resolution (for route leaking, default: default)
    - metric: Optional metric value

  router add | del file <file> <gateway> [proce <num>] [table <table>] [vrf <vrf|default>] [nexthop-vrf <nexthop_vrf|default>] [metric <metric>]
    - file: Path to file with CIDR lines (example: router add file cn.txt 192.168.10.254)
    - proce: Number of processes (default: 1)

  router show [-4|-6] [-p N|all] [-i PATTERN] [-t TABLE] [-v NEXTVRF_NAME]
    -4: Show IPv4 routes
    -6: Show IPv6 routes
    -p: Page size or all, default size 30
    -i: Filter pattern
    -t: Table (table name or table number)
    -v: Filter by NextVRF name

  router -h: Show this help

该程序可进行单一路由的高效添加/删除、从文件读取CIDR进行高效添加/删除以及高效的查看路由功能;

在router show显示中需要注意的是 scope​ 这个数值,数值越小,路由的适用范围越广(全局性越强);数值越大,范围越窄(更本地化);这源于RFC 1812的路由器要求:

  • scope link → RT_SCOPE_LINK(253),本地链路路由

  • scope global → RT_SCOPE_UNIVERSE(0),未定义的其他所有路由

  • scope host → RT_SCOPE_HOST(254),本机路由

  • RT_SCOPE_SITE = 200(站点本地,IPv6常用,但IPv4少用)

  • RT_SCOPE_NOWHERE = 255(无效路由)

Linux内核的FIB(Forwarding Information Base)查找逻辑中,不仅考虑最长前缀匹配,也考虑scope。

在PREROUTING HOOK之后,IP Rule匹配之前,会进行scope 计算并记录为flow_scope,随后再进行ip rule规则匹配查找路由表。此时,仅使用数据包的静态属性:目的IP(dst)、输入路径时的输入接口(iif)或输出路径时的输出接口(oif)、源IP(saddr)和路由标志(fib_flags,如RTF_LOCAL)。具体计算过程为:

快速本地检查:内核首先查询接口的本地路由缓存(per-interface fib_info),检查目的IP是否匹配接口的直连子网(/32主机或更大前缀)。这仅需O(1)时间:遍历接口的少量内核注入路由(由ifconfig或ip addr生成的,protocol字段为kernel的路由)。

  • 如果目的IP是本地主机地址(e.g., lo或接口IP),flow_scope = RT_SCOPE_HOST (254)。

  • 如果目的IP在输入接口的链路本地子网(e.g., iif的/24网络),flow_scope = RT_SCOPE_LINK (253)。

  • 否则,默认为RT_SCOPE_UNIVERSE (0),适用于全局转发。

随后进行ip rule规则匹配和路由查找。查找路由时,要求:1、进行最长前缀匹配(LPM)原则出符合条件的前缀;2、对符合条件的前缀进行scope对比,要求 route_scope >= flow_scope;只有这两个条件都符合的路由才可以被匹配转发;这个逻辑防止不当转发,例如避免用全局默认路由处理链路本地流量

具体案例为:

1、当ip rule规则如下

# ip rule show
0:      from all lookup local
304:    from 10.10.16.0/24 iif tun1 lookup 10000 proto zebra
32766:  from all lookup main
32767:  from all lookup default

2、10000表内路由如下

# router show -t 10000
-----------------------------------------------------------------------------------
|                                 IPv4 Routes: 1                                  |
-----------------------------------------------------------------------------------
| Destination  Gateway     Prefsrc  Protocol  Scope   Metric  Dev  Table  NextVRF |
-----------------------------------------------------------------------------------
| default      10.10.40.1  /        pbr       global  20      wg0  10000  default |
-----------------------------------------------------------------------------------

3、10.10.16.0/24是tun1接口地址,main表内容如下

# router show -t mian
---------------------------------------------------------------------------------------------------
|                                         IPv4 Routes: 7                                          |
---------------------------------------------------------------------------------------------------
| Destination         Gateway     Prefsrc       Protocol  Scope   Metric  Dev      Table  NextVRF |
---------------------------------------------------------------------------------------------------
| default             172.66.4.1  172.66.5.130  dhcp      global  100     enp3s0   main   default |
| 10.10.15.0/24       /           10.10.15.1    kernel    link    0       tun0     main   default |
| 10.10.16.0/24       /           10.10.16.1    kernel    link    0       tun1     main   default |
| 10.10.40.0/24       /           10.10.40.2    kernel    link    0       wg0      main   default |
| 154.223.182.134/32  172.66.4.1  /             static    global  0       enp3s0   main   default |
| 172.66.4.0/23       /           172.66.5.130  kernel    link    100     enp3s0   main   default |
| 192.168.17.0/24     /           192.168.17.1  kernel    link    0       docker0  main   default |
---------------------------------------------------------------------------------------------------

4、源地址是10.10.16.100的地址访问1010.16.2时,数据包进入内核经过PREROUTING HOOK后会先进行 flow_scope 计算,根据数据的方向(入或出)接口、protocol字段为kernel​的内核注入路由、源目IP信息进行计算。

计算完成后发现:目的地址为protocol kernel路由10.10.16.0/24,是直连路由,打上 flow_scope (253)的标记,并开始进行ip rule规则匹配。

在10000表内根据最长前缀匹配(LPM)原则查询到default路由符合条件,随后开始进行scope匹配,但default route_scope(0)< flow_scope (253),不满足 route_scope >= flow_scope(253)的条件,因此10000表内的default路由不会被使用,开始下一个ip rule规则的匹配。

在main表内匹配到10.10.16.0/24路由,且 route_scope 与 flow_scope(253)相等。符合条件,使用直连转发数据,因此不会被304 rule匹配使用10.10.40.1作为下一跳地址。

对router程序进行命令行补全和提示

vim /usr/share/bash-completion/completions/router
#!/usr/bin/env bash

_complete_router() {
    local cur prev words cword
    _init_completion || return

    # 常见固定值
    local families=("ipv4" "ipv6")
    local common_opts=("dev" "table" "vrf" "nexthop-vrf" "metric" "prefsrc")
    local tables=("main" "local" "default" "all" "<Table_ID/Name>")  # 支持常见表名,数字由用户输入
    local protocols=("unspec" "redirect" "kernel" "static" "gated" "ra" "mrt" "zebra" "bird" "dnrouted" "xorp" "ntk" "dhcp" "keepalived" "babel" "openr" "bgp" "isis" "ospf" "rip" "ripng" "nhrp" "eigrp" "ldp" "sharp" "pbr" "openfabric" "srte")  # 从g_proto_map提取
    local show_opts=("-4" "-6" "-p" "-i" "-t" "-v" "-o" "neighbor")

    # 获取网络接口列表(用于dev、vrf等)
    local interfaces
    interfaces=$(ip link show | awk -F: '/^[0-9]+:/ { gsub(/^[ \t]+/, "", $2); print $2 }' 2>/dev/null)

    # 第一级命令:add, del, show, -h
    if [[ $cword -eq 1 ]]; then
        COMPREPLY=($(compgen -W "add del show -h" -- "$cur"))
        return 0
    fi

    # 根据子命令分支
    case "${words[1]}" in
        add|del)
            # 处理ipv4/ipv6(可选)
            local subcmd_start=2
            if [[ "${words[2]}" == "ipv4" || "${words[2]}" == "ipv6" ]]; then
                subcmd_start=3
            fi

            # 根据当前位置补全
            case "$prev" in
                file)
                    # 补全文件路径
                    _filedir
                    return 0
                    ;;
                dev)
                    # 补全接口或null
                    COMPREPLY=($(compgen -W "null $interfaces" -- "$cur"))
                    return 0
                    ;;
                table)
                    # 补全表名或数字(数字不补全,让用户输入)
                    COMPREPLY=($(compgen -W "${tables[*]}" -- "$cur"))
                    return 0
                    ;;
                vrf|nexthop-vrf)
                    # 补全VRF(假设VRF是接口,使用interfaces;default特殊)
                    COMPREPLY=($(compgen -W "default $interfaces" -- "$cur"))
                    return 0
                    ;;
                proce|metric)
                    # 数字值,不补全
                    return 0
                    ;;
                prefsrc)
                    # IP地址,不补全
                    return 0
                    ;;
            esac

            # 默认:检查是否是network/mask/gateway/file或其他选项
            if [[ $cword -eq $subcmd_start ]]; then
                COMPREPLY=($(compgen -W "<network> file" -- "$cur"))  # 提示<network>和file
                return 0
            elif [[ $cword -eq $((subcmd_start + 1)) && "${words[$subcmd_start]}" != "file" ]]; then
                # mask不补全(用户输入mask或CIDR)
                return 0
            else
                # 其他位置:补全常见选项(dev, table等),或gateway(IP,不补全)
                COMPREPLY=($(compgen -W "proce ${common_opts[*]}" -- "$cur"))
                return 0
            fi
            ;;
        show)
            # show的选项补全
            case "$prev" in
                -p)
                    COMPREPLY=($(compgen -W "all" -- "$cur"))  # 或数字,不补全
                    return 0
                    ;;
                -i)
                    # 模式,不补全
                    return 0
                    ;;
                -t)
                    COMPREPLY=($(compgen -W "${tables[*]}" -- "$cur"))
                    return 0
                    ;;
                -v)
                    # NextVRF名称,补全接口
                    COMPREPLY=($(compgen -W "$interfaces" -- "$cur"))
                    return 0
                    ;;
                -o)
                    # 协议
                    COMPREPLY=($(compgen -W "${protocols[*]}" -- "$cur"))
                    return 0
                    ;;
            esac

            # 默认:补全show选项
            COMPREPLY=($(compgen -W "${show_opts[*]}" -- "$cur"))
            return 0
            ;;
        -h)
            # 无需补全
            return 0
            ;;
    esac
}

complete -F _complete_router router
chmod +x /usr/share/bash-completion/completions/router

效果如下,按两下tab键补全

root@Router:~# router
-h    add   del   show
root@Router:~# router show
-4        -6        -i        -o        -p        -t        -v        neighbor

28、创建自定义服务

在Linux系统启动时,各种服务启动都有顺序,如果随意修改可能导致启动异常,有些服务启动顺序靠前但有时又要依赖后期的服务。例如,在上面我们创建了wstunne@.service,但是如果我们需要在nftables配置文件内根据wstunnel@.service启动后生成的接口应用防火墙策略,这就比较矛盾了。nftables在启动时由于系统内没有wstunnel@.service的接口,会启动报错。

为了可以最小限度的修改,我们可以创建一个在最后启动的服务,这个服务可以在其他服务都启动后再一次启动失败的服务,解决前面服务启动依赖问题。

创建self-service@.service服务

vim /etc/systemd/system/self-service@.service
[Unit]
Description=Self-service.service
Description=Self-Service for %i
After=multi-user.target

[Service]
Type=simple
ExecStart=/usr/bin/self-service-init %i
Restart=always
RestartSec=3s
User=root
Group=root

[Install]
WantedBy=multi-user.target

创建启动脚本

vim /usr/bin/self-service-init
#!/bin/bash

# 获取实例名(从参数传入)
INSTANCE="$1"
if [ -z "$INSTANCE" ]; then
    echo "[ERROR] $(date "+%Y-%m-%d %H:%M:%S") 未指定实例名" | tee -a "/var/log/$INSTANCE.log"
    exit 1
fi

# 切换工作目录
cd /etc/self-service

CONF_FILE="/etc/self-service/$INSTANCE.conf"
LOG_FILE="/var/log/$INSTANCE.log"

if [ ! -f "$CONF_FILE" ]; then
    echo "[ERROR] $(date "+%Y-%m-%d %H:%M:%S") 配置文件不存在: $CONF_FILE" | tee -a "$LOG_FILE"
    exit 1
fi

# 解析配置文件
START_CMDS=$(awk '/<start>/,/<\/start>/' "$CONF_FILE" | sed -e '1d' -e '$d' -e 's/&$//')
END_CMDS=$(awk '/<end>/,/<\/end>/' "$CONF_FILE" | sed -e '1d' -e '$d' -e 's/&$//')

echo "====================================================================================" | tee -a "$LOG_FILE"
echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") 正在启动 self-service 实例 $INSTANCE ......" | tee -a "$LOG_FILE"

# 执行 start 段命令
echo "$START_CMDS" | while read -r line; do
    # 跳过空行和 # 开头的注释行
    [ -z "$line" ] && continue
    case "$line" in
        \#*) continue ;;
    esac
    echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") → $line" | tee -a "$LOG_FILE"
    bash -c "$line" 2>&1 | tee -a "$LOG_FILE" || echo "[WARNING] $(date "+%Y-%m-%d %H:%M:%S") 命令执行失败: $line" | tee -a "$LOG_FILE"
done

echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") Start 命令执行完毕" | tee -a "$LOG_FILE"

# 捕获终止信号,执行结束命令
trap '
    echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") 收到终止信号,开始执行结束命令......" | tee -a "'"$LOG_FILE"'"
    echo "'"$END_CMDS"'" | while read -r line; do
        [ -z "$line" ] && continue
        case "$line" in
            \#*) continue ;;
        esac
        echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") → $line" | tee -a "'"$LOG_FILE"'"
        bash -c "$line" 2>&1 | tee -a "'"$LOG_FILE"'" || echo "[WARNING] $(date "+%Y-%m-%d %H:%M:%S") 命令执行失败: $line" | tee -a "'"$LOG_FILE"'"
    done
    echo "[INFO] $(date "+%Y-%m-%d %H:%M:%S") End 命令执行完毕" | tee -a "'"$LOG_FILE"'"
    echo "====================================================================================" | tee -a "'"$LOG_FILE"'"
    exit 0
' SIGTERM SIGINT

# 保持服务运行(无限等待信号)
while true; do
    sleep 3600 &  # 后台 sleep,避免占用 CPU
    wait $!
done
chmod +x /usr/bin/self-service-init
systemctl daemon-reload

创建目录和配置文件

mkdir /etc/self-service
vim /etc/self-service/nftables.conf
<start>
systemctl restart nftables
</start>

<end>
</end>

设置接口启动脚本

当我们手动使用ip命令加入路由或者地址的时候,有可能在物理接口重启的情况下路由和地址消失,因此需要确保路由的出接口再启动时依旧可以自动添加路由,防止某些服务或数据路径出错。特别是wireguard或者openvpn网卡这种无法被nmcli管理的网卡。

接口up/down脚本

vim /etc/NetworkManager/dispatcher.d/90-eth0-dhcp4.sh
#!/bin/bash

# 可能的事件类型(从 NetworkManager dispatcher 支持的列表):
# - up: 接口逻辑 up 时(链路就绪,但可能无 IP)
# - down: 接口 down 时
# - pre-up: 接口 up 前(最早阶段,可用于准备工作)
# - pre-down: 接口 down 前(可用于清理)
# - dhcp4-change: 仅在 IPv4 DHCP lease 初次获取或成功更新时触发(推荐接口依赖dhcp时),在地址到期但未获取新地址和失败时不会触发,但失败可能会触发其他状态
# - dhcp6-change: 仅在 IPv6 DHCP lease 初次获取或成功更新时触发(推荐接口依赖dhcp时),在地址到期但未获取新地址和失败时不会触发,但失败可能会触发其他状态
# - vpn-up: VPN 连接 up 时
# - vpn-down: VPN 连接 down 时
# - hostname: 主机名变化时

CONNECTION_NAME="eth0"                                  # 指定接口名称,例如 eth0 或 ens192
EVENT_TYPE="dhcp4-change"                               # 指定要监听的事件类型,例如 up, down 等

COMMANDS="systemctl start self-service@vpls-wg0"        # 该事件触发时执行的命令,用;分隔多个

INTERFACE=$1
EVENT=$2

if [ "$INTERFACE" != "$CONNECTION_NAME" ]; then
    exit 0
fi
# 根据事件类型执行对应命令序列
case "$EVENT" in
    $EVENT_TYPE)
        if [ -n "$COMMANDS" ]; then
            eval "$COMMANDS"
        fi
        ;;
    *)
        # 忽略其他事件
        ;;
esac
chmod +x /etc/NetworkManager/dispatcher.d/90-eth0-dhcp4.sh

接口down脚本

vim /etc/NetworkManager/dispatcher.d/90-eth0-down.sh
#!/bin/bash

# 可能的事件类型(从 NetworkManager dispatcher 支持的列表):
# - up: 接口逻辑 up 时(链路就绪,但可能无 IP)
# - down: 接口 down 时
# - pre-up: 接口 up 前(最早阶段,可用于准备工作)
# - pre-down: 接口 down 前(可用于清理)
# - dhcp4-change: 仅在 IPv4 DHCP lease 初次获取或成功更新时触发(推荐接口依赖dhcp时),在地址到期但未获取新地址和失败时不会触发,但失败可能会触发其他状态
# - dhcp6-change: 仅在 IPv6 DHCP lease 初次获取或成功更新时触发(推荐接口依赖dhcp时),在地址到期但未获取新地址和失败时不会触发,但失败可能会触发其他状态
# - vpn-up: VPN 连接 up 时
# - vpn-down: VPN 连接 down 时
# - hostname: 主机名变化时

CONNECTION_NAME="eth0"                                  # 指定接口名称,例如 eth0 或 ens192
EVENT_TYPE="down"                                       # 指定要监听的事件类型,例如 up, down 等

COMMANDS="systemctl stop self-service@vpls-wg0"        # 该事件触发时执行的命令,用;分隔多个

INTERFACE=$1
EVENT=$2

if [ "$INTERFACE" != "$CONNECTION_NAME" ]; then
    exit 0
fi
# 根据事件类型执行对应命令序列
case "$EVENT" in
    $EVENT_TYPE)
        if [ -n "$COMMANDS" ]; then
            eval "$COMMANDS"
        fi
        ;;
    *)
        # 忽略其他事件
        ;;
esac

接口pre-up脚本

vim /etc/NetworkManager/dispatcher.d/pre-up.d/90-eth0-preup.sh
#!/bin/bash

# 可能的事件类型(从 NetworkManager dispatcher 支持的列表):
# - up: 接口逻辑 up 时(链路就绪,但可能无 IP)
# - down: 接口 down 时
# - pre-up: 接口 up 前(最早阶段,可用于准备工作)
# - pre-down: 接口 down 前(可用于清理)
# - dhcp4-change: 仅在 IPv4 DHCP lease 初次获取或成功更新时触发(推荐接口依赖dhcp时),在地址到期但未获取新地址和失败时不会触发,但失败可能会触发其他状态
# - dhcp6-change: 仅在 IPv6 DHCP lease 初次获取或成功更新时触发(推荐接口依赖dhcp时),在地址到期但未获取新地址和失败时不会触发,但失败可能会触发其他状态
# - vpn-up: VPN 连接 up 时
# - vpn-down: VPN 连接 down 时
# - hostname: 主机名变化时

CONNECTION_NAME="eth0"                                  # 指定接口名称,例如 eth0 或 ens192
EVENT_TYPE="pre-up"                                     # 指定要监听的事件类型,例如 up, down 等

COMMANDS="systemctl start self-service@vpls-wg0"        # 该事件触发时执行的命令,用;分隔多个

INTERFACE=$1
EVENT=$2

if [ "$INTERFACE" != "$CONNECTION_NAME" ]; then
    exit 0
fi
# 根据事件类型执行对应命令序列
case "$EVENT" in
    $EVENT_TYPE)
        if [ -n "$COMMANDS" ]; then
            eval "$COMMANDS"
        fi
        ;;
    *)
        # 忽略其他事件
        ;;
esac
chmod +x /etc/NetworkManager/dispatcher.d/pre-up.d/90-eth0-preup.sh

接口pre-down脚本

vim /etc/NetworkManager/dispatcher.d/pre-down.d/90-eth0-predown.sh
#!/bin/bash

# 可能的事件类型(从 NetworkManager dispatcher 支持的列表):
# - up: 接口逻辑 up 时(链路就绪,但可能无 IP)
# - down: 接口 down 时
# - pre-up: 接口 up 前(最早阶段,可用于准备工作)
# - pre-down: 接口 down 前(可用于清理)
# - dhcp4-change: 仅在 IPv4 DHCP lease 初次获取或成功更新时触发(推荐接口依赖dhcp时),在地址到期但未获取新地址和失败时不会触发,但失败可能会触发其他状态
# - dhcp6-change: 仅在 IPv6 DHCP lease 初次获取或成功更新时触发(推荐接口依赖dhcp时),在地址到期但未获取新地址和失败时不会触发,但失败可能会触发其他状态
# - vpn-up: VPN 连接 up 时
# - vpn-down: VPN 连接 down 时
# - hostname: 主机名变化时

CONNECTION_NAME="eth0"                                  # 指定接口名称,例如 eth0 或 ens192
EVENT_TYPE="pre-down"                                   # 指定要监听的事件类型,例如 up, down 等

COMMANDS="systemctl stop self-service@vpls-wg0"			# 该事件触发时执行的命令,用;分隔多个

INTERFACE=$1
EVENT=$2

if [ "$INTERFACE" != "$CONNECTION_NAME" ]; then
    exit 0
fi
# 根据事件类型执行对应命令序列
case "$EVENT" in
    $EVENT_TYPE)
        if [ -n "$COMMANDS" ]; then
            eval "$COMMANDS"
        fi
        ;;
    *)
        # 忽略其他事件
        ;;
esac
chmod +x /etc/NetworkManager/dispatcher.d/pre-down.d/90-eth0-predown.sh

建议command内运行的都是systemd服务,因为这样会降低依赖风险,例如下面的self-service服务

当接口启动脚本与self-service服务结合使用的时候,虽然在系统启动时会运行systemctl start self-service@vpls-wg0​命令,这个调用不会立即强制启动服务,而是交给systemd处理,。但是由于self-service@.service​服务依赖于multi-user.target​而此时multi-user.target​还未就绪,因此会等待multi-user.target​就绪才会运行systemctl start self-service@vpls-wg0​命令,也就不会有任何依赖问题。

相反,如果把wg0.conf​内的命令都写在接口启动时的脚本内的话,由于其他接口都没有正常启动,会存在报错问题,也就是存在服务依赖问题。

但是要注意:如果接口IP地址依赖DHCP4/6,最好使用状态为dhcp4-change或dhcp6-change,否则虽然系统启动时会自动解决服务依赖,但如果DHCP地址延时了很久才获取到的话,脚本内的命令可能还是无效的