暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Linux 网卡队列、combined、irqbalance对网络性能的影响和优化

digoal 2019-11-20
2978

作者

digoal

日期

2019-11-20

标签

PostgreSQL , linux , combined , 网卡队列 , irqbalance


背景

中断被限制在某些单核处理成为瓶颈。

网卡没有开队列,高吞吐时产生瓶颈。qps可能从几十万降到几万。

这两种瓶颈如何解决?

中断不均匀问题:

《转载 - Linux软中断不均的调优》

《转载 - Linux 多核下绑定硬件中断到不同 CPU(IRQ Affinity)》

《转载 - 用户空间与内核空间,进程上下文与中断上下文[总结]》

```
也可以直接使用irqbalance解决

service irqbalance start
```

网卡队列问题:

通过ethtool -L 设置

```

ethtool -l eth0

Channel parameters for eth0:
Pre-set maximums:
RX: 0
TX: 0
Other: 0
Combined: 4 最大值4
Current hardware settings:
RX: 0
TX: 0
Other: 0
Combined: 1

设置为最大值4

ethtool -L eth0 combined 4

ethtool -l eth0

Channel parameters for eth0:
Pre-set maximums:
RX: 0
TX: 0
Other: 0
Combined: 4
Current hardware settings:
RX: 0
TX: 0
Other: 0
Combined: 4
```

性能对比

ecs 客户端 16c
ecs 数据库服务器 32c
相同机房

pgbench -i -s 1000 pgbench -M prepared -n -r -P 1 -c 192 -j 192 -T 120 -S

关闭网卡队列时,测试tpcb select only

transaction type: <builtin: select only> scaling factor: 1000 query mode: prepared number of clients: 192 number of threads: 192 duration: 120 s number of transactions actually processed: 34502551 latency average = 0.664 ms latency stddev = 0.134 ms tps = 287512.936228 (including connections establishing) tps = 288934.928415 (excluding connections establishing) statement latencies in milliseconds: 0.001 \set aid random(1, 100000 * :scale) 0.667 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

开启网卡队列(客户端4, 数据库端8)时,测试tpcb select only

transaction type: <builtin: select only> scaling factor: 1000 query mode: prepared number of clients: 192 number of threads: 192 duration: 120 s number of transactions actually processed: 55518366 latency average = 0.403 ms latency stddev = 0.520 ms tps = 462553.463542 (including connections establishing) tps = 467220.064086 (excluding connections establishing) statement latencies in milliseconds: 0.001 \set aid random(1, 100000 * :scale) 0.403 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

注意

测试机所在服务器,要开启网卡队列。

数据库所在服务器,也要开启网卡队列。

小结

网卡队列对高并发的业务非常重要。关闭网卡队列,cpu压不满,28.8万 qps。开启网卡队列,46.7万 qps, cpu基本耗尽。开启相比关闭性能差了将近一倍。

参考

man irqbalance

man ethtool

```
-L --set-channels
Changes the numbers of channels of the specified network device.

       rx N   Changes the number of channels with only receive queues.

       tx N   Changes the number of channels with only transmit queues.

       other N  
              Changes the number of channels used only for other purposes e.g. link interrupts or SR-IOV co-ordination.

       combined N  
              Changes the number of multi-purpose channels.
复制

```

《转载 - Linux软中断不均的调优》

《转载 - Linux 多核下绑定硬件中断到不同 CPU(IRQ Affinity)》

《转载 - 用户空间与内核空间,进程上下文与中断上下文[总结]》

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论