ssh 登录挂住的问题 - 可能是 MTU

今天在一个新环境使用 ssh 登录卡住了, 提示: Read from socket failed: Connection reset by peer .

打开 verbose 信息后, 发现是卡在 expecting SSH2_MSG_KEX_ECDH_REPLY :

OpenSSH_6.6.1, OpenSSL 1.0.1f 6 Jan 2014
debug1: Reading configuration data /home/niko/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug1: Connecting to 207.148.74.211 [207.148.74.211] port 22.
debug1: Connection established.
debug1: identity file /home/niko/.ssh/id_rsa_xxx type -1
debug1: identity file /home/niko/.ssh/id_rsa_xxx-cert type -1
...
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.8
debug1: Remote protocol version 2.0, remote software version OpenSSH_7.4
debug1: match: OpenSSH_7.4 pat OpenSSH* compat 0x04000000
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-ctr hmac-sha1-etm@openssh.com none
debug1: kex: client->server aes128-ctr hmac-sha1-etm@openssh.com none
debug1: sending SSH2_MSG_KEX_ECDH_INIT
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
Read from socket failed: Connection reset by peer
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.0]

使用的是 openssh 版本是:

dpkg -l | grep ssh
ii libssh-4:amd64 0.6.1-0ubuntu3.3 amd64 tiny C SSH library
ii openssh-client 1:6.6p1-2ubuntu2.8 amd64 secure shell (SSH) client, for secure access to remote machines
ii openssh-server 1:6.6p1-2ubuntu2.8 amd64 secure shell (SSH) server, for secure access from remote machines
ii openssh-sftp-server 1:6.6p1-2ubuntu2.8 amd64 secure shell (SSH) sftp server module, for SFTP access from remote machines
ii python-paramiko 1.10.1-1git1ubuntu0.1 all Make ssh v2 connections with Python (Python 2)
ii ssh-askpass-gnome 1:6.6p1-2ubuntu2.8 amd64 interactive X program to prompt users for a passphrase for ssh-add
ii ssh-import-id 3.21-0ubuntu1 all securely retrieve an SSH public key and install it locally
ii sshpass 1.05-1 amd64 Non-interactive ssh password authentication

Fix


Google 之后发现可能是一个 Bug : Bug #1254085 “ssh fails to connect to VPN host - hangs at 'expec...” : Bugs : openssh package : Ubuntu.

于是验证一下:

网卡默认设置的 MTU 是 1500, 这是 LAN 用的, 正常情况下, 发送的 ICMP 内容大小最大为 1500 - 20 - 8 = 1472,
IP header 最小长度 20 字节,ICMP header 最小长度 8 字节,而 1472 为 ICMP 数据长度。
因此超过 1472 则会失败(如果设置了不进行 fragmentation), 如下本地 eth1 设置 MTU 的是 1500 :

-> % sudo ping baidu.com -s 1472 -c 3 -M do
PING baidu.com (220.181.57.216) 1472(1500) bytes of data.
1480 bytes from 220.181.57.216: icmp_seq=1 ttl=128 time=37.1 ms
1480 bytes from 220.181.57.216: icmp_seq=2 ttl=128 time=41.7 ms
1480 bytes from 220.181.57.216: icmp_seq=3 ttl=128 time=78.3 ms

--- baidu.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 37.162/52.421/78.390/18.458 ms


niko@NOW [10:29:15 PM] [~/dev/shortcuts/shell-scripts/server]
-> % sudo ping baidu.com -s 1473 -c 3 -M do
PING baidu.com (220.181.57.216) 1473(1501) bytes of data.
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500

--- baidu.com ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2010ms

但是我在服务器的实际测试发现, 发送 1472 的消息仍然提示错误, 这可能是到远程服务器传输路径的某个网络设备 MTU 小于 1500(或者防火墙无法处理 IP packet fragmentation ).

因此, 我们可以尝试调低 MTU 验证一下 :

sudo ip link set mtu 1200 dev ethX

或者

sudo ifconfig ethX mtu 1200

测试后发现可以正常登录 ssh 了...

参考


Jon's Site - Test MTU with ping
linux - SSH access problem: debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY - Server Fault
The mysterious case of broken SSH client (“connection reset by peer”) | Web 0.2
Bug 1268847 – ssh fails to connect to VPN hosts - hangs at "expecting SSH2_MSG_KEX_ECDH_REPLY"
Path MTU 概述 | 孙勇峰的部落格
TCP-MSS, PMTU 详解- MTU工具解析与常见问题汇总-下篇-新西兰资深网工的日常-51CTO博客