Tuesday, June 22, 2010

Linux - SSH Login Slow

I am probably just going to shed some light on one aspect which recently I saw in one of the servers. But I strongly feel that one should have a very good understanding of how SSH works before even starting to troubleshoot anything on SSH.

Recently I got a complain about one server 'being slow'. I just opened the server and went to grab a cup of coffee. I came back to seat, did some general health check up, logs checking etc. but could not find the box behaving odd. I replied the guy back to recheck if things has settled downed and I logged out. Surprisingly enough he again complains that 'server is slow'. I again logged in. What i saw was server was taking some 15-30 sec, after you provide the password to give the shell. And that was what the user was saying 'server is slow'. When you actually understand the problem(WHAT) - solution(HOW) is easy!!

Luckily my first doubt on network settings was correct. DNS was out of my radar as i am trying to connect with raw ip and not any hostname. Yes! I was correct! It was a wrong gateway issue.

On the problematic server i did try to figure out the default GW and surprisingly i found two entry there -

[root@server5 ~]# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
0.0.0.0 192.168.2.7 0.0.0.0 UG 0 0 0 eth0
0.0.0.0 192.168.2.6 0.0.0.0 UG 0 0 0 eth0

[root@server5 ~]# ping 192.168.2.7
PING 192.168.2.7 (192.168.2.7) 56(84) bytes of data.

--- 192.168.2.7 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1000ms

So first GW it self is wrong, and does not exist.

[root@server5 ~]# ping 192.168.2.6
PING 192.168.2.6 (192.168.2.6) 56(84) bytes of data.
64 bytes from 192.168.2.6: icmp_seq=1 ttl=255 time=1.48 ms
64 bytes from 192.168.2.6: icmp_seq=2 ttl=255 time=1.53 ms


I quickly validate the correct GW on some other server and found that none of these two are actually a valid one, which our network team confirmed. So decided to change Default GW on the problematic server.

[root@server5 ~]# route del default

[root@server5 ~]# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
0.0.0.0 192.168.2.6 0.0.0.0 UG 0 0 0 eth0

[root@server5 ~]# route del default

[root@server5 ~]# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0

Added a new valid gateway:

[root@leserver5 ~]# route add default gw 192.168.2.8

[root@server5 ~]# route
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.2.0 * 255.255.255.0 U 0 0 0 eth0
169.254.0.0 * 255.255.0.0 U 0 0 0 eth0
default 192.168.2.8 0.0.0.0 UG 0 0 0 eth0

SSH login looks OK now.

My writing of this post was not to end up here, at this point. Rather, to analyze what was the blockade the wrong GW had posed to make the login process slow and revisit how SSH actually works and how much imp. it is - to have a right gateway, even though you might have multiple (kinda) capable GW in your network. I was accessing these server from outside. Saying so what i meant was, for the reverse packet to flow back, default gateway was needed as i was on other subnet/network. So this server was actually trying to get through the first gateway i.e. 192.168.2.7, which was giving us a RTO on ping. The second one i.e. 192.168.2.6 was somehow capable of, but this has a NAT to a already expired contract ISP vendor, which could have gone anytime in a day or two, and somehow the right gateway was not added to this server. I could not test and validate BUT i am almost sure that, this issue would not be faced if i am doing a ssh from the local network.

Earlier I found SSH slow login issue on two more occasions - once it was a X-Forwarding issue and the other was DNS look-up issue.Do leave some comment below if you have faced this issue for some other factor.



Cheers!!!

No comments:

Post a Comment

RCA - Root Cause Analysis

An important step in finding the root causes of issues or occurrences that happen within a system or organization is root cause analysis (RC...