Tuesday, August 19, 2008

Simple Network Tuning for the Mainframe Sysprog

There have been some interesting postings on the VSE-L list this week about throughput to Windows based PCs. So, I thought I would cover some basic network tuning points.

First, use Ethereal/Wireshark (free open source packet sniffer) to get a trace of the data transfer (E.g., FTP) you are interested in tuning. This trace will tell you a lot about the data transfer.

Now look at the size of the packets being transferred. In the old days of the Internet (dial-up days) it was common to use packets of 576 bytes. In fact, all TCP/IP stacks are required to handle this size packet. However, now that Ethernet is pretty much the standard, look for 1500 byte packets. 1500 bytes includes the 40 byte IP header and 1460 bytes of data. I am ignoring the 18 byte Ethernet header in these numbers.

If the 2 hosts (mainframe and PC) have Gigabit Ethernet adapters AND they are connected to a Gigabit switch then you might expect to see jumbo Ethernet frames. Jumbo Ethernet frames are usually 9000 bytes.

In a bulk data transfer you will normally see a bunch of packets of the same size. If this size is not 1500 bytes then either the MTU size is set incorrectly or the MSS (Maximum Segment Size) is not correct. Usually it is the MSS that is incorrect. When 2 hosts establish a connection connection (socket) each host sends what it wants to use as a segment size. The smallest value wins! Windows PC's like to a segment size of 536 for some types of applications (E.g., web servers). If the MSS is set to 536 bytes it will take 2.7 of these smaller packets to send as much data as can be contained in a single packet with an MSS of 1460 bytes.

The next item to look for is the TCP Window size. This value should be something close to 64K minus one (65535). Windows always wants to use a TCP window size that is an exact multiple of the segment size. With an MSS of 1460 you can get 44 segments in a 64K TCP window. The actual size is 1460 * 44 = 64240 (x'FAF0).

These two values are set by registry entries and there are various scripts and utilities available around the net that will set these values for you.

Another source of throughput problems is TcpAckFrequency.

When Windows receives a packet it will wait for 200ms for another packet to arrive before sending an ACK. Why? Windows hopes that it can send a single ACK for both packets. Why does this slow down a transfer? 200ms is a very long time at network speeds. Why would this come into play at all for an FTP? If a host is sending data and has sent one packet but there is not enough space available in the TCP Window to send the next packet, the sender will wait for an ACK of the data already sent. At the same time, Windows is waiting for 200ms for another packet to come in. This does not have to happen very often for these 200ms delays to add up.

Couple of good web pages to look at are ...
http://smallvoid.com/article/winnt-nagle-algorithm.html
http://support.microsoft.com/kb/Q328890

If you follow the instructions on these web pages you can add a registry entry for TcpAckFrequency Full_DWORD = 1. This will disable the 200ms ACK delay Windows uses as a timer for delayed ACK processing. I have seen disabling this feature result in a 10x improvement in throughput. Your mileage will vary!

The last two items to look at for data transfer throughput are your CPU and memory. If data transfer rates are important get the fastest CPU you can afford and the maximum amount of memory your motherboard will support. If Windows has lots of memory it will cache the data from the transfer in memory and not spend time writing the data to disk until the transfer has finished.

Well, there you have basic network tuning for the mainframe system programmer.