Archive for the 'networking' Category

Xen Network Performance Problem - TX Checksumming

Wednesday, August 19th, 2009

I was stung by this again today. Two virtual servers on the same hardware trying to communicate with each other. One running apache2 & php, and the other a MySQL database server. This happened just after a server reboot.

CPU, disk I/O and memory usage were all minimal on both servers and on the host, and yet the LAMP application was performing like a dog with one leg.

Fortunately I’ve seen this before, and (after banging my head on my keyboard a few times) I recognised the problem. Xen networking.

Basically, when the operating system sends a packet through the network, it computes a checksum of the data so the recipient can tell if it arrived intact. On a physical machine, the operating system offloads this operation to the network card as its chipset can handle it. However, in a virtual machine, there is no physical card to hand the checksumming off to - the network card is just an abstraction in software, and so this is terribly inefficient.

This problem only affects two virtual machines on the same hardware talking over the virtual network. [correction: Actually, it affects all network traffic, it’s just more noticeable between two adjacent VMs] So, not usually a problem, but when the application server needs to talk to it’s database server and they both happened to be on the same hardware it makes a huge performance difference.

To fix, use ethtool. First, check the settings (do this on each domU):


# ethtool -k eth0
Offload parameters for eth0:
Cannot get device rx csum settings: Operation not supported
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: off
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

TX checksumming is on. Turn it off:

ethtool -K eth0 tx off

And verify the result:


# ethtool -k eth0
Offload parameters for eth0:
Cannot get device rx csum settings: Operation not supported
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

I figured this out before, and fixed it, but when I rebooted the server the change was lost. I’ve now fixed it permanently in /etc/rc.local (you could also do this in a post-up network script, although rc.local will run after networking is up anyway).

I’ve already put this into the configuration management system (we’re using puppet - I’ll make that the topic of another post), but these are some old VMs that are not yet automated.

So - fixed. And the servers are humming along even better than before now after a hardware upgrade.

/proc/sys/net/huh?

Saturday, January 13th, 2007

You’ll often come across docs and how-tos that say things like “to enable forwarding issue the following command”:

echo 1 > /proc/sys/net/ipv4/ip_forward

Ever wondered what all that stuff in /proc/sys/net actually does? Ok, a lot of it is pretty logical, but sometimes it’s nice to actually know with a bit more certainty. Today I broke something on a server because I assumed, instead of looking it up. Oops.

There’s a lot of documentation in the kernel sources which is surprisingly accessible to the non kernel hackers among us. First, get yourself a copy of the kernel source if you don’t have one. Take a look in /usr/src. If you don’t see a directory called something like linux-2.6.18, you probably don’t have the kernel source available. If you’re on a debian, ubuntu or other apt-based distro, you can apt-get the source for your kernel:

cd /usr/src
apt-get source linux-image-2.6.18-3-k7

Once your kernel source has downloaded and unpacked, cd into the source directory. You’ll find a directory called Documentation, and inside that a subdirectory called networking. The document we’re looking for in this case is ip-sysctl.txt. Open it in your favorite text editor.

  
/proc/sys/net/ipv4/* Variables:

ip_forward - BOOLEAN
        0 - disabled (default)
        not 0 - enabled

        Forward Packets between interfaces.

        This variable is special, its change resets all configuration
        parameters to their default state (RFC1122 for hosts, RFC1812
        for routers)

  

Have a browse around - there’s quite a bit of other doco. There’s an index file, 00-INDEX that lists what’s what.