Archive for the 'Xen' Category

Xen Network Performance Problem - TX Checksumming

Wednesday, August 19th, 2009

I was stung by this again today. Two virtual servers on the same hardware trying to communicate with each other. One running apache2 & php, and the other a MySQL database server. This happened just after a server reboot.

CPU, disk I/O and memory usage were all minimal on both servers and on the host, and yet the LAMP application was performing like a dog with one leg.

Fortunately I’ve seen this before, and (after banging my head on my keyboard a few times) I recognised the problem. Xen networking.

Basically, when the operating system sends a packet through the network, it computes a checksum of the data so the recipient can tell if it arrived intact. On a physical machine, the operating system offloads this operation to the network card as its chipset can handle it. However, in a virtual machine, there is no physical card to hand the checksumming off to - the network card is just an abstraction in software, and so this is terribly inefficient.

This problem only affects two virtual machines on the same hardware talking over the virtual network. [correction: Actually, it affects all network traffic, it’s just more noticeable between two adjacent VMs] So, not usually a problem, but when the application server needs to talk to it’s database server and they both happened to be on the same hardware it makes a huge performance difference.

To fix, use ethtool. First, check the settings (do this on each domU):


# ethtool -k eth0
Offload parameters for eth0:
Cannot get device rx csum settings: Operation not supported
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: off
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

TX checksumming is on. Turn it off:

ethtool -K eth0 tx off

And verify the result:


# ethtool -k eth0
Offload parameters for eth0:
Cannot get device rx csum settings: Operation not supported
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

I figured this out before, and fixed it, but when I rebooted the server the change was lost. I’ve now fixed it permanently in /etc/rc.local (you could also do this in a post-up network script, although rc.local will run after networking is up anyway).

I’ve already put this into the configuration management system (we’re using puppet - I’ll make that the topic of another post), but these are some old VMs that are not yet automated.

So - fixed. And the servers are humming along even better than before now after a hardware upgrade.

Gladserv

Friday, December 1st, 2006

I notice Google has picked up this blog recently so I guess I’d better start writing in it. Drafts of various articles have been underway for a while, but I’ve had little time to finish them. I expect I’ll be writing more over the Christmas break. Last year I spent considerable time testing and reviewing Open Source LAMP apps between eating various roasted animals and consuming vast quantities of alcohol. Bliss.

Gladserv.com is a step or two closer to being launched as a business hosted services provider. The domains are registered, the website is coming together, the second dedicated server has been ordered from Bytemark. An earlier order from UK2 was aborted when I discovered just how difficult they were to contact. Take a look at “Why Not To Use UK2” if you’re seriously considering them - cheap has more than one meaning. This server will be split into several virtual machines (VMs) using Xen with unused VMs sold off - there are already three other businesses on board.

I went to see the bank yesterday and my bank manager actually told me he thought my revenue estimates for the first year were very conservative. I tend to estimate on the side of caution these days, after previous bitter experience. This project is definitely gathering momentum.

I’m starting to promote the site. At the speed Google moves, I think it best to link first and write afterwards. I’ve put the shell of the site together using Website Baker, which is probably the easiest Content Management System (CMS) to set up and use I’ve come across. Graphics and pretty stuff will follow when someone with more visual talent than I provides them.

For the moment I have no need for the kind of fancy frippery that something like Joomla has built in. I usually spend the first hour on a new Joomla site turning everything off. For a simple business site, Website Baker has everything needed to get off the ground without additional distractions. There are some addons available to perform most commonly required functions, but nothing like the bewildering range of Joomla toys. Maybe later.

Yesterday I bought an incoming phone number from Gradwell and pointed it at an old Asterisk installation on my backup server. I’ve never used Gradwell for VOIP services before, but they came highly recommended to me so I thought I’d try them out. I’ve had less success with some other providers in the past. No problems at all so far. Online signup was straightforward. At one point I needed to phone for an authorisation code. At 1730 they answered the phone within a few rings and dealt with it on the spot. Provisioning of the line was immediate.

Asterisk setup is a topic for another day, but to add a new number into an existing setup is trivial. Add a few lines like this to iax.conf:

[08708618861]
type=user
username=myusername
secret=mypassword
context=iax-in
host=dynamic

and a line in extension.conf to tell asterisk where to direct incoming calls:

[iax-in]
exten => 08708618861,1,Goto(gladserv,s,1)

Easy. No need to get a man in at all.