Saturday, December 5, 2009

Transforming Parenting

Rocker ran out of batteries way too often.

I knew I kept those old DC transformers for a reason.

Monday, August 10, 2009

Making the switch

Well that's funny.

Just a few hours after my last post, which suggested that virtio based networking might be getting bested by the not-in-userspace v-bus, Michael Tsirkin posts an in-kernel backend to virtio. Which puts the two on more or less the same procedural footing.

Fire up the benchmarks?

(not) switching contexts.

a lot of bits have been spilled over virtual network performance in v-bus vs virtio-net/virtio-pci.. (aka alacrity vm vs traditional kvm/qemu).. this includes some pretty sensational(ist?) performance graphs: here.

There are lots of details (and details do matter) but the first-order issue can probably be summed up thusly, from Avi Kivity on lklm:

The current conjecture is that ioq outperforms virtio because the host side of ioq is implemented in the host kernel, while the host side of virtio is implemented in userspace.

Perhaps context switching isn't such a minor detail afterall.

Tuesday, July 21, 2009

Caution - (S)Low Bridge Ahead

This post will not be satisfying. Someone has posted some great datapoints about virtualized packet forwarding, which is great. But they don't make a lot of sense. Which is not great. Nor is it satisfying.

Oh well, I'm sure there will be a followup sometime in the future.

In this thread, Or Gerlitz posts a new networking type for qemu (and by extension) kvm which are of course popular linux host virtualization packages. The networking type is "raw" and the driver couldn't be more simple - a (v)lan interface on the host is opened with a AF_PACKET socket and all of the packets that appear there are shoved through to the guest interface, and vice versa.

This is a pretty direct way of doing things, but it has the unfortunate side effect that all of the guests and the host itself are aggregated onto one upstream switch port without any kind of bridge, switch, or router in between. This means that unless the upstream switch can do a u-turn when forwarding (and most of them will not), all of the guests and the host are isolated from each other. The normal way of doing things is to attach the guests and host together with a tun/tap socket and run a bridge on host. This bridge will do all the necessary forwarding so that everybody has full connectivity, and it lets you run iptables and ebtables on the host to boot.

That's all well and good, but the really interesting part was the motivation for running around tun/tap/bridge anyhow: the poster runs a test with short udp transmissions over gige.. running it between two real (non-vm) hosts he sees 450K packets per second. The post doesn't mention what hardware is involved, so we'll just take it as a black box baseline. Switching the sender to be a qemu guest with traditional tap/bridge networking it plummets to just 195K. The "raw" interface gets that back up to 240K - which is still a far cry from 450, eh?

Tap mode has 3 times the context switches than the raw version. I don't think I saw a number for the non-vm test. Other than that nothing, including the profiles, really jumps out.

The whole thread is worth reading - but the main data points are here and here

Monday, July 20, 2009

What is eating those Google SYN-ACKs?

In this post, I mentioned google was seeing huge packet loss on syn-acks from their servers. At the time it looked like 2%. That sounded nuts.

It still sounds nuts.

Someone else on the mailing list posted about that, and Jerry Chu of Google confirmed it:

Our overall pkt retransmission rate often goes over 1%. I was
wondering if SYN/SYN-ACK pkts are less likely to be dropped
by some routers due to their smaller size so we collected traces
and computed SYN-ACK retransmissions rate on some servers.
We confirmed it to be consistent with the overall pkt drop rate,
i.e., > 1% often.

You could imagine why the overall retransmission rate might be higher than the real drop rate due to jitter and various fast retransmit algorithms that might retransmit things that just hadn't been acknowledged quite yet. Even SYNs might be dropped at the host (instead of the network) due to queue overflows and such.. but we're talking about SYN-ACKs from busy servers towards what one would expect would be pretty idle google-searching clients. And these SYN-ACKs have giant timeouts (3 seconds - which is why Jerry was writing in the first place) so it certainly isn't a matter of over-aggressive retransmit. The only explanation seems to be packet loss. At greater than 1%


This probably has more to do with the global nature of google's audience than anything else. But still, TCP can really suck at loss rates that high. It must be very different than the desktop Internet I know (which is a fair-to-middlin cable service, not a fancy Fiber-To-The-Home setup which is becoming more common.)

I wonder exactly where those losses happen.

Tuesday, July 14, 2009

Google Thinks TCP should be more Aggressive by Default

Really interesting post from Jerry Chu of Google. He says Google has data which shows that we ought to lower the initial RTO, increase the initial CWND, drop the min RTO, and reduced the delayed ack time out in TCP.

Based on my own anecdotal data, I've done stuff like that in products I've worked on. Let's face it - 3 seconds is a freaking eternity. Processors, networks, and busses have all scaled but these constants remain the same. Jerry says Google has data that shows this is important. As the google data set is no doubt much more extensive than any I worked with, that's a really welcome post.

Probably the most important data point Jerry shares is that "up to a few percentage points" in his data set exhibit a SYN-ACK retransmission from the google servers. Wow. (at least) 1 in 50 syn-acks needs to be retransmitted? That's not my experience at all, and if true on google scale it is absolutely fascinating. Are they generally seeing 2% packet loss on google tx? There's no way that they are seeing that.. google would appear to suck! So what's going on... ? Why is syn-ack rexmitted more than anything else? (and I'm assuming they are indeed lost, because otherwise lowering the timeout wouldn't be the right remedy..)

Sunday, June 28, 2009

'Violation' is so prejorative

Ben Hutchings says:

>[...] we also have architectural issues in violating layered
> software design

Meanwhile, in the real world, we want to avoid copying data, so an skb doesn't belong to any specific protocol layer.

Thank You Ben!

Abstractions aren't inherently good. They are great if they help you build or maintain things that are otherwise too complex to understand or too tediuos to work on - but we have to vigilantly remember that losing those details also sometimes restricts the quality of what we can build too as somethings are just inherently complex.

Monday, June 1, 2009

Software - Creative Economy, Blue Collar, or Just Rearranging Bits?

Via Ezra Klein, here is a really interesting bit from this Sunday's New York Times magazine:
The Case for Working with Your Hands.

I have often felt that as a software engineer the work I do is not substantively different than the work done by carpenters, architects, doctors, or in the case of the article above mechanics. For all of us, the most challenging work we do on a regular basis is that of trouble shooting. Sure, on a hand full of occasions I have had an opportunity to contribute a very high value insight to the construction of something new. On a couple of occasions I have even come up with an idea that enabled something that hadn't been done before. But generally, being a software "architect" involves making design choices from a bunch of well understood techniques, making measurements so you understand the problem space, and weaving the two together. Much of the job involves broadening your understanding of those choices and keeping up with the state of the art. The better you are at that, the better an "architect" you are. Writing code is a similar deal. It requires a whole lot of background, and a lot of diligence, but it is no more or less insightful than any of the other trades I mentioned. The quality of the code tends to correlate directly with the background and the diligence of its author. It is a very skillful occupation, but I question just how creative it is.

But solving a good bug - well that's really the litmus test of engineering skill for me. The article I cite above is really about the mental satisfaction of working on motorcycle engine bugs. The details are different, but the process is not. Each is proof of creativity, knowledge, and critical thinking. Of my favorite 10 software engineering experiences, at least 7 have to involve resolution of an inscrutable bug. It can bring together the most unexpected sets of facts and insights and leave you at the end of the day (or week, or month) with a sense of satisfaction that little us can professionally do.

A grand design is indeed grand. But most powerpoint architectures are not worth much more than the bits that hold it together. The value is in a robust working implementation. I think we vastly undervalue that as a society - and that's true of software, carpentry, architecture, and medicine all. Every once in a while a truly unique thought is illustrated in a powerpoint or an academic setting and I don't mean to undervalue the importance of a breakthrough idea. I am just saying that far too often we give the benefit of the doubt to design expressions and at the same time we don't value nearly enough the insight it takes to align all the details and make something run (be it software, an engine, your body, or a building).

To borrow from the article, where the author is discussing his job creating magazine article abstracts:

You might wonder: Wasn’t there any quality control? My supervisor would periodically read a few of my abstracts, and I was sometimes corrected and told not to begin an abstract with a dependent clause. But I was never confronted with an abstract I had written and told that it did not adequately reflect the article. The quality standards were the generic ones of grammar, which could be applied without my supervisor having to read the article at hand. Rather, my supervisor and I both were held to a metric that was conjured by someone remote from the work process — an absentee decision maker armed with a (putatively) profit-maximizing calculus, one that took no account of the intrinsic nature of the job. I wonder whether the resulting perversity really made for maximum profits in the long term. Corporate managers are not, after all, the owners of the businesses they run.

You see it time and time again when a business tries to scale up by "throwing resources" at a problem. As the real work becomes more abstract to the operators of the business, the quality of the work invariably declines. I would suggest the value of powerpoint architecture in such organizations rises at the same time.

Tuesday, May 26, 2009

VOIP Recorder: Phonebook.. aka the "Mom is calling" feature

I am continuing to add little features to VOIP Recorder that help round out the overall functionality.

The newest feature to join the party is a phonebook database. The entries in this database are automatically populated from Caller-ID information. They are designed to be easily edited in order to personalize the names associated with particular numbers.

After personalizing a number that new name is used for the pop-ups and logs anytime that number calls (or is called). The obvious use for this is to rename "Jane Smith" to be "Mom" so that when Mom does call, it is noted immediately!

The phonebook feature is in revision "o" of the VOIP Recorder Preview. It is accessed through the Caller-ID tab of VR's web console.

VOIP Recorder lets you record, block, and manage calls made with the Vonage ™ service. Check it out at

Monday, May 11, 2009

VOIP Recorder: Filter Anonymous Calls

I released a fun new feature for VOIP Recorder today: filters based on anonymous calls. Just set the calling number to be "anonymous" and you can block anonymous calls without ever ringing the phone. They will go to voice mail instead. You can of course use the filter to toggle the default record/do-not-record status as well.

Filters have always worked on any Caller-ID based name or number, and now they essentially work on the absence of a number as well.

Anonymous call blocking is in revision N of the VOIP Recorder preview. VR makes more out of your Vonage&trade service. Check it out at

Friday, May 1, 2009

VOIP Recorder: Listen Live

I've had the opportunity to add a few new features to my Vonage call recording application, VOIP Recorder.

The most entertaining feature is "Listen Live". That will stream the audio from any active phone call to your desktop in more or less real time. That's neat.

I have also added easy buttons on the "at-a-glance" screen to toggle the recording of an individual call on or off. These buttons compliment the touch tone sequences or Caller-ID based programmable filters that provide similar functionality.

Feedback on the first preview release has started to come in. Generally, it has been quite positive. A few people had trouble with the auto-discover portion of the program. I have made some updates to those algorithms to deal with more topologies and it seems more robust now. If you tried out VOIP Recorder earlier, and had problems auto-discovering your ATA, try and download the new release (revision 1-M or greater) and see if that helps. All accounts have been updated with the new release. If you have a problem please be sure to write me so we can make VOIP Recorder even better.

Also, thanks to an idea from Steve, I have added optional courtesy beeps. These are short beeps played periodically to remind everyone about the call recording. You can configure if they are played and, if so, how often they are played. They are off by default. I like the way they sound - they make a nice alternative to the full "recording" announcement insertion.

Last in the new feature department is the addition of a simple "*" filter which matches everything. This lets you write filters that, for instance, whitelist some specific phone numbers but block everyone else. Thanks to Chad for pointing out that omission.

So there is lots going on in the world of VOIP Recorder. You should check out the new release at - Linux, Macs, and Windows are all supported for recording calls made with Vonage(tm), as well as orchestrating pop-up notifications and call blocking based on Caller-ID information.

Sunday, April 19, 2009

device_create() and the linux shifting API

The kernel API for device_create() in 2.6.26 and previous versions was:

extern struct device *device_create(struct class *cls, struct device *parent,
dev_t devt, const char *fmt, ...)

and starting in 2.6.27 it changed to:

struct device *device_create(struct class *cls, struct device *parent,
dev_t devt, void *drvdata,
const char *fmt, ...)

Note the insertion of a fifth argument. In this case it is a void * at the fourth position in a function that takes a ... argument list.

This is more dangerous that the usual unstable evolutions in kernel APIs in that legacy code may continue to compile without warning on newer kernels, but it will of course crash as the first argument that was intended for the formatting string is now treated as the formatting string itself.

Some code is going to live out of the tree and trip over this. And some code is always going to live out of the tree - if for no other reason than the folks who control the commits have to (and should!) make judgments on what is appropriate, but of course other folks will disagree and carry on with their work. TCP Offload Engines are a good example of that kind of diversity.

Given that, I wonder what the reason for reusing the device_create() name was in between two incompatible versions of that function. There actually was an interim version of the new function called device_create_drvdata() that was used to migrate all of the in-tree uses over to the new style. At the end all the drvdata() versions were renamed back to device_create() where a safer path would seem to have been to simply remove device_create() all together to avoid confusion.

oh well, its not a big deal - but maybe this post will serve as google bait to help someone else resolve the issue more quickly than I could.

Thursday, April 16, 2009

Recording calls made with Vonage

I am looking for early adopters (isn't that a nice euphemism for tester?) for a new project I have been working on: VOIP Recorder.

VOIP Recorder is desktop software (available for Windows, Mac, and Linux) that records normal Vonage calls without any special configuration. Just run it on the same LAN as your Vonage ATA and VR will redirect the VOIP calls through your dekstop where it can make a copy. Playback and archive management is through an embedded web interface.

Read all about it and register for a free download at

VR has other features too: pop-ups with Caller-ID info, optional insertion of announcements, touch-tone based triggers, Caller-ID based call blocking, voicemail tracking, and more.

An Example Caller-ID Popup

Friday, February 6, 2009

Increasing Upload Speed from Firefox on Windows

Sometimes bugs are more interesting to work on than features - they have that mysterious quality about them and give a satisfying feeling when you figure it out.

This one was brought to my attention by Mark Finkle.

It basically boiled down to HTTP POSTs from Firefox on Windows being slower than they are in Internet Explorer, and also slower than they are in Firefox on OS X or Linux. (IE on windows and the non-windows platforms all perform about the same, with FF on windows lagging behind).

The culprit turned out to be the TCP congestion window. Firefox never had more than 8KB of un-acked data outstanding. If you have a network path with a high bandwidth-delay product, that isn't going to cut it.

Windows (up to and including Vista) has an 8KB default sending window. Or so I found out thanks to Google.

Autotuning that buffer size is standard practice on OS X and Linux and has been for a long while. Vista autotunes the receive buffer (but not XP according to what I read), but the send buffer is a small fixed value. IE, realizing that its a web 2.0 kinda world out there full of User Generated Content, must increase that value from its default - because I can look at the IE tcpdump traces and see >80KB of un-acknowledged data (there would be more, but the max window size is not the limit at whatever value they have it set to) in the same way I do with a trace of Firefox on Linux.

The Linux default is 128KB for any reasonably modern machine.

Fortunately that can be controlled on a system wide basis through a registry preference, or on a per socket basis by setting SO_SNDBUF. I submitted a patch that does the latter if the network.tcp.sendbuffer preference is set - the patch also sets the pref for windows.

If you would suffer from this, I see three options:
1] Wait until my patch (or a later rev of it) ends up in an official build
2] Set the registry property to change it for your whole Windows install - KB 950326.
3] you might be able to build a k3wl binary add-on that does the same thing as my patch in a crazy way. Fame, fortune, and faster flickr and picasa uploads await you.

Friday, January 30, 2009

Getting Vonage Caller-ID display notifications on Linux & Mac without a soft phone

(Update - April 2009: See also and for a one-stop answer to this problem on windows, mac, and linux)

I use vonage. What they really sell you is a POTS<-VOIP->POTS tunnel where they provide you one of the POTS/VOIP bridges that you install in your house in order to bring your old traditional phones on line. They also sell a soft-phone that does not include this bridge, but that isn't what I use.

It's a good service - unmetered calling for the places I call, and it comes with a bunch of phone features for a flat $28/month. The VOIP bits are done with SIP the usual way.

So that's lovely, but by default it doesn't provide any access to the SIP data beyond the POTS bridge and that presents a challenge to unlocking your data.

What I would appreciate would be desktop display notifications of the caller id data when the phone rings. This is pretty standard stuff when dealing with soft phones, but it seems to be a bit trickier in the vonage case.

So I rolled my own for KDE4 and OS X, which are the screens I spend my time staring at.

Step 1: Find the SIP invitations.

The SIP protocol is UDP unicast to the vonage "router". If you install the router (in my case a motorolla vt2142) doing double duty as your broadband gateway router, then it will consume those packets without ever sending them onto your LAN. If they're not on the LAN, then you can't really capture them and display the precious info inside, so a different arrangement is required.

I put the vonage box behind a Linux bridge. The bridge is just a linux box (in this case my file, email, and print server) with 2 interfaces. Those interfaces don't have IP addresses, instead they are brought together into logical interface commonly called br0. do this as: "brctl addbr br0; brctl addif br0 eth0; brctl addif br0 eth1" .. once you have done that the machine will act like an ethernet switch, forwarding packets between interfaces as necessary. You could set it up as an IP router instead, but then you would need different subnets and all manner of other duplicated architecture. The bridge is fine. The server doesn't need an IP address to be a bridge, but it does in order to keep doing those file/print things.. I just ran dhcp as normal on the new br0 interface. Now if you run tcpdump on the eth1 (or more specificlly the interface "behind" the bridge with the vonage device) you will see the vonage traffic crossing the bridge. Reading that data it is easy to see my SIP control runs on UDP port 10000. I hear other routers typically use port 5061.

Step 2: Capture those invitations

Now that you've got access to the SIP data, let's do something with it. I used the NFQUEUE iptables interface. NFQUEUE lets you shunt packets to userspace for filtering while they are still in the network stack. I wrote a simple iptables rule that matches data coming into port 10000 and places those packets into queue number 5061 for consumption by a userspace program: "/sbin/iptables -A FORWARD --protocol udp --dport 10000 -j NFQUEUE --queue-num 5061 -d"

Step 3: Process the invitations and generate network notifications

I wrote a little C program that runs on the bridge which consumes the packets in the NFQUEUE. For each packet it tries to figure out if this is a SIP invitation and if it is, what is the caller id info. All packets are acknowledged back to netfilter/iptables so they are passed onto the vonage router (which is what makes the phone ring!). If you wanted to do some automatic call blocking, this would be a good place to just drop the invite on the floor and then the phone would never ring.

The producer-nfqueue program is available here.

If a piece of caller-id info is found it is broadcast to the local LAN in two different formats. The first format contains just a magic number to identify the format and the caller id strings. It is sent on UDP port 7651. The second broadcast is in Growl network format. Growl is a daemon commonly used on mac OS X to display system notifications. Anybody running growl with "listen for incoming notifications" and "allow remote application registration" enabled will see a popup as soon as this broadcast takes place.

Step 4: KDE applet.

On my linux KDE4 environment, I wrote a kapplet that used a QSystemTrayIcon overload to listen for the port 7651 broadcasts. The effect is nice, but I would have rather had something gnome/kde cross platform. From doing some reading it appears I can inject something into dbus and knotify4 will pop it up as will gnotify, but I couldn't get that to work easily. It would also be a potential signal to things like pulseaudio to turn down the volume. oh well, maybe next version. The applet is available here.

and now I can be lazy and find out that the ringing phone isn't one I want to answer without having to break my train of thought. mission accomplished?