binary protocol notes from the facebook hackathon

Wed Jul 11 11:25:50 UTC 2007

Hi,

I initially wanted to comment only because we need to have a consistent 
protocol, so I was wondering why only the length field was marked as 
big-endian -- was it just a doc omission or was it the way it was 
intended to be.

Then comes the bit about performance.   The sole purpose of implementing 
a binary protocol is reduce parsing overhead, which it will clearly 
accomplish.   Parsing most likely uses several hundred cycles per 
request for doing what the binary protocol parser can do in a few tens 
of cycles at most.

At 60K requests per second per node that some of us do, we're at most 
losing several hundred thousand of cycles for swapping per second, 
something I am sure everyone can live with, especially due to the 
already low parsing overhead the binary protocol is aiming to accomplish 
so you can safely ignore my brain farts.

And as I said and noted by Roy and Steve, big-endian is the 
traditionally accepted way of  doing binary protocols.  It's based 
purely on historical reasons and in my opinion doesn't have to apply to 
new protocols anymore, especially not very simple ones.   Most CPUs are 
now little endian by default (even if they CAN do both -- PPC).   Sun 
most likely ships more little endian systems by now.

I would just like to see the binary protocol and its 'parser' as lean 
and mean as possible, that's all.   Coming from a background of .{s,asm} 
programming, little things like that can tick me off sometimes :)

Marc