<HTML>
<HEAD>
<TITLE>Re: binary protocol notes from the facebook hackathon</TITLE>
</HEAD>
<BODY>
<FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'>I think little-endian wire protocol violates the principle of least astonishment. One of the things we considered when going through this is “will this be impossible to read with a protocol analyzer”, I’d rather see 00 00 00 01 than remember that 01 00 00 00 is a 4 byte, little endian field.<BR>
<BR>
Also, the cost of byte swapping is absolute noise compared to just doing a memory fetch and store. Whether I do<BR>
<BR>
foo->length = htonl(*(long*)(buf+4));<BR>
<BR>
Or<BR>
foo->length = *(long*)buf+4<BR>
<BR>
The cost is in the * and the =, not the htonl.<BR>
<BR>
I don’t think alignment is all that important. Values are byte aligned anyway and for the header the ability to cast directly from header-buffer to a struct is unsafe, unportable, and of unmeasurable performance gain.<BR>
<BR>
We discussed word alignment briefly, but since values may not be aligned we’d have to add an additional field to indicate the number of pad bytes and that would have to be word-al <BR>
<BR>
On 7/11/07 12:45 AM, "marc@corky.net" <marc@corky.net> wrote:<BR>
<BR>
</SPAN></FONT><BLOCKQUOTE><FONT SIZE="2"><FONT FACE="Monaco, Courier New"><SPAN STYLE='font-size:10.0px'>Hi everyone,<BR>
<BR>
I'm happy to see a nice and compact result with zero bloat. I'm also <BR>
happy you guys kept alignment within the request/response struct and <BR>
that would help performance.<BR>
<BR>
I see byte ordering is mentioned twice; the length field both in the <BR>
request and response.<BR>
<BR>
While network byte ordering (Big Endian) is traditionally the 'right' <BR>
thing to do (or the default thing to do), in most cases it's a minor <BR>
performance hit due to constant swapping. Since we're implementing a <BR>
binary protocol specifically to avoid/minimize minor performance hits <BR>
and since this is a brand new protocol I would recommend to keep all <BR>
values as Little Endian because:<BR>
<BR>
- It's easier that all values are kept to a the same endianess; reduces <BR>
confusion.<BR>
- Nowadays MOST (but obviously not all) servers are running little <BR>
endian. So this saves byte swapping for most people's cases and thus a <BR>
few cycles are spared on each request -- isn't that the whole point? ;)<BR>
<BR>
And since I mentioned alignment at the top; Would the entire packet, <BR>
including its payload, be aligned? It can be a waste of up to a three <BR>
bytes per stored object but could potentially improve performance a <BR>
little bit -- something entertaining to benchmark -- think we'll be able <BR>
to notice a timing difference above the noise level? :)<BR>
<BR>
Marc<BR>
</SPAN></FONT></FONT></BLOCKQUOTE><FONT SIZE="2"><FONT FACE="Monaco, Courier New"><SPAN STYLE='font-size:10.0px'><BR>
</SPAN></FONT></FONT>
</BODY>
</HTML>