Command processing

Mon Mar 10 15:41:16 UTC 2008

Roy Lyseng wrote:

In reality, a full blown string compare is not even needed in this 
case.   One can loop through the command array comparing the first 
character only, then the second, etc.  

If the actual length of the command is stored in the array as well and 
the length of the requested command is done beforehand, one needs to at 
most only compare two characters and the length.   

 From a quick look only the get/gets commands both start with the same 
two characters - therefore the length must also be compared.   Otherwise 
the entire string in this particular case.

I could also argue that 'add' is almost certainly less frequency in most 
people's apps than 'set'. 

In reality, we're nit-picking too much, a few tens/hundreds of clock 
cycles more or less per request would not give us any meaningful 
performance benefit.   Interrupts and context switches are the 
bottlenecks when demand is extremely high - tens of thousands of 
requests per second.

Just for fun I started porting memcached into a kernel module with the 
goal of measuring performance differences between the standard one and 
the kernel module - but I haven't finished that yet.    There would be 
pretty large savings on context switching and interrupts - though I 
don't think real world performance gains would be meaningful for 99% of 
the users on this list.   Only a few push (or get close to pushing) 
memcached to its limits.

Marc

>
>
> Håkan Waara wrote:
>> I haven't hacked this code myself, so forgive me if I'm totally 
>> wrong, but couldn't you use a hash table instead below to avoid any 
>> string comparison at all?
>
> I certainly could, but I do not think this is a performance pain point.
>
> But the array of command strings would be a great start for another 
> optimization.
>
> Roy
>>
>> /Håkan
>>
>> Roy Lyseng wrote:
>>> Hi,
>>>
>>> I have been looking at structuring the command processing in 
>>> memcached slightly.
>>>
>>> I am thinking about creating an array with info about available 
>>> commands:
>>>
>>> static struct {
>>>     char *cmdword;    /* Command word string */
>>>     int   cmd;        /* Command shorthand */
>>>     int   mintokens;  /* Minimum number of tokens (required) */
>>>     int   maxtokens;  /* Maximum number of tokens (zero means no 
>>> limit) */
>>> } cmds[] = {
>>>    {"get",        CMD_GET,       3, 0},
>>>    {"delete",     CMD_DELETE,    3, 5},
>>>    {"add",        CMD_ADD,       6, 7},
>>>    {"set",        CMD_SET,       6, 7},
>>>    {"replace",    CMD_REPLACE,   6, 7},
>>>    {"prepend",    CMD_PREPEND,   6, 7},
>>>    {"append",     CMD_APPEND,    6, 7},
>>>    {"gets",       CMD_GETS,      3, 0},
>>>    {"cas",        CMD_CAS,       7, 8},
>>>    {"incr",       CMD_INCR,      4, 5},
>>>    {"decr",       CMD_DECR,      4, 5},
>>>    {"bget",       CMD_BGET,      3, 0},
>>>    {"own",        CMD_OWN,       3, 3},
>>>    {"disown",     CMD_DISOWN,    3, 3},
>>>    {"bg",         CMD_BG,        3, 3},
>>>    {"stats",      CMD_STATS,     2, 0},
>>>    {"flush_all",  CMD_FLUSH,     2, 4},
>>>    {"version",    CMD_VERSION,   2, 2},
>>>    {"quit",       CMD_QUIT,      2, 2},
>>>    {"slabs",      CMD_SLABS,     5, 5},  /* Next token should be 
>>> "reassign" */
>>>    {"verbosity",  CMD_VERBOSITY, 3, 4},
>>>    {NULL,         -1,            0, 0}   /* Terminate with a NULL 
>>> string pointer */
>>> };
>>>
>>> I have tried to sort the presumably most frequent commands first.
>>>
>>> process_commands() will then do:
>>>
>>>     ntokens = tokenize_command(command, tokens, MAX_TOKENS);
>>>
>>>     for (i = 0; cmds[i].cmdword != NULL; i++) {
>>>         if (strcmp(tokens[COMMAND_TOKEN].value, cmds[i].cmdword) == 
>>> 0) {
>>>            cmd = cmds[i].cmd;
>>>            break;
>>>         }
>>>     }
>>>
>>>     if (cmd < 0) {
>>>         out_string(c, "ERROR");              /* Token not matched */
>>>         return;
>>>     }
>>>     if (ntokens < cmds[i].mintokens ||
>>>        (cmds[i].maxtokens > 0 && ntokens > cmds[i].maxtokens)) {
>>>         out_string(c, "ERROR");              /* Invalid number of 
>>> tokens for this cmd */
>>>         return;
>>>     }
>>>
>>>     c->item_comm = cmd;                      /* Command being 
>>> processed on connection */
>>>
>>>     switch (cmd) {
>>>     case CMD_GET:
>>>     case CMD_BGET:
>>>         process_get_command(c, tokens, ntokens, false);
>>>         break;
>>> ...
>>>
>>> Does this look interesting to you guys?
>>>
>>> Before I go any further with this, is there any other information 
>>> that should be used to characterize commands?
>>>
>>> Are there commands that are not in use (BGET?)
>>>
>>> More?
>>>
>>> Thanks,
>>> Roy
>
>
> !DSPAM:47d5500917066856617375!
>