Command processing

Roy Lyseng Roy.Lyseng at Sun.COM
Mon Mar 10 16:35:54 UTC 2008



marc at corky.net wrote:
> Roy Lyseng wrote:
> 
> In reality, a full blown string compare is not even needed in this 
> case.   One can loop through the command array comparing the first 
> character only, then the second, etc. 
> If the actual length of the command is stored in the array as well and 
> the length of the requested command is done beforehand, one needs to at 
> most only compare two characters and the length.  
>  From a quick look only the get/gets commands both start with the same 
> two characters - therefore the length must also be compared.   Otherwise 
> the entire string in this particular case.

Yes - and I guess strcmp() will bail out after the first non-match.

> 
> I could also argue that 'add' is almost certainly less frequency in most 
> people's apps than 'set'.
> In reality, we're nit-picking too much, a few tens/hundreds of clock 
> cycles more or less per request would not give us any meaningful 
> performance benefit.   Interrupts and context switches are the 
> bottlenecks when demand is extremely high - tens of thousands of 
> requests per second.
> 
> Just for fun I started porting memcached into a kernel module with the 
> goal of measuring performance differences between the standard one and 
> the kernel module - but I haven't finished that yet.    There would be 
> pretty large savings on context switching and interrupts - though I 
> don't think real world performance gains would be meaningful for 99% of 
> the users on this list.   Only a few push (or get close to pushing) 
> memcached to its limits.

Just want to add that the benchmarks I see show 20% time in user-space 
and 80% in kernel-space...

Roy
> 
> Marc
> 
>>
>>
>> Håkan Waara wrote:
>>> I haven't hacked this code myself, so forgive me if I'm totally 
>>> wrong, but couldn't you use a hash table instead below to avoid any 
>>> string comparison at all?
>>
>> I certainly could, but I do not think this is a performance pain point.
>>
>> But the array of command strings would be a great start for another 
>> optimization.
>>
>> Roy
>>>
>>> /Håkan
>>>
>>> Roy Lyseng wrote:
>>>> Hi,
>>>>
>>>> I have been looking at structuring the command processing in 
>>>> memcached slightly.
>>>>
>>>> I am thinking about creating an array with info about available 
>>>> commands:
>>>>
>>>> static struct {
>>>>     char *cmdword;    /* Command word string */
>>>>     int   cmd;        /* Command shorthand */
>>>>     int   mintokens;  /* Minimum number of tokens (required) */
>>>>     int   maxtokens;  /* Maximum number of tokens (zero means no 
>>>> limit) */
>>>> } cmds[] = {
>>>>    {"get",        CMD_GET,       3, 0},
>>>>    {"delete",     CMD_DELETE,    3, 5},
>>>>    {"add",        CMD_ADD,       6, 7},
>>>>    {"set",        CMD_SET,       6, 7},
>>>>    {"replace",    CMD_REPLACE,   6, 7},
>>>>    {"prepend",    CMD_PREPEND,   6, 7},
>>>>    {"append",     CMD_APPEND,    6, 7},
>>>>    {"gets",       CMD_GETS,      3, 0},
>>>>    {"cas",        CMD_CAS,       7, 8},
>>>>    {"incr",       CMD_INCR,      4, 5},
>>>>    {"decr",       CMD_DECR,      4, 5},
>>>>    {"bget",       CMD_BGET,      3, 0},
>>>>    {"own",        CMD_OWN,       3, 3},
>>>>    {"disown",     CMD_DISOWN,    3, 3},
>>>>    {"bg",         CMD_BG,        3, 3},
>>>>    {"stats",      CMD_STATS,     2, 0},
>>>>    {"flush_all",  CMD_FLUSH,     2, 4},
>>>>    {"version",    CMD_VERSION,   2, 2},
>>>>    {"quit",       CMD_QUIT,      2, 2},
>>>>    {"slabs",      CMD_SLABS,     5, 5},  /* Next token should be 
>>>> "reassign" */
>>>>    {"verbosity",  CMD_VERBOSITY, 3, 4},
>>>>    {NULL,         -1,            0, 0}   /* Terminate with a NULL 
>>>> string pointer */
>>>> };
>>>>
>>>> I have tried to sort the presumably most frequent commands first.
>>>>
>>>> process_commands() will then do:
>>>>
>>>>     ntokens = tokenize_command(command, tokens, MAX_TOKENS);
>>>>
>>>>     for (i = 0; cmds[i].cmdword != NULL; i++) {
>>>>         if (strcmp(tokens[COMMAND_TOKEN].value, cmds[i].cmdword) == 
>>>> 0) {
>>>>            cmd = cmds[i].cmd;
>>>>            break;
>>>>         }
>>>>     }
>>>>
>>>>     if (cmd < 0) {
>>>>         out_string(c, "ERROR");              /* Token not matched */
>>>>         return;
>>>>     }
>>>>     if (ntokens < cmds[i].mintokens ||
>>>>        (cmds[i].maxtokens > 0 && ntokens > cmds[i].maxtokens)) {
>>>>         out_string(c, "ERROR");              /* Invalid number of 
>>>> tokens for this cmd */
>>>>         return;
>>>>     }
>>>>
>>>>     c->item_comm = cmd;                      /* Command being 
>>>> processed on connection */
>>>>
>>>>     switch (cmd) {
>>>>     case CMD_GET:
>>>>     case CMD_BGET:
>>>>         process_get_command(c, tokens, ntokens, false);
>>>>         break;
>>>> ...
>>>>
>>>> Does this look interesting to you guys?
>>>>
>>>> Before I go any further with this, is there any other information 
>>>> that should be used to characterize commands?
>>>>
>>>> Are there commands that are not in use (BGET?)
>>>>
>>>> More?
>>>>
>>>> Thanks,
>>>> Roy
>>
>>
>> !DSPAM:47d5500917066856617375!
>>
> 


More information about the memcached mailing list