[PATCH] utf8 flag support on perl lib

Sat Jan 12 21:41:52 UTC 2008

On Jan 12, 2008, at 12:59, Tomash Brechko wrote:

> On Sat, Jan 12, 2008 at 12:25:57 -0800, Dustin Sallings wrote:
>> 	I don't understand why you wouldn't want UTF-8 in common.
>
> Because I think that such flag would be a mistake, and want to limit
> the damage it will cause.  Imagine the application that reads a
> key/value pair, and stores it to memcached (Brian may have one).  If
> we decide to relay on UTF-8 flag when fetching data, then such app now
> would have to ask, "Dear user, but what's the meaning of the data we
> are uploading?  Is it binary, or a non-ASCII text?" (the data may come
> from arbitrary file, you can't assume it will always be the text
> encoded according to user's locale).  This information is obviously
> not relevant for the uploading process itself, but we would be bound
> to decide how the data would be retrieved later.  I find this very
> ridiculous.

	I may not be getting your point, but are you saying that the UTF-8  
flag is bad because someone might want to store data that isn't UTF-8  
and you need to be able to tell the difference?  Isn't that the  
purpose of the flag.

	I have separate flags for byte array and string.  I don't think  
that's that unusual.

>> 	I was under the impression that CRC32 hashing with
>> modulus bucketing was an LCD.
>
> Even C::M does a bit more:
>
>  ((crc32($key) >> 16) & 0x7fff) % number_of_servers
>
> There's also what's called the Ketama consistent hashing, and there
> you are free to choose any hash function for server names (C::M::F
> uses the same CRC32).


	Yes, I suppose both of these models and have tests that verify that I  
get the same results as the perl code above and the ketama C code.

-- 
Dustin Sallings