[From nobody Sun Dec 12 22:28:42 2004 Return-Path: <dev-bounces@mail.sofari.com> Received: from dev.sofari.com (dev.sofari.com [127.0.0.1]) by mail.sofari.com (8.12.11/8.12.3/Debian-6.6) with ESMTP id iBD6Q1Vo024534; Sun, 12 Dec 2004 22:26:01 -0800 Received: from [192.168.1.100] (dsl081-071-021.sfo1.dsl.speakeasy.net [64.81.71.21]) by mail.sofari.com (8.12.11/8.12.3/Debian-6.6) with ESMTP id iBD6Pw7e024531; Sun, 12 Dec 2004 22:25:59 -0800 Message-ID: <41BD35F6.2040800@newsmonster.org> Date: Sun, 12 Dec 2004 22:25:58 -0800 From: "Kevin A. Burton" <burton@newsmonster.org> User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Greg Whalin <greg@meetup.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: dev@rojo.com Subject: [Dev] memcached option to always use non-serialized primitive types? X-BeenThere: dev@mail.sofari.com X-Mailman-Version: 2.1.4 Precedence: list List-Id: dev.mail.sofari.com List-Unsubscribe: <https://mail.sofari.com/mailman/listinfo/dev>, <mailto:dev-request@mail.sofari.com?subject=unsubscribe> List-Archive: <https://mail.sofari.com/pipermail/dev> List-Post: <mailto:dev@mail.sofari.com> List-Help: <mailto:dev-request@mail.sofari.com?subject=help> List-Subscribe: <https://mail.sofari.com/mailman/listinfo/dev>, <mailto:dev-request@mail.sofari.com?subject=subscribe> Sender: dev-bounces@mail.sofari.com Errors-To: dev-bounces@mail.sofari.com (sorry for the long email) Currently the Memcached driver for Java supports the setSerialize() option. This can increase performance in some situations but has a few issues: Code that performs class casting will throw ClassCastExceptions when setSerialize is enabled. For example: mc.set( "foo", new Integer( 1 ) ); Integer output = (Integer)mc.get( "foo" ); Will work just file when setSerialize is true but when its false will just throw a ClassCastException. Also internally it doesn't support Boolean and since toString is called wastes a lot of memory and causes additional performance issue. For example an Integer can take anywhere from 1 byte to 10 bytes. Due to the way the memcached slab allocator works it seems like a LOT of wasted memory to store primitive types as serialized objects (from a performance and memory perspective). In our applications we have millions of small objects and wasted memory would become a big problem. For example a Serialized Boolean takes 47 bytes which means it will fit into the 64byte LRU. Using 1 byte means it will fit into the 8 byte LRU thus saving 8x the memory. This also saves the CPU performance since we don't have to serialize bytes back and forth and we can compute the byte[] value directly. One problem would be when the user calls get() because doing so would require the app to know the type of the object stored as a bytearray inside memcached (since the user will probably cast). If we assume the basic types are interned we could use the first byte as the type with the remaining bytes as the value. Then on get() we could read the first byte to determine the type and then construct the correct object for it. This would prevent the ClassCastException I talked about above. We could remove the setSerialize() option and just assume that standard VM types are always internd in this manner. mc.set( "foo", new Boolean.TRUE ); Boolean b = (Boolean)mc.get( "foo" ); And the type casts would work because internally we would create a new Boolean to return back to the client. This would reduce memory footprint and allow for a virtual implementation of the Externalizable interface which is much faster than Serialzation. Currently the memory improvements would be: java.lang.Boolean - 8x performance improvement (now just two bytes) java.lang.Integer - 16x performance improvement (now just 5 bytes) Most of the other primitive types would benefit from this optimization. java.lang.Character being another obvious example. Thoughts? I was thinking about going ahead and performing this optimization since looking at the code it seems straight forward. I'd want you of course to agree that this is a good idea so you'd be eager to accept the patch :) I know it seems like I'm being really picky here but for our application I'd save 1G of memory right off the bat. We'd go down from 1.152G of memory used down to 144M of memory used which is much better IMO. Kevin -- Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an invite! Also see irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html If you're interested in RSS, Weblogs, Social Networking, etc... then you should work for Rojo! If you recommend someone and we hire them you'll get a free iPod! Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 _______________________________________________ Dev mailing list Dev@mail.sofari.com http://mail.sofari.com/mailman/listinfo/dev ]