Sunday, November 18, 2007

Compiling Firefox with tcmalloc

Here's how I compile firefox with tcmalloc to experiment if it helps with the fragmentation problem.

Follow the standard firefox build instructions here: http://developer.mozilla.org/en/docs/Build_Documentation

My .mozconfig:

. $topsrcdir/browser/config/mozconfig

# Options for client.mk.
mk_add_options MOZ_CO_PROJECT=browser
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-@CONFIG_GUESS@

# Options for 'configure' (same as command-line options).
ac_add_options --with-pthreads
ac_add_options --enable-application=browser


Patch to disable zone allocator because I want it to allocate everything, even small objects, directly via tcmalloc. There's probably a better way, but this was quick and dirty.

Index: nsprpub/pr/include/private/primpl.h
===================================================================
RCS file: /cvsroot/mozilla/nsprpub/pr/include/private/primpl.h,v
retrieving revision 3.87
diff -r3.87 primpl.h
1867c1867
< #define _PR_ZONE_ALLOCATOR
---
> /* # define _PR_ZONE_ALLOCATOR */


Build command to link tcalloc in:

LDFLAGS=-ltcmalloc make -f client.mk build


To run with the heap profiler I do:

HEAPPROFILE=/tmp/firefox ./firefox


This creates heap dumps in /tmp like firefox.0001.heap. By default it does it whenever the usage increases by 100MB and when a total of 1G of objects have been alloced (regardless how much is still in use). See http://google-perftools.googlecode.com/svn/trunk/doc/heapprofile.html for which env vars to set to change this, as well as how to examine the heap in pprof.

I don't really have any results to report yet. It would be cool to figure out how to instrument something like the images Pavlov has created showing the fragmentation. It shouldn't be that hard to do with tcmalloc or a malloc wrapper. A malloc wrapper would make it easier to compare different implementations, but might be harder to implement without adding a lot of overhead.

1 comment:

Unknown said...

I'll have images with tcmalloc pretty soon. We've got a new tool to replay allocation logs with different allocators so we can see how they compare. Early reports show both tcmalloc and nedmalloc being about 10% faster on pure malloc/free speed. Not sure how much faster things would be in tests that matter. I'm still hooking up code to get the fragmentation info out of tcmalloc that I need.