On Thu, 1 Oct 2015 09:17:42 +1000
Dave Chinner <david(a)fromorbit.com> wrote:
On Wed, Sep 30, 2015 at 06:03:59AM -0400, Jeff Layton wrote:
> Thanks for testing it and catching the problem in the first place!
> FWIW, the problem seems to have been bad hash distribution generated by
> hash_ptr on struct inode pointers. When the cache had ~10000 entries in
> it total, one of the hash chains had almost 2000 entries. When I
> switched to hashing on inode->i_ino, the distribution was much better.
> I'm not sure if it was just rotten luck or there is something about
> inode pointers that makes hash_ptr generate a lot of duplicates. That
> really could use more investigation...
Inode pointers have no entropy in the lower 9-10 bits because of
their size, and being allocated from a slab they are all going to
have the same set of values in the next 3-4 bits (i.e. offset into
the slab page which is defined by sizeof(inode)). Pointers also
have very similar upper bits, too, because they are all in kernel
hash_64 trys to fold all the entropy from the lower bits into into
the upper bits and then takes the result from the upper bits. Hence
if there is no entropy in either the lower or upper bits to start
with, then the hash may not end up with much entropy in it at all...
FWIW, see fs/inode.c::hash() to see how the fs code hashes inode
numbers (called from insert_inode_hash()). It's very different
because because inode numbers have the majority of their entropy in
the lower bits and (usually) none in the upper bits...
Thanks for the explanation, Dave. That makes sense.
In hindsight I should have looked at how the vfs code hashes inodes in
its hashtable. Given that we're basically creating "shadow" inode
structures here that would probably work fairly well.
Jeff Layton <jeff.layton(a)primarydata.com>