On Tue, 2018-02-27 at 13:43 +0000, David Howells wrote:
Jeff Layton <jlayton(a)redhat.com> wrote:
> 0xffffffff813ae828 <+136>: je 0xffffffff813ae83a
> 0xffffffff813ae82a <+138>: mov 0x150(%rbp),%rcx
> 0xffffffff813ae831 <+145>: shr %rcx
> 0xffffffff813ae834 <+148>: cmp %rcx,0x20(%rax)
> 0xffffffff813ae838 <+152>: je 0xffffffff813ae862
Is it possible there's a stall between the load of RCX and the subsequent
instructions because they all have to wait for RCX to become available?
The interleaving between operating on RSI and RCX in the older code might
In addition, the load if the 20(%rax) value is now done in the CMP instruction
rather than earlier, so it might not get speculatively loaded in time, whereas
the earlier code explicitly loads it up front.
Thanks David, that makes sense.
At this point, I think we ought to wait and see what the results look
like without IMA compiled in at all.
It's possible we're misunderstanding this completely. At most, we'll be
hitting this once on every close of a file. It doesn't seem like that
ought to be causing something this noticeable though.
Jeff Layton <jlayton(a)redhat.com>