public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
* Build glibc for aarch64 with C-only versions of memcpy?
@ 2021-12-15 20:31 Miller, Tim
  2021-12-15 20:40 ` Florian Weimer
  0 siblings, 1 reply; 6+ messages in thread
From: Miller, Tim @ 2021-12-15 20:31 UTC (permalink / raw)
  To: libc-help

Hi,

I am doing some software profiling on an aarch64 system, and I’m using the Linux perf tool. The problem I’m running into is that “__GI___memcpy_simd” keeps showing up as the function with the most CPU usage.

Unfortunately, no matter what I do, this function keeps showing up as orphaned. That is, I cannot get a stack trace for it so I can find out who is calling it. I’ve rebuilt all of glibc with -no-omit-frame-pointer, but since this function is written in assembly, it doesn’t preserve the frame pointer. (I have tried using dwarf to get stack traces instead of the frame pointer, but that usually gets overloaded, and I still don’t get a stack trace for memcpy.)

I’ve spent about a day now, trying to hack it so that the C version in string/memcpy.c gets called instead, but everything I do leads to compile errors of some sort, either undefined symbols or doubly defined symbols. Basically, I can’t figure out how to untangle all the abstraction going on that seems to be involved in picking one memcpy implementation or another.

What is the proper way that one would go about disabling all assembly versions of memcpy for a custom built glibc?

Thanks.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build glibc for aarch64 with C-only versions of memcpy?
  2021-12-15 20:31 Build glibc for aarch64 with C-only versions of memcpy? Miller, Tim
@ 2021-12-15 20:40 ` Florian Weimer
  2021-12-15 21:13   ` Miller, Tim
  2021-12-15 21:28   ` Miller, Tim
  0 siblings, 2 replies; 6+ messages in thread
From: Florian Weimer @ 2021-12-15 20:40 UTC (permalink / raw)
  To: Miller, Tim via Libc-help

* Tim via Libc-help Miller:

> I am doing some software profiling on an aarch64 system, and I’m using
> the Linux perf tool. The problem I’m running into is that
> “__GI___memcpy_simd” keeps showing up as the function with the most
> CPU usage.

This sounds like a perf bug.  Surely it can look at the LR register and
see where the call is coming from?

If it shows up so high in profiles, I suggest to attach a debugger and
see if you can get to it to stop in the memcpy call, and take things
from there.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build glibc for aarch64 with C-only versions of memcpy?
  2021-12-15 20:40 ` Florian Weimer
@ 2021-12-15 21:13   ` Miller, Tim
  2021-12-15 21:28   ` Miller, Tim
  1 sibling, 0 replies; 6+ messages in thread
From: Miller, Tim @ 2021-12-15 21:13 UTC (permalink / raw)
  To: Florian Weimer, Miller, Tim via Libc-help



On 12/15/21, 3:41 PM, "Florian Weimer" <fweimer@redhat.com> wrote:

    * Tim via Libc-help Miller:

    > I am doing some software profiling on an aarch64 system, and I’m using
    > the Linux perf tool. The problem I’m running into is that
    > “__GI___memcpy_simd” keeps showing up as the function with the most
    > CPU usage.

    This sounds like a perf bug.  Surely it can look at the LR register and
    see where the call is coming from?

Are you referring to the LBR mode that perf supports? When I try to select that I get this error:

Error:
PMU Hardware doesn't support sampling/overflow-interrupts.

    If it shows up so high in profiles, I suggest to attach a debugger and
    see if you can get to it to stop in the memcpy call, and take things
    from there.

It's not that high. It's 3 to 4 percent. But it's the top user and MAY be one of the things contributing to a 10% performance regression I'm seeing.

I'll try the debugger. Thanks!

    Thanks,
    Florian



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build glibc for aarch64 with C-only versions of memcpy?
  2021-12-15 20:40 ` Florian Weimer
  2021-12-15 21:13   ` Miller, Tim
@ 2021-12-15 21:28   ` Miller, Tim
  2021-12-15 21:33     ` Florian Weimer
  1 sibling, 1 reply; 6+ messages in thread
From: Miller, Tim @ 2021-12-15 21:28 UTC (permalink / raw)
  To: Florian Weimer, Miller, Tim via Libc-help



On 12/15/21, 3:41 PM, "Florian Weimer" <fweimer@redhat.com> wrote:


    * Tim via Libc-help Miller:

    > I am doing some software profiling on an aarch64 system, and I’m using
    > the Linux perf tool. The problem I’m running into is that
    > “__GI___memcpy_simd” keeps showing up as the function with the most
    > CPU usage.

    This sounds like a perf bug.  Surely it can look at the LR register and
    see where the call is coming from?

    If it shows up so high in profiles, I suggest to attach a debugger and
    see if you can get to it to stop in the memcpy call, and take things
    from there.

Well, I tried that, but it keeps stopping at that breakpoint over and over again. I do get a stack trace, but what I need to find out is who is calling it MOST.

    Thanks,
    Florian



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build glibc for aarch64 with C-only versions of memcpy?
  2021-12-15 21:28   ` Miller, Tim
@ 2021-12-15 21:33     ` Florian Weimer
  2021-12-15 21:37       ` Miller, Tim
  0 siblings, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2021-12-15 21:33 UTC (permalink / raw)
  To: Miller, Tim; +Cc: Miller, Tim via Libc-help

* Tim Miller:

> Are you referring to the LBR mode that perf supports?

Not just that.  There are other call graph modes.  For example, you
could use a distribution that builds everything with DWARF unwinding
data, and use DWARF unwinding.

> Well, I tried that, but it keeps stopping at that breakpoint over and
> over again. I do get a stack trace, but what I need to find out is who
> is calling it MOST.

Don't set a breakpoint.  Just hit ^C from time to time to see if you can
hit the function.  But at 3% to 4%, this isn't going to work or is very
tedious.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build glibc for aarch64 with C-only versions of memcpy?
  2021-12-15 21:33     ` Florian Weimer
@ 2021-12-15 21:37       ` Miller, Tim
  0 siblings, 0 replies; 6+ messages in thread
From: Miller, Tim @ 2021-12-15 21:37 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Miller, Tim via Libc-help



On 12/15/21, 4:33 PM, "Florian Weimer" <fweimer@redhat.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



    * Tim Miller:

    > Are you referring to the LBR mode that perf supports?

    Not just that.  There are other call graph modes.  For example, you
    could use a distribution that builds everything with DWARF unwinding
    data, and use DWARF unwinding.

I have tried using dwarf mode, but it always gets overloaded.

Anyhow, I'll contact the perf mailing list to see if I can get some help from them.

It just would be nice if glibc had a build option that minimized the use of asm code for debugging purposes.

    Thanks,
    Florian



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-15 21:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-15 20:31 Build glibc for aarch64 with C-only versions of memcpy? Miller, Tim
2021-12-15 20:40 ` Florian Weimer
2021-12-15 21:13   ` Miller, Tim
2021-12-15 21:28   ` Miller, Tim
2021-12-15 21:33     ` Florian Weimer
2021-12-15 21:37       ` Miller, Tim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).