public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org
Subject: Re: [PATCH] powerpc: Use aligned stores in memset
Date: Fri, 18 Aug 2017 12:13:00 -0000	[thread overview]
Message-ID: <dd9f2a2a-6e10-9b34-8538-33ac75608459@linaro.org> (raw)
In-Reply-To: <e7daca03-3e86-8cdf-9d42-4e7effb02c63@redhat.com>



On 18/08/2017 06:10, Florian Weimer wrote:
> On 08/18/2017 08:51 AM, Rajalakshmi Srinivasaraghavan wrote:
>>
>>
>> On 08/18/2017 11:51 AM, Florian Weimer wrote:
>>> On 08/18/2017 07:11 AM, Rajalakshmi Srinivasaraghavan wrote:
>>>>     * sysdeps/powerpc/powerpc64/power8/memset.S: Store byte by byte
>>>>     for unaligned inputs if size is less than 8.
>>>
>>> This makes me rather nervous.  powerpc64le was supposed to have
>>> reasonable efficient unaligned loads and stores.  GCC happily generates
>>> them, too.
>>
>> This is meant ONLY for caching inhibited accesses.  Caching Inhibited
>> accesses are required to be Guarded and properly aligned.
> 
> The intent is to support memset for such memory regions, right?  This
> change is insufficient.  You have to fix GCC as well because it will
> inline memset of unaligned pointers, like this:
> 
> typedef long __attribute__ ((aligned(1))) long_unaligned;
> 
> void
> clear (long_unaligned *p)
> {
>   memset (p, 0, sizeof (*p));
> }
> 
> clear:
> 	li 9,0
> 	std 9,0(3)
> 	blr
> 
> That's why I think your change is not useful in isolation.


POWER8 does have fast unaligned access memory and in fact unaligned access
could be used to provide a faster memcpy/memmove implementation (I created
one that I never sent upstream some time ago [1]). Unaligned accesses are
used extensively in some optimized str* implementation I created for POWER8. 
It also allows GCC to use unaligned access for builtin mem* operation without
issue on *most* of the cases.

The problem is memset/memcpy/memmove *specifically* are used in some userland
drivers for DMA (if I recall correctly for some XORG drivers) and for this
specific user cases using unaligned access, specially vector ones, will case
the kernel to trap on *every* unaligned instruction leading to abysmal
performance. That's why I pushed 87868c2418fb74357757e3b739ce5b76b17a8929
to fix this very issue for POWER7 memcpy.

We already discussed this same issue some time ago [2] to try overcome this
limitation. I think ideally the drivers that rely on aligned mem* operations
should we its own mem* operations (similar to how dpdk does [3]).

[1] https://github.com/zatrazz/glibc/commits/memopt-power8
[2] https://sourceware.org/ml/libc-alpha/2015-01/msg00130.html
[3] http://dpdk.org/browse/dpdk/tree/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h

  reply	other threads:[~2017-08-18 12:13 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-18  5:13 Rajalakshmi Srinivasaraghavan
2017-08-18  6:21 ` Florian Weimer
2017-08-18  6:51   ` Rajalakshmi Srinivasaraghavan
2017-08-18  9:10     ` Florian Weimer
2017-08-18 12:13       ` Adhemerval Zanella [this message]
2017-09-12 10:30       ` Florian Weimer
2017-09-12 12:18         ` Zack Weinberg
2017-09-12 13:57           ` Steven Munroe
2017-09-12 14:37           ` Joseph Myers
2017-09-12 15:06             ` Zack Weinberg
2017-09-12 17:09           ` Florian Weimer
2017-09-12 13:38         ` Steven Munroe
2017-09-12 14:08           ` Florian Weimer
2017-09-12 14:16             ` Steven Munroe
2017-09-12 17:04               ` Florian Weimer
2017-09-12 19:21                 ` Steven Munroe
2017-09-12 19:45                   ` Florian Weimer
2017-09-12 20:25                     ` Steven Munroe
2017-09-13 13:12         ` Tulio Magno Quites Machado Filho
2017-09-18 13:54           ` Florian Weimer
2017-10-03 18:29             ` Adhemerval Zanella
2017-10-05 12:13               ` Rajalakshmi Srinivasaraghavan
2017-11-08 18:52               ` Tulio Magno Quites Machado Filho
2017-12-08 19:52                 ` [PATCHv2] powerpc: POWER8 memcpy optimization for cached memory Tulio Magno Quites Machado Filho
2017-12-08 20:06                   ` Florian Weimer
2017-12-11 12:44                     ` Tulio Magno Quites Machado Filho
2017-12-11 20:09                       ` Adhemerval Zanella
2017-12-10  7:11                   ` Rajalakshmi Srinivasaraghavan
2017-12-11 19:48                     ` Tulio Magno Quites Machado Filho
2017-08-18  6:25 ` [PATCH] powerpc: Use aligned stores in memset Andrew Pinski
2017-08-21  2:20 ` Tulio Magno Quites Machado Filho

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd9f2a2a-6e10-9b34-8538-33ac75608459@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).