public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: Alexander Monakov <amonakov@ispras.ru>,
	Jakub Jelinek <jakub@redhat.com>, X86 ML <x86@kernel.org>,
	GCC Development <gcc@gcc.gnu.org>
Subject: Re: typeof and operands in named address spaces
Date: Thu, 5 Nov 2020 07:27:38 -0800	[thread overview]
Message-ID: <CALCETrW7k=yHLD=M=r0YQLreV+omVjYviA9x7qPivxe13WWx_w@mail.gmail.com> (raw)
In-Reply-To: <CAFULd4b_kBKf0i7f1ytyyJpGwdoWfiGjJ+aY4PLy5cyq5JWxVA@mail.gmail.com>

> On Nov 5, 2020, at 4:26 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Thu, Nov 5, 2020 at 1:14 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
>>> I was also thinking of introducing of operand modifier, but Richi
>>> advises the following:
>>>
>>> --cut here--
>>> typedef __UINTPTR_TYPE__ uintptr_t;
>>>
>>> __seg_fs int x;
>>>
>>> uintptr_t test (void)
>>> {
>>> uintptr_t *p = (uintptr_t *)(uintptr_t) &x;
>>> uintptr_t addr;
>>>
>>> asm volatile ("lea %1, %0" : "=r"(addr) : "m"(*p));
>>>
>>> return addr;
>>> }
>>
>> This is even worse undefined behavior compared to my solution above:
>> this code references memory in uintptr_t type, while mine preserves the
>> original type via __typeof. So this can visibly break with TBAA (though
>> the kernel uses -fno-strict-aliasing, so this particular concern wouldn't
>> apply there).
>
> Agreed, but I was trying to solve this lone use case in the kernel. It
> fits this particular usage, so I found a bit of overkill to implement
> the otherwise useless operand modifier in gcc. As discussed
> previously, these hacks are needed exclusively in asm templates, they
> are not needed in "normal" C code.
>>
>> If you don't care about preserving sizeof and type you can use a cast to char:
>>
>> #define strip_as(mem) (*(char *)(intptr_t)&(mem))
>
> I hope that a developer from kernel can chime in and express their
> opinion on the proposed approaches.
>

I haven’t looked all that closely at precisely what the kernel needs,
but I’ve had bad experiences with passing imprecise things into asm
“m” and “=m” operands. GCC seems to assume, quite reasonably, that if
I pass a value via “m” or “=m”, then I read or write *that value*.
So, if we use type hackery to produce an lvalue or rvalue that has the
address space stripped, then I would imagine I get UB — GCC will try
to understand what value I’m reading or writing, and this will only
match what I’m actually doing by luck.

It’s kind of like doing this (sorry for whitespace damage):

int read_int(int *ptr)
{
int ret; uintptr_t tmp;
asm (
"lea %[val], %[tmp]\n\t"
"mov 4(%[tmp]), %[ret]"
: [ret] "=r" (ret), [tmp] "+r" (tmp)
: [val] "m" (*(ptr - 1)));
return ret;
}

That code is obviously rather contrived, but I think it's
fundamentally the same type of hack as all these typeofs.  I haven't
tested precisely what GCC does, but I suspect we have:

int foo;
read_int(&foo);  // UB

int foo[2];
read_int(foo[1]);  // Maybe UB, but maybe non-UB that returns garbage

So I think a better constraint type would be an improvement.  Or maybe
a more general "pointer" constraint could be invented for this and
other use cases:

[name] "p" (ptr)

With this constraint, ptr must be uintptr_t or intptr_t.  %[name]
refers to ptr, formatted as a dereference operation.  So the generated
asm is identical to [name] "m" (*(char *)ptr), but the semantics are
different.  The problem is that I don't know how to specify the
semantics, but at least the instant UB of building and dereferencing a
garbage pointer would be avoided.

--Andy

  reply	other threads:[~2020-11-05 15:27 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04 18:31 Uros Bizjak
2020-11-05  7:26 ` Richard Biener
2020-11-05  8:56   ` Uros Bizjak
2020-11-05  9:36     ` Alexander Monakov
2020-11-05 10:33       ` Uros Bizjak
2020-11-05 11:38         ` Alexander Monakov
2020-11-05 12:00           ` Uros Bizjak
2020-11-05 12:14             ` Alexander Monakov
2020-11-05 12:24               ` Richard Biener
2020-11-05 12:32                 ` Uros Bizjak
2020-11-05 12:35                   ` Uros Bizjak
2020-11-05 13:22                     ` Alexander Monakov
2020-11-05 13:39                       ` Alexander Monakov
2020-11-05 13:46                         ` Uros Bizjak
2020-11-05 12:26               ` Uros Bizjak
2020-11-05 15:27                 ` Andy Lutomirski [this message]
2020-11-05 11:03       ` Uros Bizjak
2020-11-05  9:45     ` Richard Biener
2020-11-05  9:51       ` Jakub Jelinek
2020-11-09 12:47 ` Peter Zijlstra
2020-11-09 19:38   ` Segher Boessenkool
2020-11-09 19:50     ` Nick Desaulniers
2020-11-10  7:57       ` Peter Zijlstra
2020-11-10 18:42         ` Nick Desaulniers
2020-11-10 20:11           ` Peter Zijlstra
2020-11-12  0:40             ` Segher Boessenkool
2020-11-12  0:47         ` Segher Boessenkool
2020-11-10  7:52     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrW7k=yHLD=M=r0YQLreV+omVjYviA9x7qPivxe13WWx_w@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=amonakov@ispras.ru \
    --cc=gcc@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=ubizjak@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).