public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* typeof and operands in named address spaces
@ 2020-11-04 18:31 Uros Bizjak
  2020-11-05  7:26 ` Richard Biener
  2020-11-09 12:47 ` Peter Zijlstra
  0 siblings, 2 replies; 28+ messages in thread
From: Uros Bizjak @ 2020-11-04 18:31 UTC (permalink / raw)
  To: GCC Development; +Cc: X86 ML, Jakub Jelinek, Andy Lutomirski

Hello!

I was looking at the recent linux patch series [1] where segment
qualifiers (named address spaces) were introduced to handle percpu
variables. In the patch [2], the author mentions that:

--q--
Unfortunately, gcc does not provide a way to remove segment
qualifiers, which is needed to use typeof() to create local instances
of the per-cpu variable. For this reason, do not use the segment
qualifier for per-cpu variables, and do casting using the segment
qualifier instead.
--/q--

The core of the problem can be seen with the following testcase:

--cut here--
#define foo(_var)                    \
  ({                            \
    typeof(_var) tmp__;                    \
    asm ("mov %1, %0" : "=r"(tmp__) : "m"(_var));    \
    tmp__;                        \
  })

__seg_fs int x;

int test (void)
{
  int y;

  y = foo (x);
  return y;
}
--cut here--

when compiled with -O2 for x86 target, the compiler reports:

pcpu.c: In function ‘test’:
pcpu.c:14:3: error: ‘__seg_fs’ specified for auto variable ‘tmp__’

It looks to me that the compiler should remove address space
information when typeof is used, otherwise, there is no way to use
typeof as intended in the above example.

A related problem is exposed when we want to cast address from the
named address space to a generic address space (e.g. to use it with
LEA):

--cut here--
typedef __UINTPTR_TYPE__ uintptr_t;

__seg_fs int x;

uintptr_t test (void)
{
  uintptr_t *p = (uintptr_t *) &y;
  uintptr_t addr;

  asm volatile ("lea %1, %0" : "=r"(addr) : "m"(*p));

  return addr;
}
--cut here--

The gcc documentation advises explicit casts:

--q--
This means that explicit casts are required to convert pointers
between these address spaces and the generic address space.  In
practice the application should cast to 'uintptr_t' and apply the
segment base offset that it installed previously.
--/q--

However, a warning is emitted when compiling the above example:

pcpu1.c: In function ‘test’:
pcpu1.c:7:18: warning: cast to generic address space pointer from
disjoint __seg_fs address space pointer

but the desired result is obtained nevertheless.

       lea x(%rip), %rax

As shown in the referred patchset, named address spaces have quite
some optimization potential, please see [1] for the list.

[1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2053461.html
[2] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2053462.html

Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-04 18:31 typeof and operands in named address spaces Uros Bizjak
@ 2020-11-05  7:26 ` Richard Biener
  2020-11-05  8:56   ` Uros Bizjak
  2020-11-09 12:47 ` Peter Zijlstra
  1 sibling, 1 reply; 28+ messages in thread
From: Richard Biener @ 2020-11-05  7:26 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC Development, Jakub Jelinek, X86 ML, Andy Lutomirski

On Wed, Nov 4, 2020 at 7:33 PM Uros Bizjak via Gcc <gcc@gcc.gnu.org> wrote:
>
> Hello!
>
> I was looking at the recent linux patch series [1] where segment
> qualifiers (named address spaces) were introduced to handle percpu
> variables. In the patch [2], the author mentions that:
>
> --q--
> Unfortunately, gcc does not provide a way to remove segment
> qualifiers, which is needed to use typeof() to create local instances
> of the per-cpu variable. For this reason, do not use the segment
> qualifier for per-cpu variables, and do casting using the segment
> qualifier instead.
> --/q--
>
> The core of the problem can be seen with the following testcase:
>
> --cut here--
> #define foo(_var)                    \
>   ({                            \
>     typeof(_var) tmp__;                    \

Looks like writing

    typeof((typeof(_var))0) tmp__;

makes it work.  Assumes there's a literal zero for the type of course.
Basically I try to get at a rvalue for the typeof.

Is there a way to query the address space of an object so I can
put another variable in the same address space?

>     asm ("mov %1, %0" : "=r"(tmp__) : "m"(_var));    \
>     tmp__;                        \
>   })
>
> __seg_fs int x;
>
> int test (void)
> {
>   int y;
>
>   y = foo (x);
>   return y;
> }
> --cut here--
>
> when compiled with -O2 for x86 target, the compiler reports:
>
> pcpu.c: In function ‘test’:
> pcpu.c:14:3: error: ‘__seg_fs’ specified for auto variable ‘tmp__’
>
> It looks to me that the compiler should remove address space
> information when typeof is used, otherwise, there is no way to use
> typeof as intended in the above example.
>
> A related problem is exposed when we want to cast address from the
> named address space to a generic address space (e.g. to use it with
> LEA):
>
> --cut here--
> typedef __UINTPTR_TYPE__ uintptr_t;
>
> __seg_fs int x;
>
> uintptr_t test (void)
> {
>   uintptr_t *p = (uintptr_t *) &y;

   uintptr_t *p = (uintptr_t *)(uintptr_t) &y;

works around the warning.  I think the wording you cite
suggests (uintptr_t) &y here, not sure if there's a reliable
way to get the lea with just a uintptr_t operand though.

>   uintptr_t addr;
>
>   asm volatile ("lea %1, %0" : "=r"(addr) : "m"(*p));
>
>   return addr;
> }
> --cut here--
>
> The gcc documentation advises explicit casts:
>
> --q--
> This means that explicit casts are required to convert pointers
> between these address spaces and the generic address space.  In
> practice the application should cast to 'uintptr_t' and apply the
> segment base offset that it installed previously.
> --/q--
>
> However, a warning is emitted when compiling the above example:
>
> pcpu1.c: In function ‘test’:
> pcpu1.c:7:18: warning: cast to generic address space pointer from
> disjoint __seg_fs address space pointer
>
> but the desired result is obtained nevertheless.
>
>        lea x(%rip), %rax
>
> As shown in the referred patchset, named address spaces have quite
> some optimization potential, please see [1] for the list.
>
> [1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2053461.html
> [2] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2053462.html
>
> Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05  7:26 ` Richard Biener
@ 2020-11-05  8:56   ` Uros Bizjak
  2020-11-05  9:36     ` Alexander Monakov
  2020-11-05  9:45     ` Richard Biener
  0 siblings, 2 replies; 28+ messages in thread
From: Uros Bizjak @ 2020-11-05  8:56 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development, Jakub Jelinek, X86 ML, Andy Lutomirski

On Thu, Nov 5, 2020 at 8:26 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Wed, Nov 4, 2020 at 7:33 PM Uros Bizjak via Gcc <gcc@gcc.gnu.org> wrote:
> >
> > Hello!
> >
> > I was looking at the recent linux patch series [1] where segment
> > qualifiers (named address spaces) were introduced to handle percpu
> > variables. In the patch [2], the author mentions that:
> >
> > --q--
> > Unfortunately, gcc does not provide a way to remove segment
> > qualifiers, which is needed to use typeof() to create local instances
> > of the per-cpu variable. For this reason, do not use the segment
> > qualifier for per-cpu variables, and do casting using the segment
> > qualifier instead.
> > --/q--
> >
> > The core of the problem can be seen with the following testcase:
> >
> > --cut here--
> > #define foo(_var)                    \
> >   ({                            \
> >     typeof(_var) tmp__;                    \
>
> Looks like writing
>
>     typeof((typeof(_var))0) tmp__;
>
> makes it work.  Assumes there's a literal zero for the type of course.

This is very limiting assumption, which already breaks for the following test:

--cut here--
typedef struct { short a; short b; } pair_t;

#define foo(_var)                     \
  ({                             \
    typeof((typeof(_var))0) tmp__;             \
    asm ("mov %1, %0" : "=r"(tmp__) : "m"(_var));    \
    tmp__;                         \
  })

__seg_fs pair_t x;

pair_t
test (void)
{
  pair_t y;

  y = foo (x);
  return y;
}
--cut here--

So, what about introducing e.g. typeof_noas (not sure about the name)
that would simply strip the address space from typeof?

> Basically I try to get at a rvalue for the typeof.
>
> Is there a way to query the address space of an object so I can
> put another variable in the same address space?

I think that would go hand in hand with the above typeof_noas. Perhaps
typeof_as, that would return the address space of the variable?

> >     asm ("mov %1, %0" : "=r"(tmp__) : "m"(_var));    \
> >     tmp__;                        \
> >   })
> >
> > __seg_fs int x;
> >
> > int test (void)
> > {
> >   int y;
> >
> >   y = foo (x);
> >   return y;
> > }
> > --cut here--
> >
> > when compiled with -O2 for x86 target, the compiler reports:
> >
> > pcpu.c: In function ‘test’:
> > pcpu.c:14:3: error: ‘__seg_fs’ specified for auto variable ‘tmp__’
> >
> > It looks to me that the compiler should remove address space
> > information when typeof is used, otherwise, there is no way to use
> > typeof as intended in the above example.
> >
> > A related problem is exposed when we want to cast address from the
> > named address space to a generic address space (e.g. to use it with
> > LEA):
> >
> > --cut here--
> > typedef __UINTPTR_TYPE__ uintptr_t;
> >
> > __seg_fs int x;
> >
> > uintptr_t test (void)
> > {
> >   uintptr_t *p = (uintptr_t *) &y;
>
>    uintptr_t *p = (uintptr_t *)(uintptr_t) &y;

Indeed, this works as expected.

> works around the warning.  I think the wording you cite
> suggests (uintptr_t) &y here, not sure if there's a reliable
> way to get the lea with just a uintptr_t operand though.

No, because we have to use the "m" constraint for the LEA. We get the
following error:

as1.c:10:49: error: memory input 1 is not directly addressable

Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05  8:56   ` Uros Bizjak
@ 2020-11-05  9:36     ` Alexander Monakov
  2020-11-05 10:33       ` Uros Bizjak
  2020-11-05 11:03       ` Uros Bizjak
  2020-11-05  9:45     ` Richard Biener
  1 sibling, 2 replies; 28+ messages in thread
From: Alexander Monakov @ 2020-11-05  9:36 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: Richard Biener, Jakub Jelinek, GCC Development, X86 ML, Andy Lutomirski

On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:

> > Looks like writing
> >
> >     typeof((typeof(_var))0) tmp__;
> >
> > makes it work.  Assumes there's a literal zero for the type of course.
> 
> This is very limiting assumption, which already breaks for the following test:

To elaborate Richard's idea, you need a way to decay lvalue to rvalue inside
the typeof to strip the address space; if you need the macro to work for
more types than just scalar types, the following expression may be useful:

  typeof(0?(_var):(_var))

(though there's a bug: +(_var) should also suffice for scalar types, but
 somehow GCC keeps the address space on the resulting rvalue)

But I wonder if you actually need this at all:

> > works around the warning.  I think the wording you cite
> > suggests (uintptr_t) &y here, not sure if there's a reliable
> > way to get the lea with just a uintptr_t operand though.
> 
> No, because we have to use the "m" constraint for the LEA. We get the
> following error:

What is the usecase for stripping the address space for asm operands?
From reading the patch I understand the kernel wants to pass qualified
lvalues to inline assembly to get

  lea <reg>, %fs:<mem>

LEA without the %fs will produce the offset within the segment, which
you can obtain simply by casting the pointer to intptr_t in the first place.

Alexander

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05  8:56   ` Uros Bizjak
  2020-11-05  9:36     ` Alexander Monakov
@ 2020-11-05  9:45     ` Richard Biener
  2020-11-05  9:51       ` Jakub Jelinek
  1 sibling, 1 reply; 28+ messages in thread
From: Richard Biener @ 2020-11-05  9:45 UTC (permalink / raw)
  To: Uros Bizjak, Joseph S. Myers
  Cc: GCC Development, Jakub Jelinek, X86 ML, Andy Lutomirski

On Thu, Nov 5, 2020 at 9:56 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Thu, Nov 5, 2020 at 8:26 AM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Nov 4, 2020 at 7:33 PM Uros Bizjak via Gcc <gcc@gcc.gnu.org> wrote:
> > >
> > > Hello!
> > >
> > > I was looking at the recent linux patch series [1] where segment
> > > qualifiers (named address spaces) were introduced to handle percpu
> > > variables. In the patch [2], the author mentions that:
> > >
> > > --q--
> > > Unfortunately, gcc does not provide a way to remove segment
> > > qualifiers, which is needed to use typeof() to create local instances
> > > of the per-cpu variable. For this reason, do not use the segment
> > > qualifier for per-cpu variables, and do casting using the segment
> > > qualifier instead.
> > > --/q--
> > >
> > > The core of the problem can be seen with the following testcase:
> > >
> > > --cut here--
> > > #define foo(_var)                    \
> > >   ({                            \
> > >     typeof(_var) tmp__;                    \
> >
> > Looks like writing
> >
> >     typeof((typeof(_var))0) tmp__;
> >
> > makes it work.  Assumes there's a literal zero for the type of course.
>
> This is very limiting assumption, which already breaks for the following test:
>
> --cut here--
> typedef struct { short a; short b; } pair_t;
>
> #define foo(_var)                     \
>   ({                             \
>     typeof((typeof(_var))0) tmp__;             \
>     asm ("mov %1, %0" : "=r"(tmp__) : "m"(_var));    \
>     tmp__;                         \
>   })
>
> __seg_fs pair_t x;
>
> pair_t
> test (void)
> {
>   pair_t y;
>
>   y = foo (x);
>   return y;
> }
> --cut here--
>
> So, what about introducing e.g. typeof_noas (not sure about the name)
> that would simply strip the address space from typeof?

Well, I think we should fix typeof to not retain the address space.  It's
probably our implementation detail of having those in TYPE_QUALS
that exposes the issue and not standard mandated.

The rvalue trick is to avoid depending on a "fixed" GCC.

Joseph should know how typeof should behave here.

Richard.

> > Basically I try to get at a rvalue for the typeof.
> >
> > Is there a way to query the address space of an object so I can
> > put another variable in the same address space?
>
> I think that would go hand in hand with the above typeof_noas. Perhaps
> typeof_as, that would return the address space of the variable?
>
> > >     asm ("mov %1, %0" : "=r"(tmp__) : "m"(_var));    \
> > >     tmp__;                        \
> > >   })
> > >
> > > __seg_fs int x;
> > >
> > > int test (void)
> > > {
> > >   int y;
> > >
> > >   y = foo (x);
> > >   return y;
> > > }
> > > --cut here--
> > >
> > > when compiled with -O2 for x86 target, the compiler reports:
> > >
> > > pcpu.c: In function ‘test’:
> > > pcpu.c:14:3: error: ‘__seg_fs’ specified for auto variable ‘tmp__’
> > >
> > > It looks to me that the compiler should remove address space
> > > information when typeof is used, otherwise, there is no way to use
> > > typeof as intended in the above example.
> > >
> > > A related problem is exposed when we want to cast address from the
> > > named address space to a generic address space (e.g. to use it with
> > > LEA):
> > >
> > > --cut here--
> > > typedef __UINTPTR_TYPE__ uintptr_t;
> > >
> > > __seg_fs int x;
> > >
> > > uintptr_t test (void)
> > > {
> > >   uintptr_t *p = (uintptr_t *) &y;
> >
> >    uintptr_t *p = (uintptr_t *)(uintptr_t) &y;
>
> Indeed, this works as expected.
>
> > works around the warning.  I think the wording you cite
> > suggests (uintptr_t) &y here, not sure if there's a reliable
> > way to get the lea with just a uintptr_t operand though.
>
> No, because we have to use the "m" constraint for the LEA. We get the
> following error:
>
> as1.c:10:49: error: memory input 1 is not directly addressable
>
> Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05  9:45     ` Richard Biener
@ 2020-11-05  9:51       ` Jakub Jelinek
  0 siblings, 0 replies; 28+ messages in thread
From: Jakub Jelinek @ 2020-11-05  9:51 UTC (permalink / raw)
  To: Richard Biener
  Cc: Uros Bizjak, Joseph S. Myers, GCC Development, X86 ML, Andy Lutomirski

On Thu, Nov 05, 2020 at 10:45:59AM +0100, Richard Biener wrote:
> Well, I think we should fix typeof to not retain the address space.  It's
> probably our implementation detail of having those in TYPE_QUALS
> that exposes the issue and not standard mandated.
> 
> The rvalue trick is to avoid depending on a "fixed" GCC.
> 
> Joseph should know how typeof should behave here.

For other qualifiers like const it has been discussed recently in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97702
and I think the address space qualifiers should work the same as others.

	Jakub


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05  9:36     ` Alexander Monakov
@ 2020-11-05 10:33       ` Uros Bizjak
  2020-11-05 11:38         ` Alexander Monakov
  2020-11-05 11:03       ` Uros Bizjak
  1 sibling, 1 reply; 28+ messages in thread
From: Uros Bizjak @ 2020-11-05 10:33 UTC (permalink / raw)
  To: Alexander Monakov
  Cc: Richard Biener, Jakub Jelinek, GCC Development, X86 ML, Andy Lutomirski

On Thu, Nov 5, 2020 at 10:36 AM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:
>
> > > Looks like writing
> > >
> > >     typeof((typeof(_var))0) tmp__;
> > >
> > > makes it work.  Assumes there's a literal zero for the type of course.
> >
> > This is very limiting assumption, which already breaks for the following test:
>
> To elaborate Richard's idea, you need a way to decay lvalue to rvalue inside
> the typeof to strip the address space; if you need the macro to work for
> more types than just scalar types, the following expression may be useful:
>
>   typeof(0?(_var):(_var))
>
> (though there's a bug: +(_var) should also suffice for scalar types, but
>  somehow GCC keeps the address space on the resulting rvalue)
>
> But I wonder if you actually need this at all:
>
> > > works around the warning.  I think the wording you cite
> > > suggests (uintptr_t) &y here, not sure if there's a reliable
> > > way to get the lea with just a uintptr_t operand though.
> >
> > No, because we have to use the "m" constraint for the LEA. We get the
> > following error:
>
> What is the usecase for stripping the address space for asm operands?

Please see the end of [2], where the offset to <mem> is passed in %rsi
to the call to this_cpu_cmpxchg16b_emu. this_cpu_cmpxchg16b_emu
implements access with PER_CPU_VAR((%rsi)), which expands to
%gs:(%rsi), so it is the same as %gs:<mem> in cmpxchg16b alternative.
The offset is loaded by lea <mem>, %rsi to %rsi reg.

> From reading the patch I understand the kernel wants to pass qualified
> lvalues to inline assembly to get
>
>   lea <reg>, %fs:<mem>

No, this will emit an assembler warning that "segment override on
'lea' is ineffectual".

Uros.

> LEA without the %fs will produce the offset within the segment, which
> you can obtain simply by casting the pointer to intptr_t in the first place.

> Alexander

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05  9:36     ` Alexander Monakov
  2020-11-05 10:33       ` Uros Bizjak
@ 2020-11-05 11:03       ` Uros Bizjak
  1 sibling, 0 replies; 28+ messages in thread
From: Uros Bizjak @ 2020-11-05 11:03 UTC (permalink / raw)
  To: Alexander Monakov
  Cc: Richard Biener, Jakub Jelinek, GCC Development, X86 ML, Andy Lutomirski

On Thu, Nov 5, 2020 at 10:36 AM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:
>
> > > Looks like writing
> > >
> > >     typeof((typeof(_var))0) tmp__;
> > >
> > > makes it work.  Assumes there's a literal zero for the type of course.
> >
> > This is very limiting assumption, which already breaks for the following test:
>
> To elaborate Richard's idea, you need a way to decay lvalue to rvalue inside
> the typeof to strip the address space; if you need the macro to work for
> more types than just scalar types, the following expression may be useful:
>
>   typeof(0?(_var):(_var))

Great, this works well for various operand types.

> (though there's a bug: +(_var) should also suffice for scalar types, but
>  somehow GCC keeps the address space on the resulting rvalue)
>
> But I wonder if you actually need this at all:

The posted example is a bit naive, because assignment and basic
operations can be implemented directly, e.g.:

__seg_fs int x;

void
test (void)
{
  x &= 1;
}

compiles to:

       andl    $1, %fs:x(%rip)

without any macro usage at all. However, several operations, such as
xadd and cmpxchg are implemented using assembly templates (see e.g.
arch/x86/include/asm/percpu.h), where local instances are needed.

Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 10:33       ` Uros Bizjak
@ 2020-11-05 11:38         ` Alexander Monakov
  2020-11-05 12:00           ` Uros Bizjak
  0 siblings, 1 reply; 28+ messages in thread
From: Alexander Monakov @ 2020-11-05 11:38 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Jakub Jelinek, X86 ML, Andy Lutomirski, GCC Development

On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:

> > What is the usecase for stripping the address space for asm operands?
> 
> Please see the end of [2], where the offset to <mem> is passed in %rsi
> to the call to this_cpu_cmpxchg16b_emu. this_cpu_cmpxchg16b_emu
> implements access with PER_CPU_VAR((%rsi)), which expands to
> %gs:(%rsi), so it is the same as %gs:<mem> in cmpxchg16b alternative.
> The offset is loaded by lea <mem>, %rsi to %rsi reg.

I see, thanks. But then with the typeof-stripping-address-space solution
you'd be making a very evil cast (producing address of an object that
does not actually exist in the generic address space). I can write such
a solution, but it is clearly Undefined Behavior:

#define strip_as(mem) (*(__typeof(0?(mem):(mem))*)(intptr_t)&(mem))

void foo(__seg_fs int *x)
{
  asm("# %0" :: "m"(x[1]));
  asm("# %0" :: "m"(strip_as(x[1])));
}

yields

foo:
        # %fs:4(%rdi)
        # 4(%rdi)
        ret


I think a clean future solution is adding a operand modifier that would
print the memory operand without the segment prefix.

Alexander

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 11:38         ` Alexander Monakov
@ 2020-11-05 12:00           ` Uros Bizjak
  2020-11-05 12:14             ` Alexander Monakov
  0 siblings, 1 reply; 28+ messages in thread
From: Uros Bizjak @ 2020-11-05 12:00 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Jakub Jelinek, X86 ML, Andy Lutomirski, GCC Development

On Thu, Nov 5, 2020 at 12:38 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:
>
> > > What is the usecase for stripping the address space for asm operands?
> >
> > Please see the end of [2], where the offset to <mem> is passed in %rsi
> > to the call to this_cpu_cmpxchg16b_emu. this_cpu_cmpxchg16b_emu
> > implements access with PER_CPU_VAR((%rsi)), which expands to
> > %gs:(%rsi), so it is the same as %gs:<mem> in cmpxchg16b alternative.
> > The offset is loaded by lea <mem>, %rsi to %rsi reg.
>
> I see, thanks. But then with the typeof-stripping-address-space solution
> you'd be making a very evil cast (producing address of an object that
> does not actually exist in the generic address space). I can write such
> a solution, but it is clearly Undefined Behavior:
>
> #define strip_as(mem) (*(__typeof(0?(mem):(mem))*)(intptr_t)&(mem))
>
> void foo(__seg_fs int *x)
> {
>   asm("# %0" :: "m"(x[1]));
>   asm("# %0" :: "m"(strip_as(x[1])));
> }
>
> yields
>
> foo:
>         # %fs:4(%rdi)
>         # 4(%rdi)
>         ret
>
>
> I think a clean future solution is adding a operand modifier that would
> print the memory operand without the segment prefix.

I was also thinking of introducing of operand modifier, but Richi
advises the following:

--cut here--
typedef __UINTPTR_TYPE__ uintptr_t;

__seg_fs int x;

uintptr_t test (void)
{
 uintptr_t *p = (uintptr_t *)(uintptr_t) &x;
 uintptr_t addr;

 asm volatile ("lea %1, %0" : "=r"(addr) : "m"(*p));

 return addr;
}
--cut here--

Please note that the gcc documentation says:

--q--
6.17.4 x86 Named Address Spaces
-------------------------------

On the x86 target, variables may be declared as being relative to the
'%fs' or '%gs' segments.

'__seg_fs'
'__seg_gs'
    The object is accessed with the respective segment override prefix.

    The respective segment base must be set via some method specific to
    the operating system.  Rather than require an expensive system call
    to retrieve the segment base, these address spaces are not
    considered to be subspaces of the generic (flat) address space.
    This means that explicit casts are required to convert pointers
    between these address spaces and the generic address space.  In
    practice the application should cast to 'uintptr_t' and apply the
    segment base offset that it installed previously.

    The preprocessor symbols '__SEG_FS' and '__SEG_GS' are defined when
    these address spaces are supported.
--/q--

Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 12:00           ` Uros Bizjak
@ 2020-11-05 12:14             ` Alexander Monakov
  2020-11-05 12:24               ` Richard Biener
  2020-11-05 12:26               ` Uros Bizjak
  0 siblings, 2 replies; 28+ messages in thread
From: Alexander Monakov @ 2020-11-05 12:14 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Jakub Jelinek, X86 ML, Andy Lutomirski, GCC Development



On Thu, 5 Nov 2020, Uros Bizjak wrote:

> On Thu, Nov 5, 2020 at 12:38 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> >
> > On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:
> >
> > > > What is the usecase for stripping the address space for asm operands?
> > >
> > > Please see the end of [2], where the offset to <mem> is passed in %rsi
> > > to the call to this_cpu_cmpxchg16b_emu. this_cpu_cmpxchg16b_emu
> > > implements access with PER_CPU_VAR((%rsi)), which expands to
> > > %gs:(%rsi), so it is the same as %gs:<mem> in cmpxchg16b alternative.
> > > The offset is loaded by lea <mem>, %rsi to %rsi reg.
> >
> > I see, thanks. But then with the typeof-stripping-address-space solution
> > you'd be making a very evil cast (producing address of an object that
> > does not actually exist in the generic address space). I can write such
> > a solution, but it is clearly Undefined Behavior:
> >
> > #define strip_as(mem) (*(__typeof(0?(mem):(mem))*)(intptr_t)&(mem))
> >
> > void foo(__seg_fs int *x)
> > {
> >   asm("# %0" :: "m"(x[1]));
> >   asm("# %0" :: "m"(strip_as(x[1])));
> > }
> >
> > yields
> >
> > foo:
> >         # %fs:4(%rdi)
> >         # 4(%rdi)
> >         ret
> >
> >
> > I think a clean future solution is adding a operand modifier that would
> > print the memory operand without the segment prefix.
> 
> I was also thinking of introducing of operand modifier, but Richi
> advises the following:
> 
> --cut here--
> typedef __UINTPTR_TYPE__ uintptr_t;
> 
> __seg_fs int x;
> 
> uintptr_t test (void)
> {
>  uintptr_t *p = (uintptr_t *)(uintptr_t) &x;
>  uintptr_t addr;
> 
>  asm volatile ("lea %1, %0" : "=r"(addr) : "m"(*p));
> 
>  return addr;
> }

This is even worse undefined behavior compared to my solution above:
this code references memory in uintptr_t type, while mine preserves the
original type via __typeof. So this can visibly break with TBAA (though
the kernel uses -fno-strict-aliasing, so this particular concern wouldn't
apply there).

If you don't care about preserving sizeof and type you can use a cast to char:

#define strip_as(mem) (*(char *)(intptr_t)&(mem))

Alexander

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 12:14             ` Alexander Monakov
@ 2020-11-05 12:24               ` Richard Biener
  2020-11-05 12:32                 ` Uros Bizjak
  2020-11-05 12:26               ` Uros Bizjak
  1 sibling, 1 reply; 28+ messages in thread
From: Richard Biener @ 2020-11-05 12:24 UTC (permalink / raw)
  To: Alexander Monakov
  Cc: Uros Bizjak, Jakub Jelinek, GCC Development, X86 ML, Andy Lutomirski

On Thu, Nov 5, 2020 at 1:16 PM Alexander Monakov via Gcc
<gcc@gcc.gnu.org> wrote:
>
>
>
> On Thu, 5 Nov 2020, Uros Bizjak wrote:
>
> > On Thu, Nov 5, 2020 at 12:38 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> > >
> > > On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:
> > >
> > > > > What is the usecase for stripping the address space for asm operands?
> > > >
> > > > Please see the end of [2], where the offset to <mem> is passed in %rsi
> > > > to the call to this_cpu_cmpxchg16b_emu. this_cpu_cmpxchg16b_emu
> > > > implements access with PER_CPU_VAR((%rsi)), which expands to
> > > > %gs:(%rsi), so it is the same as %gs:<mem> in cmpxchg16b alternative.
> > > > The offset is loaded by lea <mem>, %rsi to %rsi reg.
> > >
> > > I see, thanks. But then with the typeof-stripping-address-space solution
> > > you'd be making a very evil cast (producing address of an object that
> > > does not actually exist in the generic address space). I can write such
> > > a solution, but it is clearly Undefined Behavior:
> > >
> > > #define strip_as(mem) (*(__typeof(0?(mem):(mem))*)(intptr_t)&(mem))
> > >
> > > void foo(__seg_fs int *x)
> > > {
> > >   asm("# %0" :: "m"(x[1]));
> > >   asm("# %0" :: "m"(strip_as(x[1])));
> > > }
> > >
> > > yields
> > >
> > > foo:
> > >         # %fs:4(%rdi)
> > >         # 4(%rdi)
> > >         ret
> > >
> > >
> > > I think a clean future solution is adding a operand modifier that would
> > > print the memory operand without the segment prefix.
> >
> > I was also thinking of introducing of operand modifier, but Richi
> > advises the following:
> >
> > --cut here--
> > typedef __UINTPTR_TYPE__ uintptr_t;
> >
> > __seg_fs int x;
> >
> > uintptr_t test (void)
> > {
> >  uintptr_t *p = (uintptr_t *)(uintptr_t) &x;
> >  uintptr_t addr;
> >
> >  asm volatile ("lea %1, %0" : "=r"(addr) : "m"(*p));
> >
> >  return addr;
> > }
>
> This is even worse undefined behavior compared to my solution above:
> this code references memory in uintptr_t type, while mine preserves the
> original type via __typeof. So this can visibly break with TBAA (though
> the kernel uses -fno-strict-aliasing, so this particular concern wouldn't
> apply there).
>
> If you don't care about preserving sizeof and type you can use a cast to char:
>
> #define strip_as(mem) (*(char *)(intptr_t)&(mem))

But in the end, on x86 the (uintptr_t)&x cast yields you exactly
the offset from the segment register, no?  The casting back
to (uintrptr_t *) and the "dereference" is just because the
inline asm is not able to build the lea otherwise?  that said,
sth like

 asm volatile ("lea fs:%1, %0" : "=r"(addr) : "r" ((uintptr_t)&x));

with the proper asm template should likely be used.

Of course in case the kernel wants transparent handling of
non-fs and fs-based cases that will be off.

Richard.

>
> Alexander

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 12:14             ` Alexander Monakov
  2020-11-05 12:24               ` Richard Biener
@ 2020-11-05 12:26               ` Uros Bizjak
  2020-11-05 15:27                 ` Andy Lutomirski
  1 sibling, 1 reply; 28+ messages in thread
From: Uros Bizjak @ 2020-11-05 12:26 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Jakub Jelinek, X86 ML, Andy Lutomirski, GCC Development

On Thu, Nov 5, 2020 at 1:14 PM Alexander Monakov <amonakov@ispras.ru> wrote:

> > I was also thinking of introducing of operand modifier, but Richi
> > advises the following:
> >
> > --cut here--
> > typedef __UINTPTR_TYPE__ uintptr_t;
> >
> > __seg_fs int x;
> >
> > uintptr_t test (void)
> > {
> >  uintptr_t *p = (uintptr_t *)(uintptr_t) &x;
> >  uintptr_t addr;
> >
> >  asm volatile ("lea %1, %0" : "=r"(addr) : "m"(*p));
> >
> >  return addr;
> > }
>
> This is even worse undefined behavior compared to my solution above:
> this code references memory in uintptr_t type, while mine preserves the
> original type via __typeof. So this can visibly break with TBAA (though
> the kernel uses -fno-strict-aliasing, so this particular concern wouldn't
> apply there).

Agreed, but I was trying to solve this lone use case in the kernel. It
fits this particular usage, so I found a bit of overkill to implement
the otherwise useless operand modifier in gcc. As discussed
previously, these hacks are needed exclusively in asm templates, they
are not needed in "normal" C code.
>
> If you don't care about preserving sizeof and type you can use a cast to char:
>
> #define strip_as(mem) (*(char *)(intptr_t)&(mem))

I hope that a developer from kernel can chime in and express their
opinion on the proposed approaches.

Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 12:24               ` Richard Biener
@ 2020-11-05 12:32                 ` Uros Bizjak
  2020-11-05 12:35                   ` Uros Bizjak
  0 siblings, 1 reply; 28+ messages in thread
From: Uros Bizjak @ 2020-11-05 12:32 UTC (permalink / raw)
  To: Richard Biener
  Cc: Alexander Monakov, Jakub Jelinek, GCC Development, X86 ML,
	Andy Lutomirski

On Thu, Nov 5, 2020 at 1:24 PM Richard Biener
<richard.guenther@gmail.com> wrote:

> > This is even worse undefined behavior compared to my solution above:
> > this code references memory in uintptr_t type, while mine preserves the
> > original type via __typeof. So this can visibly break with TBAA (though
> > the kernel uses -fno-strict-aliasing, so this particular concern wouldn't
> > apply there).
> >
> > If you don't care about preserving sizeof and type you can use a cast to char:
> >
> > #define strip_as(mem) (*(char *)(intptr_t)&(mem))
>
> But in the end, on x86 the (uintptr_t)&x cast yields you exactly
> the offset from the segment register, no?  The casting back
> to (uintrptr_t *) and the "dereference" is just because the
> inline asm is not able to build the lea otherwise?  that said,
> sth like
>
>  asm volatile ("lea fs:%1, %0" : "=r"(addr) : "r" ((uintptr_t)&x));

No, this is not how LEA operates. It needs a memory input operand. The
above will report "operand type mismatch for 'lea'" error.

Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 12:32                 ` Uros Bizjak
@ 2020-11-05 12:35                   ` Uros Bizjak
  2020-11-05 13:22                     ` Alexander Monakov
  0 siblings, 1 reply; 28+ messages in thread
From: Uros Bizjak @ 2020-11-05 12:35 UTC (permalink / raw)
  To: Richard Biener
  Cc: Alexander Monakov, Jakub Jelinek, GCC Development, X86 ML,
	Andy Lutomirski

On Thu, Nov 5, 2020 at 1:32 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Thu, Nov 5, 2020 at 1:24 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
>
> > > This is even worse undefined behavior compared to my solution above:
> > > this code references memory in uintptr_t type, while mine preserves the
> > > original type via __typeof. So this can visibly break with TBAA (though
> > > the kernel uses -fno-strict-aliasing, so this particular concern wouldn't
> > > apply there).
> > >
> > > If you don't care about preserving sizeof and type you can use a cast to char:
> > >
> > > #define strip_as(mem) (*(char *)(intptr_t)&(mem))
> >
> > But in the end, on x86 the (uintptr_t)&x cast yields you exactly
> > the offset from the segment register, no?  The casting back
> > to (uintrptr_t *) and the "dereference" is just because the
> > inline asm is not able to build the lea otherwise?  that said,
> > sth like
> >
> >  asm volatile ("lea fs:%1, %0" : "=r"(addr) : "r" ((uintptr_t)&x));
>
> No, this is not how LEA operates. It needs a memory input operand. The
> above will report "operand type mismatch for 'lea'" error.

The following will work:

  asm volatile ("lea (%1), %0" : "=r"(addr) : "r"((uintptr_t)&x));

Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 12:35                   ` Uros Bizjak
@ 2020-11-05 13:22                     ` Alexander Monakov
  2020-11-05 13:39                       ` Alexander Monakov
  0 siblings, 1 reply; 28+ messages in thread
From: Alexander Monakov @ 2020-11-05 13:22 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: Richard Biener, Jakub Jelinek, GCC Development, X86 ML, Andy Lutomirski

On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:

> > No, this is not how LEA operates. It needs a memory input operand. The
> > above will report "operand type mismatch for 'lea'" error.
> 
> The following will work:
> 
>   asm volatile ("lea (%1), %0" : "=r"(addr) : "r"((uintptr_t)&x));

This is the same as a plain move though, and the cast to uintptr_t doesn't
do anything, you can simply pass "r"(&x) to the same effect.

The main advantage of passing a "fake" memory location for use with lea is
avoiding base+offset computation outside the asm. If you're okay with one
extra register tied up by the asm, just pass the address to the asm directly:

void foo(__seg_fs int *x)
{
  asm("# %0 (%1)" :: "m"(x[1]), "r"(&x[1]));
  asm("# %0 (%1)" :: "m"(x[0]), "r"(&x[0]));
}

foo:
        leaq    4(%rdi), %rax
        # %fs:4(%rdi) (%rax)
        # %fs:(%rdi) (%rdi)
        ret

Alexander

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 13:22                     ` Alexander Monakov
@ 2020-11-05 13:39                       ` Alexander Monakov
  2020-11-05 13:46                         ` Uros Bizjak
  0 siblings, 1 reply; 28+ messages in thread
From: Alexander Monakov @ 2020-11-05 13:39 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Jakub Jelinek, X86 ML, Andy Lutomirski, GCC Development

On Thu, 5 Nov 2020, Alexander Monakov via Gcc wrote:

> On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:
> 
> > > No, this is not how LEA operates. It needs a memory input operand. The
> > > above will report "operand type mismatch for 'lea'" error.
> > 
> > The following will work:
> > 
> >   asm volatile ("lea (%1), %0" : "=r"(addr) : "r"((uintptr_t)&x));
> 
> This is the same as a plain move though, and the cast to uintptr_t doesn't
> do anything, you can simply pass "r"(&x) to the same effect.
> 
> The main advantage of passing a "fake" memory location for use with lea is
> avoiding base+offset computation outside the asm. If you're okay with one
> extra register tied up by the asm, just pass the address to the asm directly:
> 
> void foo(__seg_fs int *x)
> {
>   asm("# %0 (%1)" :: "m"(x[1]), "r"(&x[1]));
>   asm("# %0 (%1)" :: "m"(x[0]), "r"(&x[0]));
> }

Actually, in the original context the asm ties up %rsi no matter what (because
the operand must be in %rsi to make the call), so the code would just
pass "S"(&var) for the call alternative and "m"(var) for the native instruction.

Then the only disadvantage is useless mov/lea to %rsi on the common path when
the alternative selected at runtime is native.

Alexander

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 13:39                       ` Alexander Monakov
@ 2020-11-05 13:46                         ` Uros Bizjak
  0 siblings, 0 replies; 28+ messages in thread
From: Uros Bizjak @ 2020-11-05 13:46 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Jakub Jelinek, X86 ML, Andy Lutomirski, GCC Development

On Thu, Nov 5, 2020 at 2:39 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Thu, 5 Nov 2020, Alexander Monakov via Gcc wrote:
>
> > On Thu, 5 Nov 2020, Uros Bizjak via Gcc wrote:
> >
> > > > No, this is not how LEA operates. It needs a memory input operand. The
> > > > above will report "operand type mismatch for 'lea'" error.
> > >
> > > The following will work:
> > >
> > >   asm volatile ("lea (%1), %0" : "=r"(addr) : "r"((uintptr_t)&x));
> >
> > This is the same as a plain move though, and the cast to uintptr_t doesn't
> > do anything, you can simply pass "r"(&x) to the same effect.
> >
> > The main advantage of passing a "fake" memory location for use with lea is
> > avoiding base+offset computation outside the asm. If you're okay with one
> > extra register tied up by the asm, just pass the address to the asm directly:
> >
> > void foo(__seg_fs int *x)
> > {
> >   asm("# %0 (%1)" :: "m"(x[1]), "r"(&x[1]));
> >   asm("# %0 (%1)" :: "m"(x[0]), "r"(&x[0]));
> > }
>
> Actually, in the original context the asm ties up %rsi no matter what (because
> the operand must be in %rsi to make the call), so the code would just
> pass "S"(&var) for the call alternative and "m"(var) for the native instruction.

Or pass both, "m"(var), and

uintptr_t *p = (uintptr_t *)(uintptr_t) &var;

"m"(*p)  alternatives, similar to what is done in the original patch.
The copy to %rsi can then be a part of the alternative assembly.

Uros.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-05 12:26               ` Uros Bizjak
@ 2020-11-05 15:27                 ` Andy Lutomirski
  0 siblings, 0 replies; 28+ messages in thread
From: Andy Lutomirski @ 2020-11-05 15:27 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Alexander Monakov, Jakub Jelinek, X86 ML, GCC Development

> On Nov 5, 2020, at 4:26 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Thu, Nov 5, 2020 at 1:14 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
>>> I was also thinking of introducing of operand modifier, but Richi
>>> advises the following:
>>>
>>> --cut here--
>>> typedef __UINTPTR_TYPE__ uintptr_t;
>>>
>>> __seg_fs int x;
>>>
>>> uintptr_t test (void)
>>> {
>>> uintptr_t *p = (uintptr_t *)(uintptr_t) &x;
>>> uintptr_t addr;
>>>
>>> asm volatile ("lea %1, %0" : "=r"(addr) : "m"(*p));
>>>
>>> return addr;
>>> }
>>
>> This is even worse undefined behavior compared to my solution above:
>> this code references memory in uintptr_t type, while mine preserves the
>> original type via __typeof. So this can visibly break with TBAA (though
>> the kernel uses -fno-strict-aliasing, so this particular concern wouldn't
>> apply there).
>
> Agreed, but I was trying to solve this lone use case in the kernel. It
> fits this particular usage, so I found a bit of overkill to implement
> the otherwise useless operand modifier in gcc. As discussed
> previously, these hacks are needed exclusively in asm templates, they
> are not needed in "normal" C code.
>>
>> If you don't care about preserving sizeof and type you can use a cast to char:
>>
>> #define strip_as(mem) (*(char *)(intptr_t)&(mem))
>
> I hope that a developer from kernel can chime in and express their
> opinion on the proposed approaches.
>

I haven’t looked all that closely at precisely what the kernel needs,
but I’ve had bad experiences with passing imprecise things into asm
“m” and “=m” operands. GCC seems to assume, quite reasonably, that if
I pass a value via “m” or “=m”, then I read or write *that value*.
So, if we use type hackery to produce an lvalue or rvalue that has the
address space stripped, then I would imagine I get UB — GCC will try
to understand what value I’m reading or writing, and this will only
match what I’m actually doing by luck.

It’s kind of like doing this (sorry for whitespace damage):

int read_int(int *ptr)
{
int ret; uintptr_t tmp;
asm (
"lea %[val], %[tmp]\n\t"
"mov 4(%[tmp]), %[ret]"
: [ret] "=r" (ret), [tmp] "+r" (tmp)
: [val] "m" (*(ptr - 1)));
return ret;
}

That code is obviously rather contrived, but I think it's
fundamentally the same type of hack as all these typeofs.  I haven't
tested precisely what GCC does, but I suspect we have:

int foo;
read_int(&foo);  // UB

int foo[2];
read_int(foo[1]);  // Maybe UB, but maybe non-UB that returns garbage

So I think a better constraint type would be an improvement.  Or maybe
a more general "pointer" constraint could be invented for this and
other use cases:

[name] "p" (ptr)

With this constraint, ptr must be uintptr_t or intptr_t.  %[name]
refers to ptr, formatted as a dereference operation.  So the generated
asm is identical to [name] "m" (*(char *)ptr), but the semantics are
different.  The problem is that I don't know how to specify the
semantics, but at least the instant UB of building and dereferencing a
garbage pointer would be avoided.

--Andy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-04 18:31 typeof and operands in named address spaces Uros Bizjak
  2020-11-05  7:26 ` Richard Biener
@ 2020-11-09 12:47 ` Peter Zijlstra
  2020-11-09 19:38   ` Segher Boessenkool
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2020-11-09 12:47 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: GCC Development, X86 ML, Jakub Jelinek, Andy Lutomirski,
	linux-toolchains, segher, borntraeger, Will Deacon,
	Linus Torvalds, mpe


+ lots of people and linux-toolchains

On Wed, Nov 04, 2020 at 07:31:42PM +0100, Uros Bizjak wrote:
> Hello!
> 
> I was looking at the recent linux patch series [1] where segment
> qualifiers (named address spaces) were introduced to handle percpu
> variables. In the patch [2], the author mentions that:
> 
> --q--
> Unfortunately, gcc does not provide a way to remove segment
> qualifiers, which is needed to use typeof() to create local instances
> of the per-cpu variable. For this reason, do not use the segment
> qualifier for per-cpu variables, and do casting using the segment
> qualifier instead.
> --/q--

C in general does not provide means to strip qualifiers. We recently had
a _lot_ of 'fun' trying to strip volatile from a type, see here:

  https://lore.kernel.org/lkml/875zimp0ay.fsf@mpe.ellerman.id.au

which resulted in the current __unqual_scalar_typeof() hack.

If we're going to do compiler extentions here, can we pretty please have
a sane means of modifying qualifiers in general?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-09 12:47 ` Peter Zijlstra
@ 2020-11-09 19:38   ` Segher Boessenkool
  2020-11-09 19:50     ` Nick Desaulniers
  2020-11-10  7:52     ` Peter Zijlstra
  0 siblings, 2 replies; 28+ messages in thread
From: Segher Boessenkool @ 2020-11-09 19:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Uros Bizjak, GCC Development, X86 ML, Jakub Jelinek,
	Andy Lutomirski, linux-toolchains, borntraeger, Will Deacon,
	Linus Torvalds, mpe

On Mon, Nov 09, 2020 at 01:47:13PM +0100, Peter Zijlstra wrote:
> 
> + lots of people and linux-toolchains
> 
> On Wed, Nov 04, 2020 at 07:31:42PM +0100, Uros Bizjak wrote:
> > Hello!
> > 
> > I was looking at the recent linux patch series [1] where segment
> > qualifiers (named address spaces) were introduced to handle percpu
> > variables. In the patch [2], the author mentions that:
> > 
> > --q--
> > Unfortunately, gcc does not provide a way to remove segment
> > qualifiers, which is needed to use typeof() to create local instances
> > of the per-cpu variable. For this reason, do not use the segment
> > qualifier for per-cpu variables, and do casting using the segment
> > qualifier instead.
> > --/q--
> 
> C in general does not provide means to strip qualifiers.

Most ways you can try to use the result are undefined behaviour, even.

> We recently had
> a _lot_ of 'fun' trying to strip volatile from a type, see here:
> 
>   https://lore.kernel.org/lkml/875zimp0ay.fsf@mpe.ellerman.id.au
> 
> which resulted in the current __unqual_scalar_typeof() hack.
> 
> If we're going to do compiler extentions here, can we pretty please have
> a sane means of modifying qualifiers in general?

What do you want to do with it?  It may be more feasible to do a
compiler extension for *that*.


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-09 19:38   ` Segher Boessenkool
@ 2020-11-09 19:50     ` Nick Desaulniers
  2020-11-10  7:57       ` Peter Zijlstra
  2020-11-10  7:52     ` Peter Zijlstra
  1 sibling, 1 reply; 28+ messages in thread
From: Nick Desaulniers @ 2020-11-09 19:50 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Peter Zijlstra, Uros Bizjak, GCC Development, X86 ML,
	Jakub Jelinek, Andy Lutomirski, linux-toolchains,
	Christian Borntraeger, Will Deacon, Linus Torvalds,
	Michael Ellerman

On Mon, Nov 9, 2020 at 11:46 AM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Mon, Nov 09, 2020 at 01:47:13PM +0100, Peter Zijlstra wrote:
> >
> > + lots of people and linux-toolchains
> >
> > On Wed, Nov 04, 2020 at 07:31:42PM +0100, Uros Bizjak wrote:
> > > Hello!
> > >
> > > I was looking at the recent linux patch series [1] where segment
> > > qualifiers (named address spaces) were introduced to handle percpu
> > > variables. In the patch [2], the author mentions that:
> > >
> > > --q--
> > > Unfortunately, gcc does not provide a way to remove segment
> > > qualifiers, which is needed to use typeof() to create local instances
> > > of the per-cpu variable. For this reason, do not use the segment
> > > qualifier for per-cpu variables, and do casting using the segment
> > > qualifier instead.
> > > --/q--
> >
> > C in general does not provide means to strip qualifiers.
>
> Most ways you can try to use the result are undefined behaviour, even.

Yes, removing `const` from a `const` declared variable (via cast) then
expecting to use the result is a great way to have clang omit the use
from the final program.  This has bitten us in the past getting MIPS
support up and running, and one of the MTK gfx drivers.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-09 19:38   ` Segher Boessenkool
  2020-11-09 19:50     ` Nick Desaulniers
@ 2020-11-10  7:52     ` Peter Zijlstra
  1 sibling, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2020-11-10  7:52 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Uros Bizjak, GCC Development, X86 ML, Jakub Jelinek,
	Andy Lutomirski, linux-toolchains, borntraeger, Will Deacon,
	Linus Torvalds, mpe

On Mon, Nov 09, 2020 at 01:38:51PM -0600, Segher Boessenkool wrote:
> On Mon, Nov 09, 2020 at 01:47:13PM +0100, Peter Zijlstra wrote:
> > 
> > + lots of people and linux-toolchains
> > 
> > On Wed, Nov 04, 2020 at 07:31:42PM +0100, Uros Bizjak wrote:
> > > Hello!
> > > 
> > > I was looking at the recent linux patch series [1] where segment
> > > qualifiers (named address spaces) were introduced to handle percpu
> > > variables. In the patch [2], the author mentions that:
> > > 
> > > --q--
> > > Unfortunately, gcc does not provide a way to remove segment
> > > qualifiers, which is needed to use typeof() to create local instances
> > > of the per-cpu variable. For this reason, do not use the segment
> > > qualifier for per-cpu variables, and do casting using the segment
> > > qualifier instead.
> > > --/q--
> > 
> > C in general does not provide means to strip qualifiers.
> 
> Most ways you can try to use the result are undefined behaviour, even.
> 
> > We recently had
> > a _lot_ of 'fun' trying to strip volatile from a type, see here:
> > 
> >   https://lore.kernel.org/lkml/875zimp0ay.fsf@mpe.ellerman.id.au
> > 
> > which resulted in the current __unqual_scalar_typeof() hack.
> > 
> > If we're going to do compiler extentions here, can we pretty please have
> > a sane means of modifying qualifiers in general?
> 
> What do you want to do with it?  It may be more feasible to do a
> compiler extension for *that*.

Like with the parent use-case it's pretty much always declaring
temporaries in macros. We don't want the temporaries to be volatile, or
as the parent post points out, to have a segment qualifier.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-09 19:50     ` Nick Desaulniers
@ 2020-11-10  7:57       ` Peter Zijlstra
  2020-11-10 18:42         ` Nick Desaulniers
  2020-11-12  0:47         ` Segher Boessenkool
  0 siblings, 2 replies; 28+ messages in thread
From: Peter Zijlstra @ 2020-11-10  7:57 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Segher Boessenkool, Uros Bizjak, GCC Development, X86 ML,
	Jakub Jelinek, Andy Lutomirski, linux-toolchains,
	Christian Borntraeger, Will Deacon, Linus Torvalds,
	Michael Ellerman

On Mon, Nov 09, 2020 at 11:50:15AM -0800, Nick Desaulniers wrote:
> On Mon, Nov 9, 2020 at 11:46 AM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> >
> > On Mon, Nov 09, 2020 at 01:47:13PM +0100, Peter Zijlstra wrote:
> > >
> > > + lots of people and linux-toolchains
> > >
> > > On Wed, Nov 04, 2020 at 07:31:42PM +0100, Uros Bizjak wrote:
> > > > Hello!
> > > >
> > > > I was looking at the recent linux patch series [1] where segment
> > > > qualifiers (named address spaces) were introduced to handle percpu
> > > > variables. In the patch [2], the author mentions that:
> > > >
> > > > --q--
> > > > Unfortunately, gcc does not provide a way to remove segment
> > > > qualifiers, which is needed to use typeof() to create local instances
> > > > of the per-cpu variable. For this reason, do not use the segment
> > > > qualifier for per-cpu variables, and do casting using the segment
> > > > qualifier instead.
> > > > --/q--
> > >
> > > C in general does not provide means to strip qualifiers.
> >
> > Most ways you can try to use the result are undefined behaviour, even.
> 
> Yes, removing `const` from a `const` declared variable (via cast) then
> expecting to use the result is a great way to have clang omit the use
> from the final program.  This has bitten us in the past getting MIPS
> support up and running, and one of the MTK gfx drivers.

Stripping const to delcare another variable is useful though. Sure C has
sharp edges, esp. if you cast stuff, but since when did that stop anyone
;-)

The point is, C++ has these very nice template helpers that can strip
qualifiers, I want that too, for much of the same reasons. We might not
have templates :-(, but we've become very creative with our
pre-processor.

Surely our __unqual_scalar_typeof() cries for a better solution.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-10  7:57       ` Peter Zijlstra
@ 2020-11-10 18:42         ` Nick Desaulniers
  2020-11-10 20:11           ` Peter Zijlstra
  2020-11-12  0:47         ` Segher Boessenkool
  1 sibling, 1 reply; 28+ messages in thread
From: Nick Desaulniers @ 2020-11-10 18:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Segher Boessenkool, Uros Bizjak, GCC Development, X86 ML,
	Jakub Jelinek, Andy Lutomirski, linux-toolchains,
	Christian Borntraeger, Will Deacon, Linus Torvalds,
	Michael Ellerman

On Mon, Nov 9, 2020 at 11:57 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Stripping const to delcare another variable is useful though. Sure C has
> sharp edges, esp. if you cast stuff, but since when did that stop anyone
> ;-)
>
> The point is, C++ has these very nice template helpers that can strip
> qualifiers, I want that too, for much of the same reasons. We might not
> have templates :-(, but we've become very creative with our
> pre-processor.
>
> Surely our __unqual_scalar_typeof() cries for a better solution.

Yeah, and those macros bloat the hell out of our compile times, for
both compilers.  I think it's reasonable to provide variants of
typeof() that strip qualifiers.

Some questions to flesh out more of a design.

Would we want such a feature to strip all qualifiers or just specific
individual ones? The more specific variants could be composed, ie.

nonconst_typeof(x) y = x + 1;
nonvol_typeof(z) w = z + 1;
#define nonqual_typeof(v) nonconst_typeof(nonvol_typeof(v))
nonqual_typeof(v) k = v + 1;
vs just:
nonqual_typeof(v) k = v + 1;

When I think of qualifiers, I think of const and volatile.  I'm not
sure why the first post I'm cc'ed on talks about "segment" qualifiers.
Maybe it's in reference to a variable attribute that the kernel
defines?  Looking at Clang's Qualifier class, I see const, volatile,
restrict (ah, right), some Objective-C stuff, and address space
(TR18037 is referenced, I haven't looked up what that is) though maybe
"segment" pseudo qualifiers the kernel defines expand to address space
variable attributes?

Maybe stripping all qualifiers is fine since you can add them back in
if necessary?

const volatile foo;
const nonqual_typeof(foo) bar = foo; // strips off both qualifiers,
re-adds const to bar
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-10 18:42         ` Nick Desaulniers
@ 2020-11-10 20:11           ` Peter Zijlstra
  2020-11-12  0:40             ` Segher Boessenkool
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2020-11-10 20:11 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Segher Boessenkool, Uros Bizjak, GCC Development, X86 ML,
	Jakub Jelinek, Andy Lutomirski, linux-toolchains,
	Christian Borntraeger, Will Deacon, Linus Torvalds,
	Michael Ellerman

On Tue, Nov 10, 2020 at 10:42:58AM -0800, Nick Desaulniers wrote:

> When I think of qualifiers, I think of const and volatile.  I'm not
> sure why the first post I'm cc'ed on talks about "segment" qualifiers.
> Maybe it's in reference to a variable attribute that the kernel
> defines?  Looking at Clang's Qualifier class, I see const, volatile,
> restrict (ah, right), some Objective-C stuff, and address space
> (TR18037 is referenced, I haven't looked up what that is) though maybe
> "segment" pseudo qualifiers the kernel defines expand to address space
> variable attributes?

Right, x86 Named Address Space:

  https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Named-Address-Spaces.html#Named-Address-Spaces

Also, Google found me this:

  https://reviews.llvm.org/D64676

The basic problem seems to be they act exactly like qualifiers in that
typeof() preserves them, so if you have:

( and now I realize the parent isn't Cc'd to LKML, find here:
  https://gcc.gnu.org/pipermail/gcc/2020-November/234119.html )

> --cut here--
> #define foo(_var)                    \
> ({                            \
> typeof(_var) tmp__;                    \
> asm ("mov %1, %0" : "=r"(tmp__) : "m"(_var));    \
> tmp__;                        \
> })
>
> __seg_fs int x;
>
> int test (void)
> {
> int y;
>
> y = foo (x);
> return y;
> }
> --cut here--

> when compiled with -O2 for x86 target, the compiler reports:
>
> pcpu.c: In function ‘test’:
> pcpu.c:14:3: error: ‘__seg_fs’ specified for auto variable ‘tmp__’


> Maybe stripping all qualifiers is fine since you can add them back in
> if necessary?

So far that seems sufficient. Although the Devil's advocate in me is
trying to construct a case where we need to preserve const but strip
volatile and that's then means we need to detect if the original has
const or not, because unconditionally adding it will be wrong.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-10 20:11           ` Peter Zijlstra
@ 2020-11-12  0:40             ` Segher Boessenkool
  0 siblings, 0 replies; 28+ messages in thread
From: Segher Boessenkool @ 2020-11-12  0:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Uros Bizjak, GCC Development, X86 ML,
	Jakub Jelinek, Andy Lutomirski, linux-toolchains,
	Christian Borntraeger, Will Deacon, Linus Torvalds,
	Michael Ellerman

On Tue, Nov 10, 2020 at 09:11:08PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 10, 2020 at 10:42:58AM -0800, Nick Desaulniers wrote:
> > When I think of qualifiers, I think of const and volatile.  I'm not
> > sure why the first post I'm cc'ed on talks about "segment" qualifiers.
> > Maybe it's in reference to a variable attribute that the kernel
> > defines?  Looking at Clang's Qualifier class, I see const, volatile,
> > restrict (ah, right), some Objective-C stuff, and address space
> > (TR18037 is referenced, I haven't looked up what that is) though maybe
> > "segment" pseudo qualifiers the kernel defines expand to address space
> > variable attributes?
> 
> Right, x86 Named Address Space:
> 
>   https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Named-Address-Spaces.html#Named-Address-Spaces
> 
> Also, Google found me this:
> 
>   https://reviews.llvm.org/D64676
> 
> The basic problem seems to be they act exactly like qualifiers in that
> typeof() preserves them, so if you have:

GCC has the four standard type qualifiers (const, volatile, restrict,
and _Atomic), but also the address space things yes.

> > Maybe stripping all qualifiers is fine since you can add them back in
> > if necessary?
> 
> So far that seems sufficient. Although the Devil's advocate in me is
> trying to construct a case where we need to preserve const but strip
> volatile and that's then means we need to detect if the original has
> const or not, because unconditionally adding it will be wrong.

If you want to drop all qualifiers, you only need a way to convert
something to an rvalue (which always has an unqualified type).  So maybe
make syntax for just *that*?  __builtin_unqualified() perhaps?  Which
could be useful in more places than just doing an unqualified_typeof.


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: typeof and operands in named address spaces
  2020-11-10  7:57       ` Peter Zijlstra
  2020-11-10 18:42         ` Nick Desaulniers
@ 2020-11-12  0:47         ` Segher Boessenkool
  1 sibling, 0 replies; 28+ messages in thread
From: Segher Boessenkool @ 2020-11-12  0:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Uros Bizjak, GCC Development, X86 ML,
	Jakub Jelinek, Andy Lutomirski, linux-toolchains,
	Christian Borntraeger, Will Deacon, Linus Torvalds,
	Michael Ellerman

On Tue, Nov 10, 2020 at 08:57:42AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 09, 2020 at 11:50:15AM -0800, Nick Desaulniers wrote:
> > On Mon, Nov 9, 2020 at 11:46 AM Segher Boessenkool
> > <segher@kernel.crashing.org> wrote:
> > > On Mon, Nov 09, 2020 at 01:47:13PM +0100, Peter Zijlstra wrote:
> > > > C in general does not provide means to strip qualifiers.
> > >
> > > Most ways you can try to use the result are undefined behaviour, even.
> > 
> > Yes, removing `const` from a `const` declared variable (via cast) then
> > expecting to use the result is a great way to have clang omit the use
> > from the final program.  This has bitten us in the past getting MIPS
> > support up and running, and one of the MTK gfx drivers.
> 
> Stripping const to delcare another variable is useful though. Sure C has
> sharp edges, esp. if you cast stuff, but since when did that stop anyone
> ;-)

My point is that removing most qualifiers usually is a problem, so
before doing this, we should think if it is such a good plan, whether
there is a safer / saner solution, etc.


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-11-12  0:51 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-04 18:31 typeof and operands in named address spaces Uros Bizjak
2020-11-05  7:26 ` Richard Biener
2020-11-05  8:56   ` Uros Bizjak
2020-11-05  9:36     ` Alexander Monakov
2020-11-05 10:33       ` Uros Bizjak
2020-11-05 11:38         ` Alexander Monakov
2020-11-05 12:00           ` Uros Bizjak
2020-11-05 12:14             ` Alexander Monakov
2020-11-05 12:24               ` Richard Biener
2020-11-05 12:32                 ` Uros Bizjak
2020-11-05 12:35                   ` Uros Bizjak
2020-11-05 13:22                     ` Alexander Monakov
2020-11-05 13:39                       ` Alexander Monakov
2020-11-05 13:46                         ` Uros Bizjak
2020-11-05 12:26               ` Uros Bizjak
2020-11-05 15:27                 ` Andy Lutomirski
2020-11-05 11:03       ` Uros Bizjak
2020-11-05  9:45     ` Richard Biener
2020-11-05  9:51       ` Jakub Jelinek
2020-11-09 12:47 ` Peter Zijlstra
2020-11-09 19:38   ` Segher Boessenkool
2020-11-09 19:50     ` Nick Desaulniers
2020-11-10  7:57       ` Peter Zijlstra
2020-11-10 18:42         ` Nick Desaulniers
2020-11-10 20:11           ` Peter Zijlstra
2020-11-12  0:40             ` Segher Boessenkool
2020-11-12  0:47         ` Segher Boessenkool
2020-11-10  7:52     ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).