Hi Kyrill, Thanks for your comments. Answers interleaved and the new patch attached. On 09/18/2015 11:04 AM, Kyrill Tkachov wrote: > > On 15/09/15 11:47, Christian Bruel wrote: >> >> On 09/14/2015 04:30 PM, Christian Bruel wrote: >>> Finally, the final part of the patch set does the attribute target >>> parsing and checking, redefines the preprocessor macros and implements >>> the inlining rules. >>> >>> testcases and documentation included. >>> >> new version to remove a shadowed remnant piece of code. >> >> >> > thanks >> > >> > Christian >> > > > + /* OK to inline between different modes. > + Function with mode specific instructions, e.g using asm, > + must be explicitely protected with noinline. */ > > s/explicitely/explicitly/ > thanks > > + const struct arm_fpu_desc *fpu_desc1 > + = &all_fpus[caller_opts->x_arm_fpu_index]; > + const struct arm_fpu_desc *fpu_desc2 > + = &all_fpus[callee_opts->x_arm_fpu_index]; > > Please call these caller_fpu and callee_fpu, it's much easier to reason about the inlining rules that way ok > > + > + /* Can't inline NEON extension if the caller doesn't support it. */ > + if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_NEON) > + && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_NEON)) > + return false; > + > + /* Can't inline CRYPTO extension if the caller doesn't support it. */ > + if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_CRYPTO) > + && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_CRYPTO)) > + return false; > + > > We also need to take into account FPU_FL_FP16... > In general what we want is for the callee FPU features to be > a subset of the callers features, similar to the way we handle > the x_aarch64_isa_flags handling in aarch64_can_inline_p from the > aarch64 port. I think that's the way to go here rather than explicitly > writing down a check for each feature. ok, with FL_FP16 now, > > @@ -242,6 +239,8 @@ > > /* Update macros. */ > gcc_assert (cur_opt->x_target_flags == target_flags); > + /* This one can be redefined by the pragma without warning. */ > + cpp_undef (parse_in, "__ARM_FP"); > arm_cpu_builtins (parse_in); > > Could you elaborate why the cpp_undef here? > If you want to undefine __ARM_FP so you can redefine it to a new value > in arm_cpu_builtins then I think you should just undefine it in that function. This is to avoid a warning: "__ARM_FP" redefined when creating a new pragma scope. (See the test attr-crypto.c). We cannot call the cpp_undef inside arm_cpu_builtins, because it is also used for the TARGET_CPU_CPP_BUILTINS hook and then would prevent real illegitimate redefinitions. Alternatively, I thought to reset the warn_builtin_macro_redefined flag, but that doesn't work as the macro is not NODE_BUILTIN (see the definition of warn_of_redefinition in libcpp). We might need to change this later : should target macros be marked as NOTE_BUILTIN ? We can discuss this separately (I can open a defect) as we have the cpp_undep solution for now, if you agree. > > > diff -ruN gnu_trunk.p3/gcc/gcc/doc/invoke.texi gnu_trunk.p4/gcc/gcc/doc/invoke.texi > --- gnu_trunk.p3/gcc/gcc/doc/invoke.texi 2015-09-10 12:21:00.698911244 +0200 > +++ gnu_trunk.p4/gcc/gcc/doc/invoke.texi 2015-09-14 10:27:20.281932581 +0200 > @@ -13360,6 +13363,8 @@ > floating-point arithmetic (in particular denormal values are treated as > zero), so the use of NEON instructions may lead to a loss of precision. > > +You can also set the fpu name at function level by using the @code{target("mfpu=")} function attributes (@pxref{ARM Function Attributes}) or pragmas (@pxref{Function Specific Option Pragmas}). > + > > s/"mfpu="/"fpu=" > thanks > > --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c 1970-01-01 01:00:00.000000000 +0100 > +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c 2015-09-14 16:12:08.449698268 +0200 > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_neon_ok } */ > +/* { dg-options "-O3 -mfloat-abi=softfp -ftree-vectorize" } */ > + > +void > +f3(int n, int x[], int y[]) { > + int i; > + for (i = 0; i < n; ++i) > + y[i] = x[i] << 3; > +} > + > > What if GCC has been configured with --with-fpu=neon? > Then f3 will be compiled assuming NEON. You should add a -mfpu=vfp to the dg-options. Ah yes. I've added ((target("fpu=vfp")) instead, since we are testing the attribute.