From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BAA033943434; Tue, 20 Oct 2020 11:12:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BAA033943434 From: "hubicka at ucw dot cz" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/97445] Some fonctions marked static inline in Linux kernel are not inlined Date: Tue, 20 Oct 2020 11:12:52 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 10.1.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: hubicka at ucw dot cz X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Oct 2020 11:12:52 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D97445 --- Comment #34 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D97445 >=20 > --- Comment #33 from Jakub Jelinek --- > (In reply to Jan Hubicka from comment #32) > > get_order is a wrapper around ffs64. This can be implemented w/o asm > > statement as follows: > > int > > my_fls64 (__u64 x) > > { > > if (!x) > > return 0; > > return 64 - __builtin_clzl (x); > > } > >=20 > > This results in longer assembly than the kernel asm implementation. If > > that matters I would replace builtin_constnat_p part of get_order by th= is > > implementation that is more transparent to the code size estimation and > > things will get inlined. >=20 > Better __builtin_clzll so that it works also on 32-bit arches. > Anyway, if kernel's fls64 results in better code than the my_fls64, we sh= ould > look at GCC's code generation for that case. Original asm is: __attribute__ ((noinline)) int fls64(__u64 x) { int bitpos =3D -1; asm("bsrq %1,%q0" : "+r" (bitpos) : "rm" (x)); return bitpos + 1; } There seems to be bug in bsr{q} pattern. I can make GCC produce same code with: __attribute__ ((noinline)) int my_fls64 (__u64 x) { asm volatile ("movl $-1, %eax"); return (__builtin_clzll (x) ^ 63) + 1; } But obviously the volatile asm should not be needed. I think bsrq is incorrectly modelled as returning full register (define_insn "bsr_rex64" [(set (match_operand:DI 0 "register_operand" "=3Dr") (minus:DI (const_int 63) (clz:DI (match_operand:DI 1 "nonimmediate_operand" "rm"))= )) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" "bsr{q}\t{%1, %0|%0, %1}" [(set_attr "type" "alu1") (set_attr "prefix_0f" "1") (set_attr "znver1_decode" "vector") (set_attr "mode" "DI")])=