From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x836.google.com (mail-qt1-x836.google.com [IPv6:2607:f8b0:4864:20::836]) by sourceware.org (Postfix) with ESMTPS id 5CD683858C2C for ; Fri, 8 Jul 2022 07:59:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5CD683858C2C Received: by mail-qt1-x836.google.com with SMTP id c20so3971990qtw.8 for ; Fri, 08 Jul 2022 00:59:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LFQzgT5YWpxCleZovyjiHfFA5CzxtFgGJFrG9IGxOvE=; b=RZ29XWNRTIAqNHfY/Hmah2dbjnE8WJQ0z6SxCjMBGz9ZJB7E30e2EK2bBUBtS2jI9W TwH0/OqybwsYaRUi38I2IP2otV0hULsmQlNZiCSvOqRwPgAdWAVkJLoeQRXm/Xb1zqKI M6/jfL1q8hnp6ZV8pO2wyHTM7oeJP+p60TMhv65b82YHuPt3RRKbYJc4RFGsVH/oCExo 7EBKfNnBhGo9YrZifZEwK7H7B/7TUuejfnoti4hfLdC9jN8o2NnQ8Z/b0At1VU92u5vE ac74dPL1Brg5ZGEZl0bRYRepNf4QEHblXHkCEvtfu4EaPDfreA2dMMi9bPtE1+ECgZwW 7GzA== X-Gm-Message-State: AJIora/xpLyQSwmTjZxwUrdElTzAuemxk+SH8AArQk2KELhqU/O69fmo +SuBdyOuLTfe4fWWRITF7TPbTVusYQ9TLTkVP7g= X-Google-Smtp-Source: AGRyM1uYQNC1A/zSjt8Eb+5XvQ0iDtVRaj7ggNELfjQljKpNuApLzqE6JuV7uDNr1Ibqe4qj1/dW1Gskw9Ogq454BK4= X-Received: by 2002:a05:6214:c83:b0:470:b3e3:c25e with SMTP id r3-20020a0562140c8300b00470b3e3c25emr1614437qvr.1.1657267167795; Fri, 08 Jul 2022 00:59:27 -0700 (PDT) MIME-Version: 1.0 References: <003b01d8929a$70be2cc0$523a8640$@nextmovesoftware.com> In-Reply-To: <003b01d8929a$70be2cc0$523a8640$@nextmovesoftware.com> From: Uros Bizjak Date: Fri, 8 Jul 2022 09:59:16 +0200 Message-ID: Subject: Re: [x86 PATCH] Fun with flags: Adding stc/clc instructions to i386.md. To: Roger Sayle Cc: "gcc-patches@gcc.gnu.org" , Segher Boessenkool Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=0.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, MEDICAL_SUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jul 2022 07:59:30 -0000 On Fri, Jul 8, 2022 at 9:15 AM Roger Sayle wrote: > > > This patch adds support for x86's single-byte encoded stc (set carry flag) > and clc (clear carry flag) instructions to i386.md. > > The motivating example is the simple code snippet: > > unsigned int foo (unsigned int a, unsigned int b, unsigned int *c) > { > return __builtin_ia32_addcarryx_u32 (1, a, b, c); > } > > which uses the target built-in to generate an adc instruction, adding > together A and B with the incoming carry flag already set. Currently > for this mainline GCC generates (with -O2): > > movl $1, %eax > addb $-1, %al > adcl %esi, %edi > setc %al > movl %edi, (%rdx) > movzbl %al, %eax > ret > > where the first two instructions (to load 1 into a byte register and > then add 255 to it) are the idiom used to set the carry flag. This > is a little inefficient as x86 has a "stc" instruction for precisely > this purpose. With the attached patch we now generate: > > stc > adcl %esi, %edi > setc %al > movl %edi, (%rdx) > movzbl %al, %eax > ret Please note that STC/CLC is quite unoptimal on some older architectures. For example, Pentium4 has a latency of 10 due to false dependency of flags [1]. [1] https://agner.org/optimize/instruction_tables.pdf Uros.