From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x536.google.com (mail-pg1-x536.google.com [IPv6:2607:f8b0:4864:20::536]) by sourceware.org (Postfix) with ESMTPS id 11D153893658 for ; Tue, 27 Apr 2021 01:14:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 11D153893658 Received: by mail-pg1-x536.google.com with SMTP id j7so31807103pgi.3 for ; Mon, 26 Apr 2021 18:14:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=uKtunPsWm/Kqm0DX9QMjAcp4BrjfvqWwGZEYKUPSflk=; b=U6zwKDTGVXmaV8qXKFdIR+dqzJiYSlZe43UDn2EifUwoo2slk6+ubVg4Oo0bSL6Z8I rBflJZILoMhkvkzjnXsj4is2T1hu+KHmXW4TL5/ZcypB4sMpuiz/sy1nXGy2LCKVTMtA FyD0bH3MwnV5CiXUPO2OLSE+4KryJE59dszAwaYyvGD088lolf7opmMvmo+H8dSnEhzl m7VbYd0xOO7nB576j6zbPaypICYGlAfqOnl2YOydzCLw9vcM2jZC2d+7vg4LQUtx3KqQ 72Ud5K1NRndAEeCGN8tVu/Vug+1AZZ+JD6ERlBNtSsCIrkVwkSxfxFEf3jg6eSAyi0Ou MfYQ== X-Gm-Message-State: AOAM531ZQCX71cWDzrHYUgUFnvrlXbfzmpkzNzDo2jlymD5M5gevT6rL u2wBE+FJMLRtsb4eI+WGwGP6Gej75I54uw== X-Google-Smtp-Source: ABdhPJzHikXlOFf6UvEwKqqz7XpqcyGGMNaZp5pY1/zpc8PVAzCCWt6kUDQclDwibagZSbFx4pMQvQ== X-Received: by 2002:a62:fb14:0:b029:22e:e189:c6b1 with SMTP id x20-20020a62fb140000b029022ee189c6b1mr21184821pfm.31.1619486068563; Mon, 26 Apr 2021 18:14:28 -0700 (PDT) Received: from gnu-cfl-2.localdomain ([172.58.35.177]) by smtp.gmail.com with ESMTPSA id j29sm11900202pgl.30.2021.04.26.18.14.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Apr 2021 18:14:28 -0700 (PDT) Received: from gnu-cfl-2.. (localhost [IPv6:::1]) by gnu-cfl-2.localdomain (Postfix) with ESMTP id ADB36C0308; Mon, 26 Apr 2021 18:14:26 -0700 (PDT) From: "H.J. Lu" To: gcc-patches@gcc.gnu.org Subject: [PATCH v3 0/2] Generate offset adjusted operation for op_by_pieces operations Date: Mon, 26 Apr 2021 18:14:24 -0700 Message-Id: <20210427011426.479089-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3029.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Apr 2021 01:14:32 -0000 Add an overlap_op_by_pieces_p target hook for op_by_pieces operations between two areas of memory to generate one offset adjusted operation in the smallest integer mode for the remaining bytes on the last piece operation of a memory region to avoid doing more than one smaller operations. Pass the RTL information from the previous iteration to m_constfn in op_by_pieces operation so that builtin_memset_[read|gen]_str can generate the new RTL from the previous RTL. The v3 changes: 1. Split changing a while loop in op_by_pieces_d::run to a do-while loop into a separate patch for easier review. 2. Simplify the builtin_memset_read_str change. 3. Document that offset adjusted operation is unaligned. The v2 changes are: 1. Added a target hook, TARGET_OVERLAP_OP_BY_PIECES_P. 2. Added a pointer argument to pieces_addr::adjust to pass the RTL information from the previous iteraton to m_constfn. 3. Updated builtin_memset_read_str and builtin_memset_gen_str to generate the new RTL from the previous RTL info. H.J. Lu (2): op_by_pieces_d::run: Change a while loop to a do-while loop Generate offset adjusted operation for op_by_pieces operations gcc/builtins.c | 36 ++++- gcc/builtins.h | 6 +- gcc/config/i386/i386.c | 3 + gcc/doc/tm.texi | 7 + gcc/doc/tm.texi.in | 2 + gcc/expr.c | 171 ++++++++++++++++----- gcc/expr.h | 10 +- gcc/target.def | 9 ++ gcc/testsuite/g++.dg/pr90773-1.h | 14 ++ gcc/testsuite/g++.dg/pr90773-1a.C | 13 ++ gcc/testsuite/g++.dg/pr90773-1b.C | 5 + gcc/testsuite/g++.dg/pr90773-1c.C | 5 + gcc/testsuite/g++.dg/pr90773-1d.C | 19 +++ gcc/testsuite/gcc.target/i386/pr90773-1.c | 17 ++ gcc/testsuite/gcc.target/i386/pr90773-10.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-11.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-12.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-13.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-14.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-2.c | 20 +++ gcc/testsuite/gcc.target/i386/pr90773-3.c | 23 +++ gcc/testsuite/gcc.target/i386/pr90773-4.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-5.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-6.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-7.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-8.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-9.c | 13 ++ 27 files changed, 446 insertions(+), 49 deletions(-) create mode 100644 gcc/testsuite/g++.dg/pr90773-1.h create mode 100644 gcc/testsuite/g++.dg/pr90773-1a.C create mode 100644 gcc/testsuite/g++.dg/pr90773-1b.C create mode 100644 gcc/testsuite/g++.dg/pr90773-1c.C create mode 100644 gcc/testsuite/g++.dg/pr90773-1d.C create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-10.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-11.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-12.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-13.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-14.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-5.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-6.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-7.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-8.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-9.c -- 2.31.1