From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20367 invoked by alias); 15 Jan 2020 13:19:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 20236 invoked by uid 89); 15 Jan 2020 13:19:09 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=s.a, sb, 92765, REF X-HELO: mail-wr1-f52.google.com Received: from mail-wr1-f52.google.com (HELO mail-wr1-f52.google.com) (209.85.221.52) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 15 Jan 2020 13:18:59 +0000 Received: by mail-wr1-f52.google.com with SMTP id b6so15745247wrq.0 for ; Wed, 15 Jan 2020 05:18:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=to:from:subject:message-id:date:user-agent:mime-version :content-language; bh=jsuU2DTOopGPPi8/QCRpXM/wUpNbLMJJsKu5u/bV4ik=; b=DIU2FpfmVs3XvENpa0+N934uF5IdIg4q8oNNaBcbmBcbOZSfZANVfyH+YilsJ+v/U8 e62J1ibGwbRIwhmU6izNGXrj4LsH+b74QepoOQjU61LcBRBUAQ7GxXq6iGgez3dRpYpM HxxI5AOxebNI+yoLrMKURBzkWALwN8puGvKPYkRknPXpOC/ypDcqK108Yz7t7YN1o7+5 Cxw0tDEvhpFARfHyFDww5s3c56ghcpP3fYgm9HswLEy3fP9g2yi/1zUTAkIzJP/Sapop lhfj1AFHU9dbMuNTcXQEE6SFq9TIuvDnculNRwYxAH/BJjSYx6IHm8/196RKKaW6CmBt zmKA== Return-Path: Received: from ?IPv6:2a02:8084:d6bd:2b80:df53:c350:cc3e:ee3d? ([2a02:8084:d6bd:2b80:df53:c350:cc3e:ee3d]) by smtp.gmail.com with ESMTPSA id t12sm24186847wrs.96.2020.01.15.05.18.56 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 15 Jan 2020 05:18:56 -0800 (PST) To: gcc-patches@gcc.gnu.org From: Martin Sebor Subject: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765) Message-ID: <64a504e5-8e1f-682a-0443-e74e62d3aac1@gmail.com> Date: Wed, 15 Jan 2020 13:25:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------5A1C54AF85682E78BB2F8CC5" X-IsSubscribed: yes X-SW-Source: 2020-01/txt/msg00883.txt.bz2 This is a multi-part message in MIME format. --------------5A1C54AF85682E78BB2F8CC5 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-length: 1143 The strcmp optimization newly introduced in GCC 10 relies on the size of the smallest referenced array object to determine whether the function can return zero. When the size of the object is smaller than the length of the other string argument the optimization folds the equality to false. The bug report has identified a couple of problems here: 1) when the access to the array object is via a pointer to a (possibly indirect) member of a union, in GIMPLE the pointer may actually point to a different member than the one in the original source code. Thus the size of the array may appear to be smaller than in the source code which can then result in the optimization being invalid. 2) when the pointer in the access may point to two or more arrays of different size (i.e., it's the result of a PHI), assuming it points to the smallest of them can also lead to an incorrect result when the optimization is applied. The attached patch adjusts the optimization to 1) avoid making any assumptions about the sizes of objects accessed via union types, and b) use the size of the largest object in PHI nodes. Tested on x86_64-linux. Martin --------------5A1C54AF85682E78BB2F8CC5 Content-Type: text/x-patch; charset=UTF-8; name="gcc-92765.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="gcc-92765.diff" Content-length: 5119 PR tree-optimization/92765 - wrong code for strcmp of a union member gcc/testsuite/ChangeLog: PR tree-optimization/92765 * gcc.dg/strlenopt-92.c: New test. gcc/ChangeLog: PR tree-optimization/92765 * tree-ssa-strlen.c (component_ref_via_union_p): New function. (determine_min_objsize): Call it. Use the maximum object size for PHI arguments. diff --git a/gcc/testsuite/gcc.dg/strlenopt-92.c b/gcc/testsuite/gcc.dg/strlenopt-92.c new file mode 100644 index 00000000000..44ad5b88854 --- /dev/null +++ b/gcc/testsuite/gcc.dg/strlenopt-92.c @@ -0,0 +1,94 @@ +/* PR tree-optimization/92765 - wrong code for strcmp of a union member + { dg-do run } + { dg-options "-O2 -Wall" } */ + +#include "strlenopt.h" + +extern void free (void*); +extern void* calloc (size_t, size_t); +extern void* malloc (size_t); + + +/* Test from comment #0. */ +struct S0 { + char a[2]; +}; + +union U0 { + char b[4]; + struct S0 s; +}; + +union U0 u0; +union U0 *p0 = &u0; + +int test_0 () +{ + u0.b[0] = 'a'; + u0.b[1] = 'b'; + u0.b[2] = '\0'; + + int x = memcmp (p0->s.a, "x", 2); + + if (strcmp (p0->b, "ab")) + abort (); + + return x; +} + + +/* Test from comment #6. */ +union U1 { struct S1 { char a[2]; char b[2]; char c[2]; } s; char d[6]; } u1; + +__attribute__((noipa)) void +barrier (char *p) +{ + asm volatile ("" : : "g" (p) : "memory"); +} + +__attribute__((noipa)) void +test_1 (union U1 *x) +{ + char *p = (char *) &x->s.b; + barrier (p); + if (strcmp (&x->d[2], "cde")) + abort (); +} + +/* Test from comment #7. */ + +__attribute__((noipa)) int +barrier_copy (char *x, int y) +{ + asm volatile ("" : : "g" (x), "g" (y) : "memory"); + if (y == 0) + strcpy (x, "abcd"); + return y; +} + +__attribute__((noipa)) char * +test_2 (int x) +{ + char *p; + if (x) + p = malloc (2); + else + p = calloc (16, 1); + if (barrier_copy (p, x)) + return p; + if (strcmp (p, "abcd") != 0) + abort (); + return p; +} + + +int main (void) +{ + test_0 (); + + strcpy (u1.d, "abcde"); + test_1 (&u1); + + free (test_2 (0)); + free (test_2 (1)); +} diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index ad9e98973b1..f4b6aadae47 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -4087,6 +4087,29 @@ compute_string_length (int idx) return string_leni; } +/* Returns true if REF is a reference to a member of a union type, + or a member of such a type (traversing all references along + the path). Used to avoid making assumptions about accesses + to members that could also be accessed by other members of + incompatible types. */ + +static bool +component_ref_via_union_p (tree ref) +{ + if (TREE_CODE (ref) == ADDR_EXPR) + ref = TREE_OPERAND (ref, 0); + + while (TREE_CODE (ref) == MEM_REF || handled_component_p (ref)) + { + tree type = TREE_TYPE (ref); + if (TREE_CODE (type) == UNION_TYPE) + return true; + ref = TREE_OPERAND (ref, 0); + } + + return false; +} + /* Determine the minimum size of the object referenced by DEST expression which must have a pointer type. Return the minimum size of the object if successful or HWI_M1U when @@ -4099,14 +4122,18 @@ determine_min_objsize (tree dest) init_object_sizes (); - if (compute_builtin_object_size (dest, 2, &size)) - return size; - /* Try to determine the size of the object through the RHS of the assign statement. */ if (TREE_CODE (dest) == SSA_NAME) { gimple *stmt = SSA_NAME_DEF_STMT (dest); + + /* Determine the size of the largest object when DEST refers + to two or more via a PHI, otherwise the smallest. */ + int ostype = gimple_code (stmt) == GIMPLE_PHI ? 0 : 2; + if (compute_builtin_object_size (dest, ostype, &size)) + return size; + if (!is_gimple_assign (stmt)) return HOST_WIDE_INT_M1U; @@ -4118,6 +4145,10 @@ determine_min_objsize (tree dest) return determine_min_objsize (dest); } + /* Try to determine the size of the referenced object itself. */ + if (compute_builtin_object_size (dest, 2, &size)) + return size; + /* Try to determine the size of the object from its type. */ if (TREE_CODE (dest) != ADDR_EXPR) return HOST_WIDE_INT_M1U; @@ -4129,11 +4160,13 @@ determine_min_objsize (tree dest) type = TYPE_MAIN_VARIANT (type); /* The size of a flexible array cannot be determined. Otherwise, - for arrays with more than one element, return the size of its - type. GCC itself misuses arrays of both zero and one elements - as flexible array members so they are excluded as well. */ + unless the reference involves a union, for arrays with more than + one element, return the size of its type. GCC itself misuses + arrays of both zero and one elements as flexible array members + so they are excluded as well. */ if (TREE_CODE (type) != ARRAY_TYPE - || !array_at_struct_end_p (dest)) + || (!component_ref_via_union_p (dest) + && !array_at_struct_end_p (dest))) { tree type_size = TYPE_SIZE_UNIT (type); if (type_size && TREE_CODE (type_size) == INTEGER_CST --------------5A1C54AF85682E78BB2F8CC5--