From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) by sourceware.org (Postfix) with ESMTPS id 497823858407 for ; Wed, 31 Aug 2022 23:56:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 497823858407 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x133.google.com with SMTP id q7so22119103lfu.5 for ; Wed, 31 Aug 2022 16:56:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:message-id:subject:to:date:from:from:to:cc; bh=gZhcSkpN0l7kpFeU166eUPe6kpV/ofZv19V2TxO67bA=; b=N/N6JKvnkqveg89xeBBcu6DkkLM2YWQGRvBL4NQbowD1Kb23RMU91p5ZsnZPsFKm7+ X6zRyeEMiX2JFoMnnaKnp7MUBdaJiRnP9t0EiDapHcpxED4UUy8dIBOOAvZWQS9/Yt9a ghem92hRB/9JB12KpRjb2y2l31YkecXLc6NIQbGTzcPXvcMtZ3pWXRnahfpSStqmxuRQ XywY8Hq9EeCuMNhMDFHhr6Xc3mT0HgNJ5VsuH4wc63XBH35sqWNtaGhP4Quury+jB2S5 mWcUEqSlNclR+Jwb0KuFZ73gpmLX9qJqB3XKU50BGp22yLH2Il3tFLSTPOmuf34kEUSC 4f8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:message-id:subject:to:date:from:x-gm-message-state :from:to:cc; bh=gZhcSkpN0l7kpFeU166eUPe6kpV/ofZv19V2TxO67bA=; b=lKsQgBVmxCaz9AFw0aZBL3q43Yf9t6d/uJPlb56+mEJzD2xXvg+TkJM7QX9raKiVHq Z9cluubILY5z9QqQGX1el6aYsEqycVK5havMwTQl/fyE62Rsc9+tzS9vJ7TR5zpLLs2i MAQmR1Efkhk2IK3XZwGFBelFQGJ5ztGljtA91sXuwFkTJ3jsNY8oYJx6FGlrl+yLTmBC VSkd9DdtSGOMARafuBJXOsNZJI8Mc08TGSGdylXZzoNRlQIq0Bpu2pPF1ETu9A7OP+iL JPm60fxjAWtDPrKVZiMiIFasC0G5mpBdxQ/E8HIyPn+KezI2xyVw5q5VVRV0Sllb0QPb 6kLw== X-Gm-Message-State: ACgBeo3FHRmlFh9mcBw11sVGh9GTX9ORuR25jfwnrsFY1Mu3oZtLPZqT zc/3rH+eLIyeJCZb6Uz63HtYSunAArg= X-Google-Smtp-Source: AA6agR4HD9FXEEz5TO6gj4dWTSFDewwdJ7Vcbtsmaxcxei/cCQhCC6U1IdGl/uf/TM60fukEptJw6Q== X-Received: by 2002:a05:6512:1520:b0:492:daab:8382 with SMTP id bq32-20020a056512152000b00492daab8382mr10820738lfb.151.1661990159770; Wed, 31 Aug 2022 16:55:59 -0700 (PDT) Received: from [192.168.0.14] (c83-254-134-90.bredband.tele2.se. [83.254.134.90]) by smtp.gmail.com with ESMTPSA id b20-20020ac24114000000b0048b045cda65sm2159025lfi.70.2022.08.31.16.55.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Aug 2022 16:55:58 -0700 (PDT) From: Krister Walfridsson X-Google-Original-From: Krister Walfridsson Date: Wed, 31 Aug 2022 23:55:42 +0000 (UTC) To: gcc@gcc.gnu.org Subject: GIMPLE undefined behavior Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: I'm implementing a tool for translation validation (similar to Alive2 for LLVM). The tool uses an SMT solver to verify for each GIMPLE pass that the output IR is a refinement of the input IR: * That each compiled function returns an identical result before/after the pass (for input that does not invoke UB) * That each function does not have additional UB after the pass * That values in global memory are identical after the two versions of the function are run I have reported three bugs (106523, 106613, 106744) where the tool has found differences in the result, but it is a bit unclear to me what is considered undefined behavior in GIMPLE, so I have not reported any such cases yet... For example, ifcombine optimizes int foo (int x, int a, int b) { int c; int _1; int _2; int _3; int _4; : c_6 = 1 << a_5(D); _1 = c_6 & x_7(D); if (_1 != 0) goto ; else goto ; : _2 = x_7(D) >> b_8(D); _3 = _2 & 1; if (_3 != 0) goto ; else goto ; : : # _4 = PHI <2(4), 0(2), 0(3)> return _4; } to int foo (int x, int a, int b) { int c; int _4; int _10; int _11; int _12; int _13; : _10 = 1 << b_8(D); _11 = 1 << a_5(D); _12 = _10 | _11; _13 = x_7(D) & _12; if (_12 == _13) goto ; else goto ; : : # _4 = PHI <2(3), 0(2)> return _4; } Both return the same value for foo(8, 1, 34), but the optimized version shifts more than 31 bits when calculating _10. tree.def says for LSHIFT_EXPR that "the result is undefined" if the number of bits to shift by is larger than the type size, which I interpret as it just should be considered to return an arbitrary value. I.e., this does not invoke undefined behavior, so the optimization is allowed. Is my understanding correct? What about signed integer wrapping for PLUS_EXPR? This happens for int f (int c, int s) { int y2; int y1; int x2; int x1; int _7; : y1_2 = c_1(D) + 2; x1_4 = y1_2 * s_3(D); y2_5 = c_1(D) + 4; x2_6 = s_3(D) * y2_5; _7 = x1_4 + x2_6; return _7; } where slsr optimizes this to int f (int c, int s) { int y1; int x2; int x1; int _7; int slsr_9; : y1_2 = c_1(D) + 2; x1_4 = y1_2 * s_3(D); slsr_9 = s_3(D) * 2; x2_6 = x1_4 + slsr_9; _7 = x1_4 + x2_6; return _7; Calling f(-3, 0x75181005) makes slsr_9 overflow in the optimized code, even though the original did not overflow. My understanding is that signed overflow invokes undefined behavior in GIMPLE, so this is a bug in ifcombine. Is my understanding correct? I would appreciate some comments on which non-memory-related operations I should treat as invoking undefined behavior (memory operations are more complicated, and I'll be back with more concrete questions later...). /Krister