From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2633 invoked by alias); 8 Feb 2018 15:06:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 2618 invoked by uid 89); 8 Feb 2018 15:06:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.5 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-lf0-f49.google.com Received: from mail-lf0-f49.google.com (HELO mail-lf0-f49.google.com) (209.85.215.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 08 Feb 2018 15:06:37 +0000 Received: by mail-lf0-f49.google.com with SMTP id a204so6847617lfa.2 for ; Thu, 08 Feb 2018 07:06:37 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=WmzxAVtU3/IN8EOf+XrnntHz1ZW07QnanfrI/tc3LXk=; b=VH4XvZmwd1worNIr8ceSCd+Jm8HCzUN203bWkZB+dLz2o0ZYnL7opDmqi93xbDIjMV v+EbbCaf6u7agCaLKjtz+ndn2cjxKtdgdTRYrApBVnFrI4V2Rht0UrQZ5tX4pkk0YBdp 7HxNyj2xCCxo6sHX+4Nr2M4KScX8+LWJbmVwOgweIFqgpnNLnCC3KwnEK6iu17Jct0TM pe411GXTc0kiDHA5HWI8U+985rGA4Ab65arKUZBVgu4YtcRoFR3Gf6o1jr1fkKerVYmi xjD75vjYq3ajSQ3CnML/hPpyBTVN6Vor+vJWT6iUz9Vn1iqAZlC7NtxuuIWwXQkku+Oo v95Q== X-Gm-Message-State: APf1xPCx3dx2Z5idwnzzLXU+Mg+ffPICyE2434ywTTWU/iXHLrifjfoE L4MOZDWYRWagI3eXYKVmTRnMBQ2xm6zG4ovrRngcHg== X-Google-Smtp-Source: AH8x226rEkGm9V9YkfiY0d0BvqUxmJhItwUzGg31WvSMqc87olvIICG+GERQqf3TAh36AGiD97N4aBPg4q+2AvHInbI= X-Received: by 10.46.2.1 with SMTP id 1mr776726ljc.0.1518102395462; Thu, 08 Feb 2018 07:06:35 -0800 (PST) MIME-Version: 1.0 Received: by 10.46.51.21 with HTTP; Thu, 8 Feb 2018 07:06:35 -0800 (PST) In-Reply-To: <87d11fy971.fsf@linaro.org> References: <87a7wra3ba.fsf@linaro.org> <87d11fy971.fsf@linaro.org> From: Richard Biener Date: Thu, 08 Feb 2018 15:06:00 -0000 Message-ID: Subject: Re: Use nonzero bits to refine range in split_constant_offset (PR 81635) To: Richard Biener , GCC Patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2018-02/txt/msg00426.txt.bz2 On Thu, Feb 8, 2018 at 1:09 PM, Richard Sandiford wrote: > Richard Biener writes: >> On Fri, Feb 2, 2018 at 3:12 PM, Richard Sandiford >> wrote: >>> Index: gcc/tree-data-ref.c >>> =================================================================== >>> --- gcc/tree-data-ref.c 2018-02-02 14:03:53.964530009 +0000 >>> +++ gcc/tree-data-ref.c 2018-02-02 14:03:54.184521826 +0000 >>> @@ -721,7 +721,13 @@ split_constant_offset_1 (tree type, tree >>> if (TREE_CODE (tmp_var) != SSA_NAME) >>> return false; >>> wide_int var_min, var_max; >>> - if (get_range_info (tmp_var, &var_min, &var_max) != VR_RANGE) >>> + value_range_type vr_type = get_range_info (tmp_var, &var_min, >>> + &var_max); >>> + wide_int var_nonzero = get_nonzero_bits (tmp_var); >>> + signop sgn = TYPE_SIGN (itype); >>> + if (intersect_range_with_nonzero_bits (vr_type, &var_min, >>> + &var_max, var_nonzero, >>> + sgn) != VR_RANGE) >> >> Above it looks like we could go from VR_RANGE to VR_UNDEFINED. >> I'm not sure if the original range-info might be useful in this case - >> if it may be >> can we simply use only the range info if it was VR_RANGE? > > I think we only drop to VR_UNDEFINED if we have contradictory > information: nonzero bits says some bits must be clear, but the range > only contains values for which the bits are set. In that case I think > we should either be conservative and not use the information, or be > aggressive and say that we have undefined behaviour, so overflow is OK. > > It seems a bit of a fudge to go back to the old range when we know it's > false, and use it to allow the split some times and not others. Fine. > Thanks, > Richard > >> >> Ok otherwise. >> Thanks, >> Richard. >> >>> return false; >>> >>> /* See whether the range of OP0 (i.e. TMP_VAR + TMP_OFF) >>> @@ -729,7 +735,6 @@ split_constant_offset_1 (tree type, tree >>> operations done in ITYPE. The addition must overflow >>> at both ends of the range or at neither. */ >>> bool overflow[2]; >>> - signop sgn = TYPE_SIGN (itype); >>> unsigned int prec = TYPE_PRECISION (itype); >>> wide_int woff = wi::to_wide (tmp_off, prec); >>> wide_int op0_min = wi::add (var_min, woff, sgn, &overflow[0]); >>> Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-3.c >>> =================================================================== >>> --- /dev/null 2018-02-02 09:03:36.168354735 +0000 >>> +++ gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-3.c 2018-02-02 14:03:54.183521863 +0000 >>> @@ -0,0 +1,62 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-additional-options "-fno-tree-loop-vectorize" } */ >>> +/* { dg-require-effective-target vect_double } */ >>> +/* { dg-require-effective-target lp64 } */ >>> + >>> +void >>> +f1 (double *p, double *q, unsigned int n) >>> +{ >>> + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); >>> + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); >>> + for (unsigned int i = 0; i < n; i += 4) >>> + { >>> + double a = q[i] + p[i]; >>> + double b = q[i + 1] + p[i + 1]; >>> + q[i] = a; >>> + q[i + 1] = b; >>> + } >>> +} >>> + >>> +void >>> +f2 (double *p, double *q, unsigned int n) >>> +{ >>> + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); >>> + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); >>> + for (unsigned int i = 0; i < n; i += 2) >>> + { >>> + double a = q[i] + p[i]; >>> + double b = q[i + 1] + p[i + 1]; >>> + q[i] = a; >>> + q[i + 1] = b; >>> + } >>> +} >>> + >>> +void >>> +f3 (double *p, double *q, unsigned int n) >>> +{ >>> + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); >>> + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); >>> + for (unsigned int i = 0; i < n; i += 6) >>> + { >>> + double a = q[i] + p[i]; >>> + double b = q[i + 1] + p[i + 1]; >>> + q[i] = a; >>> + q[i + 1] = b; >>> + } >>> +} >>> + >>> +void >>> +f4 (double *p, double *q, unsigned int start, unsigned int n) >>> +{ >>> + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); >>> + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); >>> + for (unsigned int i = start & -2; i < n; i += 2) >>> + { >>> + double a = q[i] + p[i]; >>> + double b = q[i + 1] + p[i + 1]; >>> + q[i] = a; >>> + q[i + 1] = b; >>> + } >>> +} >>> + >>> +/* { dg-final { scan-tree-dump-times "basic block vectorized" 4 "slp1" } } */ >>> Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-4.c >>> =================================================================== >>> --- /dev/null 2018-02-02 09:03:36.168354735 +0000 >>> +++ gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-4.c 2018-02-02 14:03:54.183521863 +0000 >>> @@ -0,0 +1,47 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-additional-options "-fno-tree-loop-vectorize" } */ >>> +/* { dg-require-effective-target lp64 } */ >>> + >>> +void >>> +f1 (double *p, double *q, unsigned int n) >>> +{ >>> + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); >>> + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); >>> + for (unsigned int i = 0; i < n; i += 1) >>> + { >>> + double a = q[i] + p[i]; >>> + double b = q[i + 1] + p[i + 1]; >>> + q[i] = a; >>> + q[i + 1] = b; >>> + } >>> +} >>> + >>> +void >>> +f2 (double *p, double *q, unsigned int n) >>> +{ >>> + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); >>> + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); >>> + for (unsigned int i = 0; i < n; i += 3) >>> + { >>> + double a = q[i] + p[i]; >>> + double b = q[i + 1] + p[i + 1]; >>> + q[i] = a; >>> + q[i + 1] = b; >>> + } >>> +} >>> + >>> +void >>> +f3 (double *p, double *q, unsigned int start, unsigned int n) >>> +{ >>> + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); >>> + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); >>> + for (unsigned int i = start; i < n; i += 2) >>> + { >>> + double a = q[i] + p[i]; >>> + double b = q[i + 1] + p[i + 1]; >>> + q[i] = a; >>> + q[i + 1] = b; >>> + } >>> +} >>> + >>> +/* { dg-final { scan-tree-dump-not "basic block vectorized" "slp1" } } */