From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <adonis0147@gmail.com>
Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com
 [IPv6:2a00:1450:4864:20::52f])
 by sourceware.org (Postfix) with ESMTPS id 07B9F385C337
 for <gcc-help@gcc.gnu.org>; Tue, 28 Jun 2022 15:24:05 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 07B9F385C337
Received: by mail-ed1-x52f.google.com with SMTP id z19so18047397edb.11
 for <gcc-help@gcc.gnu.org>; Tue, 28 Jun 2022 08:24:04 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=bQy9q9+kOTpOqhDyztk0BsSHi/H6pFZukDzTWuj+SE4=;
 b=Kugu7nZZ87Q7WEY1Y+C6NK/2SG2DIuzFDjSBPiaNxCTNYD/QdLwPsrUQhYi/BN8puY
 E6uN3QdhXEQHh+6TH+ZKajGry/VP+/yItPKFlnW5kcADpKyr58kK7chylmmuNUw2qti3
 4Esa4pBIDg9fsTr5zRl9BqUk4ftndIpW9Q8QanzKeYdOaD8Ewss4bmfLoocwncv9Hkpz
 oxNSFWwwBkGzWH8wemo3HDVZBqrQq4VhNMCaCan1M3a6sFACwlO6Nz/kTbO2PB5NCCOY
 30YPggWt57TT3ei1TkR0aEzyC9sKohmNcmWB0HSLv4ZyryB4/RACY9YSMfBJ98qSBnSn
 pISg==
X-Gm-Message-State: AJIora+j+YDOCgTs0U2GpVIVj8gtqd7YKzhDsmD4RmN6HtI0X5wMNtQh
 kCW8HY8ejEeZK7s8jbKDC/Ge7lQRT6N96uib3gY=
X-Google-Smtp-Source: AGRyM1sszAFTyZ2rio96Fr4vcGspXkkr0zFt71pUqi6aQJBVqXpPg31Cz21+mhjLJ7UBvSsJkTdVSpxQrBRmIAlm3Uw=
X-Received: by 2002:a05:6402:1219:b0:437:74dd:640d with SMTP id
 c25-20020a056402121900b0043774dd640dmr20483874edw.312.1656429843666; Tue, 28
 Jun 2022 08:24:03 -0700 (PDT)
MIME-Version: 1.0
References: <CAG5qfXii3dyLDsdULLyZSoigMZjMjk7LmOmjwgZSB44cAQzf-g@mail.gmail.com>
 <3eb44329-3b12-896c-14c4-3473d43aed3d@ispras.ru>
In-Reply-To: <3eb44329-3b12-896c-14c4-3473d43aed3d@ispras.ru>
From: Adonis Ling <adonis0147@gmail.com>
Date: Tue, 28 Jun 2022 23:23:52 +0800
Message-ID: <CAG5qfXi-jE1wtgsKBECR88R8c6FhnWBP3Dj07eTv5d6g4fZHYQ@mail.gmail.com>
Subject: Re: Why does different types of array subscript used to iterate
 affect auto vectorization
To: Alexander Monakov <amonakov@ispras.ru>
Cc: gcc-help@gcc.gnu.org
X-Spam-Status: No, score=-0.6 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT,
 FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
 TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
X-BeenThere: gcc-help@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-help mailing list <gcc-help.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-help>,
 <mailto:gcc-help-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-help/>
List-Post: <mailto:gcc-help@gcc.gnu.org>
List-Help: <mailto:gcc-help-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-help>,
 <mailto:gcc-help-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Jun 2022 15:24:07 -0000

Hi Alexander, thanks for your reply.

On Tue, Jun 28, 2022 at 9:06 PM Alexander Monakov <amonakov@ispras.ru>
wrote:

> On Mon, 27 Jun 2022, Adonis Ling via Gcc-help wrote:
>
> >  Hi all,
> >
> > Recently, I met an issue with auto vectorization.
> >
> > As following code shows, why uint32_t prevents the compiler (GCC 12.1 +
> O3)
> > from optimizing by auto vectorization. See
> https://godbolt.org/z/a3GfaKEq6.
> >
> > #include <cstdint>
> >
> > // no auto vectorization
> > void test32(uint32_t *array, uint32_t &nread, uint32_t from, uint32_t
> to) {
> >     for (uint32_t i = from; i < to; i++) {
> >         array[nread++] = i;
> >     }
> > }
>
> Here the main problem is '*array' and 'nread' have the same type, so they
> might
> overlap. Ideally the compiler would recognize that that cannot happen
> because it
> would make 'array[nread++] = i' undefined due to unsequenced
> modifications, but
> GCC is not sufficiently smart (yet). The secondary issue is the same as
> below:
>

I got your point.

After that, I tried to add __restrict__ to nread as the following shows and
GCC still doesn't optimize it.

#include <cstdint>

// no auto vectorization
void test32(uint32_t *array, uint32_t & __restrict__ nread, uint32_t from,
uint32_t to) {
    for (uint32_t i = from; i < to; i++) {
        array[nread++] = i;
    }
}

However, when I used Clang to compile, I noticed the code was optimized by
Clang. See https://godbolt.org/z/eEz9W7o9z .


> > // no auto vectorization
> > void test_another_32(uint32_t *array, uint32_t &nread, uint32_t from,
> > uint32_t to) {
> >     uint32_t index = nread;
> >     for (uint32_t i = from; i < to; i++) {
> >         array[index++] = i;
> >     }
> >     nread = index;
> > }
>
> ... here: the issue is that index is unsigned and shorter than pointer
> type, it
> can wrap around from 0xffffffff to 0, making the access non-consecutive.
> When
> you compile for 32-bit x86, this loop is vectorized.
>
> Alexander
>

Clang also optimizes this function. See https://godbolt.org/z/eEz9W7o9z .

-- 
Best regards,
Adonis