From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by sourceware.org (Postfix) with ESMTPS id 2D92B385BF9B for ; Mon, 6 Sep 2021 11:05:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2D92B385BF9B Received: by mail-ed1-x536.google.com with SMTP id r7so9001032edd.6 for ; Mon, 06 Sep 2021 04:05:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JdpOAP+RRkcV6adAqJ7Xrv8srbf/P593ccMQ9VcXIDY=; b=gLoJFmM/ui7mW4XICc8JsSwEhcEkBekaMV8xB5N4iigVOmxu9lp6YekPK/Ymss9Le/ Bx1GBhWHgrnH01fofa15nJXiQtBX9NGp1K0COYU/uuBWiSKHJUQLk71NfnkRmzmCayIq k045AtjjC5Mufhq1lTXy7zxCa/IEI8ldJFblQ8XnclR8bOMC+cxKBhi9qI+1pRtrgF2u IkJvtv1lWpt+2Xe5dkwKn74mP2KTykJVMtE8tvCkVc6DzwbPKnmdwt5b8e19qe658XBL CvEKrmEjvU+TCXbfJbPbrpdpj/liTi1uUgxm0Fqy3WRzvF/i3/hE2IRIFHa8bZjZKVek L9lQ== X-Gm-Message-State: AOAM5307fUByr+v4gzZvGRt8gggRQ7G0bQ/lEmbtSk0YOJyIHF5+EjfD ml7HypwTYdSSLJHOYaljq4XbteA3f2Ft2r39gc0= X-Google-Smtp-Source: ABdhPJzOP07PSCbufeY3VCXcIhsVRJXJOFH9Frzfc6lU76qlmidc2WzTCm8x+OTToza6sCCoDCR8g7PT9M/tGv+Id94= X-Received: by 2002:a50:ed0b:: with SMTP id j11mr12731229eds.97.1630926317178; Mon, 06 Sep 2021 04:05:17 -0700 (PDT) MIME-Version: 1.0 References: <20210906084614.7974-1-hongtao.liu@intel.com> <20210906094127.GQ920497@tucnak> In-Reply-To: <20210906094127.GQ920497@tucnak> From: Richard Biener Date: Mon, 6 Sep 2021 13:05:06 +0200 Message-ID: Subject: Re: [PATCH] Enable auto-vectorization at O2 with very-cheap cost model. To: Jakub Jelinek Cc: liuhongt , GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Sep 2021 11:05:19 -0000 On Mon, Sep 6, 2021 at 11:41 AM Jakub Jelinek wrote: > > On Mon, Sep 06, 2021 at 11:18:47AM +0200, Richard Biener wrote: > > On Mon, Sep 6, 2021 at 10:47 AM liuhongt via Gcc-patches > > wrote: > > > > > > Hi: > > > As discussed in [1], most of (currently unopposed) targets want > > > auto-vectorization at O2, and IMHO now would be a good time to enable O2 > > > vectorization for GCC trunk, so it would leave enough time to expose > > > related issues and fix them. > > > > > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,} > > > Ok for trunk? > > > > It changes the cost model used when the user specifices > > -O2 -ftree-vectorize which used 'cheap' before but now sticks to > > 'very-cheap'. I guess adjusting the cost model in process_options > > might be possible when any(?) of the vectorizer flags were set > > explicitly? > > process_options would mean it affects only the command line and not > __attribute__((optimize ("O2", "ftree-vectorize"))) > etc. > So, shouldn't it be instead done in default_options_optimization, somewhere > among the > if (openacc_mode) > SET_OPTION_IF_UNSET (opts, opts_set, flag_ipa_pta, true); > > /* Track fields in field-sensitive alias analysis. */ > if (opt2) > SET_OPTION_IF_UNSET (opts, opts_set, param_max_fields_for_field_sensitive, > 100); > > if (opts->x_optimize_size) > /* We want to crossjump as much as possible. */ > SET_OPTION_IF_UNSET (opts, opts_set, param_min_crossjump_insns, 1); > > /* Restrict the amount of work combine does at -Og while retaining > most of its useful transforms. */ > if (opts->x_optimize_debug) > SET_OPTION_IF_UNSET (opts, opts_set, param_max_combine_insns, 2); > in there? > Like: > /* Use -fvect-cost-model=cheap instead of -fvect-cost-mode=very-cheap > by default with explicit -ftree-{loop,slp}-vectorize. */ > if (opts->x_optimize == 2 > && (opts_set->x_ftree_loop_vectorize > || opts_set->x_ftree_slp_vectorize)) > SET_OPTION_IF_UNSET (opts, opts_set, fvect_cost_model_, > VECT_COST_MODEL_CHEAP); > Though, unsure if that will work with -O2 -ftree-vectorize which is an > option without flag with EnabledBy on the other two options. One needs to check that, yes. > > Also, is: > + { OPT_LEVELS_2_PLUS, OPT_ftree_loop_vectorize, NULL, 1 }, > + { OPT_LEVELS_2_PLUS, OPT_ftree_slp_vectorize, NULL, 1 }, > what we really want, isn't that instead: > + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_loop_vectorize, NULL, 1 }, > + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_slp_vectorize, NULL, 1 }, > ? I mean, for -Os vectorization even in very-cheap model I'd think it > usually enlarges code size, and for -Og it is seriously harmful for > debugging experience, especially when DWARF <= 5 doesn't have anything that > would help debugging vectorized loops. I guess technically SLP vectorize would be fine for -Os, at least on archs with fixed size instruction lengths. I'm unsure whether the vectorizer cost model is good at tracking size though. -Og is optimize == 1 Now the issue is what we'd do for -O2 -ftree-slp-vectorize, do we want the very-cheap model for loop vectorization but cheap for SLP vectorization? I think the cost model differences (besides disabling) only make a difference for loop vectorization so it should probably be testing opts_set->x_ftree_loop_vectorize only and thus explicit/implicit enabling of loop vectorization should make the difference. (but yes, double-check how -ftree-vectorize arrives here) Richard. > > Jakub >