From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by sourceware.org (Postfix) with ESMTPS id AD3023858D1E for ; Wed, 29 Nov 2023 07:47:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AD3023858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AD3023858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701244037; cv=none; b=ndLWdyGlR85ROxhB6WCsd+HLlTJtsoy0XlvVa+C8ssY9KUYf4pqOJQ2LGn7/17Jcg+6XmlOdSO+maBj9T7GvFM9qYblz1UrlsmZle1jRNwey4vBseZ8BEvz+R7wU4knAqr5bLxgLPh3XmvyuCPOmou2riIRk+FR4qfQUZ5HUi7U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701244037; c=relaxed/simple; bh=Pj+ujtkvIR1cWZryDD5ZWxalKeXztLYwnL4yUwwcM+I=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=pTSTYinCYq18eN9cJgY6vWrrAANV50w0WtrxQVFt9fBv/uEBAum3yZXaIzoCi5dWhnoD6MbHfJWM1V6KKki1VDiFMaCwG7grcOOzeoSCwd5FEMSaBPMvlyoljw8jKA+hw2Q4naLhsSXWMv6IMPlOtAVtz17eimeOQb92dBobAwM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-50bbb78efb5so2174407e87.3 for ; Tue, 28 Nov 2023 23:47:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701244034; x=1701848834; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=sCqtcMEJHcX6peUrZdA0xDF/4b+p5NO4xKfgL37Xp1g=; b=f6gbwLVi1Iyir38VlMBhvbmYV5kf5j6mBhQQJu9+wm9EIf3M5qKK7MNtnNIByMO/h+ WAHNvAt1GrC4xquUGftQGes5EMuKdN+PG3g5vJMjWnnoDCK1c4ah6pwiGr/Sd6+f3ZHI iyU7MK/d/cIlUlEMZogJ4dg/ijriOBtqeVF/3DuvIFIfCinEq/WQG0H3EjGWo/0raztY dn72jGWOu7MbGS/kw7h5r7yX2mMDhMuU4o4RYzFtABUViAtR5g1cg/WyPbqA+4G6pBCt OZpWD3luGx3xxkDLj65P9kKovIzLjwtSPdx0G61ShLRNWJoovO8Gp+iFpRu1aZ05suBV uErg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701244034; x=1701848834; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sCqtcMEJHcX6peUrZdA0xDF/4b+p5NO4xKfgL37Xp1g=; b=kOT+zghYOU2RKBU/zGfUhz1l7rkgM06Ev9GufKbgZVBM4kV8sI89BwL8qjQkxGkk3Q RJFCEtXPPKmncR6lj+dr2+TGFtiCW8PtjHoyPK4+dwxSvWvQSEdDzuXJz3znVp0hsmef N1Ve+Xg2c3I4slea1UoiBpd1cnOhla4RdvnyXz0rmqavXUwJCPR40u4p3dthqsIHXx1e ZMI2eeL0Zltl7hLpHgndN7Eddyfw7VOv94x2m9qzuVf+EXzMjky9gwmbiruiGbJ6KEr1 TvujzADhynQx9tB9FyhqWIcxI9dY7mcQCL+bfsJ+qSDq/XsiPxnFEI6ODriDrYehAHdT HGYg== X-Gm-Message-State: AOJu0Yya0SDP0nzcKEq/EfTTV6i9Q26CL6Md8Jc3wRg5w/MGYhZtByYb ktdks7cJuE03WPtHfDkyxsPa3X1e0iWzGHrRZT4= X-Google-Smtp-Source: AGHT+IESplJFOkdLxuNPhTZUkvU85qHQmh04FWsYORcs5aALainEleQehr570If/slUKB5o9dECX95NKFpB3Na2R9zc= X-Received: by 2002:a19:760a:0:b0:50b:c3ba:6144 with SMTP id c10-20020a19760a000000b0050bc3ba6144mr877613lff.30.1701244033603; Tue, 28 Nov 2023 23:47:13 -0800 (PST) MIME-Version: 1.0 References: <20231128075212.3526692-1-hongtao.liu@intel.com> In-Reply-To: <20231128075212.3526692-1-hongtao.liu@intel.com> From: Richard Biener Date: Wed, 29 Nov 2023 08:47:01 +0100 Message-ID: Subject: Re: [PATCH] Take register pressure into account for vec_construct when the components are not loaded from memory. To: liuhongt Cc: gcc-patches@gcc.gnu.org, crazylht@gmail.com, hjl.tools@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Nov 28, 2023 at 8:54=E2=80=AFAM liuhongt wr= ote: > > For vec_contruct, the components must be live at the same time if > they're not loaded from memory, when the number of those components > exceeds available registers, spill happens. Try to account that with a > rough estimation. > ??? Ideally, we should have an overall estimation of register pressure > if we know the live range of all variables. > > The patch can avoid regressions due to .i.e. vec_contruct with 32 char. > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for trunk? Hmm, I would suggest you put reg_needed into the class and accumulate over all vec_construct, with your patch you pessimize a single v32qi over two separate v16qi for example. Also currently the whole block is gated with INTEGRAL_TYPE_P but register pressure would be also a concern for floating point vectors. finish_cost would then apply an adjustment. 'target_avail_regs' is for GENERAL_REGS, does that include APX regs? I don't see anything similar for FP regs, but I guess the target should kno= w or maybe there's a #regs in regclass query already. That said, this kind of adjustment looks somewhat appealing. Richard. > gcc/ChangeLog: > > * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Take > register pressure into account for vec_construct when the > components are not loaded from memory. > --- > gcc/config/i386/i386.cc | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index 683ac643bc8..f8417555930 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -24706,6 +24706,7 @@ ix86_vector_costs::add_stmt_cost (int count, vect= _cost_for_stmt kind, > stmt_cost =3D ix86_builtin_vectorization_cost (kind, vectype, misa= lign); > unsigned i; > tree op; > + unsigned reg_needed =3D 0; > FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op) > if (TREE_CODE (op) =3D=3D SSA_NAME) > TREE_VISITED (op) =3D 0; > @@ -24737,11 +24738,30 @@ ix86_vector_costs::add_stmt_cost (int count, ve= ct_cost_for_stmt kind, > && (gimple_assign_rhs_code (def) !=3D BIT_FIELD_REF > || !VECTOR_TYPE_P (TREE_TYPE > (TREE_OPERAND (gimple_assign_rhs1 (def), = 0)))))) > - stmt_cost +=3D ix86_cost->sse_to_integer; > + { > + stmt_cost +=3D ix86_cost->sse_to_integer; > + reg_needed++; > + } > } > FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op) > if (TREE_CODE (op) =3D=3D SSA_NAME) > TREE_VISITED (op) =3D 0; > + > + /* For vec_contruct, the components must be live at the same time = if > + they're not loaded from memory, when the number of those compone= nts > + exceeds available registers, spill happens. Try to account that = with a > + rough estimation. Currently only handle integral modes since sca= lar fp > + shares sse_regs with vectors. > + ??? Ideally, we should have an overall estimation of register pr= essure > + if we know the live range of all variables. */ > + if (!fp && kind =3D=3D vec_construct > + && reg_needed > target_avail_regs) > + { > + unsigned spill_cost =3D ix86_builtin_vectorization_cost (scalar= _store, > + vectype, > + misalign= ); > + stmt_cost +=3D spill_cost * (reg_needed - target_avail_regs); > + } > } > if (stmt_cost =3D=3D -1) > stmt_cost =3D ix86_builtin_vectorization_cost (kind, vectype, misali= gn); > -- > 2.31.1 >