From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 5D9CC381DCF9 for ; Mon, 9 Mar 2020 18:01:49 +0000 (GMT) Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 029HrKK5071709 for ; Mon, 9 Mar 2020 14:01:48 -0400 Received: from smtp.notes.na.collabserv.com (smtp.notes.na.collabserv.com [158.85.210.103]) by mx0b-001b2d01.pphosted.com with ESMTP id 2ynsupj01b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 09 Mar 2020 14:01:46 -0400 Received: from localhost by smtp.notes.na.collabserv.com with smtp.notes.na.collabserv.com ESMTP for from ; Mon, 9 Mar 2020 18:01:43 -0000 Received: from us1b3-smtp02.a3dr.sjc01.isc4sb.com (10.122.7.175) by smtp.notes.na.collabserv.com (10.122.47.39) with smtp.notes.na.collabserv.com ESMTP; Mon, 9 Mar 2020 18:01:39 -0000 Received: from us1b3-mail251.a3dr.sjc03.isc4sb.com ([10.160.11.87]) by us1b3-smtp02.a3dr.sjc01.isc4sb.com with ESMTP id 2020030918013912-733638 ; Mon, 9 Mar 2020 18:01:39 +0000 In-Reply-To: From: "Hong X" To: Hongtao Liu Cc: gcc-help@gcc.gnu.org Date: Mon, 9 Mar 2020 18:01:39 +0000 MIME-Version: 1.0 Sensitivity: Importance: Normal X-Priority: 3 (Normal) References: , X-Mailer: IBM iNotes ($HaikuForm 1054.1) | IBM Domino Build SCN1812108_20180501T0841_FP62 November 04, 2019 at 09:47 X-LLNOutbound: False X-Disclaimed: 50291 X-TNEFEvaluated: 1 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 x-cbid: 20030918-6283-0000-0000-0000010DA217 X-IBM-SpamModules-Scores: BY=0; FL=0; FP=0; FZ=0; HX=0; KW=0; PH=0; SC=0.388783; ST=0; TS=0; UL=0; ISC=; MB=0.002637 X-IBM-SpamModules-Versions: BY=3.00012717; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000293; SDB=6.01345268; UDB=6.00717244; IPR=6.01127671; MB=3.00031147; MTD=3.00000008; XFM=3.00000015; UTC=2020-03-09 18:01:41 X-IBM-AV-DETECTION: SAVI=unsuspicious REMOTE=unsuspicious XFE=unused X-IBM-AV-VERSION: SAVI=2020-03-09 16:45:51 - 6.00011099 x-cbparentid: 20030918-6284-0000-0000-000000E5BEF0 Message-Id: Subject: RE: Initializing a vector to zero leads to less efficient assemblies than manually assigning a vector to zero? X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-03-09_06:2020-03-09, 2020-03-09 signatures=0 X-Proofpoint-Spam-Reason: safe X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-help@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-help mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Mar 2020 18:01:50 -0000 -----Hongtao Liu wrote: ----- >To: Hong X >From: Hongtao Liu >Date: 03/08/2020 22:54 >Cc: gcc-help@gcc.gnu.org >Subject: [EXTERNAL] Re: Initializing a vector to zero leads to less >efficient assemblies than manually assigning a vector to zero? > >On Sat, Mar 7, 2020 at 5:20 AM Hong X wrote: >> >> Hi all, >> >> I tried to compile the following two code snippets with >"--std=3Dc++14 -mavx2 -O3" options: >> >> double tmp=5Fvalues[4] =3D {0}; >> >> and >> >> double tmp=5Fvalues[4]; >> >> for (auto i =3D 0; i < 4; ++i) { >> tmp=5Fvalues[i] =3D 0.0; >> } >> >> The first code snippet leads to >> >> vmovaps XMMWORD PTR [rsp], xmm0 >> vmovaps XMMWORD PTR [rsp+16], xmm0 >> >> But the second leads to only >> >> vmovapd YMMWORD PTR [rsp], ymm0 >> >> which is less efficient than the previous one. Am I missing >something? >> >Assume you're working on Skylake. the latency and throuoput of >vmovaps/vmovpad is > | lat | throughput | uops | >port | >VMOVAPS (XMM, M128)| [=E2=89=A44;=E2=89=A47] | 0.50 / 0.50 | 1 | 1*p23 | >VMOVAPS (YMM, M256)| [=E2=89=A45;=E2=89=A48]| 0.50 / 0.50| 1 | 1*p23 | >Refer to >https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A=5F=5Fuops.info=5Ftab= le. >html&d=3DDwIFaQ&c=3Djf=5FiaSHvJObTbx-siA1ZOg&r=3DMiihJD2XQNB=5FCwZVDvjHBg&= m=3DnEB >RkuwiQXUL6Tu6accQsNS-jUQ9wCEw6jqJXNEBOes&s=3DzEMMNHR8du8hu3NLiODEXoXBYX >fjaraeuP8ueYllxTM&e=3D=20 >So the later seems better. Oops, I said in the other way around. I meant the second is *more* (not *le= ss* in my original post) efficient than the first despite they are function= ally equivalent, but the first is likely more preferred by an average C++ p= rogrammer. This looks odd to me. Thanks, Hong