From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10olkn2033.outbound.protection.outlook.com [40.92.42.33]) by sourceware.org (Postfix) with ESMTPS id 100813835839 for ; Fri, 22 Jul 2022 07:51:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 100813835839 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FAJ62A5cL16Fv8Bci+LQQoFmDlDijx17oLXmvO8qD+TfoUkR51nLkTOdk3zm8g8v4zRxMDVoL7rAnMWaTlFFjDDp+KoC2FlEqkAo9IpGffAF3rf2KXHD6ItsIzGNqmr4svFBYOZelMLjhDlspYxjg4aqfFCFnkTu2rUJ9A12rni1PX04dSL4ldRnEGyHdC9EtTy9bSmbp+E2YumtXtEWcvmm/FC3OQgdhhZ4j7+bRharMBsALYFm2Gz8yDHhwoaeF2FOLyjSGlOBRFS30s+Z27X0SUabTZyPrCWpdGwLhCJo5HPgMuKELxcu4/HLCSctHmgYJKZEM3jSzh93+SMwmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=953P/EQZDMdjZarAND5DooXs+TkWlC5N8iEtfDQoNBI=; b=C78QUIMijNQNp4clZxTxgj+rL70U59Se501TUg1UNcjWmjMjGukMmvdD7yNYhQXUbQD3mYU5nxri9nlJ3a86kWQI2Zk+LJHWoIQ62kswq8NOnpiHOOmT/JEMHx9RzS/sUxZ2i4k7mpP6D0mUsOTtlsklHKRa35OweakD1ojZw5rcRPLuAjfxsjFAxl140/3/2WIrJQFIdGN/vAg945wr5Xlta6bH1VmymHN93jCL3zkC/rVx/Sf3abgWusKgDrCO/BaaheeeanxDHxIS7QkynWHJpo7F4HQrQ5xiXJ+K+RFzewIxIwr8whHWoyuR15aDlWXS2VfHQjBMUtVIaSBZ3Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none Received: from DM8PR16MB4357.namprd16.prod.outlook.com (2603:10b6:8:30::8) by DM6PR16MB2844.namprd16.prod.outlook.com (2603:10b6:5:125::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5438.26; Fri, 22 Jul 2022 07:51:04 +0000 Received: from DM8PR16MB4357.namprd16.prod.outlook.com ([fe80::e5ef:7428:e6f1:7275]) by DM8PR16MB4357.namprd16.prod.outlook.com ([fe80::e5ef:7428:e6f1:7275%9]) with mapi id 15.20.5458.019; Fri, 22 Jul 2022 07:51:03 +0000 From: Edgar Mobile To: "gcc-help@gcc.gnu.org" Subject: Can this function be auto vectorized with NEON? Thread-Topic: Can this function be auto vectorized with NEON? Thread-Index: AQHYnZ7orzCWq9ydRk+1E8knUSpLmA== Date: Fri, 22 Jul 2022 07:51:03 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-tmn: [9mZ3D7TbNaI3VIChoIKsklbNnCD7ZVh2] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 05370cbe-4e74-4618-5984-08da6bb6ed82 x-ms-traffictypediagnostic: DM6PR16MB2844:EE_ x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 66evOavzjteOaXTLXFhP4nMDkwqhcd7pLtgr8iYej1bAq8e3Cg9914w29E1JTJVpCa7rsbnkRA0+eOKfaTPK9oFuHFLGm57PIeRJ/OpY6YBm91AWcXwhz0vO2yR7LxyPWNA+1w9rr5UtGJATn8reYUzZZUzq7bumMyYT95+TwqHwv6aAZ4AV8T9/MCTXym08riw2rrOA5jnBn7/SmY7+tgFxgmzHNKgE3sKWFn2Ob5OALS1kAPXHqC/612Q1woKKJZ1QuLqKkBHbwi2FAH4Fww4SxOsVf+CE4lqNWnn2pkxEc3p1spcCtZy/nkZGUcomi00XUo82t4/f/gvtdbR65w+A3lz2MB9OfUZ1ZjU/A+7QfNc0/TrpHjXE2A79mWUJqvOrGRx+5oMm201BP5TX3wKtPhi9Jfu8yjr/jlQ0G9JiDwQJfXTPF1c4Glq0WrflKRviv5ZboWrlSPicY058qx4y576SPQQ638vYfYc2hwFM841fX1vbE0mV9scYYjrua7DA7M8rBesTKSOYgtkse3x524bbTTv+D3kjKf3V39wyRpLGwoe316WikqHLyGSeV1qKP2y7e0ZXzLebVTcDUZHAc6h9SzGsEY3a6q5CZspTZ3j4//SHFxgorc/Mp+84 x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?XO0emHCcCRmT4ZhBu3QqujphQC0Z+YwVqy+IzUt2RocX27YuH6l2PWhyvm?= =?iso-8859-1?Q?r3xJC/pGssHrC+ZTPLsFa1xZy64JTwOq8GDYnoq2qSGMHsuEdRd+nPTHcA?= =?iso-8859-1?Q?ddLG6dCGmZjvopTgbhcA/OBqae0T4InFjKBx8A3mT+SOLNQrY59uj0yBce?= =?iso-8859-1?Q?dj5PClgv25kqXWYipiqKz9Egt1U25Y+FG0VByrX2wQWO9fi1xFcLKsJ3pk?= =?iso-8859-1?Q?bA8l+IypViHAqYA/h3iLRDVfX7eG186lnxva189B4BajRr5IyH+BxANfVa?= =?iso-8859-1?Q?qUnCmEcHx7lz3BcQ+i5XsHSgFx/ChRURgs/m58186/XMT3+L5+ercWb235?= =?iso-8859-1?Q?/tHdWWybvoyXirItsXJMSEucRYNgsth3mEUAZ2o3e3DvAVqtPNe77+6wdB?= =?iso-8859-1?Q?jHJ0SyydfiBU7kfRggS//p0RPfUKioyCOw+ELjc8H+1Ec2YyJ5Ncsketxw?= =?iso-8859-1?Q?GMRe9R4E9eBZT5sKw7sXaPVK/lMT/tAD4WBVgicXSesyyvEGFjFvqqDkRK?= =?iso-8859-1?Q?4rD7AoTEOs+06SxRDZ6EBrf1iML1qJ8CE/1YFuQbAe7kku7Nhq/Lbjh+dZ?= =?iso-8859-1?Q?uZW2vPzU6GAlU8DSk4b9GR5/rDKTBZyjKjdIVRUXogg9FHsTVy2rvDxOk5?= =?iso-8859-1?Q?iaxc9IqYF1YGWdOA2y9o6yXAE79GxFRajPaEJAhtQE6/177KYX7adGAGqA?= =?iso-8859-1?Q?jGSW9AxFnIlRO8eEKU60nE4lKpTvBx0/iafY40AYh1bBpw3h8qUzftyZN0?= =?iso-8859-1?Q?gQ0MB/2jHAkSTEkhLpFMV/pasPLgmzrQQ19cS6Z2Aq9nRsqopP1KHiD+vp?= =?iso-8859-1?Q?KGiUjqNR1jjL4lPKfClJHORVBVQfl4w0CaWlJGdNfu2eGghBuyM14f8fE9?= =?iso-8859-1?Q?YJGhjHW4vanU9U3bohwEV9w9ZvXYZF0SDQMuivu/ZTKy4py/ZB4cQPgyNL?= =?iso-8859-1?Q?vHPKUJHwfH5YhUtcJmL7IkxQiiGNCckhruIhikrE7DXEyGQ/JR/iYCZoWA?= =?iso-8859-1?Q?MXgJ9g4EMdT+efVsA02sWYtkGxs1j6nDGqgXj5y362hM0Bp0byWwKS4heb?= =?iso-8859-1?Q?n0jx0Qpw22jFuxNDack5erE1bwJvaT2W7NmBH7n+LEZ1hQxJumb3OP+hom?= =?iso-8859-1?Q?KWjcpQuf/9xo/aERjnrAiru8Uzj2btacTEYaLJ/j7UPrvEqZTqFvhMwQk1?= =?iso-8859-1?Q?w6S+KoOKhJMngMAu3yHatPFCctnkbnCJTDbSUgWqElTcrxKQCoHIXPiX7c?= =?iso-8859-1?Q?u4hUBfvA3QqwuCeD4YKgOpEbSXYE4Q1J+6pG62gGoR3qWa9t8xLH5Yc2vG?= =?iso-8859-1?Q?jbVl?= MIME-Version: 1.0 X-OriginatorOrg: sct-15-20-4755-11-msonline-outlook-5ce7b.templateTenant X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM8PR16MB4357.namprd16.prod.outlook.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: 05370cbe-4e74-4618-5984-08da6bb6ed82 X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Jul 2022 07:51:03.8996 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR16MB2844 X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS, TXREP, T_KAM_HTML_FONT_INVALID, T_REMOTE_IMAGE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-help@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-help mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2022 07:51:07 -0000 Greetings, I currently try to have this matrix-vector multiplication function auto-vec= torized to Aaarch64 armv8 NEON instructions and a 9.3.0 gcc without success= : https://github.com/g-truc/glm/blob/b3f87720261d623986f164b2a7f6a0a938430271= /glm/detail/type_mat4x4.inl#L536 [https://opengraph.githubassets.com/8c183f4120fbe851c5fc8496ce4638436329666= 708589299a2cdae3043ba4284/g-truc/glm] glm/type_mat4x4.inl at b3f87720261d623986f164b2a7f6a0a938430271 =B7 g-truc/= glm OpenGL Mathematics (GLM). Contribute to g-truc/glm development by creating = an account on GitHub. github.com I also tried replacing it with the explicit version commented out: return typename mat<4, 4, T, Q>::col_type( m[0][0] * v[0] + m[1][0] * v[1] + m[2][0] * v[2] + m[3][0] * v[3], m[0][1] * v[0] + m[1][1] * v[1] + m[2][1] * v[2] + m[3][1] * v[3], m[0][2] * v[0] + m[1][2] * v[1] + m[2][2] * v[2] + m[3][2] * v[3], m[0][3] * v[0] + m[1][3] * v[1] + m[2][3] * v[2] + m[3][3] * v[3]); I tried -O3 and -ftree-vectorize but it only produces the usual fmadd etc. = instructions, no NEON. Can someone tell me which parameters g++ needs to even consider NEON? Regards