From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-459782-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 5886 invoked by alias); 3 Aug 2017 18:49:56 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 5313 invoked by uid 89); 3 Aug 2017 18:49:55 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_PASS,UNPARSEABLE_RELAY autolearn=ham version=3.3.2 spammy=HTo:U*davem, H*Ad:U*davem, our
X-HELO: userp1040.oracle.com
Received: from userp1040.oracle.com (HELO userp1040.oracle.com) (156.151.31.81) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 03 Aug 2017 18:49:54 +0000
Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74])	by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v73InneU019424	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);	Thu, 3 Aug 2017 18:49:49 GMT
Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235])	by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v73InmbT000464	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);	Thu, 3 Aug 2017 18:49:49 GMT
Received: from abhmp0003.oracle.com (abhmp0003.oracle.com [141.146.116.9])	by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v73InlTk013085;	Thu, 3 Aug 2017 18:49:47 GMT
Received: from dhcp-adc-twvpn-3-vpnpool-10-154-96-17.vpn.oracle.com (/10.154.96.17)	by default (Oracle Beehive Gateway v4.0)	with ESMTP ; Thu, 03 Aug 2017 11:49:47 -0700
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: [PATCH 1/1] sparc: support for -mmisalign in the SPARC M8
From: Qing Zhao <qing.zhao@oracle.com>
In-Reply-To: <20170803.094054.1765286132092334063.davem@davemloft.net>
Date: Thu, 03 Aug 2017 18:49:00 -0000
Cc: gcc-patches@gcc.gnu.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <4EF10823-F692-407B-A085-292DCC6DF7E6@oracle.com>
References: <20170802.161731.761683434253007025.davem@davemloft.net> <B127A3FC-92FA-4315-992B-2D8A3B363F02@oracle.com> <4F455D43-8D8D-41B1-A9DF-5BDBD238D04E@oracle.com> <20170803.094054.1765286132092334063.davem@davemloft.net>
To: David Miller <davem@davemloft.net>
X-IsSubscribed: yes
X-SW-Source: 2017-08/txt/msg00345.txt.bz2


> On Aug 3, 2017, at 11:40 AM, David Miller <davem@davemloft.net> wrote:
>=20
> From: Qing Zhao <qing.zhao@oracle.com>
> Date: Thu, 3 Aug 2017 10:37:15 -0500
>=20
>> all the special handling on STRICT_ALIGNMENT or
>> SLOW_UNALIGNMENT_ACCESS in these codes have the following common
>> logic:
>>=20
>> if the memory access is known to be not-aligned well during
>> compilation time, if the targeted platform does NOT support faster
>> unaligned memory access, the compiler will try to make the memory
>> access aligned well. Otherwise, if the targeted platform supports
>> faster unaligned memory access, it will leave the compiler-time
>> known not-aligned memory access as it, later the hardware support
>> will kicked in for these unaligned memory access.
>>=20
>> this behavior is consistent with the high level definition of STRICT_ALI=
GNMENT.=20
>=20
> That's exactly the problem.
>=20
> What you want with this M8 feature is simply to let the compiler know
> that if it is completely impossible to make some memory object
> aligned, then the cpu can handle this with special instructions.

>=20
> You still want the compiler to make the effort to align data when it
> can because the accesses will be faster than if it used the unaligned
> loads and stores.

I don=E2=80=99t think the above is true.

first, the compiler-time known misaligned memory access can always be emula=
ted by aligned memory access ( by byte-size load/stores).  then there will =
be no compiler-time known=20
misaligned memory access left for the special misaligned ld/st insns.=20

second, there are always overhead cost for the compiler-time effort to make=
 the compiler-time known unaligned memory access as aligned memory access. =
(adding additional
padding, or split the unaligned multi-bytes to single-byte load/store), all=
 such overhead might be even bigger than the overhead of the special misali=
gned load/store itself.

to decide which is better (to use software emulation or use hardware misali=
gned load/store insns), experiments might be needed to justify the performa=
nce impact.

This set of change is to provide a way to use misaligned load/store insns t=
o implement the compiler-time known unaligned memory access,  -mno-misalign=
 can be used
to disable such behavior very easily if our performance data shows that mis=
aligned load/store insns are slower than the current software emulation.=20

Qing


>=20
> This is incredibly important for on-stack objects.