* GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon @ 2012-10-29 18:08 Jakub Jelinek 2012-10-29 18:13 ` David Miller ` (11 more replies) 0 siblings, 12 replies; 59+ messages in thread From: Jakub Jelinek @ 2012-10-29 18:08 UTC (permalink / raw) To: gcc; +Cc: gcc-patches Status ====== I'd like to close the stage 1 phase of GCC 4.8 development on Monday, November 5th. If you have still patches for new features you'd like to see in GCC 4.8, please post them for review soon. Patches posted before the freeze, but reviewed shortly after the freeze, may still go in, further changes should be just bugfixes and documentation fixes. Quality Data ============ Priority # Change from Last Report -------- --- ----------------------- P1 23 + 23 P2 77 + 8 P3 85 + 84 -------- --- ----------------------- Total 185 +115 Previous Report =============== http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html The next report will be sent by me again, announcing end of stage 1. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek @ 2012-10-29 18:13 ` David Miller 2012-10-29 18:32 ` Eric Botcazou 2012-10-30 8:22 ` Jakub Jelinek 2012-10-29 22:14 ` Magnus Granberg ` (10 subsequent siblings) 11 siblings, 2 replies; 59+ messages in thread From: David Miller @ 2012-10-29 18:13 UTC (permalink / raw) To: jakub; +Cc: gcc, gcc-patches From: Jakub Jelinek <jakub@redhat.com> Date: Mon, 29 Oct 2012 18:56:42 +0100 > I'd like to close the stage 1 phase of GCC 4.8 development > on Monday, November 5th. If you have still patches for new features you'd > like to see in GCC 4.8, please post them for review soon. Patches > posted before the freeze, but reviewed shortly after the freeze, may > still go in, further changes should be just bugfixes and documentation > fixes. I'd like to get the Sparc cbcond stuff in (3 revisions posted) which is waiting for Eric B. to do some Solaris specific work. I'd also like to enable LRA for at least 32-bit sparc, even if I can't find the time to work on auditing 64-bit completely. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:13 ` David Miller @ 2012-10-29 18:32 ` Eric Botcazou 2012-10-29 18:42 ` David Miller 2012-10-30 8:22 ` Jakub Jelinek 1 sibling, 1 reply; 59+ messages in thread From: Eric Botcazou @ 2012-10-29 18:32 UTC (permalink / raw) To: David Miller; +Cc: gcc, jakub, gcc-patches > I'd like to get the Sparc cbcond stuff in (3 revisions posted) which > is waiting for Eric B. to do some Solaris specific work. > > I'd also like to enable LRA for at least 32-bit sparc, even if I can't > find the time to work on auditing 64-bit completely. End of stage #1 isn't a hard limit for architecture-specific patches, so we need not make a decision about LRA immediately. I don't think we want to half enable it though, so it's all or nothing. -- Eric Botcazou ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:32 ` Eric Botcazou @ 2012-10-29 18:42 ` David Miller 0 siblings, 0 replies; 59+ messages in thread From: David Miller @ 2012-10-29 18:42 UTC (permalink / raw) To: ebotcazou; +Cc: gcc, jakub, gcc-patches From: Eric Botcazou <ebotcazou@adacore.com> Date: Mon, 29 Oct 2012 20:25:15 +0100 >> I'd like to get the Sparc cbcond stuff in (3 revisions posted) which >> is waiting for Eric B. to do some Solaris specific work. >> >> I'd also like to enable LRA for at least 32-bit sparc, even if I can't >> find the time to work on auditing 64-bit completely. > > End of stage #1 isn't a hard limit for architecture-specific patches, so we > need not make a decision about LRA immediately. I don't think we want to half > enable it though, so it's all or nothing. Upon further consideration, agreed. I'll only turn this on if I can get the whole backend working. FWIW, I think we should consider delaying stage1 for another reason. A large number of North American developers are about to be hit by a major natural disaster, and may be without power for weeks. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:13 ` David Miller 2012-10-29 18:32 ` Eric Botcazou @ 2012-10-30 8:22 ` Jakub Jelinek 1 sibling, 0 replies; 59+ messages in thread From: Jakub Jelinek @ 2012-10-30 8:22 UTC (permalink / raw) To: David Miller; +Cc: gcc, gcc-patches On Mon, Oct 29, 2012 at 02:07:55PM -0400, David Miller wrote: > > I'd like to close the stage 1 phase of GCC 4.8 development > > on Monday, November 5th. If you have still patches for new features you'd > > like to see in GCC 4.8, please post them for review soon. Patches > > posted before the freeze, but reviewed shortly after the freeze, may > > still go in, further changes should be just bugfixes and documentation > > fixes. > > I'd like to get the Sparc cbcond stuff in (3 revisions posted) which > is waiting for Eric B. to do some Solaris specific work. That has been posted in stage 1, so it is certainly ok to commit it even during early stage 3. And, on a case by case basis exceptions are always possible. This hasn't changed in the last few years. By the reviewed shortly after the freeze I just want to say that e.g. having large intrusive patches posted now, but reviewed late December is already too late. As for postponing end of stage 1 by a few weeks because of the storm, I'm afraid if we want to keep roughly timely releases we don't have that luxury. If you look at http://gcc.gnu.org/develop.html, ending stage 1 around end of October happened already for 4.6 and 4.7, for 4.5 if was a month earlier and for 4.4 even two months earlier. The 4.7 bugfixing went IMHO smothly, but we certainly have to expect lots of bugfixing. > I'd also like to enable LRA for at least 32-bit sparc, even if I can't > find the time to work on auditing 64-bit completely. I agree with Eric that it is better to enable it for the whole target together, rather than based on some options. Enabling LRA in early stage 3 for some targets should be ok, if it doesn't require too large and intrusive changes to the generic code that could destabilize other targets. Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek 2012-10-29 18:13 ` David Miller @ 2012-10-29 22:14 ` Magnus Granberg 2012-10-30 7:01 ` Gopalasubramanian, Ganesh ` (9 subsequent siblings) 11 siblings, 0 replies; 59+ messages in thread From: Magnus Granberg @ 2012-10-29 22:14 UTC (permalink / raw) To: gcc-patches måndag 29 oktober 2012 18.56.42 skrev Jakub Jelinek: > Status > ====== > > I'd like to close the stage 1 phase of GCC 4.8 development > on Monday, November 5th. If you have still patches for new features you'd > like to see in GCC 4.8, please post them for review soon. Patches > posted before the freeze, but reviewed shortly after the freeze, may > still go in, further changes should be just bugfixes and documentation > fixes. > I want to get the new configure --enable-espf options included. The patches have been posted some time ago. Gentoo Hardened Project Magnus Granberg ^ permalink raw reply [flat|nested] 59+ messages in thread
* RE: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek 2012-10-29 18:13 ` David Miller 2012-10-29 22:14 ` Magnus Granberg @ 2012-10-30 7:01 ` Gopalasubramanian, Ganesh 2012-10-30 13:47 ` Diego Novillo ` (8 subsequent siblings) 11 siblings, 0 replies; 59+ messages in thread From: Gopalasubramanian, Ganesh @ 2012-10-30 7:01 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches, gcc, Uros Bizjak (ubizjak@gmail.com) Hi Jakub, We are working on the following. 1. bdver3 enablement. Review completed. Changes to be incorporated and checked-in. http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01131.html 2. btver2 basic enablement is done (http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01018.html)/ Scheduler descriptions are being updated. This is architecture specific and we consider it not to be a stage-1 material. Regards Ganesh -----Original Message----- From: Jakub Jelinek [mailto:jakub@redhat.com] Sent: Monday, October 29, 2012 11:27 PM To: gcc@gcc.gnu.org Cc: gcc-patches@gcc.gnu.org Subject: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Status ====== I'd like to close the stage 1 phase of GCC 4.8 development on Monday, November 5th. If you have still patches for new features you'd like to see in GCC 4.8, please post them for review soon. Patches posted before the freeze, but reviewed shortly after the freeze, may still go in, further changes should be just bugfixes and documentation fixes. Quality Data ============ Priority # Change from Last Report -------- --- ----------------------- P1 23 + 23 P2 77 + 8 P3 85 + 84 -------- --- ----------------------- Total 185 +115 Previous Report =============== http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html The next report will be sent by me again, announcing end of stage 1. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek ` (2 preceding siblings ...) 2012-10-30 7:01 ` Gopalasubramanian, Ganesh @ 2012-10-30 13:47 ` Diego Novillo 2012-10-30 21:31 ` Lawrence Crowl 2012-10-30 21:07 ` Kenneth Zadeck ` (7 subsequent siblings) 11 siblings, 1 reply; 59+ messages in thread From: Diego Novillo @ 2012-10-30 13:47 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc, gcc-patches On Mon, Oct 29, 2012 at 1:56 PM, Jakub Jelinek <jakub@redhat.com> wrote: > Status > ====== > > I'd like to close the stage 1 phase of GCC 4.8 development > on Monday, November 5th. If you have still patches for new features you'd > like to see in GCC 4.8, please post them for review soon. Patches > posted before the freeze, but reviewed shortly after the freeze, may > still go in, further changes should be just bugfixes and documentation > fixes. I will be committing the VEC overhaul soon. With any luck this week, but PCH and gengtype are giving me a lot of grief. Diego. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-30 13:47 ` Diego Novillo @ 2012-10-30 21:31 ` Lawrence Crowl 0 siblings, 0 replies; 59+ messages in thread From: Lawrence Crowl @ 2012-10-30 21:31 UTC (permalink / raw) To: Diego Novillo; +Cc: Jakub Jelinek, gcc, gcc-patches On 10/30/12, Diego Novillo <dnovillo@google.com> wrote: > On Mon, Oct 29, 2012 at 1:56 PM, Jakub Jelinek <jakub@redhat.com> wrote: >> Status >> ====== >> >> I'd like to close the stage 1 phase of GCC 4.8 development >> on Monday, November 5th. If you have still patches for new features >> you'd >> like to see in GCC 4.8, please post them for review soon. Patches >> posted before the freeze, but reviewed shortly after the freeze, may >> still go in, further changes should be just bugfixes and documentation >> fixes. > > I will be committing the VEC overhaul soon. With any luck this week, > but PCH and gengtype are giving me a lot of grief. I have three remaining bitmap patches and the recently approved is_a/symtab/cgraph patch. However, Alexandre Oliva <aoliva@redhat.com> has a patch for bootstrap failure that is biting me. I can either incorporate it into my patches or wait for his patch and then submit. Comments? -- Lawrence Crowl ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek ` (3 preceding siblings ...) 2012-10-30 13:47 ` Diego Novillo @ 2012-10-30 21:07 ` Kenneth Zadeck 2012-10-31 10:00 ` Richard Biener 2012-10-30 22:06 ` Sriraman Tallam ` (6 subsequent siblings) 11 siblings, 1 reply; 59+ messages in thread From: Kenneth Zadeck @ 2012-10-30 21:07 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc, gcc-patches jakub, i am hoping to get the rest of my wide integer conversion posted by nov 5. I am under some adverse conditions here: hurricane sandy hit her pretty badly. my house is hooked up to a small generator, and no one has any power for miles around. So far richi has promised to review them. he has sent some comments, but so far no reviews. Some time after i get the first round of them posted, i will do a second round that incorporates everyones comments. But i would like a little slack here if possible. While this work is a show stopper for my private port, the patches address serious problems for many of the public ports, especially ones that have very flexible vector units. I believe that there are significant set of latent problems currently with the existing ports that use ti mode that these patches will fix. However, i will do everything in my power to get the first round of the patches posted by nov 5 deadline. kenny On 10/29/2012 01:56 PM, Jakub Jelinek wrote: > Status > ====== > > I'd like to close the stage 1 phase of GCC 4.8 development > on Monday, November 5th. If you have still patches for new features you'd > like to see in GCC 4.8, please post them for review soon. Patches > posted before the freeze, but reviewed shortly after the freeze, may > still go in, further changes should be just bugfixes and documentation > fixes. > > > Quality Data > ============ > > Priority # Change from Last Report > -------- --- ----------------------- > P1 23 + 23 > P2 77 + 8 > P3 85 + 84 > -------- --- ----------------------- > Total 185 +115 > > > Previous Report > =============== > > http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html > > The next report will be sent by me again, announcing end of stage 1. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-30 21:07 ` Kenneth Zadeck @ 2012-10-31 10:00 ` Richard Biener 2012-10-31 10:02 ` Richard Sandiford 2012-10-31 18:34 ` GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Andrew Haley 0 siblings, 2 replies; 59+ messages in thread From: Richard Biener @ 2012-10-31 10:00 UTC (permalink / raw) To: Kenneth Zadeck; +Cc: Jakub Jelinek, gcc, gcc-patches On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote: > jakub, > > i am hoping to get the rest of my wide integer conversion posted by nov 5. > I am under some adverse conditions here: hurricane sandy hit her pretty > badly. my house is hooked up to a small generator, and no one has any power > for miles around. > > So far richi has promised to review them. he has sent some comments, but > so far no reviews. Some time after i get the first round of them posted, > i will do a second round that incorporates everyones comments. > > But i would like a little slack here if possible. While this work is a > show stopper for my private port, the patches address serious problems for > many of the public ports, especially ones that have very flexible vector > units. I believe that there are significant set of latent problems > currently with the existing ports that use ti mode that these patches will > fix. > > However, i will do everything in my power to get the first round of the > patches posted by nov 5 deadline. I suppose you are not going to merge your private port for 4.8 and thus the wide-int changes are not a show-stopper for you. That said, I considered the main conversion to be appropriate to be defered for the next stage1. There is no advantage in disrupting the tree more at this stage. Thanks, Richard. > kenny > > > On 10/29/2012 01:56 PM, Jakub Jelinek wrote: >> >> Status >> ====== >> >> I'd like to close the stage 1 phase of GCC 4.8 development >> on Monday, November 5th. If you have still patches for new features you'd >> like to see in GCC 4.8, please post them for review soon. Patches >> posted before the freeze, but reviewed shortly after the freeze, may >> still go in, further changes should be just bugfixes and documentation >> fixes. >> >> >> Quality Data >> ============ >> >> Priority # Change from Last Report >> -------- --- ----------------------- >> P1 23 + 23 >> P2 77 + 8 >> P3 85 + 84 >> -------- --- ----------------------- >> Total 185 +115 >> >> >> Previous Report >> =============== >> >> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html >> >> The next report will be sent by me again, announcing end of stage 1. > > ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 10:00 ` Richard Biener @ 2012-10-31 10:02 ` Richard Sandiford 2012-10-31 10:13 ` Richard Biener ` (2 more replies) 2012-10-31 18:34 ` GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Andrew Haley 1 sibling, 3 replies; 59+ messages in thread From: Richard Sandiford @ 2012-10-31 10:02 UTC (permalink / raw) To: Richard Biener; +Cc: Kenneth Zadeck, Jakub Jelinek, gcc, gcc-patches Richard Biener <richard.guenther@gmail.com> writes: > On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck > <zadeck@naturalbridge.com> wrote: >> jakub, >> >> i am hoping to get the rest of my wide integer conversion posted by nov 5. >> I am under some adverse conditions here: hurricane sandy hit her pretty >> badly. my house is hooked up to a small generator, and no one has any power >> for miles around. >> >> So far richi has promised to review them. he has sent some comments, but >> so far no reviews. Some time after i get the first round of them posted, >> i will do a second round that incorporates everyones comments. >> >> But i would like a little slack here if possible. While this work is a >> show stopper for my private port, the patches address serious problems for >> many of the public ports, especially ones that have very flexible vector >> units. I believe that there are significant set of latent problems >> currently with the existing ports that use ti mode that these patches will >> fix. >> >> However, i will do everything in my power to get the first round of the >> patches posted by nov 5 deadline. > > I suppose you are not going to merge your private port for 4.8 and thus > the wide-int changes are not a show-stopper for you. > > That said, I considered the main conversion to be appropriate to be > defered for the next stage1. There is no advantage in disrupting the > tree more at this stage. I would like the wide_int class and rtl stuff to go in 4.8 though. IMO it's a significant improvement in its own right, and Kenny submitted it well before the deadline. Richard ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 10:02 ` Richard Sandiford @ 2012-10-31 10:13 ` Richard Biener 2012-10-31 13:54 ` Kenneth Zadeck 2013-02-27 12:39 ` patch to fix constant math - 5th patch - the main rtl work Kenneth Zadeck 2 siblings, 0 replies; 59+ messages in thread From: Richard Biener @ 2012-10-31 10:13 UTC (permalink / raw) To: Richard Biener, Kenneth Zadeck, Jakub Jelinek, gcc, gcc-patches, rdsandiford On Wed, Oct 31, 2012 at 10:59 AM, Richard Sandiford <rdsandiford@googlemail.com> wrote: > Richard Biener <richard.guenther@gmail.com> writes: >> On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck >> <zadeck@naturalbridge.com> wrote: >>> jakub, >>> >>> i am hoping to get the rest of my wide integer conversion posted by nov 5. >>> I am under some adverse conditions here: hurricane sandy hit her pretty >>> badly. my house is hooked up to a small generator, and no one has any power >>> for miles around. >>> >>> So far richi has promised to review them. he has sent some comments, but >>> so far no reviews. Some time after i get the first round of them posted, >>> i will do a second round that incorporates everyones comments. >>> >>> But i would like a little slack here if possible. While this work is a >>> show stopper for my private port, the patches address serious problems for >>> many of the public ports, especially ones that have very flexible vector >>> units. I believe that there are significant set of latent problems >>> currently with the existing ports that use ti mode that these patches will >>> fix. >>> >>> However, i will do everything in my power to get the first round of the >>> patches posted by nov 5 deadline. >> >> I suppose you are not going to merge your private port for 4.8 and thus >> the wide-int changes are not a show-stopper for you. >> >> That said, I considered the main conversion to be appropriate to be >> defered for the next stage1. There is no advantage in disrupting the >> tree more at this stage. > > I would like the wide_int class and rtl stuff to go in 4.8 though. > IMO it's a significant improvement in its own right, and Kenny > submitted it well before the deadline. If it gets in as-is then we'll have to live with the IMHO broken API (yet another one besides the existing double-int). So _please_ shrink the API down aggresively in favor of using non-member helper functions with more descriptive names for things that lump together multiple operations. Look at double-int and use the same API ideas as people are familiar with it (like the unsigned flag stuff) - consistency always trumps. I'm going to be on vacation for the next three weeks so somebody else has to pick up the review work. But I really think that the tree has to recover from too many changes already. Richard. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 10:02 ` Richard Sandiford 2012-10-31 10:13 ` Richard Biener @ 2012-10-31 13:54 ` Kenneth Zadeck 2012-10-31 14:05 ` Jakub Jelinek 2012-10-31 19:13 ` Marc Glisse 2013-02-27 12:39 ` patch to fix constant math - 5th patch - the main rtl work Kenneth Zadeck 2 siblings, 2 replies; 59+ messages in thread From: Kenneth Zadeck @ 2012-10-31 13:54 UTC (permalink / raw) To: Richard Biener, Jakub Jelinek, gcc, gcc-patches, rdsandiford Richi, Let me explain to you what a broken api is. I have spent the last week screwing around with tree-vpn and as of last night i finally got it to work. In tree-vpn, it is clear that double-int is the precise definition of a broken api. The tree-vpn uses an infinite-precision view of arithmetic. However, that infinite precision is implemented on top of a finite, CARVED IN STONE, base that is and will always be without a patch like this, 128 bits on an x86-64. However, as was pointed out by earlier, tree-vrp needs 2 * the size of a type + 1 bit to work correctly. Until yesterday i did not fully understand the significance of that 1 bit. what this means is that tree-vrp does not work on an x86-64 with _int128 variables. There are no checks in tree-vrp to back off when it sees something too large, tree-vrp simply gets the wrong answer. To me, this is a broken api and is GCC at its very worst. The patches that required this SHOULD HAVE NEVER GONE INTO GCC. What you have with my patches is someone who is willing to fix a large and complex problem that should have been fixed years ago. I understand that you do not like several aspects of the wide-int api and i am willing to make some of those improvements. However, what i am worried about is that you are in some ways really attached to the style of programmed where everything is dependent on the size of a HWI. I will continue to push back on those comments but have been working the rest in as i have been going along. To answer your other question, it will be a significant problem if i cannot get these patches in. They are very prone to patch rot and my customer wants a product without many patches to the base code. Also, i fear that your real reason that you want to wait is because you really do not like the fact these patches get rid of double in and that style of programming and putting off that day serves no one well. kenny On 10/31/2012 05:59 AM, Richard Sandiford wrote: > Richard Biener<richard.guenther@gmail.com> writes: >> On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck >> <zadeck@naturalbridge.com> wrote: >>> jakub, >>> >>> i am hoping to get the rest of my wide integer conversion posted by nov 5. >>> I am under some adverse conditions here: hurricane sandy hit her pretty >>> badly. my house is hooked up to a small generator, and no one has any power >>> for miles around. >>> >>> So far richi has promised to review them. he has sent some comments, but >>> so far no reviews. Some time after i get the first round of them posted, >>> i will do a second round that incorporates everyones comments. >>> >>> But i would like a little slack here if possible. While this work is a >>> show stopper for my private port, the patches address serious problems for >>> many of the public ports, especially ones that have very flexible vector >>> units. I believe that there are significant set of latent problems >>> currently with the existing ports that use ti mode that these patches will >>> fix. >>> >>> However, i will do everything in my power to get the first round of the >>> patches posted by nov 5 deadline. >> I suppose you are not going to merge your private port for 4.8 and thus >> the wide-int changes are not a show-stopper for you. >> >> That said, I considered the main conversion to be appropriate to be >> defered for the next stage1. There is no advantage in disrupting the >> tree more at this stage. > I would like the wide_int class and rtl stuff to go in 4.8 though. > IMO it's a significant improvement in its own right, and Kenny > submitted it well before the deadline. > > Richard ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 13:54 ` Kenneth Zadeck @ 2012-10-31 14:05 ` Jakub Jelinek 2012-10-31 14:06 ` Kenneth Zadeck 2012-10-31 19:13 ` Marc Glisse 1 sibling, 1 reply; 59+ messages in thread From: Jakub Jelinek @ 2012-10-31 14:05 UTC (permalink / raw) To: Kenneth Zadeck; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford On Wed, Oct 31, 2012 at 09:44:50AM -0400, Kenneth Zadeck wrote: > The tree-vpn uses an infinite-precision view of arithmetic. However, > that infinite precision is implemented on top of a finite, CARVED IN > STONE, base that is and will always be without a patch like this, > 128 bits on an x86-64. However, as was pointed out by earlier, > tree-vrp needs 2 * the size of a type + 1 bit to work correctly. > Until yesterday i did not fully understand the significance of that > 1 bit. what this means is that tree-vrp does not work on an x86-64 > with _int128 variables. If you see a VRP bug, please file a PR with a testcase, or point to existing PR. I agree with richi that it would be better to add a clean wide_int implementation for 4.9, rather than rushing something in, introducing lots of bugs, just for a port that hasn't been submitted, nor I understand why > int128_t integer types are so crucial to your port, the vector support doesn't generally need very large integers, even if your vector unit is 256-bit, 512-bit or larger. Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 14:05 ` Jakub Jelinek @ 2012-10-31 14:06 ` Kenneth Zadeck 2012-10-31 14:31 ` Jakub Jelinek 0 siblings, 1 reply; 59+ messages in thread From: Kenneth Zadeck @ 2012-10-31 14:06 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford jakub my port has 256 bit integers. They are done by strapping together all of the elements of a vector unit. if one looks at where intel is going, they are doing exactly the same thing. The difference is that they like to add the operations one at a time rather than just do a clean implementation like we did. Soon they will get there, it is just a matter of time. i understand the tree-vrp code well enough to say that this operation does not work if you have timode, but i do not know how to translate that back into c to generate a test case. My patch to tree-vrp is adaptable in that it looks at the types in the program and adjusts its definition of infinite precision based on the code that it sees. I can point people to that code in tree vrp and am happy to do that, but that is not my priority now. also, richi pointed out that there are places in the tree level constant propagators that require infinite precision so he is really the person who both should know about this and generate proper tests. kenny On 10/31/2012 09:55 AM, Jakub Jelinek wrote: > On Wed, Oct 31, 2012 at 09:44:50AM -0400, Kenneth Zadeck wrote: >> The tree-vpn uses an infinite-precision view of arithmetic. However, >> that infinite precision is implemented on top of a finite, CARVED IN >> STONE, base that is and will always be without a patch like this, >> 128 bits on an x86-64. However, as was pointed out by earlier, >> tree-vrp needs 2 * the size of a type + 1 bit to work correctly. >> Until yesterday i did not fully understand the significance of that >> 1 bit. what this means is that tree-vrp does not work on an x86-64 >> with _int128 variables. > If you see a VRP bug, please file a PR with a testcase, or point to existing > PR. I agree with richi that it would be better to add a clean wide_int > implementation for 4.9, rather than rushing something in, introducing > lots of bugs, just for a port that hasn't been submitted, nor I understand > why > int128_t integer types are so crucial to your port, the vector > support doesn't generally need very large integers, even if your > vector unit is 256-bit, 512-bit or larger. > > Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 14:06 ` Kenneth Zadeck @ 2012-10-31 14:31 ` Jakub Jelinek 2012-10-31 14:56 ` Kenneth Zadeck 2012-10-31 18:42 ` Kenneth Zadeck 0 siblings, 2 replies; 59+ messages in thread From: Jakub Jelinek @ 2012-10-31 14:31 UTC (permalink / raw) To: Kenneth Zadeck; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford On Wed, Oct 31, 2012 at 10:04:58AM -0400, Kenneth Zadeck wrote: > if one looks at where intel is going, they are doing exactly the > same thing. The difference is that they like to add the > operations one at a time rather than just do a clean implementation > like we did. Soon they will get there, it is just a matter of > time. All I see on Intel is whole vector register shifts (and like on many other ports and/or/xor/andn could be considered whole register too). And, even if your port has 256-bit integer arithmetics, there is no mangling for __int256_t or similar, so I don't see how you can offer such data type as supported in the 4.8 timeframe. Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 14:31 ` Jakub Jelinek @ 2012-10-31 14:56 ` Kenneth Zadeck 2012-10-31 18:42 ` Kenneth Zadeck 1 sibling, 0 replies; 59+ messages in thread From: Kenneth Zadeck @ 2012-10-31 14:56 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford I was not planning to do that mangling for 4.8. My primary justification for getting it in publicly now is that there are a large number of places where the current compiler (both at the tree and rtl levels) do not do optimization of the value is larger than a single hwi. My code generalizes all of these places so that they do the transformations independent of the size of the hwi. (in some cases at the rtl level, the transformations were only done on 32 bit or smaller types, but i have seen nothing like that at the tree level.) This provides benefits for cross compilers and for ports that support timode now. The fact that i have chosen to do it in such a way that we will never have this problem again is the part of the patch that richi seems to object to. We have patches that do the mangling for 256 for the front ends but we figured that we would post those for comments. These are likely to be controversial because the require extensions to the syntax to accept large constants. But there is no reason why the patches that fix the existing problems in a general way should not be considered for this release. Kenny On 10/31/2012 10:27 AM, Jakub Jelinek wrote: > On Wed, Oct 31, 2012 at 10:04:58AM -0400, Kenneth Zadeck wrote: >> if one looks at where intel is going, they are doing exactly the >> same thing. The difference is that they like to add the >> operations one at a time rather than just do a clean implementation >> like we did. Soon they will get there, it is just a matter of >> time. > All I see on Intel is whole vector register shifts (and like on many other > ports and/or/xor/andn could be considered whole register too). > And, even if your port has 256-bit integer arithmetics, there is no mangling > for __int256_t or similar, so I don't see how you can offer such data type > as supported in the 4.8 timeframe. > > Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 14:31 ` Jakub Jelinek 2012-10-31 14:56 ` Kenneth Zadeck @ 2012-10-31 18:42 ` Kenneth Zadeck 2012-11-01 12:44 ` Kenneth Zadeck 1 sibling, 1 reply; 59+ messages in thread From: Kenneth Zadeck @ 2012-10-31 18:42 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford [-- Attachment #1: Type: text/plain, Size: 4203 bytes --] Jakub, it is hard from all of the threads to actually distill what the real issues are here. So let me start from a clean slate and state them simply. Richi has three primary objections: 1) that we can do all of this with a templated version of double-int. 2) that we should not be passing in a precision and bitsize into the interface. 3) that the interface is too large. I have attached a fragment of my patch #5 to illustrate the main thrust of my patches and to illustrate the usefulness to gcc right now. In the current trunk, we have code that does simplification when the mode fits in an HWI and we have code that does the simplification if the mode fits in two HWIs. if the mode does not fit in two hwi's the code does not do the simplification. Thus here and in a large number of other places we have two copies of the code. Richi wants there to be multiple template instantiations of double-int. This means that we are now going to have to have 3 copies of this code to support oi mode on a 64 bit host and 4 copies on a 32 bit host. Further note that there are not as many cases for the 2*hwi in the code as their are for the hwi case and in general this is true through out the compiler. (CLRSB is missing from the 2hwi case in the patch) We really did not write twice the code when we stated supporting 2 hwi, we added about 1.5 times the code (simplify-rtx is better than most of the rest of the compiler). I am using the rtl level as an example here because i have posted all of those patches, but the tree level is no better. I do not want to write this code a third time and certainly not a fourth time. Just fixing all of this is quite useful now: it fills in a lot of gaps in our transformations and it removes many edge case crashes because ti mode really is lightly tested. However, this patch becomes crucial as the world gets larger. Richi's second point is that we should be doing everything at "infinite precision" and not passing in an explicit bitsize and precision. That works ok (sans the issues i raised with it in tree-vpn earlier) when the largest precision on the machine fits in a couple of hwis. However, for targets that have large integers or cross compilers, this becomes expensive. The idea behind my set of patches is that for the transformations that can work this way, we do the math in the precision of the type or mode. In general this means that almost all of the math will be done quickly, even on targets that support really big integers. For passes like tree-vrp, the math will be done at some multiple of the largest type seen in the actual program. The amount of the multiple is a function of the optimization, not the target or the host. Currently (on my home computer) the wide-int interface allows the optimization to go 4x the largest mode on the target. I can get rid of this bound at the expense of doing an alloca rather than stack allocating a fixed sized structure. However, given the extremely heavy use of this interface, that does not seem like the best of tradeoffs. The truth is that the vast majority of the compiler actually wants to see the math done the way that it is going to be done on the machine. Tree-vrp and the gimple constant prop do not. But i have made accommodations to handle both needs. I believe that the reason that double-int was never used at the rtl level is that it does not actually do the math in a way that is useful to the target. Richi's third objection is that the interface is too large. I disagree. It was designed based on the actual usage of the interface. When i found places where i was writing the same code over and over again, i put it in a function as part of the interface. I later went back and optimized many of these because this is a very heavily used interface. Richi has many other objections, but i have agreed to fix almost all of them, so i am not going to address them here. It really will be a huge burden to have to carry these patched until the next revision. We are currently in stage 1 and i believe that the minor issues that richi raises can be easily addressed. kenny [-- Attachment #2: small.diff --] [-- Type: text/x-patch, Size: 8180 bytes --] @@ -1373,302 +1411,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, return CONST_DOUBLE_FROM_REAL_VALUE (d, mode); } - if (CONST_INT_P (op) - && width <= HOST_BITS_PER_WIDE_INT && width > 0) + if (CONST_SCALAR_INT_P (op) && width > 0) { - HOST_WIDE_INT arg0 = INTVAL (op); - HOST_WIDE_INT val; + wide_int result; + enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode; + wide_int op0 = wide_int::from_rtx (op, imode); + +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* This assert keeps the simplification from producing a result + that cannot be represented in a CONST_DOUBLE but a lot of + upstream callers expect that this function never fails to + simplify something and so you if you added this to the test + above the code would die later anyway. If this assert + happens, you just need to make the port support wide int. */ + gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); +#endif switch (code) { case NOT: - val = ~ arg0; + result = ~op0; break; case NEG: - val = - arg0; + result = op0.neg (); break; case ABS: - val = (arg0 >= 0 ? arg0 : - arg0); + result = op0.abs (); break; case FFS: - arg0 &= GET_MODE_MASK (mode); - val = ffs_hwi (arg0); + result = op0.ffs (); break; case CLZ: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val)) - ; - else - val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1; + result = op0.clz (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case CLRSB: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0) - val = GET_MODE_PRECISION (mode) - 1; - else if (arg0 >= 0) - val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2; - else if (arg0 < 0) - val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2; + result = op0.clrsb (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; - + case CTZ: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0) - { - /* Even if the value at zero is undefined, we have to come - up with some replacement. Seems good enough. */ - if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val)) - val = GET_MODE_PRECISION (mode); - } - else - val = ctz_hwi (arg0); + result = op0.ctz (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case POPCOUNT: - arg0 &= GET_MODE_MASK (mode); - val = 0; - while (arg0) - val++, arg0 &= arg0 - 1; + result = op0.popcount (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case PARITY: - arg0 &= GET_MODE_MASK (mode); - val = 0; - while (arg0) - val++, arg0 &= arg0 - 1; - val &= 1; + result = op0.parity (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case BSWAP: - { - unsigned int s; - - val = 0; - for (s = 0; s < width; s += 8) - { - unsigned int d = width - s - 8; - unsigned HOST_WIDE_INT byte; - byte = (arg0 >> s) & 0xff; - val |= byte << d; - } - } + result = op0.bswap (); break; case TRUNCATE: - val = arg0; + result = op0.sext (mode); break; case ZERO_EXTEND: - /* When zero-extending a CONST_INT, we need to know its - original mode. */ - gcc_assert (op_mode != VOIDmode); - if (op_width == HOST_BITS_PER_WIDE_INT) - { - /* If we were really extending the mode, - we would have to distinguish between zero-extension - and sign-extension. */ - gcc_assert (width == op_width); - val = arg0; - } - else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT) - val = arg0 & GET_MODE_MASK (op_mode); - else - return 0; + result = op0.zext (mode); break; case SIGN_EXTEND: - if (op_mode == VOIDmode) - op_mode = mode; - op_width = GET_MODE_PRECISION (op_mode); - if (op_width == HOST_BITS_PER_WIDE_INT) - { - /* If we were really extending the mode, - we would have to distinguish between zero-extension - and sign-extension. */ - gcc_assert (width == op_width); - val = arg0; - } - else if (op_width < HOST_BITS_PER_WIDE_INT) - { - val = arg0 & GET_MODE_MASK (op_mode); - if (val_signbit_known_set_p (op_mode, val)) - val |= ~GET_MODE_MASK (op_mode); - } - else - return 0; + result = op0.sext (mode); break; case SQRT: - case FLOAT_EXTEND: - case FLOAT_TRUNCATE: - case SS_TRUNCATE: - case US_TRUNCATE: - case SS_NEG: - case US_NEG: - case SS_ABS: - return 0; - - default: - gcc_unreachable (); - } - - return gen_int_mode (val, mode); - } - - /* We can do some operations on integer CONST_DOUBLEs. Also allow - for a DImode operation on a CONST_INT. */ - else if (width <= HOST_BITS_PER_DOUBLE_INT - && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op))) - { - double_int first, value; - - if (CONST_DOUBLE_AS_INT_P (op)) - first = double_int::from_pair (CONST_DOUBLE_HIGH (op), - CONST_DOUBLE_LOW (op)); - else - first = double_int::from_shwi (INTVAL (op)); - - switch (code) - { - case NOT: - value = ~first; - break; - - case NEG: - value = -first; - break; - - case ABS: - if (first.is_negative ()) - value = -first; - else - value = first; - break; - - case FFS: - value.high = 0; - if (first.low != 0) - value.low = ffs_hwi (first.low); - else if (first.high != 0) - value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high); - else - value.low = 0; - break; - - case CLZ: - value.high = 0; - if (first.high != 0) - value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1 - - HOST_BITS_PER_WIDE_INT; - else if (first.low != 0) - value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1; - else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low)) - value.low = GET_MODE_PRECISION (mode); - break; - - case CTZ: - value.high = 0; - if (first.low != 0) - value.low = ctz_hwi (first.low); - else if (first.high != 0) - value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high); - else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low)) - value.low = GET_MODE_PRECISION (mode); - break; - - case POPCOUNT: - value = double_int_zero; - while (first.low) - { - value.low++; - first.low &= first.low - 1; - } - while (first.high) - { - value.low++; - first.high &= first.high - 1; - } - break; - - case PARITY: - value = double_int_zero; - while (first.low) - { - value.low++; - first.low &= first.low - 1; - } - while (first.high) - { - value.low++; - first.high &= first.high - 1; - } - value.low &= 1; - break; - - case BSWAP: - { - unsigned int s; - - value = double_int_zero; - for (s = 0; s < width; s += 8) - { - unsigned int d = width - s - 8; - unsigned HOST_WIDE_INT byte; - - if (s < HOST_BITS_PER_WIDE_INT) - byte = (first.low >> s) & 0xff; - else - byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff; - - if (d < HOST_BITS_PER_WIDE_INT) - value.low |= byte << d; - else - value.high |= byte << (d - HOST_BITS_PER_WIDE_INT); - } - } - break; - - case TRUNCATE: - /* This is just a change-of-mode, so do nothing. */ - value = first; - break; - - case ZERO_EXTEND: - gcc_assert (op_mode != VOIDmode); - - if (op_width > HOST_BITS_PER_WIDE_INT) - return 0; - - value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode)); - break; - - case SIGN_EXTEND: - if (op_mode == VOIDmode - || op_width > HOST_BITS_PER_WIDE_INT) - return 0; - else - { - value.low = first.low & GET_MODE_MASK (op_mode); - if (val_signbit_known_set_p (op_mode, value.low)) - value.low |= ~GET_MODE_MASK (op_mode); - - value.high = HWI_SIGN_EXTEND (value.low); - } - break; - - case SQRT: - return 0; - default: return 0; } - return immed_double_int_const (value, mode); + return immed_wide_int_const (result, mode); } ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 18:42 ` Kenneth Zadeck @ 2012-11-01 12:44 ` Kenneth Zadeck 2012-11-01 13:10 ` Richard Sandiford 0 siblings, 1 reply; 59+ messages in thread From: Kenneth Zadeck @ 2012-11-01 12:44 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford [-- Attachment #1: Type: text/plain, Size: 5655 bytes --] richi, I would like you to respond to at least point 1 of this email. In it there is code from the rtl level that was written twice, once for the case when the size of the mode is less than the size of a HWI and once for the case where the size of the mode is less that 2 HWIs. my patch changes this to one instance of the code that works no matter how large the data passed to it is. you have made a specific requirement for wide int to be a template that can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I would like to know how this particular fragment is to be rewritten in this model? It seems that I would have to retain the structure where there is one version of the code for each size that the template is instantiated. I would like to point out that there are about 125 places where we have two copies of the code for some operation. Many of these places are smaller than this, but some are larger. There are also at least several hundred places where the code only was written for the 1 hwi case. These are harder to find with simple greps. I am very concerned about this particular aspect of your comments because it seems to doom us to write the same code over and over again. kenny On 10/31/2012 02:19 PM, Kenneth Zadeck wrote: > Jakub, > > it is hard from all of the threads to actually distill what the real > issues are here. So let me start from a clean slate and state them > simply. > > Richi has three primary objections: > > 1) that we can do all of this with a templated version of double-int. > 2) that we should not be passing in a precision and bitsize into the > interface. > 3) that the interface is too large. > > I have attached a fragment of my patch #5 to illustrate the main > thrust of my patches and to illustrate the usefulness to gcc right now. > > In the current trunk, we have code that does simplification when the > mode fits in an HWI and we have code that does the simplification if > the mode fits in two HWIs. if the mode does not fit in two hwi's the > code does not do the simplification. > > Thus here and in a large number of other places we have two copies of > the code. Richi wants there to be multiple template instantiations > of double-int. This means that we are now going to have to have 3 > copies of this code to support oi mode on a 64 bit host and 4 copies > on a 32 bit host. > > Further note that there are not as many cases for the 2*hwi in the > code as their are for the hwi case and in general this is true through > out the compiler. (CLRSB is missing from the 2hwi case in the patch) > We really did not write twice the code when we stated supporting 2 > hwi, we added about 1.5 times the code (simplify-rtx is better than > most of the rest of the compiler). I am using the rtl level as an > example here because i have posted all of those patches, but the tree > level is no better. > > I do not want to write this code a third time and certainly not a > fourth time. Just fixing all of this is quite useful now: it fills > in a lot of gaps in our transformations and it removes many edge case > crashes because ti mode really is lightly tested. However, this patch > becomes crucial as the world gets larger. > > Richi's second point is that we should be doing everything at > "infinite precision" and not passing in an explicit bitsize and > precision. That works ok (sans the issues i raised with it in > tree-vpn earlier) when the largest precision on the machine fits in a > couple of hwis. However, for targets that have large integers or > cross compilers, this becomes expensive. The idea behind my set of > patches is that for the transformations that can work this way, we do > the math in the precision of the type or mode. In general this means > that almost all of the math will be done quickly, even on targets that > support really big integers. For passes like tree-vrp, the math will > be done at some multiple of the largest type seen in the actual > program. The amount of the multiple is a function of the > optimization, not the target or the host. Currently (on my home > computer) the wide-int interface allows the optimization to go 4x the > largest mode on the target. > > I can get rid of this bound at the expense of doing an alloca rather > than stack allocating a fixed sized structure. However, given the > extremely heavy use of this interface, that does not seem like the > best of tradeoffs. > > The truth is that the vast majority of the compiler actually wants to > see the math done the way that it is going to be done on the machine. > Tree-vrp and the gimple constant prop do not. But i have made > accommodations to handle both needs. I believe that the reason that > double-int was never used at the rtl level is that it does not > actually do the math in a way that is useful to the target. > > Richi's third objection is that the interface is too large. I > disagree. It was designed based on the actual usage of the > interface. When i found places where i was writing the same code > over and over again, i put it in a function as part of the > interface. I later went back and optimized many of these because > this is a very heavily used interface. Richi has many other > objections, but i have agreed to fix almost all of them, so i am not > going to address them here. > > It really will be a huge burden to have to carry these patched until > the next revision. We are currently in stage 1 and i believe that the > minor issues that richi raises can be easily addressed. > > kenny [-- Attachment #2: small.diff --] [-- Type: text/x-patch, Size: 8180 bytes --] @@ -1373,302 +1411,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, return CONST_DOUBLE_FROM_REAL_VALUE (d, mode); } - if (CONST_INT_P (op) - && width <= HOST_BITS_PER_WIDE_INT && width > 0) + if (CONST_SCALAR_INT_P (op) && width > 0) { - HOST_WIDE_INT arg0 = INTVAL (op); - HOST_WIDE_INT val; + wide_int result; + enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode; + wide_int op0 = wide_int::from_rtx (op, imode); + +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* This assert keeps the simplification from producing a result + that cannot be represented in a CONST_DOUBLE but a lot of + upstream callers expect that this function never fails to + simplify something and so you if you added this to the test + above the code would die later anyway. If this assert + happens, you just need to make the port support wide int. */ + gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); +#endif switch (code) { case NOT: - val = ~ arg0; + result = ~op0; break; case NEG: - val = - arg0; + result = op0.neg (); break; case ABS: - val = (arg0 >= 0 ? arg0 : - arg0); + result = op0.abs (); break; case FFS: - arg0 &= GET_MODE_MASK (mode); - val = ffs_hwi (arg0); + result = op0.ffs (); break; case CLZ: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val)) - ; - else - val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1; + result = op0.clz (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case CLRSB: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0) - val = GET_MODE_PRECISION (mode) - 1; - else if (arg0 >= 0) - val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2; - else if (arg0 < 0) - val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2; + result = op0.clrsb (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; - + case CTZ: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0) - { - /* Even if the value at zero is undefined, we have to come - up with some replacement. Seems good enough. */ - if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val)) - val = GET_MODE_PRECISION (mode); - } - else - val = ctz_hwi (arg0); + result = op0.ctz (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case POPCOUNT: - arg0 &= GET_MODE_MASK (mode); - val = 0; - while (arg0) - val++, arg0 &= arg0 - 1; + result = op0.popcount (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case PARITY: - arg0 &= GET_MODE_MASK (mode); - val = 0; - while (arg0) - val++, arg0 &= arg0 - 1; - val &= 1; + result = op0.parity (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case BSWAP: - { - unsigned int s; - - val = 0; - for (s = 0; s < width; s += 8) - { - unsigned int d = width - s - 8; - unsigned HOST_WIDE_INT byte; - byte = (arg0 >> s) & 0xff; - val |= byte << d; - } - } + result = op0.bswap (); break; case TRUNCATE: - val = arg0; + result = op0.sext (mode); break; case ZERO_EXTEND: - /* When zero-extending a CONST_INT, we need to know its - original mode. */ - gcc_assert (op_mode != VOIDmode); - if (op_width == HOST_BITS_PER_WIDE_INT) - { - /* If we were really extending the mode, - we would have to distinguish between zero-extension - and sign-extension. */ - gcc_assert (width == op_width); - val = arg0; - } - else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT) - val = arg0 & GET_MODE_MASK (op_mode); - else - return 0; + result = op0.zext (mode); break; case SIGN_EXTEND: - if (op_mode == VOIDmode) - op_mode = mode; - op_width = GET_MODE_PRECISION (op_mode); - if (op_width == HOST_BITS_PER_WIDE_INT) - { - /* If we were really extending the mode, - we would have to distinguish between zero-extension - and sign-extension. */ - gcc_assert (width == op_width); - val = arg0; - } - else if (op_width < HOST_BITS_PER_WIDE_INT) - { - val = arg0 & GET_MODE_MASK (op_mode); - if (val_signbit_known_set_p (op_mode, val)) - val |= ~GET_MODE_MASK (op_mode); - } - else - return 0; + result = op0.sext (mode); break; case SQRT: - case FLOAT_EXTEND: - case FLOAT_TRUNCATE: - case SS_TRUNCATE: - case US_TRUNCATE: - case SS_NEG: - case US_NEG: - case SS_ABS: - return 0; - - default: - gcc_unreachable (); - } - - return gen_int_mode (val, mode); - } - - /* We can do some operations on integer CONST_DOUBLEs. Also allow - for a DImode operation on a CONST_INT. */ - else if (width <= HOST_BITS_PER_DOUBLE_INT - && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op))) - { - double_int first, value; - - if (CONST_DOUBLE_AS_INT_P (op)) - first = double_int::from_pair (CONST_DOUBLE_HIGH (op), - CONST_DOUBLE_LOW (op)); - else - first = double_int::from_shwi (INTVAL (op)); - - switch (code) - { - case NOT: - value = ~first; - break; - - case NEG: - value = -first; - break; - - case ABS: - if (first.is_negative ()) - value = -first; - else - value = first; - break; - - case FFS: - value.high = 0; - if (first.low != 0) - value.low = ffs_hwi (first.low); - else if (first.high != 0) - value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high); - else - value.low = 0; - break; - - case CLZ: - value.high = 0; - if (first.high != 0) - value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1 - - HOST_BITS_PER_WIDE_INT; - else if (first.low != 0) - value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1; - else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low)) - value.low = GET_MODE_PRECISION (mode); - break; - - case CTZ: - value.high = 0; - if (first.low != 0) - value.low = ctz_hwi (first.low); - else if (first.high != 0) - value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high); - else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low)) - value.low = GET_MODE_PRECISION (mode); - break; - - case POPCOUNT: - value = double_int_zero; - while (first.low) - { - value.low++; - first.low &= first.low - 1; - } - while (first.high) - { - value.low++; - first.high &= first.high - 1; - } - break; - - case PARITY: - value = double_int_zero; - while (first.low) - { - value.low++; - first.low &= first.low - 1; - } - while (first.high) - { - value.low++; - first.high &= first.high - 1; - } - value.low &= 1; - break; - - case BSWAP: - { - unsigned int s; - - value = double_int_zero; - for (s = 0; s < width; s += 8) - { - unsigned int d = width - s - 8; - unsigned HOST_WIDE_INT byte; - - if (s < HOST_BITS_PER_WIDE_INT) - byte = (first.low >> s) & 0xff; - else - byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff; - - if (d < HOST_BITS_PER_WIDE_INT) - value.low |= byte << d; - else - value.high |= byte << (d - HOST_BITS_PER_WIDE_INT); - } - } - break; - - case TRUNCATE: - /* This is just a change-of-mode, so do nothing. */ - value = first; - break; - - case ZERO_EXTEND: - gcc_assert (op_mode != VOIDmode); - - if (op_width > HOST_BITS_PER_WIDE_INT) - return 0; - - value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode)); - break; - - case SIGN_EXTEND: - if (op_mode == VOIDmode - || op_width > HOST_BITS_PER_WIDE_INT) - return 0; - else - { - value.low = first.low & GET_MODE_MASK (op_mode); - if (val_signbit_known_set_p (op_mode, value.low)) - value.low |= ~GET_MODE_MASK (op_mode); - - value.high = HWI_SIGN_EXTEND (value.low); - } - break; - - case SQRT: - return 0; - default: return 0; } - return immed_double_int_const (value, mode); + return immed_wide_int_const (result, mode); } ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 12:44 ` Kenneth Zadeck @ 2012-11-01 13:10 ` Richard Sandiford 2012-11-01 13:18 ` Kenneth Zadeck ` (3 more replies) 0 siblings, 4 replies; 59+ messages in thread From: Richard Sandiford @ 2012-11-01 13:10 UTC (permalink / raw) To: Kenneth Zadeck; +Cc: Jakub Jelinek, Richard Biener, gcc, gcc-patches Kenneth Zadeck <zadeck@naturalbridge.com> writes: > I would like you to respond to at least point 1 of this email. In it > there is code from the rtl level that was written twice, once for the > case when the size of the mode is less than the size of a HWI and once > for the case where the size of the mode is less that 2 HWIs. > > my patch changes this to one instance of the code that works no matter > how large the data passed to it is. > > you have made a specific requirement for wide int to be a template that > can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I > would like to know how this particular fragment is to be rewritten in > this model? It seems that I would have to retain the structure where > there is one version of the code for each size that the template is > instantiated. I think richi's argument was that wide_int should be split into two. There should be a "bare-metal" class that just has a length and HWIs, and the main wide_int class should be an extension on top of that that does things to a bit precision instead. Presumably with some template magic so that the length (number of HWIs) is a constant for: typedef foo<2> double_int; and a variable for wide_int (because in wide_int the length would be the number of significant HWIs rather than the size of the underlying array). wide_int would also record the precision and apply it after the full HWI operation. So the wide_int class would still provide "as wide as we need" arithmetic, as in your rtl patch. I don't think he was objecting to that. As is probably obvious, I don't agree FWIW. It seems like an unnecessary complication without any clear use. Especially since the number of significant HWIs in a wide_int isn't always going to be the same for both operands to a binary operation, and it's not clear to me whether that should be handled in the base class or wide_int. Richard ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 13:10 ` Richard Sandiford @ 2012-11-01 13:18 ` Kenneth Zadeck 2012-11-01 13:24 ` Kenneth Zadeck ` (2 subsequent siblings) 3 siblings, 0 replies; 59+ messages in thread From: Kenneth Zadeck @ 2012-11-01 13:18 UTC (permalink / raw) To: Jakub Jelinek, Richard Biener, gcc, gcc-patches, rdsandiford On 11/01/2012 09:10 AM, Richard Sandiford wrote: > Kenneth Zadeck <zadeck@naturalbridge.com> writes: >> I would like you to respond to at least point 1 of this email. In it >> there is code from the rtl level that was written twice, once for the >> case when the size of the mode is less than the size of a HWI and once >> for the case where the size of the mode is less that 2 HWIs. >> >> my patch changes this to one instance of the code that works no matter >> how large the data passed to it is. >> >> you have made a specific requirement for wide int to be a template that >> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I >> would like to know how this particular fragment is to be rewritten in >> this model? It seems that I would have to retain the structure where >> there is one version of the code for each size that the template is >> instantiated. > I think richi's argument was that wide_int should be split into two. > There should be a "bare-metal" class that just has a length and HWIs, > and the main wide_int class should be an extension on top of that > that does things to a bit precision instead. Presumably with some > template magic so that the length (number of HWIs) is a constant for: > > typedef foo<2> double_int; > > and a variable for wide_int (because in wide_int the length would be > the number of significant HWIs rather than the size of the underlying > array). wide_int would also record the precision and apply it after > the full HWI operation. > > So the wide_int class would still provide "as wide as we need" arithmetic, > as in your rtl patch. I don't think he was objecting to that. > > As is probably obvious, I don't agree FWIW. It seems like an unnecessary > complication without any clear use. Especially since the number of > significant HWIs in a wide_int isn't always going to be the same for > both operands to a binary operation, and it's not clear to me whether > that should be handled in the base class or wide_int. > > Richard There is a certain amount of surprise about all of this on my part. I thought that i was doing such a great thing by looking at the specific port that you are building to determine how to size these data structures. You would think from the response that i am getting that i had murdered some one. do you think that when he gets around to reading the patch for simplify-rtx.c that he is going to object to this frag? @@ -5179,13 +4815,11 @@ static rtx simplify_immed_subreg (enum machine_mode outermode, rtx op, enum machine_mode innermode, unsigned int byte) { - /* We support up to 512-bit values (for V8DFmode). */ enum { - max_bitsize = 512, value_bit = 8, value_mask = (1 << value_bit) - 1 }; - unsigned char value[max_bitsize / value_bit]; + unsigned char value [MAX_BITSIZE_MODE_ANY_MODE/value_bit]; int value_start; int i; int elem; ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 13:10 ` Richard Sandiford 2012-11-01 13:18 ` Kenneth Zadeck @ 2012-11-01 13:24 ` Kenneth Zadeck 2012-11-01 15:16 ` Richard Sandiford 2012-11-04 16:54 ` Richard Biener 3 siblings, 0 replies; 59+ messages in thread From: Kenneth Zadeck @ 2012-11-01 13:24 UTC (permalink / raw) To: Jakub Jelinek, Richard Biener, gcc, gcc-patches, rdsandiford anyway richard, it does not answer the question as to what you are going to do with a typedef foo<2>. the point of all of this work by me was to leave no traces of the host in the way the compiler works. instantiating a specific size of the double-ints is not going to get you there. kenny On 11/01/2012 09:10 AM, Richard Sandiford wrote: > Kenneth Zadeck <zadeck@naturalbridge.com> writes: >> I would like you to respond to at least point 1 of this email. In it >> there is code from the rtl level that was written twice, once for the >> case when the size of the mode is less than the size of a HWI and once >> for the case where the size of the mode is less that 2 HWIs. >> >> my patch changes this to one instance of the code that works no matter >> how large the data passed to it is. >> >> you have made a specific requirement for wide int to be a template that >> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I >> would like to know how this particular fragment is to be rewritten in >> this model? It seems that I would have to retain the structure where >> there is one version of the code for each size that the template is >> instantiated. > I think richi's argument was that wide_int should be split into two. > There should be a "bare-metal" class that just has a length and HWIs, > and the main wide_int class should be an extension on top of that > that does things to a bit precision instead. Presumably with some > template magic so that the length (number of HWIs) is a constant for: > > typedef foo<2> double_int; > > and a variable for wide_int (because in wide_int the length would be > the number of significant HWIs rather than the size of the underlying > array). wide_int would also record the precision and apply it after > the full HWI operation. > > So the wide_int class would still provide "as wide as we need" arithmetic, > as in your rtl patch. I don't think he was objecting to that. > > As is probably obvious, I don't agree FWIW. It seems like an unnecessary > complication without any clear use. Especially since the number of > significant HWIs in a wide_int isn't always going to be the same for > both operands to a binary operation, and it's not clear to me whether > that should be handled in the base class or wide_int. > > Richard ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 13:10 ` Richard Sandiford 2012-11-01 13:18 ` Kenneth Zadeck 2012-11-01 13:24 ` Kenneth Zadeck @ 2012-11-01 15:16 ` Richard Sandiford 2012-11-04 16:54 ` Richard Biener 3 siblings, 0 replies; 59+ messages in thread From: Richard Sandiford @ 2012-11-01 15:16 UTC (permalink / raw) To: Kenneth Zadeck; +Cc: Jakub Jelinek, Richard Biener, gcc, gcc-patches Richard Sandiford <rdsandiford@googlemail.com> writes: > As is probably obvious, I don't agree FWIW. It seems like an unnecessary > complication without any clear use. Especially since the number of > significant HWIs in a wide_int isn't always going to be the same for > both operands to a binary operation, and it's not clear to me whether > that should be handled in the base class or wide_int. ...and the number of HWIs in the result might be different again. Whether that's true depends on the value as well as the (HWI) size of the operands. Richard ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 13:10 ` Richard Sandiford ` (2 preceding siblings ...) 2012-11-01 15:16 ` Richard Sandiford @ 2012-11-04 16:54 ` Richard Biener 2012-11-05 13:59 ` Kenneth Zadeck 3 siblings, 1 reply; 59+ messages in thread From: Richard Biener @ 2012-11-04 16:54 UTC (permalink / raw) To: Kenneth Zadeck, Jakub Jelinek, Richard Biener, gcc, gcc-patches, rdsandiford On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford <rdsandiford@googlemail.com> wrote: > Kenneth Zadeck <zadeck@naturalbridge.com> writes: >> I would like you to respond to at least point 1 of this email. In it >> there is code from the rtl level that was written twice, once for the >> case when the size of the mode is less than the size of a HWI and once >> for the case where the size of the mode is less that 2 HWIs. >> >> my patch changes this to one instance of the code that works no matter >> how large the data passed to it is. >> >> you have made a specific requirement for wide int to be a template that >> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I >> would like to know how this particular fragment is to be rewritten in >> this model? It seems that I would have to retain the structure where >> there is one version of the code for each size that the template is >> instantiated. > > I think richi's argument was that wide_int should be split into two. > There should be a "bare-metal" class that just has a length and HWIs, > and the main wide_int class should be an extension on top of that > that does things to a bit precision instead. Presumably with some > template magic so that the length (number of HWIs) is a constant for: > > typedef foo<2> double_int; > > and a variable for wide_int (because in wide_int the length would be > the number of significant HWIs rather than the size of the underlying > array). wide_int would also record the precision and apply it after > the full HWI operation. > > So the wide_int class would still provide "as wide as we need" arithmetic, > as in your rtl patch. I don't think he was objecting to that. That summarizes one part of my complaints / suggestions correctly. In other mails I suggested to not make it a template but a constant over object lifetime 'bitsize' (or maxlen) field. Both suggestions likely require more thought than I put into them. The main reason is that with C++ you can abstract from where wide-int information pieces are stored and thus use the arithmetic / operation workers without copying the (source) "wide-int" objects. Thus you should be able to write adaptors for double-int storage, tree or RTX storage. > As is probably obvious, I don't agree FWIW. It seems like an unnecessary > complication without any clear use. Especially since the number of Maybe the double_int typedef is without any clear use. Properly abstracting from the storage / information providers will save compile-time, memory and code though. I don't see that any thought was spent on how to avoid excessive copying or dealing with long(er)-lived objects and their storage needs. > significant HWIs in a wide_int isn't always going to be the same for > both operands to a binary operation, and it's not clear to me whether > that should be handled in the base class or wide_int. It certainly depends. Richard. > Richard ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-04 16:54 ` Richard Biener @ 2012-11-05 13:59 ` Kenneth Zadeck 2012-11-05 17:00 ` Kenneth Zadeck 2012-11-26 15:03 ` Richard Biener 0 siblings, 2 replies; 59+ messages in thread From: Kenneth Zadeck @ 2012-11-05 13:59 UTC (permalink / raw) To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford On 11/04/2012 11:54 AM, Richard Biener wrote: > On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford > <rdsandiford@googlemail.com> wrote: >> Kenneth Zadeck <zadeck@naturalbridge.com> writes: >>> I would like you to respond to at least point 1 of this email. In it >>> there is code from the rtl level that was written twice, once for the >>> case when the size of the mode is less than the size of a HWI and once >>> for the case where the size of the mode is less that 2 HWIs. >>> >>> my patch changes this to one instance of the code that works no matter >>> how large the data passed to it is. >>> >>> you have made a specific requirement for wide int to be a template that >>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I >>> would like to know how this particular fragment is to be rewritten in >>> this model? It seems that I would have to retain the structure where >>> there is one version of the code for each size that the template is >>> instantiated. >> I think richi's argument was that wide_int should be split into two. >> There should be a "bare-metal" class that just has a length and HWIs, >> and the main wide_int class should be an extension on top of that >> that does things to a bit precision instead. Presumably with some >> template magic so that the length (number of HWIs) is a constant for: >> >> typedef foo<2> double_int; >> >> and a variable for wide_int (because in wide_int the length would be >> the number of significant HWIs rather than the size of the underlying >> array). wide_int would also record the precision and apply it after >> the full HWI operation. >> >> So the wide_int class would still provide "as wide as we need" arithmetic, >> as in your rtl patch. I don't think he was objecting to that. > That summarizes one part of my complaints / suggestions correctly. In other > mails I suggested to not make it a template but a constant over object lifetime > 'bitsize' (or maxlen) field. Both suggestions likely require more thought than > I put into them. The main reason is that with C++ you can abstract from where > wide-int information pieces are stored and thus use the arithmetic / operation > workers without copying the (source) "wide-int" objects. Thus you should > be able to write adaptors for double-int storage, tree or RTX storage. We had considered something along these lines and rejected it. I am not really opposed to doing something like this, but it is not an obvious winning idea and is likely not to be a good idea. Here was our thought process: if you abstract away the storage inside a wide int, then you should be able to copy a pointer to the block of data from either the rtl level integer constant or the tree level one into the wide int. It is certainly true that making a wide_int from one of these is an extremely common operation and doing this would avoid those copies. However, this causes two problems: 1) Mike's first cut at the CONST_WIDE_INT did two ggc allocations to make the object. it created the base object and then it allocated the array. Richard S noticed that we could just allocate one CONST_WIDE_INT that had the array in it. Doing it this way saves one ggc allocation and one indirection when accessing the data within the CONST_WIDE_INT. Our plan is to use the same trick at the tree level. So to avoid the copying, you seem to have to have a more expensive rep for CONST_WIDE_INT and INT_CST. 2) You are now stuck either ggcing the storage inside a wide_int when they are created as part of an expression or you have to play some game to represent the two different storage plans inside of wide_int. Clearly this is where you think that we should be going by suggesting that we abstract away the internal storage. However, this comes at a price: what is currently an array access in my patches would (i believe) become a function call. From a performance point of view, i believe that this is a non starter. If you can figure out how to design this so that it is not a function call, i would consider this a viable option. On the other side of this you are clearly correct that we are copying the data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs. But this is why we represent data inside of the wide_ints, the INT_CSTs and the CONST_WIDE_INTs in a compressed form. Even with very big types, which are generally rare, the constants them selves are very small. So the copy operation is a loop that almost always copies one element, even with tree-vrp which doubles the sizes of every type. There is the third option which is that the storage inside the wide int is just ggced storage. We rejected this because of the functional nature of wide-ints. There are zillions created, they can be stack allocated, and they last for very short periods of time. >> As is probably obvious, I don't agree FWIW. It seems like an unnecessary >> complication without any clear use. Especially since the number of > Maybe the double_int typedef is without any clear use. Properly > abstracting from the storage / information providers will save > compile-time, memory and code though. I don't see that any thought > was spent on how to avoid excessive copying or dealing with > long(er)-lived objects and their storage needs. I actually disagree. Wide ints can use a bloated amount of storage because they are designed to be very short lived and very low cost objects that are stack allocated. For long term storage, there is INT_CST at the tree level and CONST_WIDE_INT at the rtl level. Those use a very compact storage model. The copying entailed is only a small part of the overall performance. Everything that you are suggesting along these lines is adding to the weight of a wide-int object. You have to understand there will be many more wide-ints created in a normal compilation than were ever created with double-int. This is because the rtl level had no object like this at all and at the tree level, many of the places that should have used double int, short cut the code and only did the transformations if the types fit in a HWI. This is why we are extremely defensive about this issue. We really did think a lot about it. Kenny >> significant HWIs in a wide_int isn't always going to be the same for >> both operands to a binary operation, and it's not clear to me whether >> that should be handled in the base class or wide_int. > It certainly depends. > > Richard. > >> Richard ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-05 13:59 ` Kenneth Zadeck @ 2012-11-05 17:00 ` Kenneth Zadeck 2012-11-26 15:03 ` Richard Biener 1 sibling, 0 replies; 59+ messages in thread From: Kenneth Zadeck @ 2012-11-05 17:00 UTC (permalink / raw) To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford Jakub and Richi, At this point I have decided to that i am not going to get the rest of the wide-int patches into a stable enough form for this round. The combination of still living without power at my house and some issues that i hit with the front ends has made it impossible to get this finished by today's deadline. I do want patches 1-7 to go in (after proper review) but i am going to withdraw patch 8 for this round. patches 1-5 deal with the rtl level. These have been extensively tested and "examined" with the exception of patch 4, "examined" by Richard Sandiford. They clean up a lot of things at the rtl level that effect every port as well as fixing some outstanding regressions. patches 6 and 7 are general cleanups at the tree level and can be justified as on their own without any regard to wide-int. They have also been extensively tested. I am withdrawing patch 8 because it converted tree-vpn to use wide-ints but the benefit of this patch really cannot be seen without the rest of the tree level wide-int patches. In the next couple of days i will resubmit patches 1-7 with the patch rot removed and the public comments folded into them. Kenny ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-05 13:59 ` Kenneth Zadeck 2012-11-05 17:00 ` Kenneth Zadeck @ 2012-11-26 15:03 ` Richard Biener 2012-11-26 16:03 ` Kenneth Zadeck 1 sibling, 1 reply; 59+ messages in thread From: Richard Biener @ 2012-11-26 15:03 UTC (permalink / raw) To: Kenneth Zadeck; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote: > > On 11/04/2012 11:54 AM, Richard Biener wrote: >> >> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford >> <rdsandiford@googlemail.com> wrote: >>> >>> Kenneth Zadeck <zadeck@naturalbridge.com> writes: >>>> >>>> I would like you to respond to at least point 1 of this email. In it >>>> there is code from the rtl level that was written twice, once for the >>>> case when the size of the mode is less than the size of a HWI and once >>>> for the case where the size of the mode is less that 2 HWIs. >>>> >>>> my patch changes this to one instance of the code that works no matter >>>> how large the data passed to it is. >>>> >>>> you have made a specific requirement for wide int to be a template that >>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I >>>> would like to know how this particular fragment is to be rewritten in >>>> this model? It seems that I would have to retain the structure where >>>> there is one version of the code for each size that the template is >>>> instantiated. >>> >>> I think richi's argument was that wide_int should be split into two. >>> There should be a "bare-metal" class that just has a length and HWIs, >>> and the main wide_int class should be an extension on top of that >>> that does things to a bit precision instead. Presumably with some >>> template magic so that the length (number of HWIs) is a constant for: >>> >>> typedef foo<2> double_int; >>> >>> and a variable for wide_int (because in wide_int the length would be >>> the number of significant HWIs rather than the size of the underlying >>> array). wide_int would also record the precision and apply it after >>> the full HWI operation. >>> >>> So the wide_int class would still provide "as wide as we need" >>> arithmetic, >>> as in your rtl patch. I don't think he was objecting to that. >> >> That summarizes one part of my complaints / suggestions correctly. In >> other >> mails I suggested to not make it a template but a constant over object >> lifetime >> 'bitsize' (or maxlen) field. Both suggestions likely require more thought >> than >> I put into them. The main reason is that with C++ you can abstract from >> where >> wide-int information pieces are stored and thus use the arithmetic / >> operation >> workers without copying the (source) "wide-int" objects. Thus you should >> be able to write adaptors for double-int storage, tree or RTX storage. > > We had considered something along these lines and rejected it. I am not > really opposed to doing something like this, but it is not an obvious > winning idea and is likely not to be a good idea. Here was our thought > process: > > if you abstract away the storage inside a wide int, then you should be able > to copy a pointer to the block of data from either the rtl level integer > constant or the tree level one into the wide int. It is certainly true > that making a wide_int from one of these is an extremely common operation > and doing this would avoid those copies. > > However, this causes two problems: > 1) Mike's first cut at the CONST_WIDE_INT did two ggc allocations to make > the object. it created the base object and then it allocated the array. > Richard S noticed that we could just allocate one CONST_WIDE_INT that had > the array in it. Doing it this way saves one ggc allocation and one > indirection when accessing the data within the CONST_WIDE_INT. Our plan is > to use the same trick at the tree level. So to avoid the copying, you seem > to have to have a more expensive rep for CONST_WIDE_INT and INT_CST. I did not propose having a pointer to the data in the RTX or tree int. Just the short-lived wide-ints (which are on the stack) would have a pointer to the data - which can then obviously point into the RTX and tree data. > 2) You are now stuck either ggcing the storage inside a wide_int when they > are created as part of an expression or you have to play some game to > represent the two different storage plans inside of wide_int. Hm? wide-ints are short-lived and thus never live across a garbage collection point. We create non-GCed objects pointing to GCed objects all the time and everywhere this way. > Clearly this > is where you think that we should be going by suggesting that we abstract > away the internal storage. However, this comes at a price: what is > currently an array access in my patches would (i believe) become a function > call. No, the workers (that perform the array accesses) will simply get a pointer to the first data element. Then whether it's embedded or external is of no interest to them. > From a performance point of view, i believe that this is a non > starter. If you can figure out how to design this so that it is not a > function call, i would consider this a viable option. > > On the other side of this you are clearly correct that we are copying the > data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs. But > this is why we represent data inside of the wide_ints, the INT_CSTs and the > CONST_WIDE_INTs in a compressed form. Even with very big types, which are > generally rare, the constants them selves are very small. So the copy > operation is a loop that almost always copies one element, even with > tree-vrp which doubles the sizes of every type. > > There is the third option which is that the storage inside the wide int is > just ggced storage. We rejected this because of the functional nature of > wide-ints. There are zillions created, they can be stack allocated, and > they last for very short periods of time. Of course - GCing wide-ints is a non-starter. > >>> As is probably obvious, I don't agree FWIW. It seems like an unnecessary >>> complication without any clear use. Especially since the number of >> >> Maybe the double_int typedef is without any clear use. Properly >> abstracting from the storage / information providers will save >> compile-time, memory and code though. I don't see that any thought >> was spent on how to avoid excessive copying or dealing with >> long(er)-lived objects and their storage needs. > > I actually disagree. Wide ints can use a bloated amount of storage > because they are designed to be very short lived and very low cost objects > that are stack allocated. For long term storage, there is INT_CST at the > tree level and CONST_WIDE_INT at the rtl level. Those use a very compact > storage model. The copying entailed is only a small part of the overall > performance. Well, but both trees and RTXen are not viable for short-lived things because the are GCed! double-ints were suitable for this kind of stuff because the also have a moderate size. With wide-ints size becomes a problem (or GC, if you instead use trees or RTXen). > Everything that you are suggesting along these lines is adding to the weight > of a wide-int object. On the contrary - it lessens their weight (with external already existing storage) or does not do anything to it (with the embedded storage). > You have to understand there will be many more > wide-ints created in a normal compilation than were ever created with > double-int. This is because the rtl level had no object like this at all > and at the tree level, many of the places that should have used double int, > short cut the code and only did the transformations if the types fit in a > HWI. Your argument shows that the copy-in/out from tree/RTX to/from wide-int will become a very frequent operation and thus it is worth optimizing it. > This is why we are extremely defensive about this issue. We really did > think a lot about it. I'm sure you did. Richard. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-26 15:03 ` Richard Biener @ 2012-11-26 16:03 ` Kenneth Zadeck 2012-11-26 16:30 ` Richard Biener 0 siblings, 1 reply; 59+ messages in thread From: Kenneth Zadeck @ 2012-11-26 16:03 UTC (permalink / raw) To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford On 11/26/2012 10:03 AM, Richard Biener wrote: > On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote: >> On 11/04/2012 11:54 AM, Richard Biener wrote: >>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford >>> <rdsandiford@googlemail.com> wrote: >>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes: >>>>> I would like you to respond to at least point 1 of this email. In it >>>>> there is code from the rtl level that was written twice, once for the >>>>> case when the size of the mode is less than the size of a HWI and once >>>>> for the case where the size of the mode is less that 2 HWIs. >>>>> >>>>> my patch changes this to one instance of the code that works no matter >>>>> how large the data passed to it is. >>>>> >>>>> you have made a specific requirement for wide int to be a template that >>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I >>>>> would like to know how this particular fragment is to be rewritten in >>>>> this model? It seems that I would have to retain the structure where >>>>> there is one version of the code for each size that the template is >>>>> instantiated. >>>> I think richi's argument was that wide_int should be split into two. >>>> There should be a "bare-metal" class that just has a length and HWIs, >>>> and the main wide_int class should be an extension on top of that >>>> that does things to a bit precision instead. Presumably with some >>>> template magic so that the length (number of HWIs) is a constant for: >>>> >>>> typedef foo<2> double_int; >>>> >>>> and a variable for wide_int (because in wide_int the length would be >>>> the number of significant HWIs rather than the size of the underlying >>>> array). wide_int would also record the precision and apply it after >>>> the full HWI operation. >>>> >>>> So the wide_int class would still provide "as wide as we need" >>>> arithmetic, >>>> as in your rtl patch. I don't think he was objecting to that. >>> That summarizes one part of my complaints / suggestions correctly. In >>> other >>> mails I suggested to not make it a template but a constant over object >>> lifetime >>> 'bitsize' (or maxlen) field. Both suggestions likely require more thought >>> than >>> I put into them. The main reason is that with C++ you can abstract from >>> where >>> wide-int information pieces are stored and thus use the arithmetic / >>> operation >>> workers without copying the (source) "wide-int" objects. Thus you should >>> be able to write adaptors for double-int storage, tree or RTX storage. >> We had considered something along these lines and rejected it. I am not >> really opposed to doing something like this, but it is not an obvious >> winning idea and is likely not to be a good idea. Here was our thought >> process: >> >> if you abstract away the storage inside a wide int, then you should be able >> to copy a pointer to the block of data from either the rtl level integer >> constant or the tree level one into the wide int. It is certainly true >> that making a wide_int from one of these is an extremely common operation >> and doing this would avoid those copies. >> >> However, this causes two problems: >> 1) Mike's first cut at the CONST_WIDE_INT did two ggc allocations to make >> the object. it created the base object and then it allocated the array. >> Richard S noticed that we could just allocate one CONST_WIDE_INT that had >> the array in it. Doing it this way saves one ggc allocation and one >> indirection when accessing the data within the CONST_WIDE_INT. Our plan is >> to use the same trick at the tree level. So to avoid the copying, you seem >> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST. > I did not propose having a pointer to the data in the RTX or tree int. Just > the short-lived wide-ints (which are on the stack) would have a pointer to > the data - which can then obviously point into the RTX and tree data. There is the issue then what if some wide-ints are not short lived. It makes me nervous to create internal pointers to gc ed memory. >> 2) You are now stuck either ggcing the storage inside a wide_int when they >> are created as part of an expression or you have to play some game to >> represent the two different storage plans inside of wide_int. > Hm? wide-ints are short-lived and thus never live across a garbage collection > point. We create non-GCed objects pointing to GCed objects all the time > and everywhere this way. Again, this makes me nervous but it could be done. However, it does mean that now the wide ints that are not created from rtxes or trees will be more expensive because they are not going to get their storage "for free", they are going to alloca it. however, it still is not clear, given that 99% of the wide ints are going to fit in a single hwi, that this would be a noticeable win. > >> Clearly this >> is where you think that we should be going by suggesting that we abstract >> away the internal storage. However, this comes at a price: what is >> currently an array access in my patches would (i believe) become a function >> call. > No, the workers (that perform the array accesses) will simply get > a pointer to the first data element. Then whether it's embedded or > external is of no interest to them. so is your plan that the wide int constructors from rtx or tree would just copy the pointer to the array on top of the array that is otherwise allocated on the stack? I can easily do this. But as i said, the gain seems quite small. And of course, going the other way still does need the copy. >> From a performance point of view, i believe that this is a non >> starter. If you can figure out how to design this so that it is not a >> function call, i would consider this a viable option. >> >> On the other side of this you are clearly correct that we are copying the >> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs. But >> this is why we represent data inside of the wide_ints, the INT_CSTs and the >> CONST_WIDE_INTs in a compressed form. Even with very big types, which are >> generally rare, the constants them selves are very small. So the copy >> operation is a loop that almost always copies one element, even with >> tree-vrp which doubles the sizes of every type. >> >> There is the third option which is that the storage inside the wide int is >> just ggced storage. We rejected this because of the functional nature of >> wide-ints. There are zillions created, they can be stack allocated, and >> they last for very short periods of time. > Of course - GCing wide-ints is a non-starter. > >>>> As is probably obvious, I don't agree FWIW. It seems like an unnecessary >>>> complication without any clear use. Especially since the number of >>> Maybe the double_int typedef is without any clear use. Properly >>> abstracting from the storage / information providers will save >>> compile-time, memory and code though. I don't see that any thought >>> was spent on how to avoid excessive copying or dealing with >>> long(er)-lived objects and their storage needs. >> I actually disagree. Wide ints can use a bloated amount of storage >> because they are designed to be very short lived and very low cost objects >> that are stack allocated. For long term storage, there is INT_CST at the >> tree level and CONST_WIDE_INT at the rtl level. Those use a very compact >> storage model. The copying entailed is only a small part of the overall >> performance. > Well, but both trees and RTXen are not viable for short-lived things because > the are GCed! double-ints were suitable for this kind of stuff because > the also have a moderate size. With wide-ints size becomes a problem > (or GC, if you instead use trees or RTXen). > >> Everything that you are suggesting along these lines is adding to the weight >> of a wide-int object. > On the contrary - it lessens their weight (with external already > existing storage) > or does not do anything to it (with the embedded storage). > >> You have to understand there will be many more >> wide-ints created in a normal compilation than were ever created with >> double-int. This is because the rtl level had no object like this at all >> and at the tree level, many of the places that should have used double int, >> short cut the code and only did the transformations if the types fit in a >> HWI. > Your argument shows that the copy-in/out from tree/RTX to/from wide-int > will become a very frequent operation and thus it is worth optimizing it. > >> This is why we are extremely defensive about this issue. We really did >> think a lot about it. > I'm sure you did. > > Richard. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-26 16:03 ` Kenneth Zadeck @ 2012-11-26 16:30 ` Richard Biener 2012-11-27 0:06 ` Kenneth Zadeck 0 siblings, 1 reply; 59+ messages in thread From: Richard Biener @ 2012-11-26 16:30 UTC (permalink / raw) To: Kenneth Zadeck; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote: > On 11/26/2012 10:03 AM, Richard Biener wrote: >> >> On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck <zadeck@naturalbridge.com> >> wrote: >>> >>> On 11/04/2012 11:54 AM, Richard Biener wrote: >>>> >>>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford >>>> <rdsandiford@googlemail.com> wrote: >>>>> >>>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes: >>>>>> >>>>>> I would like you to respond to at least point 1 of this email. In it >>>>>> there is code from the rtl level that was written twice, once for the >>>>>> case when the size of the mode is less than the size of a HWI and once >>>>>> for the case where the size of the mode is less that 2 HWIs. >>>>>> >>>>>> my patch changes this to one instance of the code that works no matter >>>>>> how large the data passed to it is. >>>>>> >>>>>> you have made a specific requirement for wide int to be a template >>>>>> that >>>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. >>>>>> I >>>>>> would like to know how this particular fragment is to be rewritten in >>>>>> this model? It seems that I would have to retain the structure where >>>>>> there is one version of the code for each size that the template is >>>>>> instantiated. >>>>> >>>>> I think richi's argument was that wide_int should be split into two. >>>>> There should be a "bare-metal" class that just has a length and HWIs, >>>>> and the main wide_int class should be an extension on top of that >>>>> that does things to a bit precision instead. Presumably with some >>>>> template magic so that the length (number of HWIs) is a constant for: >>>>> >>>>> typedef foo<2> double_int; >>>>> >>>>> and a variable for wide_int (because in wide_int the length would be >>>>> the number of significant HWIs rather than the size of the underlying >>>>> array). wide_int would also record the precision and apply it after >>>>> the full HWI operation. >>>>> >>>>> So the wide_int class would still provide "as wide as we need" >>>>> arithmetic, >>>>> as in your rtl patch. I don't think he was objecting to that. >>>> >>>> That summarizes one part of my complaints / suggestions correctly. In >>>> other >>>> mails I suggested to not make it a template but a constant over object >>>> lifetime >>>> 'bitsize' (or maxlen) field. Both suggestions likely require more >>>> thought >>>> than >>>> I put into them. The main reason is that with C++ you can abstract from >>>> where >>>> wide-int information pieces are stored and thus use the arithmetic / >>>> operation >>>> workers without copying the (source) "wide-int" objects. Thus you >>>> should >>>> be able to write adaptors for double-int storage, tree or RTX storage. >>> >>> We had considered something along these lines and rejected it. I am not >>> really opposed to doing something like this, but it is not an obvious >>> winning idea and is likely not to be a good idea. Here was our thought >>> process: >>> >>> if you abstract away the storage inside a wide int, then you should be >>> able >>> to copy a pointer to the block of data from either the rtl level integer >>> constant or the tree level one into the wide int. It is certainly true >>> that making a wide_int from one of these is an extremely common operation >>> and doing this would avoid those copies. >>> >>> However, this causes two problems: >>> 1) Mike's first cut at the CONST_WIDE_INT did two ggc allocations to >>> make >>> the object. it created the base object and then it allocated the array. >>> Richard S noticed that we could just allocate one CONST_WIDE_INT that had >>> the array in it. Doing it this way saves one ggc allocation and one >>> indirection when accessing the data within the CONST_WIDE_INT. Our plan >>> is >>> to use the same trick at the tree level. So to avoid the copying, you >>> seem >>> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST. >> >> I did not propose having a pointer to the data in the RTX or tree int. >> Just >> the short-lived wide-ints (which are on the stack) would have a pointer to >> the data - which can then obviously point into the RTX and tree data. > > There is the issue then what if some wide-ints are not short lived. It makes > me nervous to create internal pointers to gc ed memory. I thought they were all short-lived. >>> 2) You are now stuck either ggcing the storage inside a wide_int when >>> they >>> are created as part of an expression or you have to play some game to >>> represent the two different storage plans inside of wide_int. >> >> Hm? wide-ints are short-lived and thus never live across a garbage >> collection >> point. We create non-GCed objects pointing to GCed objects all the time >> and everywhere this way. > > Again, this makes me nervous but it could be done. However, it does mean > that now the wide ints that are not created from rtxes or trees will be more > expensive because they are not going to get their storage "for free", they > are going to alloca it. No, those would simply use the embedded storage model. > however, it still is not clear, given that 99% of the wide ints are going to > fit in a single hwi, that this would be a noticeable win. Currently even if they fit into a HWI you will still allocate 4 times the larges integer mode size. You say that doesn't matter because they are short-lived, but I say it does matter because not all of them are short-lived enough. If 99% fit in a HWI why allocate 4 times the largest integer mode size in 99% of the cases? >> >>> Clearly this >>> is where you think that we should be going by suggesting that we abstract >>> away the internal storage. However, this comes at a price: what is >>> currently an array access in my patches would (i believe) become a >>> function >>> call. >> >> No, the workers (that perform the array accesses) will simply get >> a pointer to the first data element. Then whether it's embedded or >> external is of no interest to them. > > so is your plan that the wide int constructors from rtx or tree would just > copy the pointer to the array on top of the array that is otherwise > allocated on the stack? I can easily do this. But as i said, the gain > seems quite small. > > And of course, going the other way still does need the copy. The proposal was to template wide_int on a storage model, the embedded one would work as-is (embedding 4 times largest integer mode), the external one would have a pointer to data. All functions that return a wide_int produce a wide_int with the embedded model. To avoid the function call penalty you described the storage model provides a way to get a pointer to the first element and the templated operations simply dispatch to a worker that takes this pointer to the first element (as the storage model is designed as a template its abstraction is going to be optimized away by means of inlining). Richard. >>> From a performance point of view, i believe that this is a non >>> starter. If you can figure out how to design this so that it is not a >>> function call, i would consider this a viable option. >>> >>> On the other side of this you are clearly correct that we are copying the >>> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs. >>> But >>> this is why we represent data inside of the wide_ints, the INT_CSTs and >>> the >>> CONST_WIDE_INTs in a compressed form. Even with very big types, which >>> are >>> generally rare, the constants them selves are very small. So the copy >>> operation is a loop that almost always copies one element, even with >>> tree-vrp which doubles the sizes of every type. >>> >>> There is the third option which is that the storage inside the wide int >>> is >>> just ggced storage. We rejected this because of the functional nature of >>> wide-ints. There are zillions created, they can be stack allocated, >>> and >>> they last for very short periods of time. >> >> Of course - GCing wide-ints is a non-starter. >> >>>>> As is probably obvious, I don't agree FWIW. It seems like an >>>>> unnecessary >>>>> complication without any clear use. Especially since the number of >>>> >>>> Maybe the double_int typedef is without any clear use. Properly >>>> abstracting from the storage / information providers will save >>>> compile-time, memory and code though. I don't see that any thought >>>> was spent on how to avoid excessive copying or dealing with >>>> long(er)-lived objects and their storage needs. >>> >>> I actually disagree. Wide ints can use a bloated amount of storage >>> because they are designed to be very short lived and very low cost >>> objects >>> that are stack allocated. For long term storage, there is INT_CST at >>> the >>> tree level and CONST_WIDE_INT at the rtl level. Those use a very compact >>> storage model. The copying entailed is only a small part of the overall >>> performance. >> >> Well, but both trees and RTXen are not viable for short-lived things >> because >> the are GCed! double-ints were suitable for this kind of stuff because >> the also have a moderate size. With wide-ints size becomes a problem >> (or GC, if you instead use trees or RTXen). >> >>> Everything that you are suggesting along these lines is adding to the >>> weight >>> of a wide-int object. >> >> On the contrary - it lessens their weight (with external already >> existing storage) >> or does not do anything to it (with the embedded storage). >> >>> You have to understand there will be many more >>> wide-ints created in a normal compilation than were ever created with >>> double-int. This is because the rtl level had no object like this at >>> all >>> and at the tree level, many of the places that should have used double >>> int, >>> short cut the code and only did the transformations if the types fit in a >>> HWI. >> >> Your argument shows that the copy-in/out from tree/RTX to/from wide-int >> will become a very frequent operation and thus it is worth optimizing it. >> >>> This is why we are extremely defensive about this issue. We really did >>> think a lot about it. >> >> I'm sure you did. >> >> Richard. > > ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-26 16:30 ` Richard Biener @ 2012-11-27 0:06 ` Kenneth Zadeck 2012-11-27 10:03 ` Richard Biener 0 siblings, 1 reply; 59+ messages in thread From: Kenneth Zadeck @ 2012-11-27 0:06 UTC (permalink / raw) To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford, Mike Stump Richard, I spent a good part of the afternoon talking to Mike about this. He is on the c++ standards committee and is a much more seasoned c++ programmer than I am. He convinced me that with a large amount of engineering and c++ "foolishness" that it was indeed possible to get your proposal to POSSIBLY work as well as what we did. But now the question is why would any want to do this? At the very least you are talking about instantiating two instances of wide-ints, one for the stack allocated uses and one for the places where we just move a pointer from the tree or the rtx. Then you are talking about creating connectors so that the stack allocated functions can take parameters of pointer version and visa versa. Then there is the issue that rather than just saying that something is a wide int, that the programmer is going to have to track it's origin. In particular, where in the code right now i say. wide_int foo = wide_int::from_rtx (r1); wide_int bar = wide_int::from_rtx (r2) + foo; now i would have to say wide_int_ptr foo = wide_int_ptr::from_rtx (r1); wide_int_stack bar = wide_int_ptr::from_rtx (r2) + foo; then when i want to call some function using a wide_int ref that function now must be either overloaded to take both or i have to choose one of the two instantiations (presumably based on which is going to be more common) and just have the compiler fix up everything (which it is likely to do). And so what is the payoff: 1) No one except the c++ elite is going to understand the code. The rest of the community will hate me and curse the ground that i walk on. 2) I will end up with a version of wide-int that can be used as a medium life container (where i define medium life as not allowed to survive a gc since they will contain pointers into rtxes and trees.) 3) An no clients that actually wanted to do this!! I could use as an example one of your favorite passes, tree-vrp. The current double-int could have been a medium lifetime container since it has a smaller footprint, but in fact tree-vrp converts those double-ints back into trees for medium storage. Why, because it needs the other fields of a tree-cst to store the entire state. Wide-ints also "suffer" this problem. their only state are the data, and the three length fields. They have no type and none of the other tree info so the most obvious client for a medium lifetime object is really not going to be a good match even if you "solve the storage problem". The fact is that wide-ints are an excellent short term storage class that can be very quickly converted into our two long term storage classes. Your proposal is requires a lot of work, will not be easy to use and as far as i can see has no payoff on the horizon. It could be that there could be future clients for a medium lifetime value, but asking for this with no clients in hand is really beyond the scope of a reasonable review. I remind you that the purpose of these patches is to solve problems that exist in the current compiler that we have papered over for years. If someone needs wide-ints in some way that is not foreseen then they can change it. kenny On 11/26/2012 11:30 AM, Richard Biener wrote: > On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck > <zadeck@naturalbridge.com> wrote: >> On 11/26/2012 10:03 AM, Richard Biener wrote: >>> On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck <zadeck@naturalbridge.com> >>> wrote: >>>> On 11/04/2012 11:54 AM, Richard Biener wrote: >>>>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford >>>>> <rdsandiford@googlemail.com> wrote: >>>>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes: >>>>>>> I would like you to respond to at least point 1 of this email. In it >>>>>>> there is code from the rtl level that was written twice, once for the >>>>>>> case when the size of the mode is less than the size of a HWI and once >>>>>>> for the case where the size of the mode is less that 2 HWIs. >>>>>>> >>>>>>> my patch changes this to one instance of the code that works no matter >>>>>>> how large the data passed to it is. >>>>>>> >>>>>>> you have made a specific requirement for wide int to be a template >>>>>>> that >>>>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. >>>>>>> I >>>>>>> would like to know how this particular fragment is to be rewritten in >>>>>>> this model? It seems that I would have to retain the structure where >>>>>>> there is one version of the code for each size that the template is >>>>>>> instantiated. >>>>>> I think richi's argument was that wide_int should be split into two. >>>>>> There should be a "bare-metal" class that just has a length and HWIs, >>>>>> and the main wide_int class should be an extension on top of that >>>>>> that does things to a bit precision instead. Presumably with some >>>>>> template magic so that the length (number of HWIs) is a constant for: >>>>>> >>>>>> typedef foo<2> double_int; >>>>>> >>>>>> and a variable for wide_int (because in wide_int the length would be >>>>>> the number of significant HWIs rather than the size of the underlying >>>>>> array). wide_int would also record the precision and apply it after >>>>>> the full HWI operation. >>>>>> >>>>>> So the wide_int class would still provide "as wide as we need" >>>>>> arithmetic, >>>>>> as in your rtl patch. I don't think he was objecting to that. >>>>> That summarizes one part of my complaints / suggestions correctly. In >>>>> other >>>>> mails I suggested to not make it a template but a constant over object >>>>> lifetime >>>>> 'bitsize' (or maxlen) field. Both suggestions likely require more >>>>> thought >>>>> than >>>>> I put into them. The main reason is that with C++ you can abstract from >>>>> where >>>>> wide-int information pieces are stored and thus use the arithmetic / >>>>> operation >>>>> workers without copying the (source) "wide-int" objects. Thus you >>>>> should >>>>> be able to write adaptors for double-int storage, tree or RTX storage. >>>> We had considered something along these lines and rejected it. I am not >>>> really opposed to doing something like this, but it is not an obvious >>>> winning idea and is likely not to be a good idea. Here was our thought >>>> process: >>>> >>>> if you abstract away the storage inside a wide int, then you should be >>>> able >>>> to copy a pointer to the block of data from either the rtl level integer >>>> constant or the tree level one into the wide int. It is certainly true >>>> that making a wide_int from one of these is an extremely common operation >>>> and doing this would avoid those copies. >>>> >>>> However, this causes two problems: >>>> 1) Mike's first cut at the CONST_WIDE_INT did two ggc allocations to >>>> make >>>> the object. it created the base object and then it allocated the array. >>>> Richard S noticed that we could just allocate one CONST_WIDE_INT that had >>>> the array in it. Doing it this way saves one ggc allocation and one >>>> indirection when accessing the data within the CONST_WIDE_INT. Our plan >>>> is >>>> to use the same trick at the tree level. So to avoid the copying, you >>>> seem >>>> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST. >>> I did not propose having a pointer to the data in the RTX or tree int. >>> Just >>> the short-lived wide-ints (which are on the stack) would have a pointer to >>> the data - which can then obviously point into the RTX and tree data. >> There is the issue then what if some wide-ints are not short lived. It makes >> me nervous to create internal pointers to gc ed memory. > I thought they were all short-lived. > >>>> 2) You are now stuck either ggcing the storage inside a wide_int when >>>> they >>>> are created as part of an expression or you have to play some game to >>>> represent the two different storage plans inside of wide_int. >>> Hm? wide-ints are short-lived and thus never live across a garbage >>> collection >>> point. We create non-GCed objects pointing to GCed objects all the time >>> and everywhere this way. >> Again, this makes me nervous but it could be done. However, it does mean >> that now the wide ints that are not created from rtxes or trees will be more >> expensive because they are not going to get their storage "for free", they >> are going to alloca it. > No, those would simply use the embedded storage model. > >> however, it still is not clear, given that 99% of the wide ints are going to >> fit in a single hwi, that this would be a noticeable win. > Currently even if they fit into a HWI you will still allocate 4 times the > larges integer mode size. You say that doesn't matter because they > are short-lived, but I say it does matter because not all of them are > short-lived enough. If 99% fit in a HWI why allocate 4 times the > largest integer mode size in 99% of the cases? > >>>> Clearly this >>>> is where you think that we should be going by suggesting that we abstract >>>> away the internal storage. However, this comes at a price: what is >>>> currently an array access in my patches would (i believe) become a >>>> function >>>> call. >>> No, the workers (that perform the array accesses) will simply get >>> a pointer to the first data element. Then whether it's embedded or >>> external is of no interest to them. >> so is your plan that the wide int constructors from rtx or tree would just >> copy the pointer to the array on top of the array that is otherwise >> allocated on the stack? I can easily do this. But as i said, the gain >> seems quite small. >> >> And of course, going the other way still does need the copy. > The proposal was to template wide_int on a storage model, the embedded > one would work as-is (embedding 4 times largest integer mode), the > external one would have a pointer to data. All functions that return a > wide_int produce a wide_int with the embedded model. To avoid > the function call penalty you described the storage model provides > a way to get a pointer to the first element and the templated operations > simply dispatch to a worker that takes this pointer to the first element > (as the storage model is designed as a template its abstraction is going > to be optimized away by means of inlining). > > Richard. > >>>> From a performance point of view, i believe that this is a non >>>> starter. If you can figure out how to design this so that it is not a >>>> function call, i would consider this a viable option. >>>> >>>> On the other side of this you are clearly correct that we are copying the >>>> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs. >>>> But >>>> this is why we represent data inside of the wide_ints, the INT_CSTs and >>>> the >>>> CONST_WIDE_INTs in a compressed form. Even with very big types, which >>>> are >>>> generally rare, the constants them selves are very small. So the copy >>>> operation is a loop that almost always copies one element, even with >>>> tree-vrp which doubles the sizes of every type. >>>> >>>> There is the third option which is that the storage inside the wide int >>>> is >>>> just ggced storage. We rejected this because of the functional nature of >>>> wide-ints. There are zillions created, they can be stack allocated, >>>> and >>>> they last for very short periods of time. >>> Of course - GCing wide-ints is a non-starter. >>> >>>>>> As is probably obvious, I don't agree FWIW. It seems like an >>>>>> unnecessary >>>>>> complication without any clear use. Especially since the number of >>>>> Maybe the double_int typedef is without any clear use. Properly >>>>> abstracting from the storage / information providers will save >>>>> compile-time, memory and code though. I don't see that any thought >>>>> was spent on how to avoid excessive copying or dealing with >>>>> long(er)-lived objects and their storage needs. >>>> I actually disagree. Wide ints can use a bloated amount of storage >>>> because they are designed to be very short lived and very low cost >>>> objects >>>> that are stack allocated. For long term storage, there is INT_CST at >>>> the >>>> tree level and CONST_WIDE_INT at the rtl level. Those use a very compact >>>> storage model. The copying entailed is only a small part of the overall >>>> performance. >>> Well, but both trees and RTXen are not viable for short-lived things >>> because >>> the are GCed! double-ints were suitable for this kind of stuff because >>> the also have a moderate size. With wide-ints size becomes a problem >>> (or GC, if you instead use trees or RTXen). >>> >>>> Everything that you are suggesting along these lines is adding to the >>>> weight >>>> of a wide-int object. >>> On the contrary - it lessens their weight (with external already >>> existing storage) >>> or does not do anything to it (with the embedded storage). >>> >>>> You have to understand there will be many more >>>> wide-ints created in a normal compilation than were ever created with >>>> double-int. This is because the rtl level had no object like this at >>>> all >>>> and at the tree level, many of the places that should have used double >>>> int, >>>> short cut the code and only did the transformations if the types fit in a >>>> HWI. >>> Your argument shows that the copy-in/out from tree/RTX to/from wide-int >>> will become a very frequent operation and thus it is worth optimizing it. >>> >>>> This is why we are extremely defensive about this issue. We really did >>>> think a lot about it. >>> I'm sure you did. >>> >>> Richard. >> ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-27 0:06 ` Kenneth Zadeck @ 2012-11-27 10:03 ` Richard Biener 2012-11-27 13:03 ` Kenneth Zadeck 0 siblings, 1 reply; 59+ messages in thread From: Richard Biener @ 2012-11-27 10:03 UTC (permalink / raw) To: Kenneth Zadeck; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford, Mike Stump On Tue, Nov 27, 2012 at 1:06 AM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote: > Richard, > > I spent a good part of the afternoon talking to Mike about this. He is on > the c++ standards committee and is a much more seasoned c++ programmer than > I am. > > He convinced me that with a large amount of engineering and c++ > "foolishness" that it was indeed possible to get your proposal to POSSIBLY > work as well as what we did. > > But now the question is why would any want to do this? > > At the very least you are talking about instantiating two instances of > wide-ints, one for the stack allocated uses and one for the places where we > just move a pointer from the tree or the rtx. Then you are talking about > creating connectors so that the stack allocated functions can take > parameters of pointer version and visa versa. > > Then there is the issue that rather than just saying that something is a > wide int, that the programmer is going to have to track it's origin. In > particular, where in the code right now i say. > > wide_int foo = wide_int::from_rtx (r1); > wide_int bar = wide_int::from_rtx (r2) + foo; > > now i would have to say > > wide_int_ptr foo = wide_int_ptr::from_rtx (r1); > wide_int_stack bar = wide_int_ptr::from_rtx (r2) + foo; No, you'd say wide_int foo = wide_int::from_rtx (r1); and the static, non-templated from_rtx method would automagically return (always!) a "wide_int_ptr" kind. The initialization then would use the assignment operator that mediates between wide_int and "wide_int_ptr", doing the copying. The user should get a 'stack' kind by default when specifying wide_int, like implemented with struct wide_int_storage_stack; struct wide_int_storage_ptr; template <class storage = wide_int_storage_stack> class wide_int : public storage { ... static wide_int <wide_int_storage_ptr> from_rtx (rtx); } the whole point of the exercise is to make from_rtx and from_tree avoid the copying (and excessive stack space allocation) for the rvalue case like in wide_int res = wide_int::from_rtx (x) + 1; if you save the result into a wide_int temporary first then you are lost of course (modulo some magic GCC optimization being able to elide the copy somehow). And of course for code like VRP that keeps a lattice of wide_ints to be able to reduce its footprint by using ptr storage and explicit allocations (that's a secondary concern, of course). And for VRP to specify that it needs more than the otherwise needed MAX_INT_MODE_SIZE. ptr storage would not have this arbitrary limitation, only embedded storage (should) have. > then when i want to call some function using a wide_int ref that function > now must be either overloaded to take both or i have to choose one of the > two instantiations (presumably based on which is going to be more common) > and just have the compiler fix up everything (which it is likely to do). Nope, they'd be class wide_int ... { template <class storage1, class storage2> wide_int operator+(wide_int <storage1> a, wide_int<storage2> b) { return wide_int::plus_worker (a.precision, a. ...., a.get_storage_ptr (), b.precision, ..., b.get_storage_ptr ()); } > And so what is the payoff: > 1) No one except the c++ elite is going to understand the code. The rest of > the community will hate me and curse the ground that i walk on. Maybe for the implementation - but look at hash-table and vec ... not for usage certainly. > 2) I will end up with a version of wide-int that can be used as a medium > life container (where i define medium life as not allowed to survive a gc > since they will contain pointers into rtxes and trees.) > 3) An no clients that actually wanted to do this!! I could use as an > example one of your favorite passes, tree-vrp. The current double-int > could have been a medium lifetime container since it has a smaller > footprint, but in fact tree-vrp converts those double-ints back into trees > for medium storage. Why, because it needs the other fields of a tree-cst > to store the entire state. Wide-ints also "suffer" this problem. their > only state are the data, and the three length fields. They have no type > and none of the other tree info so the most obvious client for a medium > lifetime object is really not going to be a good match even if you "solve > the storage problem". > > The fact is that wide-ints are an excellent short term storage class that > can be very quickly converted into our two long term storage classes. Your > proposal is requires a lot of work, will not be easy to use and as far as i > can see has no payoff on the horizon. It could be that there could be > future clients for a medium lifetime value, but asking for this with no > clients in hand is really beyond the scope of a reasonable review. > > I remind you that the purpose of these patches is to solve problems that > exist in the current compiler that we have papered over for years. If > someone needs wide-ints in some way that is not foreseen then they can > change it. The patches introduce a lot more temporary wide-ints (your words) and at the same time makes construction of them from tree / rtx very expensive both stack space and compile-time wise. Look at how we for example compute TREE_INT_CST + 1 - int_cst_binop internally uses double_ints for the computation and then instantiates a new tree for holding the result. Now we'd use wide_ints for this requring totally unnecessary copying. Why not in the first place try to avoid that. And try to avoid making wide_ints 4 times as large as really necessary just for the sake of VRP! (VRP should have a way to say "_I_ want larger wide_ints", without putting this burden on all other users). Richard. > kenny > > > On 11/26/2012 11:30 AM, Richard Biener wrote: >> >> On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck >> <zadeck@naturalbridge.com> wrote: >>> >>> On 11/26/2012 10:03 AM, Richard Biener wrote: >>>> >>>> On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck >>>> <zadeck@naturalbridge.com> >>>> wrote: >>>>> >>>>> On 11/04/2012 11:54 AM, Richard Biener wrote: >>>>>> >>>>>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford >>>>>> <rdsandiford@googlemail.com> wrote: >>>>>>> >>>>>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes: >>>>>>>> >>>>>>>> I would like you to respond to at least point 1 of this email. In >>>>>>>> it >>>>>>>> there is code from the rtl level that was written twice, once for >>>>>>>> the >>>>>>>> case when the size of the mode is less than the size of a HWI and >>>>>>>> once >>>>>>>> for the case where the size of the mode is less that 2 HWIs. >>>>>>>> >>>>>>>> my patch changes this to one instance of the code that works no >>>>>>>> matter >>>>>>>> how large the data passed to it is. >>>>>>>> >>>>>>>> you have made a specific requirement for wide int to be a template >>>>>>>> that >>>>>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. >>>>>>>> I >>>>>>>> would like to know how this particular fragment is to be rewritten >>>>>>>> in >>>>>>>> this model? It seems that I would have to retain the structure >>>>>>>> where >>>>>>>> there is one version of the code for each size that the template is >>>>>>>> instantiated. >>>>>>> >>>>>>> I think richi's argument was that wide_int should be split into two. >>>>>>> There should be a "bare-metal" class that just has a length and HWIs, >>>>>>> and the main wide_int class should be an extension on top of that >>>>>>> that does things to a bit precision instead. Presumably with some >>>>>>> template magic so that the length (number of HWIs) is a constant for: >>>>>>> >>>>>>> typedef foo<2> double_int; >>>>>>> >>>>>>> and a variable for wide_int (because in wide_int the length would be >>>>>>> the number of significant HWIs rather than the size of the underlying >>>>>>> array). wide_int would also record the precision and apply it after >>>>>>> the full HWI operation. >>>>>>> >>>>>>> So the wide_int class would still provide "as wide as we need" >>>>>>> arithmetic, >>>>>>> as in your rtl patch. I don't think he was objecting to that. >>>>>> >>>>>> That summarizes one part of my complaints / suggestions correctly. In >>>>>> other >>>>>> mails I suggested to not make it a template but a constant over object >>>>>> lifetime >>>>>> 'bitsize' (or maxlen) field. Both suggestions likely require more >>>>>> thought >>>>>> than >>>>>> I put into them. The main reason is that with C++ you can abstract >>>>>> from >>>>>> where >>>>>> wide-int information pieces are stored and thus use the arithmetic / >>>>>> operation >>>>>> workers without copying the (source) "wide-int" objects. Thus you >>>>>> should >>>>>> be able to write adaptors for double-int storage, tree or RTX storage. >>>>> >>>>> We had considered something along these lines and rejected it. I am >>>>> not >>>>> really opposed to doing something like this, but it is not an obvious >>>>> winning idea and is likely not to be a good idea. Here was our >>>>> thought >>>>> process: >>>>> >>>>> if you abstract away the storage inside a wide int, then you should be >>>>> able >>>>> to copy a pointer to the block of data from either the rtl level >>>>> integer >>>>> constant or the tree level one into the wide int. It is certainly >>>>> true >>>>> that making a wide_int from one of these is an extremely common >>>>> operation >>>>> and doing this would avoid those copies. >>>>> >>>>> However, this causes two problems: >>>>> 1) Mike's first cut at the CONST_WIDE_INT did two ggc allocations to >>>>> make >>>>> the object. it created the base object and then it allocated the >>>>> array. >>>>> Richard S noticed that we could just allocate one CONST_WIDE_INT that >>>>> had >>>>> the array in it. Doing it this way saves one ggc allocation and one >>>>> indirection when accessing the data within the CONST_WIDE_INT. Our >>>>> plan >>>>> is >>>>> to use the same trick at the tree level. So to avoid the copying, you >>>>> seem >>>>> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST. >>>> >>>> I did not propose having a pointer to the data in the RTX or tree int. >>>> Just >>>> the short-lived wide-ints (which are on the stack) would have a pointer >>>> to >>>> the data - which can then obviously point into the RTX and tree data. >>> >>> There is the issue then what if some wide-ints are not short lived. It >>> makes >>> me nervous to create internal pointers to gc ed memory. >> >> I thought they were all short-lived. >> >>>>> 2) You are now stuck either ggcing the storage inside a wide_int when >>>>> they >>>>> are created as part of an expression or you have to play some game to >>>>> represent the two different storage plans inside of wide_int. >>>> >>>> Hm? wide-ints are short-lived and thus never live across a garbage >>>> collection >>>> point. We create non-GCed objects pointing to GCed objects all the time >>>> and everywhere this way. >>> >>> Again, this makes me nervous but it could be done. However, it does mean >>> that now the wide ints that are not created from rtxes or trees will be >>> more >>> expensive because they are not going to get their storage "for free", >>> they >>> are going to alloca it. >> >> No, those would simply use the embedded storage model. >> >>> however, it still is not clear, given that 99% of the wide ints are going >>> to >>> fit in a single hwi, that this would be a noticeable win. >> >> Currently even if they fit into a HWI you will still allocate 4 times the >> larges integer mode size. You say that doesn't matter because they >> are short-lived, but I say it does matter because not all of them are >> short-lived enough. If 99% fit in a HWI why allocate 4 times the >> largest integer mode size in 99% of the cases? >> >>>>> Clearly this >>>>> is where you think that we should be going by suggesting that we >>>>> abstract >>>>> away the internal storage. However, this comes at a price: what is >>>>> currently an array access in my patches would (i believe) become a >>>>> function >>>>> call. >>>> >>>> No, the workers (that perform the array accesses) will simply get >>>> a pointer to the first data element. Then whether it's embedded or >>>> external is of no interest to them. >>> >>> so is your plan that the wide int constructors from rtx or tree would >>> just >>> copy the pointer to the array on top of the array that is otherwise >>> allocated on the stack? I can easily do this. But as i said, the >>> gain >>> seems quite small. >>> >>> And of course, going the other way still does need the copy. >> >> The proposal was to template wide_int on a storage model, the embedded >> one would work as-is (embedding 4 times largest integer mode), the >> external one would have a pointer to data. All functions that return a >> wide_int produce a wide_int with the embedded model. To avoid >> the function call penalty you described the storage model provides >> a way to get a pointer to the first element and the templated operations >> simply dispatch to a worker that takes this pointer to the first element >> (as the storage model is designed as a template its abstraction is going >> to be optimized away by means of inlining). >> >> Richard. >> >>>>> From a performance point of view, i believe that this is a non >>>>> starter. If you can figure out how to design this so that it is not a >>>>> function call, i would consider this a viable option. >>>>> >>>>> On the other side of this you are clearly correct that we are copying >>>>> the >>>>> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs. >>>>> But >>>>> this is why we represent data inside of the wide_ints, the INT_CSTs and >>>>> the >>>>> CONST_WIDE_INTs in a compressed form. Even with very big types, which >>>>> are >>>>> generally rare, the constants them selves are very small. So the copy >>>>> operation is a loop that almost always copies one element, even with >>>>> tree-vrp which doubles the sizes of every type. >>>>> >>>>> There is the third option which is that the storage inside the wide int >>>>> is >>>>> just ggced storage. We rejected this because of the functional nature >>>>> of >>>>> wide-ints. There are zillions created, they can be stack allocated, >>>>> and >>>>> they last for very short periods of time. >>>> >>>> Of course - GCing wide-ints is a non-starter. >>>> >>>>>>> As is probably obvious, I don't agree FWIW. It seems like an >>>>>>> unnecessary >>>>>>> complication without any clear use. Especially since the number of >>>>>> >>>>>> Maybe the double_int typedef is without any clear use. Properly >>>>>> abstracting from the storage / information providers will save >>>>>> compile-time, memory and code though. I don't see that any thought >>>>>> was spent on how to avoid excessive copying or dealing with >>>>>> long(er)-lived objects and their storage needs. >>>>> >>>>> I actually disagree. Wide ints can use a bloated amount of storage >>>>> because they are designed to be very short lived and very low cost >>>>> objects >>>>> that are stack allocated. For long term storage, there is INT_CST at >>>>> the >>>>> tree level and CONST_WIDE_INT at the rtl level. Those use a very >>>>> compact >>>>> storage model. The copying entailed is only a small part of the >>>>> overall >>>>> performance. >>>> >>>> Well, but both trees and RTXen are not viable for short-lived things >>>> because >>>> the are GCed! double-ints were suitable for this kind of stuff because >>>> the also have a moderate size. With wide-ints size becomes a problem >>>> (or GC, if you instead use trees or RTXen). >>>> >>>>> Everything that you are suggesting along these lines is adding to the >>>>> weight >>>>> of a wide-int object. >>>> >>>> On the contrary - it lessens their weight (with external already >>>> existing storage) >>>> or does not do anything to it (with the embedded storage). >>>> >>>>> You have to understand there will be many more >>>>> wide-ints created in a normal compilation than were ever created with >>>>> double-int. This is because the rtl level had no object like this at >>>>> all >>>>> and at the tree level, many of the places that should have used double >>>>> int, >>>>> short cut the code and only did the transformations if the types fit in >>>>> a >>>>> HWI. >>>> >>>> Your argument shows that the copy-in/out from tree/RTX to/from wide-int >>>> will become a very frequent operation and thus it is worth optimizing >>>> it. >>>> >>>>> This is why we are extremely defensive about this issue. We really >>>>> did >>>>> think a lot about it. >>>> >>>> I'm sure you did. >>>> >>>> Richard. >>> >>> > ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-27 10:03 ` Richard Biener @ 2012-11-27 13:03 ` Kenneth Zadeck 0 siblings, 0 replies; 59+ messages in thread From: Kenneth Zadeck @ 2012-11-27 13:03 UTC (permalink / raw) To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford, Mike Stump i will discuss this with mike when he wakes up. he lives on the west pole so that will not be until after you go to bed. the one point that i will take exception to is that the copying operation is, in practice, any more time expensive than the pointer copy. I never bother to initialize the storage in the array, i only copy the elements that are live. This is with almost always 1 hwi because either most types are small or most constants of large types compress to 1 hwi. So even if a compilation does a zillion ::from_trees, you will most likely never see the difference in time. kenny On 11/27/2012 05:03 AM, Richard Biener wrote: > On Tue, Nov 27, 2012 at 1:06 AM, Kenneth Zadeck > <zadeck@naturalbridge.com> wrote: >> Richard, >> >> I spent a good part of the afternoon talking to Mike about this. He is on >> the c++ standards committee and is a much more seasoned c++ programmer than >> I am. >> >> He convinced me that with a large amount of engineering and c++ >> "foolishness" that it was indeed possible to get your proposal to POSSIBLY >> work as well as what we did. >> >> But now the question is why would any want to do this? >> >> At the very least you are talking about instantiating two instances of >> wide-ints, one for the stack allocated uses and one for the places where we >> just move a pointer from the tree or the rtx. Then you are talking about >> creating connectors so that the stack allocated functions can take >> parameters of pointer version and visa versa. >> >> Then there is the issue that rather than just saying that something is a >> wide int, that the programmer is going to have to track it's origin. In >> particular, where in the code right now i say. >> >> wide_int foo = wide_int::from_rtx (r1); >> wide_int bar = wide_int::from_rtx (r2) + foo; >> >> now i would have to say >> >> wide_int_ptr foo = wide_int_ptr::from_rtx (r1); >> wide_int_stack bar = wide_int_ptr::from_rtx (r2) + foo; > No, you'd say > > wide_int foo = wide_int::from_rtx (r1); > > and the static, non-templated from_rtx method would automagically > return (always!) a "wide_int_ptr" kind. The initialization then would > use the assignment operator that mediates between wide_int and > "wide_int_ptr", doing the copying. > > The user should get a 'stack' kind by default when specifying wide_int, > like implemented with > > struct wide_int_storage_stack; > struct wide_int_storage_ptr; > > template <class storage = wide_int_storage_stack> > class wide_int : public storage > { > ... > static wide_int <wide_int_storage_ptr> from_rtx (rtx); > } > > the whole point of the exercise is to make from_rtx and from_tree avoid > the copying (and excessive stack space allocation) for the rvalue case > like in > > wide_int res = wide_int::from_rtx (x) + 1; > > if you save the result into a wide_int temporary first then you are lost > of course (modulo some magic GCC optimization being able to elide > the copy somehow). > > And of course for code like VRP that keeps a lattice of wide_ints to > be able to reduce its footprint by using ptr storage and explicit allocations > (that's a secondary concern, of course). And for VRP to specify that > it needs more than the otherwise needed MAX_INT_MODE_SIZE. > ptr storage would not have this arbitrary limitation, only embedded > storage (should) have. > >> then when i want to call some function using a wide_int ref that function >> now must be either overloaded to take both or i have to choose one of the >> two instantiations (presumably based on which is going to be more common) >> and just have the compiler fix up everything (which it is likely to do). > Nope, they'd be > > class wide_int ... > { > template <class storage1, class storage2> > wide_int operator+(wide_int <storage1> a, wide_int<storage2> b) > { > return wide_int::plus_worker (a.precision, a. ...., a.get_storage_ptr (), > b.precision, ..., > b.get_storage_ptr ()); > } > > >> And so what is the payoff: >> 1) No one except the c++ elite is going to understand the code. The rest of >> the community will hate me and curse the ground that i walk on. > Maybe for the implementation - but look at hash-table and vec ... not for > usage certainly. > >> 2) I will end up with a version of wide-int that can be used as a medium >> life container (where i define medium life as not allowed to survive a gc >> since they will contain pointers into rtxes and trees.) >> 3) An no clients that actually wanted to do this!! I could use as an >> example one of your favorite passes, tree-vrp. The current double-int >> could have been a medium lifetime container since it has a smaller >> footprint, but in fact tree-vrp converts those double-ints back into trees >> for medium storage. Why, because it needs the other fields of a tree-cst >> to store the entire state. Wide-ints also "suffer" this problem. their >> only state are the data, and the three length fields. They have no type >> and none of the other tree info so the most obvious client for a medium >> lifetime object is really not going to be a good match even if you "solve >> the storage problem". >> >> The fact is that wide-ints are an excellent short term storage class that >> can be very quickly converted into our two long term storage classes. Your >> proposal is requires a lot of work, will not be easy to use and as far as i >> can see has no payoff on the horizon. It could be that there could be >> future clients for a medium lifetime value, but asking for this with no >> clients in hand is really beyond the scope of a reasonable review. >> >> I remind you that the purpose of these patches is to solve problems that >> exist in the current compiler that we have papered over for years. If >> someone needs wide-ints in some way that is not foreseen then they can >> change it. > The patches introduce a lot more temporary wide-ints (your words) and > at the same time makes construction of them from tree / rtx very expensive > both stack space and compile-time wise. Look at how we for example > compute TREE_INT_CST + 1 - int_cst_binop internally uses double_ints > for the computation and then instantiates a new tree for holding the result. > Now we'd use wide_ints for this requring totally unnecessary copying. > Why not in the first place try to avoid that. And try to avoid making > wide_ints 4 times as large as really necessary just for the sake of VRP! > (VRP should have a way to say "_I_ want larger wide_ints", without putting > this burden on all other users). > > Richard. > >> kenny >> >> >> On 11/26/2012 11:30 AM, Richard Biener wrote: >>> On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck >>> <zadeck@naturalbridge.com> wrote: >>>> On 11/26/2012 10:03 AM, Richard Biener wrote: >>>>> On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck >>>>> <zadeck@naturalbridge.com> >>>>> wrote: >>>>>> On 11/04/2012 11:54 AM, Richard Biener wrote: >>>>>>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford >>>>>>> <rdsandiford@googlemail.com> wrote: >>>>>>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes: >>>>>>>>> I would like you to respond to at least point 1 of this email. In >>>>>>>>> it >>>>>>>>> there is code from the rtl level that was written twice, once for >>>>>>>>> the >>>>>>>>> case when the size of the mode is less than the size of a HWI and >>>>>>>>> once >>>>>>>>> for the case where the size of the mode is less that 2 HWIs. >>>>>>>>> >>>>>>>>> my patch changes this to one instance of the code that works no >>>>>>>>> matter >>>>>>>>> how large the data passed to it is. >>>>>>>>> >>>>>>>>> you have made a specific requirement for wide int to be a template >>>>>>>>> that >>>>>>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. >>>>>>>>> I >>>>>>>>> would like to know how this particular fragment is to be rewritten >>>>>>>>> in >>>>>>>>> this model? It seems that I would have to retain the structure >>>>>>>>> where >>>>>>>>> there is one version of the code for each size that the template is >>>>>>>>> instantiated. >>>>>>>> I think richi's argument was that wide_int should be split into two. >>>>>>>> There should be a "bare-metal" class that just has a length and HWIs, >>>>>>>> and the main wide_int class should be an extension on top of that >>>>>>>> that does things to a bit precision instead. Presumably with some >>>>>>>> template magic so that the length (number of HWIs) is a constant for: >>>>>>>> >>>>>>>> typedef foo<2> double_int; >>>>>>>> >>>>>>>> and a variable for wide_int (because in wide_int the length would be >>>>>>>> the number of significant HWIs rather than the size of the underlying >>>>>>>> array). wide_int would also record the precision and apply it after >>>>>>>> the full HWI operation. >>>>>>>> >>>>>>>> So the wide_int class would still provide "as wide as we need" >>>>>>>> arithmetic, >>>>>>>> as in your rtl patch. I don't think he was objecting to that. >>>>>>> That summarizes one part of my complaints / suggestions correctly. In >>>>>>> other >>>>>>> mails I suggested to not make it a template but a constant over object >>>>>>> lifetime >>>>>>> 'bitsize' (or maxlen) field. Both suggestions likely require more >>>>>>> thought >>>>>>> than >>>>>>> I put into them. The main reason is that with C++ you can abstract >>>>>>> from >>>>>>> where >>>>>>> wide-int information pieces are stored and thus use the arithmetic / >>>>>>> operation >>>>>>> workers without copying the (source) "wide-int" objects. Thus you >>>>>>> should >>>>>>> be able to write adaptors for double-int storage, tree or RTX storage. >>>>>> We had considered something along these lines and rejected it. I am >>>>>> not >>>>>> really opposed to doing something like this, but it is not an obvious >>>>>> winning idea and is likely not to be a good idea. Here was our >>>>>> thought >>>>>> process: >>>>>> >>>>>> if you abstract away the storage inside a wide int, then you should be >>>>>> able >>>>>> to copy a pointer to the block of data from either the rtl level >>>>>> integer >>>>>> constant or the tree level one into the wide int. It is certainly >>>>>> true >>>>>> that making a wide_int from one of these is an extremely common >>>>>> operation >>>>>> and doing this would avoid those copies. >>>>>> >>>>>> However, this causes two problems: >>>>>> 1) Mike's first cut at the CONST_WIDE_INT did two ggc allocations to >>>>>> make >>>>>> the object. it created the base object and then it allocated the >>>>>> array. >>>>>> Richard S noticed that we could just allocate one CONST_WIDE_INT that >>>>>> had >>>>>> the array in it. Doing it this way saves one ggc allocation and one >>>>>> indirection when accessing the data within the CONST_WIDE_INT. Our >>>>>> plan >>>>>> is >>>>>> to use the same trick at the tree level. So to avoid the copying, you >>>>>> seem >>>>>> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST. >>>>> I did not propose having a pointer to the data in the RTX or tree int. >>>>> Just >>>>> the short-lived wide-ints (which are on the stack) would have a pointer >>>>> to >>>>> the data - which can then obviously point into the RTX and tree data. >>>> There is the issue then what if some wide-ints are not short lived. It >>>> makes >>>> me nervous to create internal pointers to gc ed memory. >>> I thought they were all short-lived. >>> >>>>>> 2) You are now stuck either ggcing the storage inside a wide_int when >>>>>> they >>>>>> are created as part of an expression or you have to play some game to >>>>>> represent the two different storage plans inside of wide_int. >>>>> Hm? wide-ints are short-lived and thus never live across a garbage >>>>> collection >>>>> point. We create non-GCed objects pointing to GCed objects all the time >>>>> and everywhere this way. >>>> Again, this makes me nervous but it could be done. However, it does mean >>>> that now the wide ints that are not created from rtxes or trees will be >>>> more >>>> expensive because they are not going to get their storage "for free", >>>> they >>>> are going to alloca it. >>> No, those would simply use the embedded storage model. >>> >>>> however, it still is not clear, given that 99% of the wide ints are going >>>> to >>>> fit in a single hwi, that this would be a noticeable win. >>> Currently even if they fit into a HWI you will still allocate 4 times the >>> larges integer mode size. You say that doesn't matter because they >>> are short-lived, but I say it does matter because not all of them are >>> short-lived enough. If 99% fit in a HWI why allocate 4 times the >>> largest integer mode size in 99% of the cases? >>> >>>>>> Clearly this >>>>>> is where you think that we should be going by suggesting that we >>>>>> abstract >>>>>> away the internal storage. However, this comes at a price: what is >>>>>> currently an array access in my patches would (i believe) become a >>>>>> function >>>>>> call. >>>>> No, the workers (that perform the array accesses) will simply get >>>>> a pointer to the first data element. Then whether it's embedded or >>>>> external is of no interest to them. >>>> so is your plan that the wide int constructors from rtx or tree would >>>> just >>>> copy the pointer to the array on top of the array that is otherwise >>>> allocated on the stack? I can easily do this. But as i said, the >>>> gain >>>> seems quite small. >>>> >>>> And of course, going the other way still does need the copy. >>> The proposal was to template wide_int on a storage model, the embedded >>> one would work as-is (embedding 4 times largest integer mode), the >>> external one would have a pointer to data. All functions that return a >>> wide_int produce a wide_int with the embedded model. To avoid >>> the function call penalty you described the storage model provides >>> a way to get a pointer to the first element and the templated operations >>> simply dispatch to a worker that takes this pointer to the first element >>> (as the storage model is designed as a template its abstraction is going >>> to be optimized away by means of inlining). >>> >>> Richard. >>> >>>>>> From a performance point of view, i believe that this is a non >>>>>> starter. If you can figure out how to design this so that it is not a >>>>>> function call, i would consider this a viable option. >>>>>> >>>>>> On the other side of this you are clearly correct that we are copying >>>>>> the >>>>>> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs. >>>>>> But >>>>>> this is why we represent data inside of the wide_ints, the INT_CSTs and >>>>>> the >>>>>> CONST_WIDE_INTs in a compressed form. Even with very big types, which >>>>>> are >>>>>> generally rare, the constants them selves are very small. So the copy >>>>>> operation is a loop that almost always copies one element, even with >>>>>> tree-vrp which doubles the sizes of every type. >>>>>> >>>>>> There is the third option which is that the storage inside the wide int >>>>>> is >>>>>> just ggced storage. We rejected this because of the functional nature >>>>>> of >>>>>> wide-ints. There are zillions created, they can be stack allocated, >>>>>> and >>>>>> they last for very short periods of time. >>>>> Of course - GCing wide-ints is a non-starter. >>>>> >>>>>>>> As is probably obvious, I don't agree FWIW. It seems like an >>>>>>>> unnecessary >>>>>>>> complication without any clear use. Especially since the number of >>>>>>> Maybe the double_int typedef is without any clear use. Properly >>>>>>> abstracting from the storage / information providers will save >>>>>>> compile-time, memory and code though. I don't see that any thought >>>>>>> was spent on how to avoid excessive copying or dealing with >>>>>>> long(er)-lived objects and their storage needs. >>>>>> I actually disagree. Wide ints can use a bloated amount of storage >>>>>> because they are designed to be very short lived and very low cost >>>>>> objects >>>>>> that are stack allocated. For long term storage, there is INT_CST at >>>>>> the >>>>>> tree level and CONST_WIDE_INT at the rtl level. Those use a very >>>>>> compact >>>>>> storage model. The copying entailed is only a small part of the >>>>>> overall >>>>>> performance. >>>>> Well, but both trees and RTXen are not viable for short-lived things >>>>> because >>>>> the are GCed! double-ints were suitable for this kind of stuff because >>>>> the also have a moderate size. With wide-ints size becomes a problem >>>>> (or GC, if you instead use trees or RTXen). >>>>> >>>>>> Everything that you are suggesting along these lines is adding to the >>>>>> weight >>>>>> of a wide-int object. >>>>> On the contrary - it lessens their weight (with external already >>>>> existing storage) >>>>> or does not do anything to it (with the embedded storage). >>>>> >>>>>> You have to understand there will be many more >>>>>> wide-ints created in a normal compilation than were ever created with >>>>>> double-int. This is because the rtl level had no object like this at >>>>>> all >>>>>> and at the tree level, many of the places that should have used double >>>>>> int, >>>>>> short cut the code and only did the transformations if the types fit in >>>>>> a >>>>>> HWI. >>>>> Your argument shows that the copy-in/out from tree/RTX to/from wide-int >>>>> will become a very frequent operation and thus it is worth optimizing >>>>> it. >>>>> >>>>>> This is why we are extremely defensive about this issue. We really >>>>>> did >>>>>> think a lot about it. >>>>> I'm sure you did. >>>>> >>>>> Richard. >>>> ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 13:54 ` Kenneth Zadeck 2012-10-31 14:05 ` Jakub Jelinek @ 2012-10-31 19:13 ` Marc Glisse 1 sibling, 0 replies; 59+ messages in thread From: Marc Glisse @ 2012-10-31 19:13 UTC (permalink / raw) To: Kenneth Zadeck Cc: Richard Biener, Jakub Jelinek, gcc, gcc-patches, rdsandiford On Wed, 31 Oct 2012, Kenneth Zadeck wrote: > Richi, > > Let me explain to you what a broken api is. I have spent the last week > screwing around with tree-vpn and as of last night i finally got it to work. > In tree-vpn, it is clear that double-int is the precise definition of a > broken api. > > The tree-vpn uses an infinite-precision view of arithmetic. However, that > infinite precision is implemented on top of a finite, CARVED IN STONE, base > that is and will always be without a patch like this, 128 bits on an x86-64. > However, as was pointed out by earlier, tree-vrp needs 2 * the size of a type > + 1 bit to work correctly. Until yesterday i did not fully understand the > significance of that 1 bit. what this means is that tree-vrp does not work > on an x86-64 with _int128 variables. I am a bit surprised by that. AFAIK, the wrapping multiplication case is the only place that uses quad-sized arithmetic, so that must be what you are talking about. But when I wrote that code, I was well aware of the need for that extra bit and worked around it using signed / unsigned as an extra bit of information. So if you found a bug there, I'd like to know (although it becomes moot once the code is replaced with wide_int). Note that my original patch for VRP used the GMP library for computations (it was rejected as likely too slow), so I think simplifying the thing with a multi-precision type is great. And if as you explained you have one (large) fixed size used for all temporaries on the stack but never used for malloc'ed objects, that sounds good too. Good luck with the useful wide_int work, -- Marc Glisse ^ permalink raw reply [flat|nested] 59+ messages in thread
* patch to fix constant math - 5th patch - the main rtl work 2012-10-31 10:02 ` Richard Sandiford 2012-10-31 10:13 ` Richard Biener 2012-10-31 13:54 ` Kenneth Zadeck @ 2013-02-27 12:39 ` Kenneth Zadeck 2 siblings, 0 replies; 59+ messages in thread From: Kenneth Zadeck @ 2013-02-27 12:39 UTC (permalink / raw) To: Richard Biener, Jakub Jelinek, gcc, gcc-patches, rdsandiford, Ian Lance Taylor [-- Attachment #1: Type: text/plain, Size: 3491 bytes --] This patch fixes the rtl level so that the constant math performed is independent of the host compiler. This patch improves the rtl level in two ways: 1) This patch unifies the way that constant math is preformed. Without this patch, there are a large number of checks to see if a constant fit in a one or two HOST_WIDE_INTs. In many cases, transformations were not done or done differently depending on the results of the test. Now, virtually all constant math at the rtl level use the wide-int class and so there are no host dependent differences on how the math is done. This means that TImode is now better supported on 64bit host compiling to 64 bit targets. 2) This patch conditionally introduces a new rtl class, the WIDE_INT that holds integer constants that do not fit into a CONST_INT. For those targets that define TARGET_SUPPORTS_WIDE_INT, this removes the punning of using CONST_DOUBLE to hold both floats and ints that are larger than two HOST_WIDE_INTS. If the target defines this, then (at least at the rtl level) TImode can be used without iceing or getting the wrong answer on a 32 bit host and it makes it possible for the target to use modes larger than TImode. Note that we already have 2 public platforms that are beginning to make use of modes larger than 128 bits. For instance, the x86-64 can now do vector wide shifts which require 256 bit data types. It would be unsurprising to see more vector wide operations in the future. This patch fixes the rtl level so that GCC can support these operations. This patch was heavily reviewed by Richard Sandiford before he resigned as a reviewer. It was mostly just waiting on patch 4 to be accepted on which it depends very heavily. Ok to commit when stage1 opens? kenny On 10/31/2012 05:59 AM, Richard Sandiford wrote: > Richard Biener <richard.guenther@gmail.com> writes: >> On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck >> <zadeck@naturalbridge.com> wrote: >>> jakub, >>> >>> i am hoping to get the rest of my wide integer conversion posted by nov 5. >>> I am under some adverse conditions here: hurricane sandy hit her pretty >>> badly. my house is hooked up to a small generator, and no one has any power >>> for miles around. >>> >>> So far richi has promised to review them. he has sent some comments, but >>> so far no reviews. Some time after i get the first round of them posted, >>> i will do a second round that incorporates everyones comments. >>> >>> But i would like a little slack here if possible. While this work is a >>> show stopper for my private port, the patches address serious problems for >>> many of the public ports, especially ones that have very flexible vector >>> units. I believe that there are significant set of latent problems >>> currently with the existing ports that use ti mode that these patches will >>> fix. >>> >>> However, i will do everything in my power to get the first round of the >>> patches posted by nov 5 deadline. >> I suppose you are not going to merge your private port for 4.8 and thus >> the wide-int changes are not a show-stopper for you. >> >> That said, I considered the main conversion to be appropriate to be >> defered for the next stage1. There is no advantage in disrupting the >> tree more at this stage. > I would like the wide_int class and rtl stuff to go in 4.8 though. > IMO it's a significant improvement in its own right, and Kenny > submitted it well before the deadline. > > Richard [-- Attachment #2: p5-3.diff --] [-- Type: text/x-patch, Size: 135195 bytes --] diff --git a/gcc/alias.c b/gcc/alias.c index e18dd34..58e4eac 100644 --- a/gcc/alias.c +++ b/gcc/alias.c @@ -1471,9 +1471,7 @@ rtx_equal_for_memref_p (const_rtx x, const_rtx y) case VALUE: CASE_CONST_UNIQUE: - /* There's no need to compare the contents of CONST_DOUBLEs or - CONST_INTs because pointer equality is a good enough - comparison for these nodes. */ + /* Pointer equality guarantees equality for these nodes. */ return 0; default: diff --git a/gcc/builtins.c b/gcc/builtins.c index 68b6a2c..f076cee 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -669,20 +669,24 @@ c_getstr (tree src) return TREE_STRING_POINTER (src) + tree_low_cst (offset_node, 1); } -/* Return a CONST_INT or CONST_DOUBLE corresponding to target reading +/* Return a constant integer corresponding to target reading GET_MODE_BITSIZE (MODE) bits from string constant STR. */ static rtx c_readstr (const char *str, enum machine_mode mode) { - HOST_WIDE_INT c[2]; + wide_int c; HOST_WIDE_INT ch; unsigned int i, j; + HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; + unsigned int len = (GET_MODE_PRECISION (mode) + HOST_BITS_PER_WIDE_INT - 1) + / HOST_BITS_PER_WIDE_INT; + + for (i = 0; i < len; i++) + tmp[i] = 0; gcc_assert (GET_MODE_CLASS (mode) == MODE_INT); - c[0] = 0; - c[1] = 0; ch = 1; for (i = 0; i < GET_MODE_SIZE (mode); i++) { @@ -693,13 +697,14 @@ c_readstr (const char *str, enum machine_mode mode) && GET_MODE_SIZE (mode) >= UNITS_PER_WORD) j = j + UNITS_PER_WORD - 2 * (j % UNITS_PER_WORD) - 1; j *= BITS_PER_UNIT; - gcc_assert (j < HOST_BITS_PER_DOUBLE_INT); if (ch) ch = (unsigned char) str[i]; - c[j / HOST_BITS_PER_WIDE_INT] |= ch << (j % HOST_BITS_PER_WIDE_INT); + tmp[j / HOST_BITS_PER_WIDE_INT] |= ch << (j % HOST_BITS_PER_WIDE_INT); } - return immed_double_const (c[0], c[1], mode); + + c = wide_int::from_array (tmp, len, mode); + return immed_wide_int_const (c, mode); } /* Cast a target constant CST to target CHAR and if that value fits into @@ -4991,12 +4996,12 @@ expand_builtin_signbit (tree exp, rtx target) if (bitpos < GET_MODE_BITSIZE (rmode)) { - double_int mask = double_int_zero.set_bit (bitpos); + wide_int mask = wide_int::set_bit_in_zero (bitpos, rmode); if (GET_MODE_SIZE (imode) > GET_MODE_SIZE (rmode)) temp = gen_lowpart (rmode, temp); temp = expand_binop (rmode, and_optab, temp, - immed_double_int_const (mask, rmode), + immed_wide_int_const (mask, rmode), NULL_RTX, 1, OPTAB_LIB_WIDEN); } else diff --git a/gcc/combine.c b/gcc/combine.c index 98ca4a8..7dd29b8 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -2669,23 +2669,15 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int *new_direct_jump_p, offset = -1; } - if (offset >= 0 - && (GET_MODE_PRECISION (GET_MODE (SET_DEST (temp))) - <= HOST_BITS_PER_DOUBLE_INT)) + if (offset >= 0) { - double_int m, o, i; + wide_int o; rtx inner = SET_SRC (PATTERN (i3)); rtx outer = SET_SRC (temp); - - o = rtx_to_double_int (outer); - i = rtx_to_double_int (inner); - - m = double_int::mask (width); - i &= m; - m = m.llshift (offset, HOST_BITS_PER_DOUBLE_INT); - i = i.llshift (offset, HOST_BITS_PER_DOUBLE_INT); - o = o.and_not (m) | i; - + + o = (wide_int::from_rtx (outer, GET_MODE (SET_DEST (temp))) + .insert (wide_int::from_rtx (inner, GET_MODE (dest)), + offset, width)); combine_merges++; subst_insn = i3; subst_low_luid = DF_INSN_LUID (i2); @@ -2696,8 +2688,8 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int *new_direct_jump_p, /* Replace the source in I2 with the new constant and make the resulting insn the new pattern for I3. Then skip to where we validate the pattern. Everything was set up above. */ - SUBST (SET_SRC (temp), - immed_double_int_const (o, GET_MODE (SET_DEST (temp)))); + SUBST (SET_SRC (temp), + immed_wide_int_const (o, GET_MODE (SET_DEST (temp)))); newpat = PATTERN (i2); @@ -5113,7 +5105,7 @@ subst (rtx x, rtx from, rtx to, int in_dest, int in_cond, int unique_copy) if (! x) x = gen_rtx_CLOBBER (mode, const0_rtx); } - else if (CONST_INT_P (new_rtx) + else if (CONST_SCALAR_INT_P (new_rtx) && GET_CODE (x) == ZERO_EXTEND) { x = simplify_unary_operation (ZERO_EXTEND, GET_MODE (x), diff --git a/gcc/coretypes.h b/gcc/coretypes.h index 320b4dd..3ea8920 100644 --- a/gcc/coretypes.h +++ b/gcc/coretypes.h @@ -55,6 +55,9 @@ typedef const struct rtx_def *const_rtx; struct rtvec_def; typedef struct rtvec_def *rtvec; typedef const struct rtvec_def *const_rtvec; +struct hwivec_def; +typedef struct hwivec_def *hwivec; +typedef const struct hwivec_def *const_hwivec; union tree_node; typedef union tree_node *tree; typedef const union tree_node *const_tree; diff --git a/gcc/cse.c b/gcc/cse.c index b200fef..db57f33 100644 --- a/gcc/cse.c +++ b/gcc/cse.c @@ -2331,15 +2331,23 @@ hash_rtx_cb (const_rtx x, enum machine_mode mode, + (unsigned int) INTVAL (x)); return hash; + case CONST_WIDE_INT: + { + int i; + for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++) + hash += CONST_WIDE_INT_ELT (x, i); + } + return hash; + case CONST_DOUBLE: /* This is like the general case, except that it only counts the integers representing the constant. */ hash += (unsigned int) code + (unsigned int) GET_MODE (x); - if (GET_MODE (x) != VOIDmode) - hash += real_hash (CONST_DOUBLE_REAL_VALUE (x)); - else + if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (x) == VOIDmode) hash += ((unsigned int) CONST_DOUBLE_LOW (x) + (unsigned int) CONST_DOUBLE_HIGH (x)); + else + hash += real_hash (CONST_DOUBLE_REAL_VALUE (x)); return hash; case CONST_FIXED: @@ -3756,6 +3764,7 @@ equiv_constant (rtx x) /* See if we previously assigned a constant value to this SUBREG. */ if ((new_rtx = lookup_as_function (x, CONST_INT)) != 0 + || (new_rtx = lookup_as_function (x, CONST_WIDE_INT)) != 0 || (new_rtx = lookup_as_function (x, CONST_DOUBLE)) != 0 || (new_rtx = lookup_as_function (x, CONST_FIXED)) != 0) return new_rtx; diff --git a/gcc/cselib.c b/gcc/cselib.c index dcad9741..3f7c156 100644 --- a/gcc/cselib.c +++ b/gcc/cselib.c @@ -923,8 +923,7 @@ rtx_equal_for_cselib_1 (rtx x, rtx y, enum machine_mode memmode) /* These won't be handled correctly by the code below. */ switch (GET_CODE (x)) { - case CONST_DOUBLE: - case CONST_FIXED: + CASE_CONST_UNIQUE: case DEBUG_EXPR: return 0; @@ -1118,15 +1117,23 @@ cselib_hash_rtx (rtx x, int create, enum machine_mode memmode) hash += ((unsigned) CONST_INT << 7) + INTVAL (x); return hash ? hash : (unsigned int) CONST_INT; + case CONST_WIDE_INT: + { + int i; + for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++) + hash += CONST_WIDE_INT_ELT (x, i); + } + return hash; + case CONST_DOUBLE: /* This is like the general case, except that it only counts the integers representing the constant. */ hash += (unsigned) code + (unsigned) GET_MODE (x); - if (GET_MODE (x) != VOIDmode) - hash += real_hash (CONST_DOUBLE_REAL_VALUE (x)); - else + if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (x) == VOIDmode) hash += ((unsigned) CONST_DOUBLE_LOW (x) + (unsigned) CONST_DOUBLE_HIGH (x)); + else + hash += real_hash (CONST_DOUBLE_REAL_VALUE (x)); return hash ? hash : (unsigned int) CONST_DOUBLE; case CONST_FIXED: diff --git a/gcc/defaults.h b/gcc/defaults.h index 4f43f6f0..0801073 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1404,6 +1404,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define SWITCHABLE_TARGET 0 #endif +/* If the target supports integers that are wider than two + HOST_WIDE_INTs on the host compiler, then the target should define + TARGET_SUPPORTS_WIDE_INT and make the appropriate fixups. + Otherwise the compiler really is not robust. */ +#ifndef TARGET_SUPPORTS_WIDE_INT +#define TARGET_SUPPORTS_WIDE_INT 0 +#endif + #endif /* GCC_INSN_FLAGS_H */ #endif /* ! GCC_DEFAULTS_H */ diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi index 095a642..a4e4381 100644 --- a/gcc/doc/rtl.texi +++ b/gcc/doc/rtl.texi @@ -1531,17 +1531,22 @@ Similarly, there is only one object for the integer whose value is @findex const_double @item (const_double:@var{m} @var{i0} @var{i1} @dots{}) -Represents either a floating-point constant of mode @var{m} or an -integer constant too large to fit into @code{HOST_BITS_PER_WIDE_INT} -bits but small enough to fit within twice that number of bits (GCC -does not provide a mechanism to represent even larger constants). In -the latter case, @var{m} will be @code{VOIDmode}. For integral values -constants for modes with more bits than twice the number in -@code{HOST_WIDE_INT} the implied high order bits of that constant are -copies of the top bit of @code{CONST_DOUBLE_HIGH}. Note however that -integral values are neither inherently signed nor inherently unsigned; -where necessary, signedness is determined by the rtl operation -instead. +This represents either a floating-point constant of mode @var{m} or +(on ports older ports that do not define +@code{TARGET_SUPPORTS_WIDE_INT}) an integer constant too large to fit +into @code{HOST_BITS_PER_WIDE_INT} bits but small enough to fit within +twice that number of bits (GCC does not provide a mechanism to +represent even larger constants). In the latter case, @var{m} will be +@code{VOIDmode}. For integral values constants for modes with more +bits than twice the number in @code{HOST_WIDE_INT} the implied high +order bits of that constant are copies of the top bit of +@code{CONST_DOUBLE_HIGH}. Note however that integral values are +neither inherently signed nor inherently unsigned; where necessary, +signedness is determined by the rtl operation instead. + +On more modern ports, @code{CONST_DOUBLE} only represents floating +point values. New ports define to @code{TARGET_SUPPORTS_WIDE_INT} to +make this designation. @findex CONST_DOUBLE_LOW If @var{m} is @code{VOIDmode}, the bits of the value are stored in @@ -1556,6 +1561,37 @@ machine's or host machine's floating point format. To convert them to the precise bit pattern used by the target machine, use the macro @code{REAL_VALUE_TO_TARGET_DOUBLE} and friends (@pxref{Data Output}). +@findex CONST_WIDE_INT +@item (const_wide_int:@var{m} @var{nunits} @var{elt0} @dots{}) +This contains an array of @code{HOST_WIDE_INTS} that is large enough +to hold any constant that can be represented on the target. This form +of rtl is only used on targets that define +@code{TARGET_SUPPORTS_WIDE_INT} to be non zero and then +@code{CONST_DOUBLES} are only used to hold floating point values. If +the target leaves @code{TARGET_SUPPORTS_WIDE_INT} defined as 0, +@code{CONST_WIDE_INT}s are not used and @code{CONST_DOUBLE}s are as +they were before. + +The values are stored in a compressed format. The higher order +0s or -1s are not represented if they are just the logical sign +extension of the number that is represented. + +@findex CONST_WIDE_INT_VEC +@item CONST_WIDE_INT_VEC (@var{code}) +Returns the entire array of @code{HOST_WIDE_INT}s that are used to +store the value. This macro should be rarely used. + +@findex CONST_WIDE_INT_NUNITS +@item CONST_WIDE_INT_NUNITS (@var{code}) +The number of @code{HOST_WIDE_INT}s used to represent the number. +Note that this generally be smaller than the number of +@code{HOST_WIDE_INT}s implied by the mode size. + +@findex CONST_WIDE_INT_ELT +@item CONST_WIDE_INT_NUNITS (@var{code},@var{i}) +Returns the @code{i}th element of the array. Element 0 is contains +the low order bits of the constant. + @findex const_fixed @item (const_fixed:@var{m} @dots{}) Represents a fixed-point constant of mode @var{m}. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index ce2b44d..dc123c9 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -11341,3 +11341,48 @@ memory model bits are allowed. @deftypevr {Target Hook} {unsigned char} TARGET_ATOMIC_TEST_AND_SET_TRUEVAL This value should be set if the result written by @code{atomic_test_and_set} is not exactly 1, i.e. the @code{bool} @code{true}. @end deftypevr +@defmac TARGET_SUPPORTS_WIDE_INT + +On older ports, large integers are stored in @code{CONST_DOUBLE} rtl +objects. Newer ports define @code{TARGET_SUPPORTS_WIDE_INT} to be non +zero to indicate that large integers are stored in +@code{CONST_WIDE_INT} rtl objects. The @code{CONST_WIDE_INT} allows +very large integer constants to be represented. @code{CONST_DOUBLE} +are limited to twice the size of host's @code{HOST_WIDE_INT} +representation. + +Converting a port mostly requires looking for the places where +@code{CONST_DOUBLES} are used with @code{VOIDmode} and replacing that +code with code that accesses @code{CONST_WIDE_INT}s. @samp{"grep -i +const_double"} at the port level gets you to 95% of the changes that +need to be made. There are a few places that require a deeper look. + +@itemize @bullet +@item +There is no equivalent to @code{hval} and @code{lval} for +@code{CONST_WIDE_INT}s. This would be difficult to express in the md +language since there are a variable number of elements. + +Most ports only check that @code{hval} is either 0 or -1 to see if the +value is small. As mentioned above, this will no longer be necessary +since small constants are always @code{CONST_INT}. Of course there +are still a few exceptions, the alpha's constraint used by the zap +instruction certainly requires careful examination by C code. +However, all the current code does is pass the hval and lval to C +code, so evolving the c code to look at the @code{CONST_WIDE_INT} is +not really a large change. + +@item +Because there is no standard template that ports use to materialize +constants, there is likely to be some futzing that is unique to each +port in this code. + +@item +The rtx costs may have to be adjusted to properly account for larger +constants that are represented as @code{CONST_WIDE_INT}. +@end itemize + +All and all it does not takes long to convert ports that the +maintainer is familiar with. + +@end defmac diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index d6e7ce7..6345fcb 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -11177,3 +11177,48 @@ memory model bits are allowed. @end deftypefn @hook TARGET_ATOMIC_TEST_AND_SET_TRUEVAL +@defmac TARGET_SUPPORTS_WIDE_INT + +On older ports, large integers are stored in @code{CONST_DOUBLE} rtl +objects. Newer ports define @code{TARGET_SUPPORTS_WIDE_INT} to be non +zero to indicate that large integers are stored in +@code{CONST_WIDE_INT} rtl objects. The @code{CONST_WIDE_INT} allows +very large integer constants to be represented. @code{CONST_DOUBLE} +are limited to twice the size of host's @code{HOST_WIDE_INT} +representation. + +Converting a port mostly requires looking for the places where +@code{CONST_DOUBLES} are used with @code{VOIDmode} and replacing that +code with code that accesses @code{CONST_WIDE_INT}s. @samp{"grep -i +const_double"} at the port level gets you to 95% of the changes that +need to be made. There are a few places that require a deeper look. + +@itemize @bullet +@item +There is no equivalent to @code{hval} and @code{lval} for +@code{CONST_WIDE_INT}s. This would be difficult to express in the md +language since there are a variable number of elements. + +Most ports only check that @code{hval} is either 0 or -1 to see if the +value is small. As mentioned above, this will no longer be necessary +since small constants are always @code{CONST_INT}. Of course there +are still a few exceptions, the alpha's constraint used by the zap +instruction certainly requires careful examination by C code. +However, all the current code does is pass the hval and lval to C +code, so evolving the c code to look at the @code{CONST_WIDE_INT} is +not really a large change. + +@item +Because there is no standard template that ports use to materialize +constants, there is likely to be some futzing that is unique to each +port in this code. + +@item +The rtx costs may have to be adjusted to properly account for larger +constants that are represented as @code{CONST_WIDE_INT}. +@end itemize + +All and all it does not takes long to convert ports that the +maintainer is familiar with. + +@end defmac diff --git a/gcc/dojump.c b/gcc/dojump.c index 3f04eac..ecbec40 100644 --- a/gcc/dojump.c +++ b/gcc/dojump.c @@ -142,6 +142,7 @@ static bool prefer_and_bit_test (enum machine_mode mode, int bitnum) { bool speed_p; + wide_int mask = wide_int::set_bit_in_zero (bitnum, mode); if (and_test == 0) { @@ -162,8 +163,7 @@ prefer_and_bit_test (enum machine_mode mode, int bitnum) } /* Fill in the integers. */ - XEXP (and_test, 1) - = immed_double_int_const (double_int_zero.set_bit (bitnum), mode); + XEXP (and_test, 1) = immed_wide_int_const (mask, mode); XEXP (XEXP (shift_test, 0), 1) = GEN_INT (bitnum); speed_p = optimize_insn_for_speed_p (); diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 4e75407..40836ce 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -323,6 +323,17 @@ dump_struct_debug (tree type, enum debug_info_usage usage, #endif + +/* Get the number of host wide ints needed to represent the precision + of the number. */ + +static unsigned int +get_full_len (const wide_int &op) +{ + return ((op.get_precision () + HOST_BITS_PER_WIDE_INT - 1) + / HOST_BITS_PER_WIDE_INT); +} + static bool should_emit_struct_debug (tree type, enum debug_info_usage usage) { @@ -1354,6 +1365,9 @@ dw_val_equal_p (dw_val_node *a, dw_val_node *b) return (a->v.val_double.high == b->v.val_double.high && a->v.val_double.low == b->v.val_double.low); + case dw_val_class_wide_int: + return a->v.val_wide == b->v.val_wide; + case dw_val_class_vec: { size_t a_len = a->v.val_vec.elt_size * a->v.val_vec.length; @@ -1610,6 +1624,10 @@ size_of_loc_descr (dw_loc_descr_ref loc) case dw_val_class_const_double: size += HOST_BITS_PER_DOUBLE_INT / BITS_PER_UNIT; break; + case dw_val_class_wide_int: + size += (get_full_len (loc->dw_loc_oprnd2.v.val_wide) + * HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT); + break; default: gcc_unreachable (); } @@ -1787,6 +1805,20 @@ output_loc_operands (dw_loc_descr_ref loc, int for_eh_or_skip) second, NULL); } break; + case dw_val_class_wide_int: + { + int i; + int len = get_full_len (val2->v.val_wide); + if (WORDS_BIG_ENDIAN) + for (i = len; i >= 0; --i) + dw2_asm_output_data (HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR, + val2->v.val_wide.elt (i), NULL); + else + for (i = 0; i < len; ++i) + dw2_asm_output_data (HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR, + val2->v.val_wide.elt (i), NULL); + } + break; case dw_val_class_addr: gcc_assert (val1->v.val_unsigned == DWARF2_ADDR_SIZE); dw2_asm_output_addr_rtx (DWARF2_ADDR_SIZE, val2->v.val_addr, NULL); @@ -1996,6 +2028,21 @@ output_loc_operands (dw_loc_descr_ref loc, int for_eh_or_skip) dw2_asm_output_data (l, second, NULL); } break; + case dw_val_class_wide_int: + { + int i; + int len = get_full_len (val2->v.val_wide); + l = HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR; + + dw2_asm_output_data (1, len * l, NULL); + if (WORDS_BIG_ENDIAN) + for (i = len; i >= 0; --i) + dw2_asm_output_data (l, val2->v.val_wide.elt (i), NULL); + else + for (i = 0; i < len; ++i) + dw2_asm_output_data (l, val2->v.val_wide.elt (i), NULL); + } + break; default: gcc_unreachable (); } @@ -3095,7 +3142,7 @@ static void add_AT_location_description (dw_die_ref, enum dwarf_attribute, static void add_data_member_location_attribute (dw_die_ref, tree); static bool add_const_value_attribute (dw_die_ref, rtx); static void insert_int (HOST_WIDE_INT, unsigned, unsigned char *); -static void insert_double (double_int, unsigned char *); +static void insert_wide_int (const wide_int &, unsigned char *); static void insert_float (const_rtx, unsigned char *); static rtx rtl_for_decl_location (tree); static bool add_location_or_const_value_attribute (dw_die_ref, tree, bool, @@ -3720,6 +3767,20 @@ AT_unsigned (dw_attr_ref a) /* Add an unsigned double integer attribute value to a DIE. */ static inline void +add_AT_wide (dw_die_ref die, enum dwarf_attribute attr_kind, + wide_int w) +{ + dw_attr_node attr; + + attr.dw_attr = attr_kind; + attr.dw_attr_val.val_class = dw_val_class_wide_int; + attr.dw_attr_val.v.val_wide = w; + add_dwarf_attr (die, &attr); +} + +/* Add an unsigned double integer attribute value to a DIE. */ + +static inline void add_AT_double (dw_die_ref die, enum dwarf_attribute attr_kind, HOST_WIDE_INT high, unsigned HOST_WIDE_INT low) { @@ -5273,6 +5334,19 @@ print_die (dw_die_ref die, FILE *outfile) a->dw_attr_val.v.val_double.high, a->dw_attr_val.v.val_double.low); break; + case dw_val_class_wide_int: + { + int i = a->dw_attr_val.v.val_wide.get_len (); + fprintf (outfile, "constant ("); + gcc_assert (i > 0); + if (a->dw_attr_val.v.val_wide.elt (i) == 0) + fprintf (outfile, "0x"); + fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, a->dw_attr_val.v.val_wide.elt (--i)); + while (-- i >= 0) + fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, a->dw_attr_val.v.val_wide.elt (i)); + fprintf (outfile, ")"); + break; + } case dw_val_class_vec: fprintf (outfile, "floating-point or vector constant"); break; @@ -5428,6 +5502,9 @@ attr_checksum (dw_attr_ref at, struct md5_ctx *ctx, int *mark) case dw_val_class_const_double: CHECKSUM (at->dw_attr_val.v.val_double); break; + case dw_val_class_wide_int: + CHECKSUM (at->dw_attr_val.v.val_wide); + break; case dw_val_class_vec: CHECKSUM (at->dw_attr_val.v.val_vec); break; @@ -5698,6 +5775,12 @@ attr_checksum_ordered (enum dwarf_tag tag, dw_attr_ref at, CHECKSUM (at->dw_attr_val.v.val_double); break; + case dw_val_class_wide_int: + CHECKSUM_ULEB128 (DW_FORM_block); + CHECKSUM_ULEB128 (sizeof (at->dw_attr_val.v.val_wide)); + CHECKSUM (at->dw_attr_val.v.val_wide); + break; + case dw_val_class_vec: CHECKSUM_ULEB128 (DW_FORM_block); CHECKSUM_ULEB128 (sizeof (at->dw_attr_val.v.val_vec)); @@ -6162,6 +6245,8 @@ same_dw_val_p (const dw_val_node *v1, const dw_val_node *v2, int *mark) case dw_val_class_const_double: return v1->v.val_double.high == v2->v.val_double.high && v1->v.val_double.low == v2->v.val_double.low; + case dw_val_class_wide_int: + return v1->v.val_wide == v2->v.val_wide; case dw_val_class_vec: if (v1->v.val_vec.length != v2->v.val_vec.length || v1->v.val_vec.elt_size != v2->v.val_vec.elt_size) @@ -7624,6 +7709,13 @@ size_of_die (dw_die_ref die) if (HOST_BITS_PER_WIDE_INT >= 64) size++; /* block */ break; + case dw_val_class_wide_int: + size += (get_full_len (a->dw_attr_val.v.val_wide) + * HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR); + if (get_full_len (a->dw_attr_val.v.val_wide) * HOST_BITS_PER_WIDE_INT + > 64) + size++; /* block */ + break; case dw_val_class_vec: size += constant_size (a->dw_attr_val.v.val_vec.length * a->dw_attr_val.v.val_vec.elt_size) @@ -7960,6 +8052,20 @@ value_format (dw_attr_ref a) default: return DW_FORM_block1; } + case dw_val_class_wide_int: + switch (get_full_len (a->dw_attr_val.v.val_wide) * HOST_BITS_PER_WIDE_INT) + { + case 8: + return DW_FORM_data1; + case 16: + return DW_FORM_data2; + case 32: + return DW_FORM_data4; + case 64: + return DW_FORM_data8; + default: + return DW_FORM_block1; + } case dw_val_class_vec: switch (constant_size (a->dw_attr_val.v.val_vec.length * a->dw_attr_val.v.val_vec.elt_size)) @@ -8399,6 +8505,32 @@ output_die (dw_die_ref die) } break; + case dw_val_class_wide_int: + { + int i; + int len = get_full_len (a->dw_attr_val.v.val_wide); + int l = HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR; + if (len * HOST_BITS_PER_WIDE_INT > 64) + dw2_asm_output_data (1, get_full_len (a->dw_attr_val.v.val_wide) * l, + NULL); + + if (WORDS_BIG_ENDIAN) + for (i = len; i >= 0; --i) + { + dw2_asm_output_data (l, a->dw_attr_val.v.val_wide.elt (i), + name); + name = NULL; + } + else + for (i = 0; i < len; ++i) + { + dw2_asm_output_data (l, a->dw_attr_val.v.val_wide.elt (i), + name); + name = NULL; + } + } + break; + case dw_val_class_vec: { unsigned int elt_size = a->dw_attr_val.v.val_vec.elt_size; @@ -11524,9 +11656,8 @@ clz_loc_descriptor (rtx rtl, enum machine_mode mode, msb = GEN_INT ((unsigned HOST_WIDE_INT) 1 << (GET_MODE_BITSIZE (mode) - 1)); else - msb = immed_double_const (0, (unsigned HOST_WIDE_INT) 1 - << (GET_MODE_BITSIZE (mode) - - HOST_BITS_PER_WIDE_INT - 1), mode); + msb = immed_wide_int_const + (wide_int::set_bit_in_zero (GET_MODE_PRECISION (mode) - 1, mode), mode); if (GET_CODE (msb) == CONST_INT && INTVAL (msb) < 0) tmp = new_loc_descr (HOST_BITS_PER_WIDE_INT == 32 ? DW_OP_const4u : HOST_BITS_PER_WIDE_INT == 64 @@ -12467,7 +12598,16 @@ mem_loc_descriptor (rtx rtl, enum machine_mode mode, mem_loc_result->dw_loc_oprnd1.val_class = dw_val_class_die_ref; mem_loc_result->dw_loc_oprnd1.v.val_die_ref.die = type_die; mem_loc_result->dw_loc_oprnd1.v.val_die_ref.external = 0; - if (SCALAR_FLOAT_MODE_P (mode)) +#if TARGET_SUPPORTS_WIDE_INT == 0 + if (!SCALAR_FLOAT_MODE_P (mode)) + { + mem_loc_result->dw_loc_oprnd2.val_class + = dw_val_class_const_double; + mem_loc_result->dw_loc_oprnd2.v.val_double + = rtx_to_double_int (rtl); + } + else +#endif { unsigned int length = GET_MODE_SIZE (mode); unsigned char *array @@ -12479,13 +12619,26 @@ mem_loc_descriptor (rtx rtl, enum machine_mode mode, mem_loc_result->dw_loc_oprnd2.v.val_vec.elt_size = 4; mem_loc_result->dw_loc_oprnd2.v.val_vec.array = array; } - else - { - mem_loc_result->dw_loc_oprnd2.val_class - = dw_val_class_const_double; - mem_loc_result->dw_loc_oprnd2.v.val_double - = rtx_to_double_int (rtl); - } + } + break; + + case CONST_WIDE_INT: + if (!dwarf_strict) + { + dw_die_ref type_die; + + type_die = base_type_for_mode (mode, + GET_MODE_CLASS (mode) == MODE_INT); + if (type_die == NULL) + return NULL; + mem_loc_result = new_loc_descr (DW_OP_GNU_const_type, 0, 0); + mem_loc_result->dw_loc_oprnd1.val_class = dw_val_class_die_ref; + mem_loc_result->dw_loc_oprnd1.v.val_die_ref.die = type_die; + mem_loc_result->dw_loc_oprnd1.v.val_die_ref.external = 0; + mem_loc_result->dw_loc_oprnd2.val_class + = dw_val_class_wide_int; + mem_loc_result->dw_loc_oprnd2.v.val_wide + = wide_int::from_rtx (rtl, mode); } break; @@ -12956,7 +13109,15 @@ loc_descriptor (rtx rtl, enum machine_mode mode, adequately represented. We output CONST_DOUBLEs as blocks. */ loc_result = new_loc_descr (DW_OP_implicit_value, GET_MODE_SIZE (mode), 0); - if (SCALAR_FLOAT_MODE_P (mode)) +#if TARGET_SUPPORTS_WIDE_INT == 0 + if (!SCALAR_FLOAT_MODE_P (mode)) + { + loc_result->dw_loc_oprnd2.val_class = dw_val_class_const_double; + loc_result->dw_loc_oprnd2.v.val_double + = rtx_to_double_int (rtl); + } + else +#endif { unsigned int length = GET_MODE_SIZE (mode); unsigned char *array @@ -12968,12 +13129,26 @@ loc_descriptor (rtx rtl, enum machine_mode mode, loc_result->dw_loc_oprnd2.v.val_vec.elt_size = 4; loc_result->dw_loc_oprnd2.v.val_vec.array = array; } - else - { - loc_result->dw_loc_oprnd2.val_class = dw_val_class_const_double; - loc_result->dw_loc_oprnd2.v.val_double - = rtx_to_double_int (rtl); - } + } + break; + + case CONST_WIDE_INT: + if (mode == VOIDmode) + mode = GET_MODE (rtl); + + if (mode != VOIDmode && (dwarf_version >= 4 || !dwarf_strict)) + { + gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl)); + + /* Note that a CONST_DOUBLE rtx could represent either an integer + or a floating-point constant. A CONST_DOUBLE is used whenever + the constant requires more than one word in order to be + adequately represented. We output CONST_DOUBLEs as blocks. */ + loc_result = new_loc_descr (DW_OP_implicit_value, + GET_MODE_SIZE (mode), 0); + loc_result->dw_loc_oprnd2.val_class = dw_val_class_wide_int; + loc_result->dw_loc_oprnd2.v.val_wide + = wide_int::from_rtx (rtl, mode); } break; @@ -12989,6 +13164,7 @@ loc_descriptor (rtx rtl, enum machine_mode mode, ggc_alloc_atomic (length * elt_size); unsigned int i; unsigned char *p; + enum machine_mode imode = GET_MODE_INNER (mode); gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl)); switch (GET_MODE_CLASS (mode)) @@ -12997,15 +13173,8 @@ loc_descriptor (rtx rtl, enum machine_mode mode, for (i = 0, p = array; i < length; i++, p += elt_size) { rtx elt = CONST_VECTOR_ELT (rtl, i); - double_int val = rtx_to_double_int (elt); - - if (elt_size <= sizeof (HOST_WIDE_INT)) - insert_int (val.to_shwi (), elt_size, p); - else - { - gcc_assert (elt_size == 2 * sizeof (HOST_WIDE_INT)); - insert_double (val, p); - } + wide_int val = wide_int::from_rtx (elt, imode); + insert_wide_int (val, p); } break; @@ -14630,22 +14799,27 @@ extract_int (const unsigned char *src, unsigned int size) return val; } -/* Writes double_int values to dw_vec_const array. */ +/* Writes wide_int values to dw_vec_const array. */ static void -insert_double (double_int val, unsigned char *dest) +insert_wide_int (const wide_int &val, unsigned char *dest) { - unsigned char *p0 = dest; - unsigned char *p1 = dest + sizeof (HOST_WIDE_INT); + int i; if (WORDS_BIG_ENDIAN) - { - p0 = p1; - p1 = dest; - } - - insert_int ((HOST_WIDE_INT) val.low, sizeof (HOST_WIDE_INT), p0); - insert_int ((HOST_WIDE_INT) val.high, sizeof (HOST_WIDE_INT), p1); + for (i = (int)get_full_len (val) - 1; i >= 0; i--) + { + insert_int ((HOST_WIDE_INT) val.elt (i), + sizeof (HOST_WIDE_INT), dest); + dest += sizeof (HOST_WIDE_INT); + } + else + for (i = 0; i < (int)get_full_len (val); i++) + { + insert_int ((HOST_WIDE_INT) val.elt (i), + sizeof (HOST_WIDE_INT), dest); + dest += sizeof (HOST_WIDE_INT); + } } /* Writes floating point values to dw_vec_const array. */ @@ -14690,6 +14864,11 @@ add_const_value_attribute (dw_die_ref die, rtx rtl) } return true; + case CONST_WIDE_INT: + add_AT_wide (die, DW_AT_const_value, + wide_int::from_rtx (rtl, GET_MODE (rtl))); + return true; + case CONST_DOUBLE: /* Note that a CONST_DOUBLE rtx could represent either an integer or a floating-point constant. A CONST_DOUBLE is used whenever the @@ -14698,7 +14877,10 @@ add_const_value_attribute (dw_die_ref die, rtx rtl) { enum machine_mode mode = GET_MODE (rtl); - if (SCALAR_FLOAT_MODE_P (mode)) + if (TARGET_SUPPORTS_WIDE_INT == 0 && !SCALAR_FLOAT_MODE_P (mode)) + add_AT_double (die, DW_AT_const_value, + CONST_DOUBLE_HIGH (rtl), CONST_DOUBLE_LOW (rtl)); + else { unsigned int length = GET_MODE_SIZE (mode); unsigned char *array = (unsigned char *) ggc_alloc_atomic (length); @@ -14706,9 +14888,6 @@ add_const_value_attribute (dw_die_ref die, rtx rtl) insert_float (rtl, array); add_AT_vec (die, DW_AT_const_value, length / 4, 4, array); } - else - add_AT_double (die, DW_AT_const_value, - CONST_DOUBLE_HIGH (rtl), CONST_DOUBLE_LOW (rtl)); } return true; @@ -14721,6 +14900,7 @@ add_const_value_attribute (dw_die_ref die, rtx rtl) (length * elt_size); unsigned int i; unsigned char *p; + enum machine_mode imode = GET_MODE_INNER (mode); switch (GET_MODE_CLASS (mode)) { @@ -14728,15 +14908,8 @@ add_const_value_attribute (dw_die_ref die, rtx rtl) for (i = 0, p = array; i < length; i++, p += elt_size) { rtx elt = CONST_VECTOR_ELT (rtl, i); - double_int val = rtx_to_double_int (elt); - - if (elt_size <= sizeof (HOST_WIDE_INT)) - insert_int (val.to_shwi (), elt_size, p); - else - { - gcc_assert (elt_size == 2 * sizeof (HOST_WIDE_INT)); - insert_double (val, p); - } + wide_int val = wide_int::from_rtx (elt, imode); + insert_wide_int (val, p); } break; @@ -22869,6 +23042,9 @@ hash_loc_operands (dw_loc_descr_ref loc, hashval_t hash) hash = iterative_hash_object (val2->v.val_double.low, hash); hash = iterative_hash_object (val2->v.val_double.high, hash); break; + case dw_val_class_wide_int: + hash = iterative_hash_object (val2->v.val_wide, hash); + break; case dw_val_class_addr: hash = iterative_hash_rtx (val2->v.val_addr, hash); break; @@ -22958,6 +23134,9 @@ hash_loc_operands (dw_loc_descr_ref loc, hashval_t hash) hash = iterative_hash_object (val2->v.val_double.low, hash); hash = iterative_hash_object (val2->v.val_double.high, hash); break; + case dw_val_class_wide_int: + hash = iterative_hash_object (val2->v.val_wide, hash); + break; default: gcc_unreachable (); } @@ -23106,6 +23285,8 @@ compare_loc_operands (dw_loc_descr_ref x, dw_loc_descr_ref y) case dw_val_class_const_double: return valx2->v.val_double.low == valy2->v.val_double.low && valx2->v.val_double.high == valy2->v.val_double.high; + case dw_val_class_wide_int: + return valx2->v.val_wide == valy2->v.val_wide; case dw_val_class_addr: return rtx_equal_p (valx2->v.val_addr, valy2->v.val_addr); default: @@ -23149,6 +23330,8 @@ compare_loc_operands (dw_loc_descr_ref x, dw_loc_descr_ref y) case dw_val_class_const_double: return valx2->v.val_double.low == valy2->v.val_double.low && valx2->v.val_double.high == valy2->v.val_double.high; + case dw_val_class_wide_int: + return valx2->v.val_wide == valy2->v.val_wide; default: gcc_unreachable (); } diff --git a/gcc/dwarf2out.h b/gcc/dwarf2out.h index f68d0e4..7c5f142 100644 --- a/gcc/dwarf2out.h +++ b/gcc/dwarf2out.h @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3. If not see #define GCC_DWARF2OUT_H 1 #include "dwarf2.h" /* ??? Remove this once only used by dwarf2foo.c. */ +#include "wide-int.h" typedef struct die_struct *dw_die_ref; typedef const struct die_struct *const_dw_die_ref; @@ -139,6 +140,7 @@ enum dw_val_class dw_val_class_const, dw_val_class_unsigned_const, dw_val_class_const_double, + dw_val_class_wide_int, dw_val_class_vec, dw_val_class_flag, dw_val_class_die_ref, @@ -180,6 +182,7 @@ typedef struct GTY(()) dw_val_struct { HOST_WIDE_INT GTY ((default)) val_int; unsigned HOST_WIDE_INT GTY ((tag ("dw_val_class_unsigned_const"))) val_unsigned; double_int GTY ((tag ("dw_val_class_const_double"))) val_double; + wide_int GTY ((tag ("dw_val_class_wide_int"))) val_wide; dw_vec_const GTY ((tag ("dw_val_class_vec"))) val_vec; struct dw_val_die_union { diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 2c70fb1..a234e39 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -124,6 +124,9 @@ rtx cc0_rtx; static GTY ((if_marked ("ggc_marked_p"), param_is (struct rtx_def))) htab_t const_int_htab; +static GTY ((if_marked ("ggc_marked_p"), param_is (struct rtx_def))) + htab_t const_wide_int_htab; + /* A hash table storing memory attribute structures. */ static GTY ((if_marked ("ggc_marked_p"), param_is (struct mem_attrs))) htab_t mem_attrs_htab; @@ -149,6 +152,11 @@ static void set_used_decls (tree); static void mark_label_nuses (rtx); static hashval_t const_int_htab_hash (const void *); static int const_int_htab_eq (const void *, const void *); +#if TARGET_SUPPORTS_WIDE_INT +static hashval_t const_wide_int_htab_hash (const void *); +static int const_wide_int_htab_eq (const void *, const void *); +static rtx lookup_const_wide_int (rtx); +#endif static hashval_t const_double_htab_hash (const void *); static int const_double_htab_eq (const void *, const void *); static rtx lookup_const_double (rtx); @@ -185,6 +193,43 @@ const_int_htab_eq (const void *x, const void *y) return (INTVAL ((const_rtx) x) == *((const HOST_WIDE_INT *) y)); } +#if TARGET_SUPPORTS_WIDE_INT +/* Returns a hash code for X (which is a really a CONST_WIDE_INT). */ + +static hashval_t +const_wide_int_htab_hash (const void *x) +{ + int i; + HOST_WIDE_INT hash = 0; + const_rtx xr = (const_rtx) x; + + for (i = 0; i < CONST_WIDE_INT_NUNITS (xr); i++) + hash += CONST_WIDE_INT_ELT (xr, i); + + return (hashval_t) hash; +} + +/* Returns nonzero if the value represented by X (which is really a + CONST_WIDE_INT) is the same as that given by Y (which is really a + CONST_WIDE_INT). */ + +static int +const_wide_int_htab_eq (const void *x, const void *y) +{ + int i; + const_rtx xr = (const_rtx)x; + const_rtx yr = (const_rtx)y; + if (CONST_WIDE_INT_NUNITS (xr) != CONST_WIDE_INT_NUNITS (yr)) + return 0; + + for (i = 0; i < CONST_WIDE_INT_NUNITS (xr); i++) + if (CONST_WIDE_INT_ELT (xr, i) != CONST_WIDE_INT_ELT (yr, i)) + return 0; + + return 1; +} +#endif + /* Returns a hash code for X (which is really a CONST_DOUBLE). */ static hashval_t const_double_htab_hash (const void *x) @@ -192,7 +237,7 @@ const_double_htab_hash (const void *x) const_rtx const value = (const_rtx) x; hashval_t h; - if (GET_MODE (value) == VOIDmode) + if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (value) == VOIDmode) h = CONST_DOUBLE_LOW (value) ^ CONST_DOUBLE_HIGH (value); else { @@ -212,7 +257,7 @@ const_double_htab_eq (const void *x, const void *y) if (GET_MODE (a) != GET_MODE (b)) return 0; - if (GET_MODE (a) == VOIDmode) + if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (a) == VOIDmode) return (CONST_DOUBLE_LOW (a) == CONST_DOUBLE_LOW (b) && CONST_DOUBLE_HIGH (a) == CONST_DOUBLE_HIGH (b)); else @@ -478,6 +523,7 @@ const_fixed_from_fixed_value (FIXED_VALUE_TYPE value, enum machine_mode mode) return lookup_const_fixed (fixed); } +#if TARGET_SUPPORTS_WIDE_INT == 0 /* Constructs double_int from rtx CST. */ double_int @@ -497,17 +543,61 @@ rtx_to_double_int (const_rtx cst) return r; } +#endif +#if TARGET_SUPPORTS_WIDE_INT +/* Determine whether WIDE_INT, already exists in the hash table. If + so, return its counterpart; otherwise add it to the hash table and + return it. */ + +static rtx +lookup_const_wide_int (rtx wint) +{ + void **slot = htab_find_slot (const_wide_int_htab, wint, INSERT); + if (*slot == 0) + *slot = wint; -/* Return a CONST_DOUBLE or CONST_INT for a value specified as - a double_int. */ + return (rtx) *slot; +} +#endif +/* V contains a wide_int. A CONST_INT or CONST_WIDE_INT (if + TARGET_SUPPORTS_WIDE_INT is defined) or CONST_DOUBLE if + TARGET_SUPPORTS_WIDE_INT is not defined is produced based on the + number of HOST_WIDE_INTs that are necessary to represent the value + in compact form. */ rtx -immed_double_int_const (double_int i, enum machine_mode mode) +immed_wide_int_const (const wide_int &v, enum machine_mode mode) { - return immed_double_const (i.low, i.high, mode); + unsigned int len = v.get_len (); + + if (len < 2) + return gen_int_mode (v.elt (0), mode); + + gcc_assert (GET_MODE_PRECISION (mode) == v.get_precision ()); + gcc_assert (GET_MODE_BITSIZE (mode) == v.get_bitsize ()); + +#if TARGET_SUPPORTS_WIDE_INT + { + rtx value = const_wide_int_alloc (len); + unsigned int i; + + /* It is so tempting to just put the mode in here. Must control + myself ... */ + PUT_MODE (value, VOIDmode); + HWI_PUT_NUM_ELEM (CONST_WIDE_INT_VEC (value), len); + + for (i = 0; i < len; i++) + CONST_WIDE_INT_ELT (value, i) = v.elt (i); + + return lookup_const_wide_int (value); + } +#else + return immed_double_const (v.elt (0), v.elt (1), mode); +#endif } +#if TARGET_SUPPORTS_WIDE_INT == 0 /* Return a CONST_DOUBLE or CONST_INT for a value specified as a pair of ints: I0 is the low-order word and I1 is the high-order word. For values that are larger than HOST_BITS_PER_DOUBLE_INT, the @@ -559,6 +649,7 @@ immed_double_const (HOST_WIDE_INT i0, HOST_WIDE_INT i1, enum machine_mode mode) return lookup_const_double (value); } +#endif rtx gen_rtx_REG (enum machine_mode mode, unsigned int regno) @@ -5626,11 +5717,15 @@ init_emit_once (void) enum machine_mode mode; enum machine_mode double_mode; - /* Initialize the CONST_INT, CONST_DOUBLE, CONST_FIXED, and memory attribute - hash tables. */ + /* Initialize the CONST_INT, CONST_WIDE_INT, CONST_DOUBLE, + CONST_FIXED, and memory attribute hash tables. */ const_int_htab = htab_create_ggc (37, const_int_htab_hash, const_int_htab_eq, NULL); +#if TARGET_SUPPORTS_WIDE_INT + const_wide_int_htab = htab_create_ggc (37, const_wide_int_htab_hash, + const_wide_int_htab_eq, NULL); +#endif const_double_htab = htab_create_ggc (37, const_double_htab_hash, const_double_htab_eq, NULL); diff --git a/gcc/explow.c b/gcc/explow.c index 08a6653..c154472 100644 --- a/gcc/explow.c +++ b/gcc/explow.c @@ -95,38 +95,9 @@ plus_constant (enum machine_mode mode, rtx x, HOST_WIDE_INT c) switch (code) { - case CONST_INT: - if (GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT) - { - double_int di_x = double_int::from_shwi (INTVAL (x)); - double_int di_c = double_int::from_shwi (c); - - bool overflow; - double_int v = di_x.add_with_sign (di_c, false, &overflow); - if (overflow) - gcc_unreachable (); - - return immed_double_int_const (v, VOIDmode); - } - - return GEN_INT (INTVAL (x) + c); - - case CONST_DOUBLE: - { - double_int di_x = double_int::from_pair (CONST_DOUBLE_HIGH (x), - CONST_DOUBLE_LOW (x)); - double_int di_c = double_int::from_shwi (c); - - bool overflow; - double_int v = di_x.add_with_sign (di_c, false, &overflow); - if (overflow) - /* Sorry, we have no way to represent overflows this wide. - To fix, add constant support wider than CONST_DOUBLE. */ - gcc_assert (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_DOUBLE_INT); - - return immed_double_int_const (v, VOIDmode); - } - + CASE_CONST_SCALAR_INT: + return immed_wide_int_const (wide_int::from_rtx (x, mode) + + wide_int::from_shwi (c, mode), mode); case MEM: /* If this is a reference to the constant pool, try replacing it with a reference to a new constant. If the resulting address isn't diff --git a/gcc/expmed.c b/gcc/expmed.c index 954a360..a1b7fb4 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -55,7 +55,6 @@ static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT, static rtx extract_fixed_bit_field (enum machine_mode, rtx, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, rtx, int, bool); -static rtx mask_rtx (enum machine_mode, int, int, int); static rtx lshift_value (enum machine_mode, rtx, int, int); static rtx extract_split_bit_field (rtx, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, int); @@ -63,6 +62,18 @@ static void do_cmp_and_jump (rtx, rtx, enum rtx_code, enum machine_mode, rtx); static rtx expand_smod_pow2 (enum machine_mode, rtx, HOST_WIDE_INT); static rtx expand_sdiv_pow2 (enum machine_mode, rtx, HOST_WIDE_INT); +/* Return a constant integer mask value of mode MODE with BITSIZE ones + followed by BITPOS zeros, or the complement of that if COMPLEMENT. + The mask is truncated if necessary to the width of mode MODE. The + mask is zero-extended if BITSIZE+BITPOS is too small for MODE. */ + +static inline rtx +mask_rtx (enum machine_mode mode, int bitpos, int bitsize, bool complement) +{ + return immed_wide_int_const + (wide_int::shifted_mask (bitpos, bitsize, complement, mode), mode); +} + /* Test whether a value is zero of a power of two. */ #define EXACT_POWER_OF_2_OR_ZERO_P(x) (((x) & ((x) - 1)) == 0) @@ -1831,39 +1842,16 @@ extract_fixed_bit_field (enum machine_mode tmode, rtx op0, return expand_shift (RSHIFT_EXPR, mode, op0, GET_MODE_BITSIZE (mode) - bitsize, target, 0); } -\f -/* Return a constant integer (CONST_INT or CONST_DOUBLE) mask value - of mode MODE with BITSIZE ones followed by BITPOS zeros, or the - complement of that if COMPLEMENT. The mask is truncated if - necessary to the width of mode MODE. The mask is zero-extended if - BITSIZE+BITPOS is too small for MODE. */ - -static rtx -mask_rtx (enum machine_mode mode, int bitpos, int bitsize, int complement) -{ - double_int mask; - - mask = double_int::mask (bitsize); - mask = mask.llshift (bitpos, HOST_BITS_PER_DOUBLE_INT); - - if (complement) - mask = ~mask; - - return immed_double_int_const (mask, mode); -} - -/* Return a constant integer (CONST_INT or CONST_DOUBLE) rtx with the value - VALUE truncated to BITSIZE bits and then shifted left BITPOS bits. */ +/* Return a constant integer rtx with the value VALUE truncated to + BITSIZE bits and then shifted left BITPOS bits. */ static rtx lshift_value (enum machine_mode mode, rtx value, int bitpos, int bitsize) { - double_int val; - - val = double_int::from_uhwi (INTVAL (value)).zext (bitsize); - val = val.llshift (bitpos, HOST_BITS_PER_DOUBLE_INT); - - return immed_double_int_const (val, mode); + return + immed_wide_int_const (wide_int::from_rtx (value, mode) + .zext (bitsize) + .lshift (bitpos, wide_int::NONE), mode); } \f /* Extract a bit field that is split across two words @@ -3068,34 +3056,41 @@ expand_mult (enum machine_mode mode, rtx op0, rtx op1, rtx target, only if the constant value exactly fits in an `unsigned int' without any truncation. This means that multiplying by negative values does not work; results are off by 2^32 on a 32 bit machine. */ - if (CONST_INT_P (scalar_op1)) { coeff = INTVAL (scalar_op1); is_neg = coeff < 0; } +#if TARGET_SUPPORTS_WIDE_INT + else if (CONST_WIDE_INT_P (scalar_op1)) +#else else if (CONST_DOUBLE_AS_INT_P (scalar_op1)) +#endif { - /* If we are multiplying in DImode, it may still be a win - to try to work with shifts and adds. */ - if (CONST_DOUBLE_HIGH (scalar_op1) == 0 - && CONST_DOUBLE_LOW (scalar_op1) > 0) + int p = GET_MODE_PRECISION (mode); + wide_int val = wide_int::from_rtx (scalar_op1, mode); + int shift = val.exact_log2 (); + /* Perfect power of 2. */ + is_neg = false; + if (shift > 0) { - coeff = CONST_DOUBLE_LOW (scalar_op1); - is_neg = false; + /* Do the shift count trucation against the bitsize, not + the precision. See the comment above + wide-int.c:trunc_shift for details. */ + if (SHIFT_COUNT_TRUNCATED) + shift &= GET_MODE_BITSIZE (mode) - 1; + /* We could consider adding just a move of 0 to target + if the shift >= p */ + if (shift < p) + return expand_shift (LSHIFT_EXPR, mode, op0, + shift, target, unsignedp); + /* Any positive number that fits in a word. */ + coeff = CONST_WIDE_INT_ELT (scalar_op1, 0); } - else if (CONST_DOUBLE_LOW (scalar_op1) == 0) + else if (val.sign_mask () == 0) { - coeff = CONST_DOUBLE_HIGH (scalar_op1); - if (EXACT_POWER_OF_2_OR_ZERO_P (coeff)) - { - int shift = floor_log2 (coeff) + HOST_BITS_PER_WIDE_INT; - if (shift < HOST_BITS_PER_DOUBLE_INT - 1 - || mode_bitsize <= HOST_BITS_PER_DOUBLE_INT) - return expand_shift (LSHIFT_EXPR, mode, op0, - shift, target, unsignedp); - } - goto skip_synth; + /* Any positive number that fits in a word. */ + coeff = CONST_WIDE_INT_ELT (scalar_op1, 0); } else goto skip_synth; @@ -3585,9 +3580,10 @@ expmed_mult_highpart (enum machine_mode mode, rtx op0, rtx op1, static rtx expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d) { - unsigned HOST_WIDE_INT masklow, maskhigh; rtx result, temp, shift, label; int logd; + wide_int mask; + int prec = GET_MODE_PRECISION (mode); logd = floor_log2 (d); result = gen_reg_rtx (mode); @@ -3600,8 +3596,8 @@ expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d) mode, 0, -1); if (signmask) { + HOST_WIDE_INT masklow = ((HOST_WIDE_INT) 1 << logd) - 1; signmask = force_reg (mode, signmask); - masklow = ((HOST_WIDE_INT) 1 << logd) - 1; shift = GEN_INT (GET_MODE_BITSIZE (mode) - logd); /* Use the rtx_cost of a LSHIFTRT instruction to determine @@ -3646,19 +3642,11 @@ expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d) modulus. By including the signbit in the operation, many targets can avoid an explicit compare operation in the following comparison against zero. */ - - masklow = ((HOST_WIDE_INT) 1 << logd) - 1; - if (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT) - { - masklow |= (HOST_WIDE_INT) -1 << (GET_MODE_BITSIZE (mode) - 1); - maskhigh = -1; - } - else - maskhigh = (HOST_WIDE_INT) -1 - << (GET_MODE_BITSIZE (mode) - HOST_BITS_PER_WIDE_INT - 1); + mask = wide_int::mask (logd, false, mode); + mask = mask.set_bit (prec - 1); temp = expand_binop (mode, and_optab, op0, - immed_double_const (masklow, maskhigh, mode), + immed_wide_int_const (mask, mode), result, 1, OPTAB_LIB_WIDEN); if (temp != result) emit_move_insn (result, temp); @@ -3668,10 +3656,10 @@ expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d) temp = expand_binop (mode, sub_optab, result, const1_rtx, result, 0, OPTAB_LIB_WIDEN); - masklow = (HOST_WIDE_INT) -1 << logd; - maskhigh = -1; + + mask = wide_int::mask (logd, true, mode); temp = expand_binop (mode, ior_optab, temp, - immed_double_const (masklow, maskhigh, mode), + immed_wide_int_const (mask, mode), result, 1, OPTAB_LIB_WIDEN); temp = expand_binop (mode, add_optab, temp, const1_rtx, result, 0, OPTAB_LIB_WIDEN); @@ -4925,8 +4913,12 @@ make_tree (tree type, rtx x) return t; } + case CONST_WIDE_INT: + t = wide_int_to_tree (type, wide_int::from_rtx (x, TYPE_MODE (type))); + return t; + case CONST_DOUBLE: - if (GET_MODE (x) == VOIDmode) + if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (x) == VOIDmode) t = build_int_cst_wide (type, CONST_DOUBLE_LOW (x), CONST_DOUBLE_HIGH (x)); else diff --git a/gcc/expr.c b/gcc/expr.c index 08c5c9d..5478b83 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -710,23 +710,23 @@ convert_modes (enum machine_mode mode, enum machine_mode oldmode, rtx x, int uns if (mode == oldmode) return x; - /* There is one case that we must handle specially: If we are converting - a CONST_INT into a mode whose size is twice HOST_BITS_PER_WIDE_INT and - we are to interpret the constant as unsigned, gen_lowpart will do - the wrong if the constant appears negative. What we want to do is - make the high-order word of the constant zero, not all ones. */ + /* There is one case that we must handle specially: If we are + converting a CONST_INT into a mode whose size is larger than + HOST_BITS_PER_WIDE_INT and we are to interpret the constant as + unsigned, gen_lowpart will do the wrong if the constant appears + negative. What we want to do is make the high-order word of the + constant zero, not all ones. */ if (unsignedp && GET_MODE_CLASS (mode) == MODE_INT - && GET_MODE_BITSIZE (mode) == HOST_BITS_PER_DOUBLE_INT + && GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT && CONST_INT_P (x) && INTVAL (x) < 0) { - double_int val = double_int::from_uhwi (INTVAL (x)); - + HOST_WIDE_INT val = INTVAL (x); /* We need to zero extend VAL. */ if (oldmode != VOIDmode) - val = val.zext (GET_MODE_BITSIZE (oldmode)); + val &= GET_MODE_PRECISION (oldmode) - 1; - return immed_double_int_const (val, mode); + return immed_wide_int_const (wide_int::from_uhwi (val, mode), mode); } /* We can do this with a gen_lowpart if both desired and current modes @@ -738,7 +738,11 @@ convert_modes (enum machine_mode mode, enum machine_mode oldmode, rtx x, int uns && GET_MODE_PRECISION (mode) <= HOST_BITS_PER_WIDE_INT) || (GET_MODE_CLASS (mode) == MODE_INT && GET_MODE_CLASS (oldmode) == MODE_INT - && (CONST_DOUBLE_AS_INT_P (x) +#if TARGET_SUPPORTS_WIDE_INT + && (CONST_WIDE_INT_P (x) +#else + && (CONST_DOUBLE_AS_INT_P (x) +#endif || (GET_MODE_PRECISION (mode) <= GET_MODE_PRECISION (oldmode) && ((MEM_P (x) && ! MEM_VOLATILE_P (x) && direct_load[(int) mode]) @@ -1743,6 +1747,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type, int ssize) { rtx first, second; + /* TODO: const_wide_int can have sizes other than this... */ gcc_assert (2 * len == ssize); split_double (src, &first, &second); if (i) @@ -5239,10 +5244,10 @@ store_expr (tree exp, rtx target, int call_param_p, bool nontemporal) &alt_rtl); } - /* If TEMP is a VOIDmode constant and the mode of the type of EXP is not - the same as that of TARGET, adjust the constant. This is needed, for - example, in case it is a CONST_DOUBLE and we want only a word-sized - value. */ + /* If TEMP is a VOIDmode constant and the mode of the type of EXP is + not the same as that of TARGET, adjust the constant. This is + needed, for example, in case it is a CONST_DOUBLE or + CONST_WIDE_INT and we want only a word-sized value. */ if (CONSTANT_P (temp) && GET_MODE (temp) == VOIDmode && TREE_CODE (exp) != ERROR_MARK && GET_MODE (target) != TYPE_MODE (TREE_TYPE (exp))) @@ -7741,11 +7746,12 @@ expand_constructor (tree exp, rtx target, enum expand_modifier modifier, /* All elts simple constants => refer to a constant in memory. But if this is a non-BLKmode mode, let it store a field at a time - since that should make a CONST_INT or CONST_DOUBLE when we - fold. Likewise, if we have a target we can use, it is best to - store directly into the target unless the type is large enough - that memcpy will be used. If we are making an initializer and - all operands are constant, put it in memory as well. + since that should make a CONST_INT, CONST_WIDE_INT or + CONST_DOUBLE when we fold. Likewise, if we have a target we can + use, it is best to store directly into the target unless the type + is large enough that memcpy will be used. If we are making an + initializer and all operands are constant, put it in memory as + well. FIXME: Avoid trying to fill vector constructors piece-meal. Output them with output_constant_def below unless we're sure @@ -8214,17 +8220,18 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, && TREE_CONSTANT (treeop1)) { rtx constant_part; + HOST_WIDE_INT wc; + enum machine_mode wmode = TYPE_MODE (TREE_TYPE (treeop1)); op1 = expand_expr (treeop1, subtarget, VOIDmode, EXPAND_SUM); - /* Use immed_double_const to ensure that the constant is + /* Use wide_int::from_shwi to ensure that the constant is truncated according to the mode of OP1, then sign extended to a HOST_WIDE_INT. Using the constant directly can result in non-canonical RTL in a 64x32 cross compile. */ - constant_part - = immed_double_const (TREE_INT_CST_LOW (treeop0), - (HOST_WIDE_INT) 0, - TYPE_MODE (TREE_TYPE (treeop1))); + wc = TREE_INT_CST_LOW (treeop0); + constant_part + = immed_wide_int_const (wide_int::from_shwi (wc, wmode), wmode); op1 = plus_constant (mode, op1, INTVAL (constant_part)); if (modifier != EXPAND_SUM && modifier != EXPAND_INITIALIZER) op1 = force_operand (op1, target); @@ -8236,7 +8243,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, && TREE_CONSTANT (treeop0)) { rtx constant_part; - + HOST_WIDE_INT wc; + enum machine_mode wmode = TYPE_MODE (TREE_TYPE (treeop0)); op0 = expand_expr (treeop0, subtarget, VOIDmode, (modifier == EXPAND_INITIALIZER ? EXPAND_INITIALIZER : EXPAND_SUM)); @@ -8250,14 +8258,13 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, return simplify_gen_binary (PLUS, mode, op0, op1); goto binop2; } - /* Use immed_double_const to ensure that the constant is + /* Use wide_int::from_shwi to ensure that the constant is truncated according to the mode of OP1, then sign extended to a HOST_WIDE_INT. Using the constant directly can result in non-canonical RTL in a 64x32 cross compile. */ - constant_part - = immed_double_const (TREE_INT_CST_LOW (treeop1), - (HOST_WIDE_INT) 0, - TYPE_MODE (TREE_TYPE (treeop0))); + wc = TREE_INT_CST_LOW (treeop1); + constant_part + = immed_wide_int_const (wide_int::from_shwi (wc, wmode), wmode); op0 = plus_constant (mode, op0, INTVAL (constant_part)); if (modifier != EXPAND_SUM && modifier != EXPAND_INITIALIZER) op0 = force_operand (op0, target); @@ -8759,10 +8766,13 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, for unsigned bitfield expand this as XOR with a proper constant instead. */ if (reduce_bit_field && TYPE_UNSIGNED (type)) - temp = expand_binop (mode, xor_optab, op0, - immed_double_int_const - (double_int::mask (TYPE_PRECISION (type)), mode), - target, 1, OPTAB_LIB_WIDEN); + { + wide_int mask = wide_int::mask (TYPE_PRECISION (type), false, mode); + + temp = expand_binop (mode, xor_optab, op0, + immed_wide_int_const (mask, mode), + target, 1, OPTAB_LIB_WIDEN); + } else temp = expand_unop (mode, one_cmpl_optab, op0, target, 1); gcc_assert (temp); @@ -9395,9 +9405,8 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode, return decl_rtl; case INTEGER_CST: - temp = immed_double_const (TREE_INT_CST_LOW (exp), - TREE_INT_CST_HIGH (exp), mode); - + temp = immed_wide_int_const (wide_int::from_tree (exp), + TYPE_MODE (TREE_TYPE (exp))); return temp; case VECTOR_CST: @@ -9628,8 +9637,9 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode, op0 = memory_address_addr_space (address_mode, op0, as); if (!integer_zerop (TREE_OPERAND (exp, 1))) { - rtx off - = immed_double_int_const (mem_ref_offset (exp), address_mode); + wide_int wi = wide_int::from_double_int + (mem_ref_offset (exp), address_mode); + rtx off = immed_wide_int_const (wi, address_mode); op0 = simplify_gen_binary (PLUS, address_mode, op0, off); } op0 = memory_address_addr_space (mode, op0, as); @@ -10507,9 +10517,10 @@ reduce_to_bit_field_precision (rtx exp, rtx target, tree type) } else if (TYPE_UNSIGNED (type)) { - rtx mask = immed_double_int_const (double_int::mask (prec), - GET_MODE (exp)); - return expand_and (GET_MODE (exp), exp, mask, target); + enum machine_mode mode = GET_MODE (exp); + rtx mask = immed_wide_int_const + (wide_int::mask (prec, false, mode), mode); + return expand_and (mode, exp, mask, target); } else { @@ -11081,8 +11092,9 @@ const_vector_from_tree (tree exp) RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), inner); else - RTVEC_ELT (v, i) = immed_double_int_const (tree_to_double_int (elt), - inner); + RTVEC_ELT (v, i) + = immed_wide_int_const (wide_int::from_tree (elt), + TYPE_MODE (TREE_TYPE (elt))); } return gen_rtx_CONST_VECTOR (mode, v); diff --git a/gcc/final.c b/gcc/final.c index d25b8e0..ae44b00 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -3799,8 +3799,16 @@ output_addr_const (FILE *file, rtx x) output_addr_const (file, XEXP (x, 0)); break; + case CONST_WIDE_INT: + /* This should be ok for a while. */ + gcc_assert (CONST_WIDE_INT_NUNITS (x) == 2); + fprintf (file, HOST_WIDE_INT_PRINT_DOUBLE_HEX, + (unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (x, 1), + (unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (x, 0)); + break; + case CONST_DOUBLE: - if (GET_MODE (x) == VOIDmode) + if (CONST_DOUBLE_AS_INT_P (x)) { /* We can use %d if the number is one word and positive. */ if (CONST_DOUBLE_HIGH (x)) diff --git a/gcc/genemit.c b/gcc/genemit.c index 692ef52..7b1e471 100644 --- a/gcc/genemit.c +++ b/gcc/genemit.c @@ -204,6 +204,7 @@ gen_exp (rtx x, enum rtx_code subroutine_type, char *used) case CONST_DOUBLE: case CONST_FIXED: + case CONST_WIDE_INT: /* These shouldn't be written in MD files. Instead, the appropriate routines in varasm.c should be called. */ gcc_unreachable (); diff --git a/gcc/gengenrtl.c b/gcc/gengenrtl.c index 5b5a3ca..1f93dd5 100644 --- a/gcc/gengenrtl.c +++ b/gcc/gengenrtl.c @@ -142,6 +142,7 @@ static int excluded_rtx (int idx) { return ((strcmp (defs[idx].enumname, "CONST_DOUBLE") == 0) + || (strcmp (defs[idx].enumname, "CONST_WIDE_INT") == 0) || (strcmp (defs[idx].enumname, "CONST_FIXED") == 0)); } diff --git a/gcc/gengtype.c b/gcc/gengtype.c index a2eebf2..ff6b125 100644 --- a/gcc/gengtype.c +++ b/gcc/gengtype.c @@ -5440,6 +5440,7 @@ main (int argc, char **argv) POS_HERE (do_scalar_typedef ("REAL_VALUE_TYPE", &pos)); POS_HERE (do_scalar_typedef ("FIXED_VALUE_TYPE", &pos)); POS_HERE (do_scalar_typedef ("double_int", &pos)); + POS_HERE (do_scalar_typedef ("wide_int", &pos)); POS_HERE (do_scalar_typedef ("uint64_t", &pos)); POS_HERE (do_scalar_typedef ("uint8", &pos)); POS_HERE (do_scalar_typedef ("uintptr_t", &pos)); diff --git a/gcc/genpreds.c b/gcc/genpreds.c index 09fc87b..e8a25bc 100644 --- a/gcc/genpreds.c +++ b/gcc/genpreds.c @@ -612,7 +612,7 @@ write_one_predicate_function (struct pred_data *p) add_mode_tests (p); /* A normal predicate can legitimately not look at enum machine_mode - if it accepts only CONST_INTs and/or CONST_DOUBLEs. */ + if it accepts only CONST_INTs and/or CONST_WIDE_INT and/or CONST_DOUBLEs. */ printf ("int\n%s (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)\n{\n", p->name); write_predicate_stmts (p->exp); @@ -809,8 +809,11 @@ add_constraint (const char *name, const char *regclass, if (is_const_int || is_const_dbl) { enum rtx_code appropriate_code +#if TARGET_SUPPORTS_WIDE_INT + = is_const_int ? CONST_INT : CONST_WIDE_INT; +#else = is_const_int ? CONST_INT : CONST_DOUBLE; - +#endif /* Consider relaxing this requirement in the future. */ if (regclass || GET_CODE (exp) != AND @@ -1074,12 +1077,17 @@ write_tm_constrs_h (void) if (needs_ival) puts (" if (CONST_INT_P (op))\n" " ival = INTVAL (op);"); +#if TARGET_SUPPORTS_WIDE_INT + if (needs_lval || needs_hval) + error ("you can't use lval or hval"); +#else if (needs_hval) puts (" if (GET_CODE (op) == CONST_DOUBLE && mode == VOIDmode)" " hval = CONST_DOUBLE_HIGH (op);"); if (needs_lval) puts (" if (GET_CODE (op) == CONST_DOUBLE && mode == VOIDmode)" " lval = CONST_DOUBLE_LOW (op);"); +#endif if (needs_rval) puts (" if (GET_CODE (op) == CONST_DOUBLE && mode != VOIDmode)" " rval = CONST_DOUBLE_REAL_VALUE (op);"); diff --git a/gcc/gensupport.c b/gcc/gensupport.c index 9b9a03e..638e051 100644 --- a/gcc/gensupport.c +++ b/gcc/gensupport.c @@ -2775,7 +2775,13 @@ static const struct std_pred_table std_preds[] = { {"scratch_operand", false, false, {SCRATCH, REG}}, {"immediate_operand", false, true, {UNKNOWN}}, {"const_int_operand", false, false, {CONST_INT}}, +#if TARGET_SUPPORTS_WIDE_INT + {"const_wide_int_operand", false, false, {CONST_WIDE_INT}}, + {"const_scalar_int_operand", false, false, {CONST_INT, CONST_WIDE_INT}}, + {"const_double_operand", false, false, {CONST_DOUBLE}}, +#else {"const_double_operand", false, false, {CONST_INT, CONST_DOUBLE}}, +#endif {"nonimmediate_operand", false, false, {SUBREG, REG, MEM}}, {"nonmemory_operand", false, true, {SUBREG, REG}}, {"push_operand", false, false, {MEM}}, diff --git a/gcc/optabs.c b/gcc/optabs.c index c1dacf4..c877800 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -850,7 +850,8 @@ expand_subword_shift (enum machine_mode op1_mode, optab binoptab, if (CONSTANT_P (op1) || shift_mask >= BITS_PER_WORD) { carries = outof_input; - tmp = immed_double_const (BITS_PER_WORD, 0, op1_mode); + tmp = immed_wide_int_const (wide_int::from_shwi (BITS_PER_WORD, + op1_mode), op1_mode); tmp = simplify_expand_binop (op1_mode, sub_optab, tmp, op1, 0, true, methods); } @@ -865,13 +866,14 @@ expand_subword_shift (enum machine_mode op1_mode, optab binoptab, outof_input, const1_rtx, 0, unsignedp, methods); if (shift_mask == BITS_PER_WORD - 1) { - tmp = immed_double_const (-1, -1, op1_mode); + tmp = immed_wide_int_const (wide_int::minus_one (op1_mode), op1_mode); tmp = simplify_expand_binop (op1_mode, xor_optab, op1, tmp, 0, true, methods); } else { - tmp = immed_double_const (BITS_PER_WORD - 1, 0, op1_mode); + tmp = immed_wide_int_const (wide_int::from_shwi (BITS_PER_WORD - 1, + op1_mode), op1_mode); tmp = simplify_expand_binop (op1_mode, sub_optab, tmp, op1, 0, true, methods); } @@ -1034,7 +1036,8 @@ expand_doubleword_shift (enum machine_mode op1_mode, optab binoptab, is true when the effective shift value is less than BITS_PER_WORD. Set SUPERWORD_OP1 to the shift count that should be used to shift OUTOF_INPUT into INTO_TARGET when the condition is false. */ - tmp = immed_double_const (BITS_PER_WORD, 0, op1_mode); + tmp = immed_wide_int_const (wide_int::from_shwi (BITS_PER_WORD, op1_mode), + op1_mode); if (!CONSTANT_P (op1) && shift_mask == BITS_PER_WORD - 1) { /* Set CMP1 to OP1 & BITS_PER_WORD. The result is zero iff OP1 @@ -2884,7 +2887,7 @@ expand_absneg_bit (enum rtx_code code, enum machine_mode mode, const struct real_format *fmt; int bitpos, word, nwords, i; enum machine_mode imode; - double_int mask; + wide_int mask; rtx temp, insns; /* The format has to have a simple sign bit. */ @@ -2920,7 +2923,7 @@ expand_absneg_bit (enum rtx_code code, enum machine_mode mode, nwords = (GET_MODE_BITSIZE (mode) + BITS_PER_WORD - 1) / BITS_PER_WORD; } - mask = double_int_zero.set_bit (bitpos); + mask = wide_int::set_bit_in_zero (bitpos, imode); if (code == ABS) mask = ~mask; @@ -2942,7 +2945,7 @@ expand_absneg_bit (enum rtx_code code, enum machine_mode mode, { temp = expand_binop (imode, code == ABS ? and_optab : xor_optab, op0_piece, - immed_double_int_const (mask, imode), + immed_wide_int_const (mask, imode), targ_piece, 1, OPTAB_LIB_WIDEN); if (temp != targ_piece) emit_move_insn (targ_piece, temp); @@ -2960,7 +2963,7 @@ expand_absneg_bit (enum rtx_code code, enum machine_mode mode, { temp = expand_binop (imode, code == ABS ? and_optab : xor_optab, gen_lowpart (imode, op0), - immed_double_int_const (mask, imode), + immed_wide_int_const (mask, imode), gen_lowpart (imode, target), 1, OPTAB_LIB_WIDEN); target = lowpart_subreg_maybe_copy (mode, temp, imode); @@ -3559,7 +3562,7 @@ expand_copysign_absneg (enum machine_mode mode, rtx op0, rtx op1, rtx target, } else { - double_int mask; + wide_int mask; if (GET_MODE_SIZE (mode) <= UNITS_PER_WORD) { @@ -3581,10 +3584,9 @@ expand_copysign_absneg (enum machine_mode mode, rtx op0, rtx op1, rtx target, op1 = operand_subword_force (op1, word, mode); } - mask = double_int_zero.set_bit (bitpos); - + mask = wide_int::set_bit_in_zero (bitpos, imode); sign = expand_binop (imode, and_optab, op1, - immed_double_int_const (mask, imode), + immed_wide_int_const (mask, imode), NULL_RTX, 1, OPTAB_LIB_WIDEN); } @@ -3628,7 +3630,7 @@ expand_copysign_bit (enum machine_mode mode, rtx op0, rtx op1, rtx target, int bitpos, bool op0_is_abs) { enum machine_mode imode; - double_int mask; + wide_int mask, nmask; int word, nwords, i; rtx temp, insns; @@ -3652,7 +3654,7 @@ expand_copysign_bit (enum machine_mode mode, rtx op0, rtx op1, rtx target, nwords = (GET_MODE_BITSIZE (mode) + BITS_PER_WORD - 1) / BITS_PER_WORD; } - mask = double_int_zero.set_bit (bitpos); + mask = wide_int::set_bit_in_zero (bitpos, imode); if (target == 0 || target == op0 @@ -3672,14 +3674,16 @@ expand_copysign_bit (enum machine_mode mode, rtx op0, rtx op1, rtx target, if (i == word) { if (!op0_is_abs) - op0_piece - = expand_binop (imode, and_optab, op0_piece, - immed_double_int_const (~mask, imode), - NULL_RTX, 1, OPTAB_LIB_WIDEN); - + { + nmask = ~mask; + op0_piece + = expand_binop (imode, and_optab, op0_piece, + immed_wide_int_const (nmask, imode), + NULL_RTX, 1, OPTAB_LIB_WIDEN); + } op1 = expand_binop (imode, and_optab, operand_subword_force (op1, i, mode), - immed_double_int_const (mask, imode), + immed_wide_int_const (mask, imode), NULL_RTX, 1, OPTAB_LIB_WIDEN); temp = expand_binop (imode, ior_optab, op0_piece, op1, @@ -3699,15 +3703,17 @@ expand_copysign_bit (enum machine_mode mode, rtx op0, rtx op1, rtx target, else { op1 = expand_binop (imode, and_optab, gen_lowpart (imode, op1), - immed_double_int_const (mask, imode), + immed_wide_int_const (mask, imode), NULL_RTX, 1, OPTAB_LIB_WIDEN); op0 = gen_lowpart (imode, op0); if (!op0_is_abs) - op0 = expand_binop (imode, and_optab, op0, - immed_double_int_const (~mask, imode), - NULL_RTX, 1, OPTAB_LIB_WIDEN); - + { + nmask = ~mask; + op0 = expand_binop (imode, and_optab, op0, + immed_wide_int_const (nmask, imode), + NULL_RTX, 1, OPTAB_LIB_WIDEN); + } temp = expand_binop (imode, ior_optab, op0, op1, gen_lowpart (imode, target), 1, OPTAB_LIB_WIDEN); target = lowpart_subreg_maybe_copy (mode, temp, imode); diff --git a/gcc/postreload.c b/gcc/postreload.c index daabaa1..34e8e61 100644 --- a/gcc/postreload.c +++ b/gcc/postreload.c @@ -295,27 +295,25 @@ reload_cse_simplify_set (rtx set, rtx insn) #ifdef LOAD_EXTEND_OP if (extend_op != UNKNOWN) { - HOST_WIDE_INT this_val; + wide_int result; - /* ??? I'm lazy and don't wish to handle CONST_DOUBLE. Other - constants, such as SYMBOL_REF, cannot be extended. */ - if (!CONST_INT_P (this_rtx)) + if (!CONST_SCALAR_INT_P (this_rtx)) continue; - this_val = INTVAL (this_rtx); switch (extend_op) { case ZERO_EXTEND: - this_val &= GET_MODE_MASK (GET_MODE (src)); + result = (wide_int::from_rtx (this_rtx, GET_MODE (src)) + .zext (word_mode)); break; case SIGN_EXTEND: - /* ??? In theory we're already extended. */ - if (this_val == trunc_int_for_mode (this_val, GET_MODE (src))) - break; + result = (wide_int::from_rtx (this_rtx, GET_MODE (src)) + .sext (word_mode)); + break; default: gcc_unreachable (); } - this_rtx = GEN_INT (this_val); + this_rtx = immed_wide_int_const (result, GET_MODE (src)); } #endif this_cost = set_src_cost (this_rtx, speed); diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c index 3793109..1f43de1 100644 --- a/gcc/print-rtl.c +++ b/gcc/print-rtl.c @@ -612,6 +612,12 @@ print_rtx (const_rtx in_rtx) fprintf (outfile, " [%s]", s); } break; + + case CONST_WIDE_INT: + if (! flag_simple) + fprintf (outfile, " "); + hwivec_output_hex (outfile, CONST_WIDE_INT_VEC (in_rtx)); + break; #endif case CODE_LABEL: diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c index cd58b1f..a73a41b 100644 --- a/gcc/read-rtl.c +++ b/gcc/read-rtl.c @@ -806,6 +806,29 @@ validate_const_int (const char *string) fatal_with_file_and_line ("invalid decimal constant \"%s\"\n", string); } +static void +validate_const_wide_int (const char *string) +{ + const char *cp; + int valid = 1; + + cp = string; + while (*cp && ISSPACE (*cp)) + cp++; + /* Skip the leading 0x. */ + if (cp[0] == '0' || cp[1] == 'x') + cp += 2; + else + valid = 0; + if (*cp == 0) + valid = 0; + for (; *cp; cp++) + if (! ISXDIGIT (*cp)) + valid = 0; + if (!valid) + fatal_with_file_and_line ("invalid hex constant \"%s\"\n", string); +} + /* Record that PTR uses iterator ITERATOR. */ static void @@ -1319,6 +1342,56 @@ read_rtx_code (const char *code_name) gcc_unreachable (); } + if (CONST_WIDE_INT_P (return_rtx)) + { + read_name (&name); + validate_const_wide_int (name.string); + { + hwivec hwiv; + const char *s = name.string; + int len; + int index = 0; + int gs = HOST_BITS_PER_WIDE_INT/4; + int pos; + char * buf = XALLOCAVEC (char, gs + 1); + unsigned HOST_WIDE_INT wi; + int wlen; + + /* Skip the leading spaces. */ + while (*s && ISSPACE (*s)) + s++; + + /* Skip the leading 0x. */ + gcc_assert (s[0] == '0'); + gcc_assert (s[1] == 'x'); + s += 2; + + len = strlen (s); + pos = len - gs; + wlen = (len + gs - 1) / gs; /* Number of words needed */ + + return_rtx = const_wide_int_alloc (wlen); + + hwiv = CONST_WIDE_INT_VEC (return_rtx); + while (pos > 0) + { +#if HOST_BITS_PER_WIDE_INT == 64 + sscanf (s + pos, "%16" HOST_WIDE_INT_PRINT "x", &wi); +#else + sscanf (s + pos, "%8" HOST_WIDE_INT_PRINT "x", &wi); +#endif + XHWIVEC_ELT (hwiv, index++) = wi; + pos -= gs; + } + strncpy (buf, s, gs - pos); + buf [gs - pos] = 0; + sscanf (buf, "%" HOST_WIDE_INT_PRINT "x", &wi); + XHWIVEC_ELT (hwiv, index++) = wi; + /* TODO: After reading, do we want to canonicalize with: + value = lookup_const_wide_int (value); ? */ + } + } + c = read_skip_spaces (); /* Syntactic sugar for AND and IOR, allowing Lisp-like arbitrary number of arguments for them. */ diff --git a/gcc/recog.c b/gcc/recog.c index ed359f6..05e08e9 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -1141,7 +1141,7 @@ immediate_operand (rtx op, enum machine_mode mode) : mode, op)); } -/* Returns 1 if OP is an operand that is a CONST_INT. */ +/* Returns 1 if OP is an operand that is a CONST_INT of mode MODE. */ int const_int_operand (rtx op, enum machine_mode mode) @@ -1156,8 +1156,64 @@ const_int_operand (rtx op, enum machine_mode mode) return 1; } +#if TARGET_SUPPORTS_WIDE_INT +/* Returns 1 if OP is an operand that is a CONST_INT or CONST_WIDE_INT + of mode MODE. */ +int +const_scalar_int_operand (rtx op, enum machine_mode mode) +{ + if (!CONST_SCALAR_INT_P (op)) + return 0; + + if (CONST_INT_P (op)) + return const_int_operand (op, mode); + + if (mode != VOIDmode) + { + int prec = GET_MODE_PRECISION (mode); + int bitsize = GET_MODE_BITSIZE (mode); + + if (CONST_WIDE_INT_NUNITS (op) * HOST_BITS_PER_WIDE_INT > bitsize) + return 0; + + if (prec == bitsize) + return 1; + else + { + /* Multiword partial int. */ + HOST_WIDE_INT x + = CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op) - 1); + return (wide_int::sext (x, prec & (HOST_BITS_PER_WIDE_INT - 1)) + == x); + } + } + return 1; +} + +/* Returns 1 if OP is an operand that is a CONST_WIDE_INT of mode + MODE. This most likely is not as useful as + const_scalar_int_operand, but is here for consistancy. */ +int +const_wide_int_operand (rtx op, enum machine_mode mode) +{ + if (!CONST_WIDE_INT_P (op)) + return 0; + + return const_scalar_int_operand (op, mode); +} + /* Returns 1 if OP is an operand that is a constant integer or constant - floating-point number. */ + floating-point number of MODE. */ + +int +const_double_operand (rtx op, enum machine_mode mode) +{ + return (GET_CODE (op) == CONST_DOUBLE) + && (GET_MODE (op) == mode || mode == VOIDmode); +} +#else +/* Returns 1 if OP is an operand that is a constant integer or constant + floating-point number of MODE. */ int const_double_operand (rtx op, enum machine_mode mode) @@ -1173,8 +1229,9 @@ const_double_operand (rtx op, enum machine_mode mode) && (mode == VOIDmode || GET_MODE (op) == mode || GET_MODE (op) == VOIDmode)); } - -/* Return 1 if OP is a general operand that is not an immediate operand. */ +#endif +/* Return 1 if OP is a general operand that is not an immediate + operand of mode MODE. */ int nonimmediate_operand (rtx op, enum machine_mode mode) @@ -1182,7 +1239,8 @@ nonimmediate_operand (rtx op, enum machine_mode mode) return (general_operand (op, mode) && ! CONSTANT_P (op)); } -/* Return 1 if OP is a register reference or immediate value of mode MODE. */ +/* Return 1 if OP is a register reference or immediate value of mode + MODE. */ int nonmemory_operand (rtx op, enum machine_mode mode) diff --git a/gcc/rtl.c b/gcc/rtl.c index bc49fc8..137da07 100644 --- a/gcc/rtl.c +++ b/gcc/rtl.c @@ -109,7 +109,7 @@ const enum rtx_class rtx_class[NUM_RTX_CODE] = { const unsigned char rtx_code_size[NUM_RTX_CODE] = { #define DEF_RTL_EXPR(ENUM, NAME, FORMAT, CLASS) \ (((ENUM) == CONST_INT || (ENUM) == CONST_DOUBLE \ - || (ENUM) == CONST_FIXED) \ + || (ENUM) == CONST_FIXED || (ENUM) == CONST_WIDE_INT) \ ? RTX_HDR_SIZE + (sizeof FORMAT - 1) * sizeof (HOST_WIDE_INT) \ : RTX_HDR_SIZE + (sizeof FORMAT - 1) * sizeof (rtunion)), @@ -181,18 +181,24 @@ shallow_copy_rtvec (rtvec vec) unsigned int rtx_size (const_rtx x) { + if (CONST_WIDE_INT_P (x)) + return (RTX_HDR_SIZE + + sizeof (struct hwivec_def) + + ((CONST_WIDE_INT_NUNITS (x) - 1) + * sizeof (HOST_WIDE_INT))); if (GET_CODE (x) == SYMBOL_REF && SYMBOL_REF_HAS_BLOCK_INFO_P (x)) return RTX_HDR_SIZE + sizeof (struct block_symbol); return RTX_CODE_SIZE (GET_CODE (x)); } -/* Allocate an rtx of code CODE. The CODE is stored in the rtx; - all the rest is initialized to zero. */ +/* Allocate an rtx of code CODE with EXTRA bytes in it. The CODE is + stored in the rtx; all the rest is initialized to zero. */ rtx -rtx_alloc_stat (RTX_CODE code MEM_STAT_DECL) +rtx_alloc_stat_v (RTX_CODE code MEM_STAT_DECL, int extra) { - rtx rt = ggc_alloc_rtx_def_stat (RTX_CODE_SIZE (code) PASS_MEM_STAT); + rtx rt = ggc_alloc_rtx_def_stat (RTX_CODE_SIZE (code) + extra + PASS_MEM_STAT); /* We want to clear everything up to the FLD array. Normally, this is one int, but we don't want to assume that and it isn't very @@ -210,6 +216,29 @@ rtx_alloc_stat (RTX_CODE code MEM_STAT_DECL) return rt; } +/* Allocate an rtx of code CODE. The CODE is stored in the rtx; + all the rest is initialized to zero. */ + +rtx +rtx_alloc_stat (RTX_CODE code MEM_STAT_DECL) +{ + return rtx_alloc_stat_v (code PASS_MEM_STAT, 0); +} + +/* Write the wide constant OP0 to OUTFILE. */ + +void +hwivec_output_hex (FILE *outfile, const_hwivec op0) +{ + int i = HWI_GET_NUM_ELEM (op0); + gcc_assert (i > 0); + if (XHWIVEC_ELT (op0, i-1) == 0) + fprintf (outfile, "0x"); + fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, XHWIVEC_ELT (op0, --i)); + while (--i >= 0) + fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, XHWIVEC_ELT (op0, i)); +} + \f /* Return true if ORIG is a sharable CONST. */ @@ -424,7 +453,6 @@ rtx_equal_p_cb (const_rtx x, const_rtx y, rtx_equal_p_callback_function cb) if (XWINT (x, i) != XWINT (y, i)) return 0; break; - case 'n': case 'i': if (XINT (x, i) != XINT (y, i)) @@ -642,6 +670,10 @@ iterative_hash_rtx (const_rtx x, hashval_t hash) return iterative_hash_object (i, hash); case CONST_INT: return iterative_hash_object (INTVAL (x), hash); + case CONST_WIDE_INT: + for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++) + hash = iterative_hash_object (CONST_WIDE_INT_ELT (x, i), hash); + return hash; case SYMBOL_REF: if (XSTR (x, 0)) return iterative_hash (XSTR (x, 0), strlen (XSTR (x, 0)) + 1, @@ -807,6 +839,16 @@ rtl_check_failed_block_symbol (const char *file, int line, const char *func) /* XXX Maybe print the vector? */ void +hwivec_check_failed_bounds (const_hwivec r, int n, const char *file, int line, + const char *func) +{ + internal_error + ("RTL check: access of hwi elt %d of vector with last elt %d in %s, at %s:%d", + n, GET_NUM_ELEM (r) - 1, func, trim_filename (file), line); +} + +/* XXX Maybe print the vector? */ +void rtvec_check_failed_bounds (const_rtvec r, int n, const char *file, int line, const char *func) { diff --git a/gcc/rtl.def b/gcc/rtl.def index d6c881f..8fae62f 100644 --- a/gcc/rtl.def +++ b/gcc/rtl.def @@ -317,6 +317,9 @@ DEF_RTL_EXPR(TRAP_IF, "trap_if", "ee", RTX_EXTRA) /* numeric integer constant */ DEF_RTL_EXPR(CONST_INT, "const_int", "w", RTX_CONST_OBJ) +/* numeric integer constant */ +DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ) + /* fixed-point constant */ DEF_RTL_EXPR(CONST_FIXED, "const_fixed", "www", RTX_CONST_OBJ) diff --git a/gcc/rtl.h b/gcc/rtl.h index 93a64f4..58c5902 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -28,6 +28,7 @@ along with GCC; see the file COPYING3. If not see #include "fixed-value.h" #include "alias.h" #include "hashtab.h" +#include "wide-int.h" #include "flags.h" /* Value used by some passes to "recognize" noop moves as valid @@ -249,6 +250,14 @@ struct GTY(()) object_block { vec<rtx, va_gc> *anchors; }; +struct GTY((variable_size)) hwivec_def { + int num_elem; /* number of elements */ + HOST_WIDE_INT elem[1]; +}; + +#define HWI_GET_NUM_ELEM(HWIVEC) ((HWIVEC)->num_elem) +#define HWI_PUT_NUM_ELEM(HWIVEC, NUM) ((HWIVEC)->num_elem = (NUM)) + /* RTL expression ("rtx"). */ struct GTY((chain_next ("RTX_NEXT (&%h)"), @@ -343,6 +352,7 @@ struct GTY((chain_next ("RTX_NEXT (&%h)"), struct block_symbol block_sym; struct real_value rv; struct fixed_value fv; + struct hwivec_def hwiv; } GTY ((special ("rtx_def"), desc ("GET_CODE (&%0)"))) u; }; @@ -381,13 +391,13 @@ struct GTY((chain_next ("RTX_NEXT (&%h)"), for a variable number of things. The principle use is inside PARALLEL expressions. */ +#define NULL_RTVEC (rtvec) 0 + struct GTY((variable_size)) rtvec_def { int num_elem; /* number of elements */ rtx GTY ((length ("%h.num_elem"))) elem[1]; }; -#define NULL_RTVEC (rtvec) 0 - #define GET_NUM_ELEM(RTVEC) ((RTVEC)->num_elem) #define PUT_NUM_ELEM(RTVEC, NUM) ((RTVEC)->num_elem = (NUM)) @@ -397,12 +407,38 @@ struct GTY((variable_size)) rtvec_def { /* Predicate yielding nonzero iff X is an rtx for a memory location. */ #define MEM_P(X) (GET_CODE (X) == MEM) +#if TARGET_SUPPORTS_WIDE_INT + +/* Match CONST_*s that can represent compile-time constant integers. */ +#define CASE_CONST_SCALAR_INT \ + case CONST_INT: \ + case CONST_WIDE_INT + +/* Match CONST_*s for which pointer equality corresponds to value + equality. */ +#define CASE_CONST_UNIQUE \ + case CONST_INT: \ + case CONST_WIDE_INT: \ + case CONST_DOUBLE: \ + case CONST_FIXED + +/* Match all CONST_* rtxes. */ +#define CASE_CONST_ANY \ + case CONST_INT: \ + case CONST_WIDE_INT: \ + case CONST_DOUBLE: \ + case CONST_FIXED: \ + case CONST_VECTOR + +#else + /* Match CONST_*s that can represent compile-time constant integers. */ #define CASE_CONST_SCALAR_INT \ case CONST_INT: \ case CONST_DOUBLE -/* Match CONST_*s for which pointer equality corresponds to value equality. */ +/* Match CONST_*s for which pointer equality corresponds to value +equality. */ #define CASE_CONST_UNIQUE \ case CONST_INT: \ case CONST_DOUBLE: \ @@ -414,10 +450,17 @@ struct GTY((variable_size)) rtvec_def { case CONST_DOUBLE: \ case CONST_FIXED: \ case CONST_VECTOR +#endif + + + /* Predicate yielding nonzero iff X is an rtx for a constant integer. */ #define CONST_INT_P(X) (GET_CODE (X) == CONST_INT) +/* Predicate yielding nonzero iff X is an rtx for a constant integer. */ +#define CONST_WIDE_INT_P(X) (GET_CODE (X) == CONST_WIDE_INT) + /* Predicate yielding nonzero iff X is an rtx for a constant fixed-point. */ #define CONST_FIXED_P(X) (GET_CODE (X) == CONST_FIXED) @@ -430,8 +473,13 @@ struct GTY((variable_size)) rtvec_def { (GET_CODE (X) == CONST_DOUBLE && GET_MODE (X) == VOIDmode) /* Predicate yielding true iff X is an rtx for a integer const. */ +#if TARGET_SUPPORTS_WIDE_INT +#define CONST_SCALAR_INT_P(X) \ + (CONST_INT_P (X) || CONST_WIDE_INT_P (X)) +#else #define CONST_SCALAR_INT_P(X) \ (CONST_INT_P (X) || CONST_DOUBLE_AS_INT_P (X)) +#endif /* Predicate yielding true iff X is an rtx for a double-int. */ #define CONST_DOUBLE_AS_FLOAT_P(X) \ @@ -594,6 +642,13 @@ struct GTY((variable_size)) rtvec_def { __FUNCTION__); \ &_rtx->u.hwint[_n]; })) +#define XHWIVEC_ELT(HWIVEC, I) __extension__ \ +(*({ __typeof (HWIVEC) const _hwivec = (HWIVEC); const int _i = (I); \ + if (_i < 0 || _i >= HWI_GET_NUM_ELEM (_hwivec)) \ + hwivec_check_failed_bounds (_hwivec, _i, __FILE__, __LINE__, \ + __FUNCTION__); \ + &_hwivec->elem[_i]; })) + #define XCWINT(RTX, N, C) __extension__ \ (*({ __typeof (RTX) const _rtx = (RTX); \ if (GET_CODE (_rtx) != (C)) \ @@ -630,6 +685,11 @@ struct GTY((variable_size)) rtvec_def { __FUNCTION__); \ &_symbol->u.block_sym; }) +#define HWIVEC_CHECK(RTX,C) __extension__ \ +({ __typeof (RTX) const _symbol = (RTX); \ + RTL_CHECKC1 (_symbol, 0, C); \ + &_symbol->u.hwiv; }) + extern void rtl_check_failed_bounds (const_rtx, int, const char *, int, const char *) ATTRIBUTE_NORETURN; @@ -650,6 +710,9 @@ extern void rtl_check_failed_code_mode (const_rtx, enum rtx_code, enum machine_m ATTRIBUTE_NORETURN; extern void rtl_check_failed_block_symbol (const char *, int, const char *) ATTRIBUTE_NORETURN; +extern void hwivec_check_failed_bounds (const_rtvec, int, const char *, int, + const char *) + ATTRIBUTE_NORETURN; extern void rtvec_check_failed_bounds (const_rtvec, int, const char *, int, const char *) ATTRIBUTE_NORETURN; @@ -662,12 +725,14 @@ extern void rtvec_check_failed_bounds (const_rtvec, int, const char *, int, #define RTL_CHECKC2(RTX, N, C1, C2) ((RTX)->u.fld[N]) #define RTVEC_ELT(RTVEC, I) ((RTVEC)->elem[I]) #define XWINT(RTX, N) ((RTX)->u.hwint[N]) +#define XHWIVEC_ELT(HWIVEC, I) ((HWIVEC)->elem[I]) #define XCWINT(RTX, N, C) ((RTX)->u.hwint[N]) #define XCMWINT(RTX, N, C, M) ((RTX)->u.hwint[N]) #define XCNMWINT(RTX, N, C, M) ((RTX)->u.hwint[N]) #define XCNMPRV(RTX, C, M) (&(RTX)->u.rv) #define XCNMPFV(RTX, C, M) (&(RTX)->u.fv) #define BLOCK_SYMBOL_CHECK(RTX) (&(RTX)->u.block_sym) +#define HWIVEC_CHECK(RTX,C) (&(RTX)->u.hwiv) #endif @@ -810,8 +875,8 @@ extern void rtl_check_failed_flag (const char *, const_rtx, const char *, #define XCCFI(RTX, N, C) (RTL_CHECKC1 (RTX, N, C).rt_cfi) #define XCCSELIB(RTX, N, C) (RTL_CHECKC1 (RTX, N, C).rt_cselib) -#define XCVECEXP(RTX, N, M, C) RTVEC_ELT (XCVEC (RTX, N, C), M) -#define XCVECLEN(RTX, N, C) GET_NUM_ELEM (XCVEC (RTX, N, C)) +#define XCVECEXP(RTX, N, M, C) RTVEC_ELT (XCVEC (RTX, N, C), M) +#define XCVECLEN(RTX, N, C) GET_NUM_ELEM (XCVEC (RTX, N, C)) #define XC2EXP(RTX, N, C1, C2) (RTL_CHECKC2 (RTX, N, C1, C2).rt_rtx) \f @@ -1153,9 +1218,19 @@ rhs_regno (const_rtx x) #define INTVAL(RTX) XCWINT(RTX, 0, CONST_INT) #define UINTVAL(RTX) ((unsigned HOST_WIDE_INT) INTVAL (RTX)) +/* For a CONST_WIDE_INT, CONST_WIDE_INT_NUNITS is the number of + elements actually needed to represent the constant. + CONST_WIDE_INT_ELT gets one of the elements. 0 is the least + significant HOST_WIDE_INT. */ +#define CONST_WIDE_INT_VEC(RTX) HWIVEC_CHECK (RTX, CONST_WIDE_INT) +#define CONST_WIDE_INT_NUNITS(RTX) HWI_GET_NUM_ELEM (CONST_WIDE_INT_VEC (RTX)) +#define CONST_WIDE_INT_ELT(RTX, N) XHWIVEC_ELT (CONST_WIDE_INT_VEC (RTX), N) + /* For a CONST_DOUBLE: +#if TARGET_SUPPORTS_WIDE_INT == 0 For a VOIDmode, there are two integers CONST_DOUBLE_LOW is the low-order word and ..._HIGH the high-order. +#endif For a float, there is a REAL_VALUE_TYPE structure, and CONST_DOUBLE_REAL_VALUE(r) is a pointer to it. */ #define CONST_DOUBLE_LOW(r) XCMWINT (r, 0, CONST_DOUBLE, VOIDmode) @@ -1760,6 +1835,12 @@ extern rtx plus_constant (enum machine_mode, rtx, HOST_WIDE_INT); /* In rtl.c */ extern rtx rtx_alloc_stat (RTX_CODE MEM_STAT_DECL); #define rtx_alloc(c) rtx_alloc_stat (c MEM_STAT_INFO) +extern rtx rtx_alloc_stat_v (RTX_CODE MEM_STAT_DECL, int); +#define rtx_alloc_v(c, SZ) rtx_alloc_stat_v (c MEM_STAT_INFO, SZ) +#define const_wide_int_alloc(NWORDS) \ + rtx_alloc_v (CONST_WIDE_INT, \ + (sizeof (struct hwivec_def) \ + + ((NWORDS)-1) * sizeof (HOST_WIDE_INT))) \ extern rtvec rtvec_alloc (int); extern rtvec shallow_copy_rtvec (rtvec); @@ -1816,10 +1897,17 @@ extern void start_sequence (void); extern void push_to_sequence (rtx); extern void push_to_sequence2 (rtx, rtx); extern void end_sequence (void); +#if TARGET_SUPPORTS_WIDE_INT == 0 extern double_int rtx_to_double_int (const_rtx); -extern rtx immed_double_int_const (double_int, enum machine_mode); +#endif +extern void hwivec_output_hex (FILE *, const_hwivec); +#ifndef GENERATOR_FILE +extern rtx immed_wide_int_const (const wide_int &cst, enum machine_mode mode); +#endif +#if TARGET_SUPPORTS_WIDE_INT == 0 extern rtx immed_double_const (HOST_WIDE_INT, HOST_WIDE_INT, enum machine_mode); +#endif /* In loop-iv.c */ diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c index b198685..0fe1d0e 100644 --- a/gcc/rtlanal.c +++ b/gcc/rtlanal.c @@ -3091,6 +3091,8 @@ commutative_operand_precedence (rtx op) /* Constants always come the second operand. Prefer "nice" constants. */ if (code == CONST_INT) return -8; + if (code == CONST_WIDE_INT) + return -8; if (code == CONST_DOUBLE) return -7; if (code == CONST_FIXED) @@ -3103,6 +3105,8 @@ commutative_operand_precedence (rtx op) case RTX_CONST_OBJ: if (code == CONST_INT) return -6; + if (code == CONST_WIDE_INT) + return -6; if (code == CONST_DOUBLE) return -5; if (code == CONST_FIXED) @@ -5289,7 +5293,10 @@ get_address_mode (rtx mem) /* Split up a CONST_DOUBLE or integer constant rtx into two rtx's for single words, storing in *FIRST the word that comes first in memory in the target - and in *SECOND the other. */ + and in *SECOND the other. + + TODO: This function needs to be rewritten to work on any size + integer. */ void split_double (rtx value, rtx *first, rtx *second) @@ -5366,6 +5373,22 @@ split_double (rtx value, rtx *first, rtx *second) } } } + else if (GET_CODE (value) == CONST_WIDE_INT) + { + /* All of this is scary code and needs to be converted to + properly work with any size integer. */ + gcc_assert (CONST_WIDE_INT_NUNITS (value) == 2); + if (WORDS_BIG_ENDIAN) + { + *first = GEN_INT (CONST_WIDE_INT_ELT (value, 1)); + *second = GEN_INT (CONST_WIDE_INT_ELT (value, 0)); + } + else + { + *first = GEN_INT (CONST_WIDE_INT_ELT (value, 0)); + *second = GEN_INT (CONST_WIDE_INT_ELT (value, 1)); + } + } else if (!CONST_DOUBLE_P (value)) { if (WORDS_BIG_ENDIAN) diff --git a/gcc/sched-vis.c b/gcc/sched-vis.c index 98de37e..514b0d8 100644 --- a/gcc/sched-vis.c +++ b/gcc/sched-vis.c @@ -429,6 +429,23 @@ print_value (pretty_printer *pp, const_rtx x, int verbose) pp_scalar (pp, HOST_WIDE_INT_PRINT_HEX, (unsigned HOST_WIDE_INT) INTVAL (x)); break; + + case CONST_WIDE_INT: + { + const char *sep = "<"; + int i; + for (i = CONST_WIDE_INT_NUNITS (x) - 1; i >= 0; i--) + { + pp_string (pp, sep); + sep = ","; + sprintf (tmp, HOST_WIDE_INT_PRINT_HEX, + (unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (x, i)); + pp_string (pp, tmp); + } + pp_greater (pp); + } + break; + case CONST_DOUBLE: if (FLOAT_MODE_P (GET_MODE (x))) { diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c index 39dc52f..2499eaa 100644 --- a/gcc/sel-sched-ir.c +++ b/gcc/sel-sched-ir.c @@ -1138,10 +1138,10 @@ lhs_and_rhs_separable_p (rtx lhs, rtx rhs) if (lhs == NULL || rhs == NULL) return false; - /* Do not schedule CONST, CONST_INT and CONST_DOUBLE etc as rhs: no point - to use reg, if const can be used. Moreover, scheduling const as rhs may - lead to mode mismatch cause consts don't have modes but they could be - merged from branches where the same const used in different modes. */ + /* Do not schedule constants as rhs: no point to use reg, if const + can be used. Moreover, scheduling const as rhs may lead to mode + mismatch cause consts don't have modes but they could be merged + from branches where the same const used in different modes. */ if (CONSTANT_P (rhs)) return false; diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 3f04b8b..4a03299 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -86,6 +86,22 @@ mode_signbit_p (enum machine_mode mode, const_rtx x) if (width <= HOST_BITS_PER_WIDE_INT && CONST_INT_P (x)) val = INTVAL (x); +#if TARGET_SUPPORTS_WIDE_INT + else if (CONST_WIDE_INT_P (x)) + { + unsigned int i; + unsigned int elts = CONST_WIDE_INT_NUNITS (x); + if (elts != (width + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT) + return false; + for (i = 0; i < elts - 1; i++) + if (CONST_WIDE_INT_ELT (x, i) != 0) + return false; + val = CONST_WIDE_INT_ELT (x, elts - 1); + width %= HOST_BITS_PER_WIDE_INT; + if (width == 0) + width = HOST_BITS_PER_WIDE_INT; + } +#else else if (width <= HOST_BITS_PER_DOUBLE_INT && CONST_DOUBLE_AS_INT_P (x) && CONST_DOUBLE_LOW (x) == 0) @@ -93,8 +109,9 @@ mode_signbit_p (enum machine_mode mode, const_rtx x) val = CONST_DOUBLE_HIGH (x); width -= HOST_BITS_PER_WIDE_INT; } +#endif else - /* FIXME: We don't yet have a representation for wider modes. */ + /* X is not an integer constant. */ return false; if (width < HOST_BITS_PER_WIDE_INT) @@ -1487,7 +1504,6 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, rtx op, enum machine_mode op_mode) { unsigned int width = GET_MODE_PRECISION (mode); - unsigned int op_width = GET_MODE_PRECISION (op_mode); if (code == VEC_DUPLICATE) { @@ -1561,8 +1577,19 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, if (CONST_INT_P (op)) lv = INTVAL (op), hv = HWI_SIGN_EXTEND (lv); else +#if TARGET_SUPPORTS_WIDE_INT + { + /* The conversion code to floats really want exactly 2 HWIs. + This needs to be fixed. For now, if the constant is + really big, just return 0 which is safe. */ + if (CONST_WIDE_INT_NUNITS (op) > 2) + return 0; + lv = CONST_WIDE_INT_ELT (op, 0); + hv = CONST_WIDE_INT_ELT (op, 1); + } +#else lv = CONST_DOUBLE_LOW (op), hv = CONST_DOUBLE_HIGH (op); - +#endif REAL_VALUE_FROM_INT (d, lv, hv, mode); d = real_value_truncate (mode, d); return CONST_DOUBLE_FROM_REAL_VALUE (d, mode); @@ -1575,8 +1602,19 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, if (CONST_INT_P (op)) lv = INTVAL (op), hv = HWI_SIGN_EXTEND (lv); else +#if TARGET_SUPPORTS_WIDE_INT + { + /* The conversion code to floats really want exactly 2 HWIs. + This needs to be fixed. For now, if the constant is + really big, just return 0 which is safe. */ + if (CONST_WIDE_INT_NUNITS (op) > 2) + return 0; + lv = CONST_WIDE_INT_ELT (op, 0); + hv = CONST_WIDE_INT_ELT (op, 1); + } +#else lv = CONST_DOUBLE_LOW (op), hv = CONST_DOUBLE_HIGH (op); - +#endif if (op_mode == VOIDmode || GET_MODE_PRECISION (op_mode) > HOST_BITS_PER_DOUBLE_INT) /* We should never get a negative number. */ @@ -1589,302 +1627,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, return CONST_DOUBLE_FROM_REAL_VALUE (d, mode); } - if (CONST_INT_P (op) - && width <= HOST_BITS_PER_WIDE_INT && width > 0) + if (CONST_SCALAR_INT_P (op) && width > 0) { - HOST_WIDE_INT arg0 = INTVAL (op); - HOST_WIDE_INT val; + wide_int result; + enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode; + wide_int op0 = wide_int::from_rtx (op, imode); + +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* This assert keeps the simplification from producing a result + that cannot be represented in a CONST_DOUBLE but a lot of + upstream callers expect that this function never fails to + simplify something and so you if you added this to the test + above the code would die later anyway. If this assert + happens, you just need to make the port support wide int. */ + gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); +#endif switch (code) { case NOT: - val = ~ arg0; + result = ~op0; break; case NEG: - val = - arg0; + result = op0.neg (); break; case ABS: - val = (arg0 >= 0 ? arg0 : - arg0); + result = op0.abs (); break; case FFS: - arg0 &= GET_MODE_MASK (mode); - val = ffs_hwi (arg0); + result = op0.ffs (); break; case CLZ: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val)) - ; - else - val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1; + result = op0.clz (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case CLRSB: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0) - val = GET_MODE_PRECISION (mode) - 1; - else if (arg0 >= 0) - val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2; - else if (arg0 < 0) - val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2; + result = op0.clrsb (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; - + case CTZ: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0) - { - /* Even if the value at zero is undefined, we have to come - up with some replacement. Seems good enough. */ - if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val)) - val = GET_MODE_PRECISION (mode); - } - else - val = ctz_hwi (arg0); + result = op0.ctz (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case POPCOUNT: - arg0 &= GET_MODE_MASK (mode); - val = 0; - while (arg0) - val++, arg0 &= arg0 - 1; + result = op0.popcount (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case PARITY: - arg0 &= GET_MODE_MASK (mode); - val = 0; - while (arg0) - val++, arg0 &= arg0 - 1; - val &= 1; + result = op0.parity (GET_MODE_BITSIZE (mode), + GET_MODE_PRECISION (mode)); break; case BSWAP: - { - unsigned int s; - - val = 0; - for (s = 0; s < width; s += 8) - { - unsigned int d = width - s - 8; - unsigned HOST_WIDE_INT byte; - byte = (arg0 >> s) & 0xff; - val |= byte << d; - } - } + result = op0.bswap (); break; case TRUNCATE: - val = arg0; + result = op0.zforce_to_size (mode); break; case ZERO_EXTEND: - /* When zero-extending a CONST_INT, we need to know its - original mode. */ - gcc_assert (op_mode != VOIDmode); - if (op_width == HOST_BITS_PER_WIDE_INT) - { - /* If we were really extending the mode, - we would have to distinguish between zero-extension - and sign-extension. */ - gcc_assert (width == op_width); - val = arg0; - } - else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT) - val = arg0 & GET_MODE_MASK (op_mode); - else - return 0; + result = op0.zforce_to_size (mode); break; case SIGN_EXTEND: - if (op_mode == VOIDmode) - op_mode = mode; - op_width = GET_MODE_PRECISION (op_mode); - if (op_width == HOST_BITS_PER_WIDE_INT) - { - /* If we were really extending the mode, - we would have to distinguish between zero-extension - and sign-extension. */ - gcc_assert (width == op_width); - val = arg0; - } - else if (op_width < HOST_BITS_PER_WIDE_INT) - { - val = arg0 & GET_MODE_MASK (op_mode); - if (val_signbit_known_set_p (op_mode, val)) - val |= ~GET_MODE_MASK (op_mode); - } - else - return 0; + result = op0.sforce_to_size (mode); break; case SQRT: - case FLOAT_EXTEND: - case FLOAT_TRUNCATE: - case SS_TRUNCATE: - case US_TRUNCATE: - case SS_NEG: - case US_NEG: - case SS_ABS: - return 0; - - default: - gcc_unreachable (); - } - - return gen_int_mode (val, mode); - } - - /* We can do some operations on integer CONST_DOUBLEs. Also allow - for a DImode operation on a CONST_INT. */ - else if (width <= HOST_BITS_PER_DOUBLE_INT - && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op))) - { - double_int first, value; - - if (CONST_DOUBLE_AS_INT_P (op)) - first = double_int::from_pair (CONST_DOUBLE_HIGH (op), - CONST_DOUBLE_LOW (op)); - else - first = double_int::from_shwi (INTVAL (op)); - - switch (code) - { - case NOT: - value = ~first; - break; - - case NEG: - value = -first; - break; - - case ABS: - if (first.is_negative ()) - value = -first; - else - value = first; - break; - - case FFS: - value.high = 0; - if (first.low != 0) - value.low = ffs_hwi (first.low); - else if (first.high != 0) - value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high); - else - value.low = 0; - break; - - case CLZ: - value.high = 0; - if (first.high != 0) - value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1 - - HOST_BITS_PER_WIDE_INT; - else if (first.low != 0) - value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1; - else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low)) - value.low = GET_MODE_PRECISION (mode); - break; - - case CTZ: - value.high = 0; - if (first.low != 0) - value.low = ctz_hwi (first.low); - else if (first.high != 0) - value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high); - else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low)) - value.low = GET_MODE_PRECISION (mode); - break; - - case POPCOUNT: - value = double_int_zero; - while (first.low) - { - value.low++; - first.low &= first.low - 1; - } - while (first.high) - { - value.low++; - first.high &= first.high - 1; - } - break; - - case PARITY: - value = double_int_zero; - while (first.low) - { - value.low++; - first.low &= first.low - 1; - } - while (first.high) - { - value.low++; - first.high &= first.high - 1; - } - value.low &= 1; - break; - - case BSWAP: - { - unsigned int s; - - value = double_int_zero; - for (s = 0; s < width; s += 8) - { - unsigned int d = width - s - 8; - unsigned HOST_WIDE_INT byte; - - if (s < HOST_BITS_PER_WIDE_INT) - byte = (first.low >> s) & 0xff; - else - byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff; - - if (d < HOST_BITS_PER_WIDE_INT) - value.low |= byte << d; - else - value.high |= byte << (d - HOST_BITS_PER_WIDE_INT); - } - } - break; - - case TRUNCATE: - /* This is just a change-of-mode, so do nothing. */ - value = first; - break; - - case ZERO_EXTEND: - gcc_assert (op_mode != VOIDmode); - - if (op_width > HOST_BITS_PER_WIDE_INT) - return 0; - - value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode)); - break; - - case SIGN_EXTEND: - if (op_mode == VOIDmode - || op_width > HOST_BITS_PER_WIDE_INT) - return 0; - else - { - value.low = first.low & GET_MODE_MASK (op_mode); - if (val_signbit_known_set_p (op_mode, value.low)) - value.low |= ~GET_MODE_MASK (op_mode); - - value.high = HWI_SIGN_EXTEND (value.low); - } - break; - - case SQRT: - return 0; - default: return 0; } - return immed_double_int_const (value, mode); + return immed_wide_int_const (result, mode); } else if (CONST_DOUBLE_AS_FLOAT_P (op) @@ -1936,7 +1759,6 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, } return CONST_DOUBLE_FROM_REAL_VALUE (d, mode); } - else if (CONST_DOUBLE_AS_FLOAT_P (op) && SCALAR_FLOAT_MODE_P (GET_MODE (op)) && GET_MODE_CLASS (mode) == MODE_INT @@ -1949,9 +1771,12 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, /* This was formerly used only for non-IEEE float. eggert@twinsun.com says it is safe for IEEE also. */ - HOST_WIDE_INT xh, xl, th, tl; + HOST_WIDE_INT th, tl; REAL_VALUE_TYPE x, t; + wide_int wc; REAL_VALUE_FROM_CONST_DOUBLE (x, op); + HOST_WIDE_INT tmp[2]; + switch (code) { case FIX: @@ -1973,8 +1798,8 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, real_from_integer (&t, VOIDmode, tl, th, 0); if (REAL_VALUES_LESS (t, x)) { - xh = th; - xl = tl; + tmp[1] = th; + tmp[0] = tl; break; } @@ -1993,11 +1818,11 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, real_from_integer (&t, VOIDmode, tl, th, 0); if (REAL_VALUES_LESS (x, t)) { - xh = th; - xl = tl; + tmp[1] = th; + tmp[0] = tl; break; } - REAL_VALUE_TO_INT (&xl, &xh, x); + REAL_VALUE_TO_INT (&tmp[0], &tmp[1], x); break; case UNSIGNED_FIX: @@ -2024,18 +1849,19 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode, real_from_integer (&t, VOIDmode, tl, th, 1); if (REAL_VALUES_LESS (t, x)) { - xh = th; - xl = tl; + tmp[1] = th; + tmp[0] = tl; break; } - REAL_VALUE_TO_INT (&xl, &xh, x); + REAL_VALUE_TO_INT (&tmp[0], &tmp[1], x); break; default: gcc_unreachable (); } - return immed_double_const (xl, xh, mode); + wc = wide_int::from_array (tmp, 2, mode); + return immed_wide_int_const (wc, mode); } return NULL_RTX; @@ -2195,49 +2021,50 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode, if (SCALAR_INT_MODE_P (mode)) { - double_int coeff0, coeff1; + wide_int coeff0; + wide_int coeff1; rtx lhs = op0, rhs = op1; - coeff0 = double_int_one; - coeff1 = double_int_one; + coeff0 = wide_int::one (mode); + coeff1 = wide_int::one (mode); if (GET_CODE (lhs) == NEG) { - coeff0 = double_int_minus_one; + coeff0 = wide_int::minus_one (mode); lhs = XEXP (lhs, 0); } else if (GET_CODE (lhs) == MULT - && CONST_INT_P (XEXP (lhs, 1))) + && CONST_SCALAR_INT_P (XEXP (lhs, 1))) { - coeff0 = double_int::from_shwi (INTVAL (XEXP (lhs, 1))); + coeff0 = wide_int::from_rtx (XEXP (lhs, 1), mode); lhs = XEXP (lhs, 0); } else if (GET_CODE (lhs) == ASHIFT && CONST_INT_P (XEXP (lhs, 1)) && INTVAL (XEXP (lhs, 1)) >= 0 - && INTVAL (XEXP (lhs, 1)) < HOST_BITS_PER_WIDE_INT) + && INTVAL (XEXP (lhs, 1)) < GET_MODE_PRECISION (mode)) { - coeff0 = double_int_zero.set_bit (INTVAL (XEXP (lhs, 1))); + coeff0 = wide_int::set_bit_in_zero (INTVAL (XEXP (lhs, 1)), mode); lhs = XEXP (lhs, 0); } if (GET_CODE (rhs) == NEG) { - coeff1 = double_int_minus_one; + coeff1 = wide_int::minus_one (mode); rhs = XEXP (rhs, 0); } else if (GET_CODE (rhs) == MULT && CONST_INT_P (XEXP (rhs, 1))) { - coeff1 = double_int::from_shwi (INTVAL (XEXP (rhs, 1))); + coeff1 = wide_int::from_rtx (XEXP (rhs, 1), mode); rhs = XEXP (rhs, 0); } else if (GET_CODE (rhs) == ASHIFT && CONST_INT_P (XEXP (rhs, 1)) && INTVAL (XEXP (rhs, 1)) >= 0 - && INTVAL (XEXP (rhs, 1)) < HOST_BITS_PER_WIDE_INT) + && INTVAL (XEXP (rhs, 1)) < GET_MODE_PRECISION (mode)) { - coeff1 = double_int_zero.set_bit (INTVAL (XEXP (rhs, 1))); + coeff1 = wide_int::set_bit_in_zero (INTVAL (XEXP (rhs, 1)), mode); rhs = XEXP (rhs, 0); } @@ -2245,11 +2072,9 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode, { rtx orig = gen_rtx_PLUS (mode, op0, op1); rtx coeff; - double_int val; bool speed = optimize_function_for_speed_p (cfun); - val = coeff0 + coeff1; - coeff = immed_double_int_const (val, mode); + coeff = immed_wide_int_const (coeff0 + coeff1, mode); tem = simplify_gen_binary (MULT, mode, lhs, coeff); return set_src_cost (tem, speed) <= set_src_cost (orig, speed) @@ -2371,50 +2196,52 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode, if (SCALAR_INT_MODE_P (mode)) { - double_int coeff0, negcoeff1; + wide_int coeff0; + wide_int negcoeff1; rtx lhs = op0, rhs = op1; - coeff0 = double_int_one; - negcoeff1 = double_int_minus_one; + coeff0 = wide_int::one (mode); + negcoeff1 = wide_int::minus_one (mode); if (GET_CODE (lhs) == NEG) { - coeff0 = double_int_minus_one; + coeff0 = wide_int::minus_one (mode); lhs = XEXP (lhs, 0); } else if (GET_CODE (lhs) == MULT - && CONST_INT_P (XEXP (lhs, 1))) + && CONST_SCALAR_INT_P (XEXP (lhs, 1))) { - coeff0 = double_int::from_shwi (INTVAL (XEXP (lhs, 1))); + coeff0 = wide_int::from_rtx (XEXP (lhs, 1), mode); lhs = XEXP (lhs, 0); } else if (GET_CODE (lhs) == ASHIFT && CONST_INT_P (XEXP (lhs, 1)) && INTVAL (XEXP (lhs, 1)) >= 0 - && INTVAL (XEXP (lhs, 1)) < HOST_BITS_PER_WIDE_INT) + && INTVAL (XEXP (lhs, 1)) < GET_MODE_PRECISION (mode)) { - coeff0 = double_int_zero.set_bit (INTVAL (XEXP (lhs, 1))); + coeff0 = wide_int::set_bit_in_zero (INTVAL (XEXP (lhs, 1)), mode); lhs = XEXP (lhs, 0); } if (GET_CODE (rhs) == NEG) { - negcoeff1 = double_int_one; + negcoeff1 = wide_int::one (mode); rhs = XEXP (rhs, 0); } else if (GET_CODE (rhs) == MULT && CONST_INT_P (XEXP (rhs, 1))) { - negcoeff1 = double_int::from_shwi (-INTVAL (XEXP (rhs, 1))); + negcoeff1 = wide_int::from_rtx (XEXP (rhs, 1), mode).neg (); rhs = XEXP (rhs, 0); } else if (GET_CODE (rhs) == ASHIFT && CONST_INT_P (XEXP (rhs, 1)) && INTVAL (XEXP (rhs, 1)) >= 0 - && INTVAL (XEXP (rhs, 1)) < HOST_BITS_PER_WIDE_INT) + && INTVAL (XEXP (rhs, 1)) < GET_MODE_PRECISION (mode)) { - negcoeff1 = double_int_zero.set_bit (INTVAL (XEXP (rhs, 1))); - negcoeff1 = -negcoeff1; + negcoeff1 = wide_int::set_bit_in_zero (INTVAL (XEXP (rhs, 1)), + mode); + negcoeff1 = negcoeff1.neg (); rhs = XEXP (rhs, 0); } @@ -2422,11 +2249,9 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode, { rtx orig = gen_rtx_MINUS (mode, op0, op1); rtx coeff; - double_int val; bool speed = optimize_function_for_speed_p (cfun); - val = coeff0 + negcoeff1; - coeff = immed_double_int_const (val, mode); + coeff = immed_wide_int_const (coeff0 + negcoeff1, mode); tem = simplify_gen_binary (MULT, mode, lhs, coeff); return set_src_cost (tem, speed) <= set_src_cost (orig, speed) @@ -2578,26 +2403,13 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode, && trueop1 == CONST1_RTX (mode)) return op0; - /* Convert multiply by constant power of two into shift unless - we are still generating RTL. This test is a kludge. */ - if (CONST_INT_P (trueop1) - && (val = exact_log2 (UINTVAL (trueop1))) >= 0 - /* If the mode is larger than the host word size, and the - uppermost bit is set, then this isn't a power of two due - to implicit sign extension. */ - && (width <= HOST_BITS_PER_WIDE_INT - || val != HOST_BITS_PER_WIDE_INT - 1)) - return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val)); - - /* Likewise for multipliers wider than a word. */ - if (CONST_DOUBLE_AS_INT_P (trueop1) - && GET_MODE (op0) == mode - && CONST_DOUBLE_LOW (trueop1) == 0 - && (val = exact_log2 (CONST_DOUBLE_HIGH (trueop1))) >= 0 - && (val < HOST_BITS_PER_DOUBLE_INT - 1 - || GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_DOUBLE_INT)) - return simplify_gen_binary (ASHIFT, mode, op0, - GEN_INT (val + HOST_BITS_PER_WIDE_INT)); + /* Convert multiply by constant power of two into shift. */ + if (CONST_SCALAR_INT_P (trueop1)) + { + val = wide_int::from_rtx (trueop1, mode).exact_log2 (); + if (val >= 0 && val < GET_MODE_BITSIZE (mode)) + return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val)); + } /* x*2 is x+x and x*(-1) is -x */ if (CONST_DOUBLE_AS_FLOAT_P (trueop1) @@ -3645,9 +3457,9 @@ rtx simplify_const_binary_operation (enum rtx_code code, enum machine_mode mode, rtx op0, rtx op1) { - HOST_WIDE_INT arg0, arg1, arg0s, arg1s; - HOST_WIDE_INT val; +#if TARGET_SUPPORTS_WIDE_INT == 0 unsigned int width = GET_MODE_PRECISION (mode); +#endif if (VECTOR_MODE_P (mode) && code != VEC_CONCAT @@ -3840,299 +3652,128 @@ simplify_const_binary_operation (enum rtx_code code, enum machine_mode mode, /* We can fold some multi-word operations. */ if (GET_MODE_CLASS (mode) == MODE_INT - && width == HOST_BITS_PER_DOUBLE_INT - && (CONST_DOUBLE_AS_INT_P (op0) || CONST_INT_P (op0)) - && (CONST_DOUBLE_AS_INT_P (op1) || CONST_INT_P (op1))) + && CONST_SCALAR_INT_P (op0) + && CONST_SCALAR_INT_P (op1)) { - double_int o0, o1, res, tmp; - bool overflow; - - o0 = rtx_to_double_int (op0); - o1 = rtx_to_double_int (op1); - + wide_int result; + wide_int wop0 = wide_int::from_rtx (op0, mode); + wide_int wop1 = wide_int::from_rtx (op1, mode); + bool overflow = false; + +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* This assert keeps the simplification from producing a result + that cannot be represented in a CONST_DOUBLE but a lot of + upstream callers expect that this function never fails to + simplify something and so you if you added this to the test + above the code would die later anyway. If this assert + happens, you just need to make the port support wide int. */ + gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); +#endif switch (code) { case MINUS: - /* A - B == A + (-B). */ - o1 = -o1; - - /* Fall through.... */ + result = wop0 - wop1; + break; case PLUS: - res = o0 + o1; + result = wop0 + wop1; break; case MULT: - res = o0 * o1; + result = wop0 * wop1; break; case DIV: - res = o0.divmod_with_overflow (o1, false, TRUNC_DIV_EXPR, - &tmp, &overflow); + result = wop0.div_trunc (wop1, wide_int::SIGNED, &overflow); if (overflow) - return 0; + return NULL_RTX; break; - + case MOD: - tmp = o0.divmod_with_overflow (o1, false, TRUNC_DIV_EXPR, - &res, &overflow); + result = wop0.mod_trunc (wop1, wide_int::SIGNED, &overflow); if (overflow) - return 0; + return NULL_RTX; break; case UDIV: - res = o0.divmod_with_overflow (o1, true, TRUNC_DIV_EXPR, - &tmp, &overflow); + result = wop0.div_trunc (wop1, wide_int::UNSIGNED, &overflow); if (overflow) - return 0; + return NULL_RTX; break; case UMOD: - tmp = o0.divmod_with_overflow (o1, true, TRUNC_DIV_EXPR, - &res, &overflow); + result = wop0.mod_trunc (wop1, wide_int::UNSIGNED, &overflow); if (overflow) - return 0; + return NULL_RTX; break; case AND: - res = o0 & o1; + result = wop0 & wop1; break; case IOR: - res = o0 | o1; + result = wop0 | wop1; break; case XOR: - res = o0 ^ o1; + result = wop0 ^ wop1; break; case SMIN: - res = o0.smin (o1); + result = wop0.smin (wop1); break; case SMAX: - res = o0.smax (o1); + result = wop0.smax (wop1); break; case UMIN: - res = o0.umin (o1); + result = wop0.umin (wop1); break; case UMAX: - res = o0.umax (o1); - break; - - case LSHIFTRT: case ASHIFTRT: - case ASHIFT: - case ROTATE: case ROTATERT: - { - unsigned HOST_WIDE_INT cnt; - - if (SHIFT_COUNT_TRUNCATED) - { - o1.high = 0; - o1.low &= GET_MODE_PRECISION (mode) - 1; - } - - if (!o1.fits_uhwi () - || o1.to_uhwi () >= GET_MODE_PRECISION (mode)) - return 0; - - cnt = o1.to_uhwi (); - unsigned short prec = GET_MODE_PRECISION (mode); - - if (code == LSHIFTRT || code == ASHIFTRT) - res = o0.rshift (cnt, prec, code == ASHIFTRT); - else if (code == ASHIFT) - res = o0.alshift (cnt, prec); - else if (code == ROTATE) - res = o0.lrotate (cnt, prec); - else /* code == ROTATERT */ - res = o0.rrotate (cnt, prec); - } - break; - - default: - return 0; - } - - return immed_double_int_const (res, mode); - } - - if (CONST_INT_P (op0) && CONST_INT_P (op1) - && width <= HOST_BITS_PER_WIDE_INT && width != 0) - { - /* Get the integer argument values in two forms: - zero-extended in ARG0, ARG1 and sign-extended in ARG0S, ARG1S. */ - - arg0 = INTVAL (op0); - arg1 = INTVAL (op1); - - if (width < HOST_BITS_PER_WIDE_INT) - { - arg0 &= GET_MODE_MASK (mode); - arg1 &= GET_MODE_MASK (mode); - - arg0s = arg0; - if (val_signbit_known_set_p (mode, arg0s)) - arg0s |= ~GET_MODE_MASK (mode); - - arg1s = arg1; - if (val_signbit_known_set_p (mode, arg1s)) - arg1s |= ~GET_MODE_MASK (mode); - } - else - { - arg0s = arg0; - arg1s = arg1; - } - - /* Compute the value of the arithmetic. */ - - switch (code) - { - case PLUS: - val = arg0s + arg1s; - break; - - case MINUS: - val = arg0s - arg1s; - break; - - case MULT: - val = arg0s * arg1s; - break; - - case DIV: - if (arg1s == 0 - || ((unsigned HOST_WIDE_INT) arg0s - == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1) - && arg1s == -1)) - return 0; - val = arg0s / arg1s; - break; - - case MOD: - if (arg1s == 0 - || ((unsigned HOST_WIDE_INT) arg0s - == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1) - && arg1s == -1)) - return 0; - val = arg0s % arg1s; + result = wop0.umax (wop1); break; - case UDIV: - if (arg1 == 0 - || ((unsigned HOST_WIDE_INT) arg0s - == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1) - && arg1s == -1)) - return 0; - val = (unsigned HOST_WIDE_INT) arg0 / arg1; - break; - - case UMOD: - if (arg1 == 0 - || ((unsigned HOST_WIDE_INT) arg0s - == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1) - && arg1s == -1)) - return 0; - val = (unsigned HOST_WIDE_INT) arg0 % arg1; - break; - - case AND: - val = arg0 & arg1; - break; - - case IOR: - val = arg0 | arg1; - break; + case LSHIFTRT: + if (wop1.neg_p ()) + return NULL_RTX; - case XOR: - val = arg0 ^ arg1; + result = wop0.rshiftu (wop1, wide_int::TRUNC); break; - - case LSHIFTRT: - case ASHIFT: + case ASHIFTRT: - /* Truncate the shift if SHIFT_COUNT_TRUNCATED, otherwise make sure - the value is in range. We can't return any old value for - out-of-range arguments because either the middle-end (via - shift_truncation_mask) or the back-end might be relying on - target-specific knowledge. Nor can we rely on - shift_truncation_mask, since the shift might not be part of an - ashlM3, lshrM3 or ashrM3 instruction. */ - if (SHIFT_COUNT_TRUNCATED) - arg1 = (unsigned HOST_WIDE_INT) arg1 % width; - else if (arg1 < 0 || arg1 >= GET_MODE_BITSIZE (mode)) - return 0; - - val = (code == ASHIFT - ? ((unsigned HOST_WIDE_INT) arg0) << arg1 - : ((unsigned HOST_WIDE_INT) arg0) >> arg1); + if (wop1.neg_p ()) + return NULL_RTX; - /* Sign-extend the result for arithmetic right shifts. */ - if (code == ASHIFTRT && arg0s < 0 && arg1 > 0) - val |= ((unsigned HOST_WIDE_INT) (-1)) << (width - arg1); + result = wop0.rshifts (wop1, wide_int::TRUNC); break; + + case ASHIFT: + if (wop1.neg_p ()) + return NULL_RTX; - case ROTATERT: - if (arg1 < 0) - return 0; - - arg1 %= width; - val = ((((unsigned HOST_WIDE_INT) arg0) << (width - arg1)) - | (((unsigned HOST_WIDE_INT) arg0) >> arg1)); + result = wop0.lshift (wop1, wide_int::TRUNC); break; - + case ROTATE: - if (arg1 < 0) - return 0; - - arg1 %= width; - val = ((((unsigned HOST_WIDE_INT) arg0) << arg1) - | (((unsigned HOST_WIDE_INT) arg0) >> (width - arg1))); - break; - - case COMPARE: - /* Do nothing here. */ - return 0; - - case SMIN: - val = arg0s <= arg1s ? arg0s : arg1s; - break; - - case UMIN: - val = ((unsigned HOST_WIDE_INT) arg0 - <= (unsigned HOST_WIDE_INT) arg1 ? arg0 : arg1); - break; + if (wop1.neg_p ()) + return NULL_RTX; - case SMAX: - val = arg0s > arg1s ? arg0s : arg1s; + result = wop0.lrotate (wop1); break; + + case ROTATERT: + if (wop1.neg_p ()) + return NULL_RTX; - case UMAX: - val = ((unsigned HOST_WIDE_INT) arg0 - > (unsigned HOST_WIDE_INT) arg1 ? arg0 : arg1); + result = wop0.rrotate (wop1); break; - case SS_PLUS: - case US_PLUS: - case SS_MINUS: - case US_MINUS: - case SS_MULT: - case US_MULT: - case SS_DIV: - case US_DIV: - case SS_ASHIFT: - case US_ASHIFT: - /* ??? There are simplifications that can be done. */ - return 0; - default: - gcc_unreachable (); + return NULL_RTX; } - - return gen_int_mode (val, mode); + return immed_wide_int_const (result, mode); } return NULL_RTX; @@ -4800,10 +4441,11 @@ comparison_result (enum rtx_code code, int known_results) } } -/* Check if the given comparison (done in the given MODE) is actually a - tautology or a contradiction. - If no simplification is possible, this function returns zero. - Otherwise, it returns either const_true_rtx or const0_rtx. */ +/* Check if the given comparison (done in the given MODE) is actually + a tautology or a contradiction. If the mode is VOID_mode, the + comparison is done in "infinite precision". If no simplification + is possible, this function returns zero. Otherwise, it returns + either const_true_rtx or const0_rtx. */ rtx simplify_const_relational_operation (enum rtx_code code, @@ -4927,59 +4569,25 @@ simplify_const_relational_operation (enum rtx_code code, /* Otherwise, see if the operands are both integers. */ if ((GET_MODE_CLASS (mode) == MODE_INT || mode == VOIDmode) - && (CONST_DOUBLE_AS_INT_P (trueop0) || CONST_INT_P (trueop0)) - && (CONST_DOUBLE_AS_INT_P (trueop1) || CONST_INT_P (trueop1))) + && CONST_SCALAR_INT_P (trueop0) && CONST_SCALAR_INT_P (trueop1)) { - int width = GET_MODE_PRECISION (mode); - HOST_WIDE_INT l0s, h0s, l1s, h1s; - unsigned HOST_WIDE_INT l0u, h0u, l1u, h1u; - - /* Get the two words comprising each integer constant. */ - if (CONST_DOUBLE_AS_INT_P (trueop0)) - { - l0u = l0s = CONST_DOUBLE_LOW (trueop0); - h0u = h0s = CONST_DOUBLE_HIGH (trueop0); - } - else - { - l0u = l0s = INTVAL (trueop0); - h0u = h0s = HWI_SIGN_EXTEND (l0s); - } - - if (CONST_DOUBLE_AS_INT_P (trueop1)) - { - l1u = l1s = CONST_DOUBLE_LOW (trueop1); - h1u = h1s = CONST_DOUBLE_HIGH (trueop1); - } - else - { - l1u = l1s = INTVAL (trueop1); - h1u = h1s = HWI_SIGN_EXTEND (l1s); - } - - /* If WIDTH is nonzero and smaller than HOST_BITS_PER_WIDE_INT, - we have to sign or zero-extend the values. */ - if (width != 0 && width < HOST_BITS_PER_WIDE_INT) - { - l0u &= GET_MODE_MASK (mode); - l1u &= GET_MODE_MASK (mode); - - if (val_signbit_known_set_p (mode, l0s)) - l0s |= ~GET_MODE_MASK (mode); - - if (val_signbit_known_set_p (mode, l1s)) - l1s |= ~GET_MODE_MASK (mode); - } - if (width != 0 && width <= HOST_BITS_PER_WIDE_INT) - h0u = h1u = 0, h0s = HWI_SIGN_EXTEND (l0s), h1s = HWI_SIGN_EXTEND (l1s); - - if (h0u == h1u && l0u == l1u) + enum machine_mode cmode = mode; + wide_int wo0; + wide_int wo1; + + /* It would be nice if we really had a mode here. However, the + largest int representable on the target is as good as + infinite. */ + if (mode == VOIDmode) + cmode = MAX_MODE_INT; + wo0 = wide_int::from_rtx (trueop0, cmode); + wo1 = wide_int::from_rtx (trueop1, cmode); + if (wo0 == wo1) return comparison_result (code, CMP_EQ); else { - int cr; - cr = (h0s < h1s || (h0s == h1s && l0u < l1u)) ? CMP_LT : CMP_GT; - cr |= (h0u < h1u || (h0u == h1u && l0u < l1u)) ? CMP_LTU : CMP_GTU; + int cr = wo0.lts_p (wo1) ? CMP_LT : CMP_GT; + cr |= wo0.ltu_p (wo1) ? CMP_LTU : CMP_GTU; return comparison_result (code, cr); } } @@ -5394,9 +5002,9 @@ simplify_ternary_operation (enum rtx_code code, enum machine_mode mode, return 0; } -/* Evaluate a SUBREG of a CONST_INT or CONST_DOUBLE or CONST_FIXED - or CONST_VECTOR, - returning another CONST_INT or CONST_DOUBLE or CONST_FIXED or CONST_VECTOR. +/* Evaluate a SUBREG of a CONST_INT or CONST_WIDE_INT or CONST_DOUBLE + or CONST_FIXED or CONST_VECTOR, returning another CONST_INT or + CONST_WIDE_INT or CONST_DOUBLE or CONST_FIXED or CONST_VECTOR. Works by unpacking OP into a collection of 8-bit values represented as a little-endian array of 'unsigned char', selecting by BYTE, @@ -5406,13 +5014,11 @@ static rtx simplify_immed_subreg (enum machine_mode outermode, rtx op, enum machine_mode innermode, unsigned int byte) { - /* We support up to 512-bit values (for V8DFmode). */ enum { - max_bitsize = 512, value_bit = 8, value_mask = (1 << value_bit) - 1 }; - unsigned char value[max_bitsize / value_bit]; + unsigned char value[MAX_BITSIZE_MODE_ANY_MODE/value_bit]; int value_start; int i; int elem; @@ -5424,6 +5030,7 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op, rtvec result_v = NULL; enum mode_class outer_class; enum machine_mode outer_submode; + int max_bitsize; /* Some ports misuse CCmode. */ if (GET_MODE_CLASS (outermode) == MODE_CC && CONST_INT_P (op)) @@ -5433,6 +5040,10 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op, if (COMPLEX_MODE_P (outermode)) return NULL_RTX; + /* We support any size mode. */ + max_bitsize = MAX (GET_MODE_BITSIZE (outermode), + GET_MODE_BITSIZE (innermode)); + /* Unpack the value. */ if (GET_CODE (op) == CONST_VECTOR) @@ -5482,8 +5093,20 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op, *vp++ = INTVAL (el) < 0 ? -1 : 0; break; + case CONST_WIDE_INT: + { + wide_int val = wide_int::from_rtx (el, innermode); + unsigned char extend = val.sign_mask (); + + for (i = 0; i < elem_bitsize; i += value_bit) + *vp++ = val.extract_to_hwi (i, value_bit); + for (; i < elem_bitsize; i += value_bit) + *vp++ = extend; + } + break; + case CONST_DOUBLE: - if (GET_MODE (el) == VOIDmode) + if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (el) == VOIDmode) { unsigned char extend = 0; /* If this triggers, someone should have generated a @@ -5506,7 +5129,8 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op, } else { - long tmp[max_bitsize / 32]; + /* This is big enough for anything on the platform. */ + long tmp[MAX_BITSIZE_MODE_ANY_MODE / 32]; int bitsize = GET_MODE_BITSIZE (GET_MODE (el)); gcc_assert (SCALAR_FLOAT_MODE_P (GET_MODE (el))); @@ -5626,24 +5250,27 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op, case MODE_INT: case MODE_PARTIAL_INT: { - unsigned HOST_WIDE_INT hi = 0, lo = 0; - - for (i = 0; - i < HOST_BITS_PER_WIDE_INT && i < elem_bitsize; - i += value_bit) - lo |= (unsigned HOST_WIDE_INT)(*vp++ & value_mask) << i; - for (; i < elem_bitsize; i += value_bit) - hi |= (unsigned HOST_WIDE_INT)(*vp++ & value_mask) - << (i - HOST_BITS_PER_WIDE_INT); - - /* immed_double_const doesn't call trunc_int_for_mode. I don't - know why. */ - if (elem_bitsize <= HOST_BITS_PER_WIDE_INT) - elems[elem] = gen_int_mode (lo, outer_submode); - else if (elem_bitsize <= HOST_BITS_PER_DOUBLE_INT) - elems[elem] = immed_double_const (lo, hi, outer_submode); - else - return NULL_RTX; + int u; + int base = 0; + int units + = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1) + / HOST_BITS_PER_WIDE_INT; + HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; + wide_int r; + + for (u = 0; u < units; u++) + { + unsigned HOST_WIDE_INT buf = 0; + for (i = 0; + i < HOST_BITS_PER_WIDE_INT && base + i < elem_bitsize; + i += value_bit) + buf |= (unsigned HOST_WIDE_INT)(*vp++ & value_mask) << i; + + tmp[u] = buf; + base += HOST_BITS_PER_WIDE_INT; + } + r = wide_int::from_array (tmp, units, outer_submode); + elems[elem] = immed_wide_int_const (r, outer_submode); } break; @@ -5651,7 +5278,7 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op, case MODE_DECIMAL_FLOAT: { REAL_VALUE_TYPE r; - long tmp[max_bitsize / 32]; + long tmp[MAX_BITSIZE_MODE_ANY_INT / 32]; /* real_from_target wants its input in words affected by FLOAT_WORDS_BIG_ENDIAN. However, we ignore this, diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c index cfd42ad..85b1552 100644 --- a/gcc/tree-ssa-address.c +++ b/gcc/tree-ssa-address.c @@ -189,15 +189,18 @@ addr_for_mem_ref (struct mem_address *addr, addr_space_t as, struct mem_addr_template *templ; if (addr->step && !integer_onep (addr->step)) - st = immed_double_int_const (tree_to_double_int (addr->step), pointer_mode); + st = immed_wide_int_const (wide_int::from_tree (addr->step), + TYPE_MODE (TREE_TYPE (addr->step))); else st = NULL_RTX; if (addr->offset && !integer_zerop (addr->offset)) - off = immed_double_int_const - (tree_to_double_int (addr->offset) - .sext (TYPE_PRECISION (TREE_TYPE (addr->offset))), - pointer_mode); + { + wide_int dc = wide_int::from_tree (addr->offset); + dc = dc.sforce_to_size (TREE_TYPE (addr->offset)); + off = immed_wide_int_const (dc, + TYPE_MODE (TREE_TYPE (addr->offset))); + } else off = NULL_RTX; diff --git a/gcc/tree.c b/gcc/tree.c index 98ad5d8..11075e3 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -59,6 +59,7 @@ along with GCC; see the file COPYING3. If not see #include "except.h" #include "debug.h" #include "intl.h" +#include "wide-int.h" /* Tree code classes. */ @@ -1064,6 +1065,23 @@ double_int_to_tree (tree type, double_int cst) return build_int_cst_wide (type, cst.low, cst.high); } +/* Constructs tree in type TYPE from with value given by CST. Signedness + of CST is assumed to be the same as the signedness of TYPE. */ + +tree +wide_int_to_tree (tree type, const wide_int &cst) +{ + wide_int v; + + gcc_assert (cst.get_len () <= 2); + if (TYPE_UNSIGNED (type)) + v = cst.zext (TYPE_PRECISION (type)); + else + v = cst.sext (TYPE_PRECISION (type)); + + return build_int_cst_wide (type, v.elt (0), v.elt (1)); +} + /* Returns true if CST fits into range of TYPE. Signedness of CST is assumed to be the same as the signedness of TYPE. */ diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index 0db1562..7cb99ac 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -3513,6 +3513,23 @@ loc_cmp (rtx x, rtx y) default: gcc_unreachable (); } + if (CONST_WIDE_INT_P (x)) + { + /* Compare the vector length first. */ + if (CONST_WIDE_INT_NUNITS (x) >= CONST_WIDE_INT_NUNITS (y)) + return 1; + else if (CONST_WIDE_INT_NUNITS (x) < CONST_WIDE_INT_NUNITS (y)) + return -1; + + /* Compare the vectors elements. */; + for (j = CONST_WIDE_INT_NUNITS (x) - 1; j >= 0 ; j--) + { + if (CONST_WIDE_INT_ELT (x, j) < CONST_WIDE_INT_ELT (y, j)) + return -1; + if (CONST_WIDE_INT_ELT (x, j) > CONST_WIDE_INT_ELT (y, j)) + return 1; + } + } return 0; } diff --git a/gcc/varasm.c b/gcc/varasm.c index 6648103..c104d87 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -3406,6 +3406,7 @@ const_rtx_hash_1 (rtx *xp, void *data) enum rtx_code code; hashval_t h, *hp; rtx x; + int i; x = *xp; code = GET_CODE (x); @@ -3416,12 +3417,12 @@ const_rtx_hash_1 (rtx *xp, void *data) { case CONST_INT: hwi = INTVAL (x); + fold_hwi: { int shift = sizeof (hashval_t) * CHAR_BIT; const int n = sizeof (HOST_WIDE_INT) / sizeof (hashval_t); - int i; - + h ^= (hashval_t) hwi; for (i = 1; i < n; ++i) { @@ -3431,8 +3432,16 @@ const_rtx_hash_1 (rtx *xp, void *data) } break; + case CONST_WIDE_INT: + hwi = GET_MODE_PRECISION (mode); + { + for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++) + hwi ^= CONST_WIDE_INT_ELT (x, i); + goto fold_hwi; + } + case CONST_DOUBLE: - if (mode == VOIDmode) + if (TARGET_SUPPORTS_WIDE_INT == 0 && mode == VOIDmode) { hwi = CONST_DOUBLE_LOW (x) ^ CONST_DOUBLE_HIGH (x); goto fold_hwi; [-- Attachment #3: p5-3.clog --] [-- Type: text/plain, Size: 4478 bytes --] 2013-2-26 Kenneth Zadeck <zadeck@naturalbridge.com> * alias.c (rtx_equal_for_memref_p): Fixed comment. * builtins.c (c_getstr, c_readstr, expand_builtin_signbit): Make to work with any size int. * combine.c (try_combine, subst): Changed to support any size integer. * coretypes.h (hwivec_def, hwivec, const_hwivec): New. * cse.c (hash_rtx_cb): Added CONST_WIDE_INT case are modified DOUBLE_INT case. * cselib.c (rtx_equal_for_cselib_1): Converted cases to CASE_CONST_UNIQUE. (cselib_hash_rtx): Added CONST_WIDE_INT case. * defaults.h (TARGET_SUPPORTS_WIDE_INT): New. * doc/rtl.texi (CONST_DOUBLE, CONST_WIDE_INT): Updated. * doc/tm.texi (TARGET_SUPPORTS_WIDE_INT): New. * doc/tm.texi.in (TARGET_SUPPORTS_WIDE_INT): New. * dojump.c (prefer_and_bit_test): Use wide int api. * dwarf2out.c (get_full_len): New function. (dw_val_equal_p, size_of_loc_descr, output_loc_operands, print_die, attr_checksum, same_dw_val_p, size_of_die, value_format, output_die, mem_loc_descriptor, loc_descriptor, extract_int, add_const_value_attribute, hash_loc_operands, compare_loc_operands): Add support for wide-ints. (add_AT_wide): New function. * dwarf2out.h (enum dw_val_class): Added dw_val_class_wide_int. * emit-rtl.c (const_wide_int_htab): Add marking. (const_wide_int_htab_hash, const_wide_int_htab_eq, lookup_const_wide_int, immed_wide_int_const): New functions. (const_double_htab_hash, const_double_htab_eq, rtx_to_double_int, immed_double_const): Conditionally changed CONST_DOUBLE behavior. (immed_double_const, init_emit_once): Changed to support wide-int. * explow.c (plus_constant): Now uses wide-int api. * expmed.c (mask_rtx, lshift_value): Now uses wide-int. (expand_mult, expand_smod_pow2): Make to work with any size int. (make_tree): Added CONST_WIDE_INT case. * expr.c (convert_modes): Added support for any size int. (emit_group_load_1): Added todo for place that still does not allow large ints. (store_expr, expand_constructor): Fixed comments. (expand_expr_real_2, expand_expr_real_1, reduce_to_bit_field_precision, const_vector_from_tree): Converted to use wide-int api. * final.c (output_addr_const): Added CONST_WIDE_INT case. * genemit.c (gen_exp): Added CONST_WIDE_INT case. * gengenrtl.c (excluded_rtx): Added CONST_WIDE_INT case. * gengtype.c (wide-int): New type. * genpreds.c (write_one_predicate_function): Fixed comment. (add_constraint): Added CONST_WIDE_INT test. (write_tm_constrs_h): Do not emit hval or lval if target supports wide integers. * gensupport.c (std_preds): Added const_wide_int_operand and const_scalar_int_operand. * optabs.c (expand_subword_shift, expand_doubleword_shift, expand_absneg_bit, expand_absneg_bit, expand_copysign_absneg, expand_copysign_bit): Made to work with any size int. * postreload.c (reload_cse_simplify_set): Now uses wide-int api. * print-rtl.c (print_rtx): Added CONST_WIDE_INT case. * read-rtl.c (validate_const_wide_int): New function. (read_rtx_code): Added CONST_WIDE_INT case. * recog.c (const_scalar_int_operand, const_double_operand): New versions if target supports wide integers. (const_wide_int_operand): New function. * rtl.c (DEF_RTL_EXPR): Added CONST_WIDE_INT case. (rtx_size): Ditto. (rtx_alloc_stat, hwivec_output_hex, hwivec_check_failed_bounds): New functions. (iterative_hash_rtx): Added CONST_WIDE_INT case. * rtl.def (CONST_WIDE_INT): New. * rtl.h (hwivec_def): New function. (HWI_GET_NUM_ELEM, HWI_PUT_NUM_ELEM, CONST_WIDE_INT_P, CONST_SCALAR_INT_P, XHWIVEC_ELT, HWIVEC_CHECK, CONST_WIDE_INT_VEC, CONST_WIDE_INT_NUNITS, CONST_WIDE_INT_ELT, rtx_alloc_v): New macros. (chain_next): Added hwiv case. (CASE_CONST_SCALAR_INT, CONST_INT, CONST_WIDE_INT): Added new defs if target supports wide ints. * rtlanal.c (commutative_operand_precedence, split_double): Added CONST_WIDE_INT case. * sched-vis.c (print_value): Added CONST_WIDE_INT case are modified DOUBLE_INT case. * sel-sched-ir.c (lhs_and_rhs_separable_p): Fixed comment * simplify-rtx.c (mode_signbit_p, simplify_const_unary_operation, simplify_binary_operation_1, simplify_const_binary_operation, simplify_const_relational_operation, simplify_immed_subreg): Make work with any size int. . * tree-ssa-address.c (addr_for_mem_ref): Changes to use wide-int rather than double-int. * tree.c (wide_int_to_tree): New function. * var-tracking.c (loc_cmp): Added CONST_WIDE_INT case. * varasm.c (const_rtx_hash_1): Added CONST_WIDE_INT case. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 10:00 ` Richard Biener 2012-10-31 10:02 ` Richard Sandiford @ 2012-10-31 18:34 ` Andrew Haley 1 sibling, 0 replies; 59+ messages in thread From: Andrew Haley @ 2012-10-31 18:34 UTC (permalink / raw) To: Richard Biener; +Cc: Kenneth Zadeck, Jakub Jelinek, gcc, gcc-patches On 10/31/2012 09:49 AM, Richard Biener wrote: > On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck > <zadeck@naturalbridge.com> wrote: >> jakub, >> >> i am hoping to get the rest of my wide integer conversion posted by nov 5. >> I am under some adverse conditions here: hurricane sandy hit her pretty >> badly. my house is hooked up to a small generator, and no one has any power >> for miles around. >> >> So far richi has promised to review them. he has sent some comments, but >> so far no reviews. Some time after i get the first round of them posted, >> i will do a second round that incorporates everyones comments. >> >> But i would like a little slack here if possible. While this work is a >> show stopper for my private port, the patches address serious problems for >> many of the public ports, especially ones that have very flexible vector >> units. I believe that there are significant set of latent problems >> currently with the existing ports that use ti mode that these patches will >> fix. >> >> However, i will do everything in my power to get the first round of the >> patches posted by nov 5 deadline. > > I suppose you are not going to merge your private port for 4.8 and thus > the wide-int changes are not a show-stopper for you. > > That said, I considered the main conversion to be appropriate to be > defered for the next stage1. There is no advantage in disrupting the > tree more at this stage. We are still in Stage 1. If it were later in the release cycle this argument would have some merit, but under the rules this sort of thing is allowed at any point in Stage 1. If we aren't going to allow something like this because "it's too late" we should have closed Stage 1 earlier. Andrew. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek ` (4 preceding siblings ...) 2012-10-30 21:07 ` Kenneth Zadeck @ 2012-10-30 22:06 ` Sriraman Tallam 2012-10-31 9:09 ` Bin Cheng ` (5 subsequent siblings) 11 siblings, 0 replies; 59+ messages in thread From: Sriraman Tallam @ 2012-10-30 22:06 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc, GCC Patches Hi Jakub, My function multiversioning patch is being reviewed and I hope to get this in by Nov. 5. Thanks, -Sri. On Mon, Oct 29, 2012 at 10:56 AM, Jakub Jelinek <jakub@redhat.com> wrote: > Status > ====== > > I'd like to close the stage 1 phase of GCC 4.8 development > on Monday, November 5th. If you have still patches for new features you'd > like to see in GCC 4.8, please post them for review soon. Patches > posted before the freeze, but reviewed shortly after the freeze, may > still go in, further changes should be just bugfixes and documentation > fixes. > > > Quality Data > ============ > > Priority # Change from Last Report > -------- --- ----------------------- > P1 23 + 23 > P2 77 + 8 > P3 85 + 84 > -------- --- ----------------------- > Total 185 +115 > > > Previous Report > =============== > > http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html > > The next report will be sent by me again, announcing end of stage 1. ^ permalink raw reply [flat|nested] 59+ messages in thread
* RE: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek ` (5 preceding siblings ...) 2012-10-30 22:06 ` Sriraman Tallam @ 2012-10-31 9:09 ` Bin Cheng 2012-10-31 10:23 ` Richard Biener ` (4 subsequent siblings) 11 siblings, 0 replies; 59+ messages in thread From: Bin Cheng @ 2012-10-31 9:09 UTC (permalink / raw) To: 'Jakub Jelinek', gcc; +Cc: gcc-patches > -----Original Message----- > From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-owner@gcc.gnu.org] On > Behalf Of Jakub Jelinek > Sent: Tuesday, October 30, 2012 1:57 AM > To: gcc@gcc.gnu.org > Cc: gcc-patches@gcc.gnu.org > Subject: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon > > Status > ====== > > I'd like to close the stage 1 phase of GCC 4.8 development on Monday, November > 5th. If you have still patches for new features you'd like to see in GCC 4.8, > please post them for review soon. Patches posted before the freeze, but > reviewed shortly after the freeze, may still go in, further changes should be > just bugfixes and documentation fixes. > > > Quality Data > ============ > > Priority # Change from Last Report > -------- --- ----------------------- > P1 23 + 23 > P2 77 + 8 > P3 85 + 84 > -------- --- ----------------------- > Total 185 +115 > > > Previous Report > =============== > > http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html > > The next report will be sent by me again, announcing end of stage 1. Hi, I am working on register pressure directed hoist pass and have committed the main patch in trunk. Here I still have two patches in this area improving it. I will send these two patches recently and hope it can be included in 4.8 if OK. Thanks. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek ` (6 preceding siblings ...) 2012-10-31 9:09 ` Bin Cheng @ 2012-10-31 10:23 ` Richard Biener 2012-11-05 16:32 ` David Malcolm 2012-10-31 10:31 ` JonY ` (3 subsequent siblings) 11 siblings, 1 reply; 59+ messages in thread From: Richard Biener @ 2012-10-31 10:23 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc, gcc-patches, David Malcolm, Michael Matz On Mon, Oct 29, 2012 at 6:56 PM, Jakub Jelinek <jakub@redhat.com> wrote: > Status > ====== > > I'd like to close the stage 1 phase of GCC 4.8 development > on Monday, November 5th. If you have still patches for new features you'd > like to see in GCC 4.8, please post them for review soon. Reminds me of the stable plugin API for introspection. David, Micha - what's the status here? Adding this is certainly ok during stage3 and I think that we should have something in 4.8 to kick of further development here. Richard. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 10:23 ` Richard Biener @ 2012-11-05 16:32 ` David Malcolm 0 siblings, 0 replies; 59+ messages in thread From: David Malcolm @ 2012-11-05 16:32 UTC (permalink / raw) To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, Michael Matz On Wed, 2012-10-31 at 11:13 +0100, Richard Biener wrote: > On Mon, Oct 29, 2012 at 6:56 PM, Jakub Jelinek <jakub@redhat.com> wrote: > > Status > > ====== > > > > I'd like to close the stage 1 phase of GCC 4.8 development > > on Monday, November 5th. If you have still patches for new features you'd > > like to see in GCC 4.8, please post them for review soon. > > Reminds me of the stable plugin API for introspection. David, Micha - what's > the status here? Adding this is certainly ok during stage3 and I think that > we should have something in 4.8 to kick of further development here. (sorry for the belated response, I was on vacation). I'm currently leaning towards having the API as a separate source tree that can be compiled against 4.6 through 4.8 onwards (hiding all necessary compatibility cruft within it [1]), generating a library that plugins can link against, providing a consistent C API across all of these GCC versions. Keeping it out-of-tree allows plugins to be written that can work with older versions of gcc, and allows the plugin API to change more rapidly than the rest of gcc (especially important for these older gcc releases). Distributions of gcc could build the plugin api at the same time as gcc, albeit from a separate tarball. When the API is more mature, we could merge it inside gcc proper, I guess. I'll try to post something later today. Dave [1] e.g C vs C++ linkage ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek ` (7 preceding siblings ...) 2012-10-31 10:23 ` Richard Biener @ 2012-10-31 10:31 ` JonY 2012-10-31 10:44 ` Jakub Jelinek 2012-10-31 11:12 ` Jonathan Wakely 2012-11-02 22:51 ` [wwwdocs] PATCH for " Gerald Pfeifer ` (2 subsequent siblings) 11 siblings, 2 replies; 59+ messages in thread From: JonY @ 2012-10-31 10:31 UTC (permalink / raw) To: gcc; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 620 bytes --] On 10/30/2012 01:56, Jakub Jelinek wrote: > Status > ====== > > I'd like to close the stage 1 phase of GCC 4.8 development > on Monday, November 5th. If you have still patches for new features you'd > like to see in GCC 4.8, please post them for review soon. Patches > posted before the freeze, but reviewed shortly after the freeze, may > still go in, further changes should be just bugfixes and documentation > fixes. > Somebody with commit rights please push "[Patch] Remove _GLIBCXX_HAVE_BROKEN_VSWPRINTF from mingw32-w64/os_defines.h". Kai has already approved, but is off for the week. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 196 bytes --] ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 10:31 ` JonY @ 2012-10-31 10:44 ` Jakub Jelinek 2012-10-31 11:12 ` Jonathan Wakely 1 sibling, 0 replies; 59+ messages in thread From: Jakub Jelinek @ 2012-10-31 10:44 UTC (permalink / raw) To: JonY; +Cc: gcc, gcc-patches On Wed, Oct 31, 2012 at 06:25:45PM +0800, JonY wrote: > On 10/30/2012 01:56, Jakub Jelinek wrote: > > I'd like to close the stage 1 phase of GCC 4.8 development > > on Monday, November 5th. If you have still patches for new features you'd > > like to see in GCC 4.8, please post them for review soon. Patches > > posted before the freeze, but reviewed shortly after the freeze, may > > still go in, further changes should be just bugfixes and documentation > > fixes. > > > > Somebody with commit rights please push "[Patch] Remove > _GLIBCXX_HAVE_BROKEN_VSWPRINTF from mingw32-w64/os_defines.h". > > Kai has already approved, but is off for the week. That looks like a bugfix (or even regression bugfix). Bugfixes are fine through stage 3, regression bugfixes are fine even in stage 4. Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-31 10:31 ` JonY 2012-10-31 10:44 ` Jakub Jelinek @ 2012-10-31 11:12 ` Jonathan Wakely 1 sibling, 0 replies; 59+ messages in thread From: Jonathan Wakely @ 2012-10-31 11:12 UTC (permalink / raw) To: JonY; +Cc: gcc, gcc-patches, libstdc++ On 31 October 2012 10:25, JonY wrote: > On 10/30/2012 01:56, Jakub Jelinek wrote: >> Status >> ====== >> >> I'd like to close the stage 1 phase of GCC 4.8 development >> on Monday, November 5th. If you have still patches for new features you'd >> like to see in GCC 4.8, please post them for review soon. Patches >> posted before the freeze, but reviewed shortly after the freeze, may >> still go in, further changes should be just bugfixes and documentation >> fixes. >> > > Somebody with commit rights please push "[Patch] Remove > _GLIBCXX_HAVE_BROKEN_VSWPRINTF from mingw32-w64/os_defines.h". > > Kai has already approved, but is off for the week. I could have done that, if it had been sent to the right lists. All libstdc++ patches go to both gcc-patches and libstdc++@gcc.gnu.org please. Let's move this to the libstdc++ list, I have some questions about the patch. ^ permalink raw reply [flat|nested] 59+ messages in thread
* [wwwdocs] PATCH for Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek ` (8 preceding siblings ...) 2012-10-31 10:31 ` JonY @ 2012-11-02 22:51 ` Gerald Pfeifer 2012-11-05 12:42 ` Peter Bergner 2012-11-06 2:57 ` Easwaran Raman 11 siblings, 0 replies; 59+ messages in thread From: Gerald Pfeifer @ 2012-11-02 22:51 UTC (permalink / raw) To: gcc-patches On Mon, 29 Oct 2012, Jakub Jelinek wrote: > I'd like to close the stage 1 phase of GCC 4.8 development Documented via the patch below. I also changed "Active Development" to "Development" to reduce text density and improve formatting on a wider range of window/text sizes. Gerald Index: index.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.863 diff -u -3 -p -r1.863 index.html --- index.html 20 Sep 2012 15:35:43 -0000 1.863 +++ index.html 2 Nov 2012 22:48:54 -0000 @@ -171,12 +171,12 @@ Any additions? Don't be shy, send them </span> </dd> -<dt><span class="version">Active development:</span> +<dt><span class="version">Development:</span> GCC 4.8.0 (<a href="gcc-4.8/changes.html">changes</a>, <a href="gcc-4.8/criteria.html">release criteria</a>) </dt><dd> Status: <!--GCC 4.8 status below--> - <a href="http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html">2012-03-02</a> + <a href="http://gcc.gnu.org/ml/gcc/2012-10/msg00434.html">2012-10-29</a> <!--GCC 4.8 status above--> (general development, stage 1). <br /> ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek ` (9 preceding siblings ...) 2012-11-02 22:51 ` [wwwdocs] PATCH for " Gerald Pfeifer @ 2012-11-05 12:42 ` Peter Bergner 2012-11-05 12:53 ` Jakub Jelinek 2012-11-06 2:57 ` Easwaran Raman 11 siblings, 1 reply; 59+ messages in thread From: Peter Bergner @ 2012-11-05 12:42 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc, gcc-patches On Mon, 2012-10-29 at 18:56 +0100, Jakub Jelinek wrote: > Status > ====== > > I'd like to close the stage 1 phase of GCC 4.8 development > on Monday, November 5th. If you have still patches for new features you'd > like to see in GCC 4.8, please post them for review soon. Patches > posted before the freeze, but reviewed shortly after the freeze, may > still go in, further changes should be just bugfixes and documentation > fixes. I'd like to post later today (hopefully this morning) a very minimal configure patch that adds the -mcpu=power8 and -mtune=power8 compiler options to gcc. Currently, power8 will be an alias for power7, but getting this path in now allows us to add power8 support to the compiler without having to touch the arch independent configure script. The only hang up at the moment is we're still determining the assembler mnemonic we'll be releasing that the gcc configure script will use to test for power6 assembler support. Peter ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-05 12:42 ` Peter Bergner @ 2012-11-05 12:53 ` Jakub Jelinek 2012-11-05 14:40 ` Peter Bergner 0 siblings, 1 reply; 59+ messages in thread From: Jakub Jelinek @ 2012-11-05 12:53 UTC (permalink / raw) To: Peter Bergner; +Cc: gcc, gcc-patches On Mon, Nov 05, 2012 at 06:41:47AM -0600, Peter Bergner wrote: > On Mon, 2012-10-29 at 18:56 +0100, Jakub Jelinek wrote: > > I'd like to close the stage 1 phase of GCC 4.8 development > > on Monday, November 5th. If you have still patches for new features you'd > > like to see in GCC 4.8, please post them for review soon. Patches > > posted before the freeze, but reviewed shortly after the freeze, may > > still go in, further changes should be just bugfixes and documentation > > fixes. > > I'd like to post later today (hopefully this morning) a very minimal > configure patch that adds the -mcpu=power8 and -mtune=power8 compiler > options to gcc. Currently, power8 will be an alias for power7, but > getting this path in now allows us to add power8 support to the > compiler without having to touch the arch independent configure script. config.gcc target specific hunks are part of the backend, the individual target maintainers can approve changes to that, I really don't see a reason to add a dummy alias now just for that. If the power8 enablement is approved and non-intrusive enough that it would be acceptable even during stage 3, then so would be corresponding config.gcc changes. Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-05 12:53 ` Jakub Jelinek @ 2012-11-05 14:40 ` Peter Bergner 2012-11-05 14:48 ` Jakub Jelinek 0 siblings, 1 reply; 59+ messages in thread From: Peter Bergner @ 2012-11-05 14:40 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc, gcc-patches On Mon, 2012-11-05 at 13:53 +0100, Jakub Jelinek wrote: > On Mon, Nov 05, 2012 at 06:41:47AM -0600, Peter Bergner wrote: > > I'd like to post later today (hopefully this morning) a very minimal > > configure patch that adds the -mcpu=power8 and -mtune=power8 compiler > > options to gcc. Currently, power8 will be an alias for power7, but > > getting this path in now allows us to add power8 support to the > > compiler without having to touch the arch independent configure script. > > config.gcc target specific hunks are part of the backend, the individual > target maintainers can approve changes to that, I really don't see a reason > to add a dummy alias now just for that. If the power8 enablement is > approved and non-intrusive enough that it would be acceptable even during > stage 3, then so would be corresponding config.gcc changes. Well we also patch config.in and configure.ac/configure. If those are acceptable to be patched later too, then great. If not, the patch isn't really very large. We did do this for power7 initially too: http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00162.html Peter ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-05 14:40 ` Peter Bergner @ 2012-11-05 14:48 ` Jakub Jelinek 2012-11-06 4:47 ` Peter Bergner 0 siblings, 1 reply; 59+ messages in thread From: Jakub Jelinek @ 2012-11-05 14:48 UTC (permalink / raw) To: Peter Bergner; +Cc: gcc, gcc-patches On Mon, Nov 05, 2012 at 08:40:00AM -0600, Peter Bergner wrote: > Well we also patch config.in and configure.ac/configure. If those are > acceptable to be patched later too, then great. If not, the patch That is the same thing as config.gcc bits. > isn't really very large. We did do this for power7 initially too: > > http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00162.html But then power7 patch went in during stage1 of the n+1 release, and wasn't really backported to release branch (just to distro vendor branches), right? Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-05 14:48 ` Jakub Jelinek @ 2012-11-06 4:47 ` Peter Bergner 0 siblings, 0 replies; 59+ messages in thread From: Peter Bergner @ 2012-11-06 4:47 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc, gcc-patches On Mon, 2012-11-05 at 15:47 +0100, Jakub Jelinek wrote: > On Mon, Nov 05, 2012 at 08:40:00AM -0600, Peter Bergner wrote: > > Well we also patch config.in and configure.ac/configure. If those are > > acceptable to be patched later too, then great. If not, the patch > > That is the same thing as config.gcc bits. > > > isn't really very large. We did do this for power7 initially too: > > > > http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00162.html > > But then power7 patch went in during stage1 of the n+1 release, and > wasn't really backported to release branch (just to distro vendor branches), > right? I think we could have done better there, yes, but not all of our patches were appropriate for backporting, especially those parts that touched outside of the port. There will be portions of power8 we won't/don't want to backport either, but I would like to get the major backend portions like machine description files and the like backported to 4.8 when the time comes. Having the configurey changes in would help that, but if you say those are things we can get in after stage1, then that can ease things a bit. That said, I'll post our current patch as is and discuss within our team and with David on what our next course of action should be. Peter ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek ` (10 preceding siblings ...) 2012-11-05 12:42 ` Peter Bergner @ 2012-11-06 2:57 ` Easwaran Raman 11 siblings, 0 replies; 59+ messages in thread From: Easwaran Raman @ 2012-11-06 2:57 UTC (permalink / raw) To: Jakub Jelinek; +Cc: GCC Mailing List, gcc-patches I'd like to get a small patch to tree reassociation ( http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01761.html ) in. Thanks, Easwaran On Mon, Oct 29, 2012 at 10:56 AM, Jakub Jelinek <jakub@redhat.com> wrote: > Status > ====== > > I'd like to close the stage 1 phase of GCC 4.8 development > on Monday, November 5th. If you have still patches for new features you'd > like to see in GCC 4.8, please post them for review soon. Patches > posted before the freeze, but reviewed shortly after the freeze, may > still go in, further changes should be just bugfixes and documentation > fixes. > > > Quality Data > ============ > > Priority # Change from Last Report > -------- --- ----------------------- > P1 23 + 23 > P2 77 + 8 > P3 85 + 84 > -------- --- ----------------------- > Total 185 +115 > > > Previous Report > =============== > > http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html > > The next report will be sent by me again, announcing end of stage 1. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
@ 2012-10-30 23:18 Sharad Singhai
2012-11-01 7:52 ` Sharad Singhai
0 siblings, 1 reply; 59+ messages in thread
From: Sharad Singhai @ 2012-10-30 23:18 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: gcc-patches, gcc
Hi Jakub,
My -fopt-info pass filtering patch
(http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being
reviewed and I hope to get this in by Nov. 5 for inclusion in gcc
4.8.0.
Thanks,
Sharad
On Mon, Oct 29, 2012 at 10:56 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Status
> ======
>
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th. If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon. Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.
>
>
> Quality Data
> ============
>
> Priority # Change from Last Report
> -------- --- -----------------------
> P1 23 + 23
> P2 77 + 8
> P3 85 + 84
> -------- --- -----------------------
> Total 185 +115
>
>
> Previous Report
> ===============
>
> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html
>
> The next report will be sent by me again, announcing end of stage 1.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-10-30 23:18 Sharad Singhai @ 2012-11-01 7:52 ` Sharad Singhai 2012-11-01 12:28 ` Jakub Jelinek 0 siblings, 1 reply; 59+ messages in thread From: Sharad Singhai @ 2012-11-01 7:52 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches, gcc On Tue, Oct 30, 2012 at 4:04 PM, Sharad Singhai <singhai@google.com> wrote: > Hi Jakub, > > My -fopt-info pass filtering patch > (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being > reviewed and I hope to get this in by Nov. 5 for inclusion in gcc > 4.8.0. I just committed -fopt-info pass filtering patch as r193061. Thanks, Sharad > Thanks, > Sharad > > On Mon, Oct 29, 2012 at 10:56 AM, Jakub Jelinek <jakub@redhat.com> wrote: >> Status >> ====== >> >> I'd like to close the stage 1 phase of GCC 4.8 development >> on Monday, November 5th. If you have still patches for new features you'd >> like to see in GCC 4.8, please post them for review soon. Patches >> posted before the freeze, but reviewed shortly after the freeze, may >> still go in, further changes should be just bugfixes and documentation >> fixes. >> >> >> Quality Data >> ============ >> >> Priority # Change from Last Report >> -------- --- ----------------------- >> P1 23 + 23 >> P2 77 + 8 >> P3 85 + 84 >> -------- --- ----------------------- >> Total 185 +115 >> >> >> Previous Report >> =============== >> >> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html >> >> The next report will be sent by me again, announcing end of stage 1. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 7:52 ` Sharad Singhai @ 2012-11-01 12:28 ` Jakub Jelinek 2012-11-01 13:09 ` Diego Novillo 2012-11-01 13:54 ` Sharad Singhai 0 siblings, 2 replies; 59+ messages in thread From: Jakub Jelinek @ 2012-11-01 12:28 UTC (permalink / raw) To: Sharad Singhai; +Cc: gcc-patches, gcc On Thu, Nov 01, 2012 at 12:52:04AM -0700, Sharad Singhai wrote: > On Tue, Oct 30, 2012 at 4:04 PM, Sharad Singhai <singhai@google.com> wrote: > > Hi Jakub, > > > > My -fopt-info pass filtering patch > > (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being > > reviewed and I hope to get this in by Nov. 5 for inclusion in gcc > > 4.8.0. > > I just committed -fopt-info pass filtering patch as r193061. How was that change tested? I'm seeing thousands of new UNRESOLVED failures, of the form: spawn -ignore SIGHUP /usr/src/gcc/obj415/gcc/xgcc -B/usr/src/gcc/obj415/gcc/ /usr/src/gcc/gcc/testsuite/gcc.target/i386/branch-cost1.c -fno-diagnostics-show-caret -O2 -fdump-tree-gimple -mbranch-cost=0 -S -o branch-cost1.s PASS: gcc.target/i386/branch-cost1.c (test for excess errors) gcc.target/i386/branch-cost1.c: dump file does not exist UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-times gimple "if " 2 gcc.target/i386/branch-cost1.c: dump file does not exist UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-not gimple " & " See http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00033.html or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00034.html, compare that to http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00025.html or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00026.html The difference is just your patch and unrelated sh backend change. Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 12:28 ` Jakub Jelinek @ 2012-11-01 13:09 ` Diego Novillo 2012-11-01 16:41 ` Sharad Singhai 2012-11-01 13:54 ` Sharad Singhai 1 sibling, 1 reply; 59+ messages in thread From: Diego Novillo @ 2012-11-01 13:09 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Sharad Singhai, gcc-patches, gcc On Thu, Nov 1, 2012 at 8:28 AM, Jakub Jelinek <jakub@redhat.com> wrote: > How was that change tested? I'm seeing thousands of new UNRESOLVED > failures, of the form: > spawn -ignore SIGHUP /usr/src/gcc/obj415/gcc/xgcc -B/usr/src/gcc/obj415/gcc/ /usr/src/gcc/gcc/testsuite/gcc.target/i386/branch-cost1.c -fno-diagnostics-show-caret -O2 -fdump-tree-gimple -mbranch-cost=0 -S -o branch-cost1.s > PASS: gcc.target/i386/branch-cost1.c (test for excess errors) > gcc.target/i386/branch-cost1.c: dump file does not exist > UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-times gimple "if " 2 > gcc.target/i386/branch-cost1.c: dump file does not exist > UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-not gimple " & " > > See http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00033.html > or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00034.html, compare that > to http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00025.html > or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00026.html > > The difference is just your patch and unrelated sh backend change. I'm seeing the same failures. Sharad, could you fix them or revert your change? Thanks. Diego. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 13:09 ` Diego Novillo @ 2012-11-01 16:41 ` Sharad Singhai 2012-11-01 16:44 ` Diego Novillo 2012-11-01 18:02 ` Sterling Augustine 0 siblings, 2 replies; 59+ messages in thread From: Sharad Singhai @ 2012-11-01 16:41 UTC (permalink / raw) To: Diego Novillo; +Cc: Jakub Jelinek, gcc-patches, gcc I found the problem and the following patch fixes it. The issue with my testing was that I was only looking at 'FAIL' lines but forgot to tally the 'UNRESOLVED' test cases, the real symptoms of my test problems. In any case, I am rerunning the whole testsuite just to be sure. Assuming tests pass, is it okay to commit the following? Thanks, Sharad 2012-11-01 Sharad Singhai <singhai@google.com> PR other/55164 * dumpfile.h (struct dump_file_info): Fix order of flags. Index: dumpfile.h =================================================================== --- dumpfile.h (revision 193061) +++ dumpfile.h (working copy) @@ -113,8 +113,8 @@ struct dump_file_info const char *alt_filename; /* filename for the -fopt-info stream */ FILE *pstream; /* pass-specific dump stream */ FILE *alt_stream; /* -fopt-info stream */ + int pflags; /* dump flags */ int optgroup_flags; /* optgroup flags for -fopt-info */ - int pflags; /* dump flags */ int alt_flags; /* flags for opt-info */ int pstate; /* state of pass-specific stream */ int alt_state; /* state of the -fopt-info stream */ ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 16:41 ` Sharad Singhai @ 2012-11-01 16:44 ` Diego Novillo 2012-11-01 17:59 ` Sharad Singhai 2012-11-01 18:02 ` Sterling Augustine 1 sibling, 1 reply; 59+ messages in thread From: Diego Novillo @ 2012-11-01 16:44 UTC (permalink / raw) To: Sharad Singhai; +Cc: Jakub Jelinek, gcc-patches, gcc On Thu, Nov 1, 2012 at 12:40 PM, Sharad Singhai <singhai@google.com> wrote: > I found the problem and the following patch fixes it. The issue with > my testing was that I was only looking at 'FAIL' lines but forgot to > tally the 'UNRESOLVED' test cases, the real symptoms of my test > problems. In any case, I am rerunning the whole testsuite just to be > sure. > > Assuming tests pass, is it okay to commit the following? > > Thanks, > Sharad > > 2012-11-01 Sharad Singhai <singhai@google.com> > > PR other/55164 > * dumpfile.h (struct dump_file_info): Fix order of flags. OK (remember to insert a tab at the start of each ChangeLog line). Diego. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 16:44 ` Diego Novillo @ 2012-11-01 17:59 ` Sharad Singhai 0 siblings, 0 replies; 59+ messages in thread From: Sharad Singhai @ 2012-11-01 17:59 UTC (permalink / raw) To: Diego Novillo; +Cc: Jakub Jelinek, gcc-patches, gcc On Thu, Nov 1, 2012 at 9:44 AM, Diego Novillo <dnovillo@google.com> wrote: > On Thu, Nov 1, 2012 at 12:40 PM, Sharad Singhai <singhai@google.com> wrote: >> I found the problem and the following patch fixes it. The issue with >> my testing was that I was only looking at 'FAIL' lines but forgot to >> tally the 'UNRESOLVED' test cases, the real symptoms of my test >> problems. In any case, I am rerunning the whole testsuite just to be >> sure. >> >> Assuming tests pass, is it okay to commit the following? >> >> Thanks, >> Sharad >> >> 2012-11-01 Sharad Singhai <singhai@google.com> >> >> PR other/55164 >> * dumpfile.h (struct dump_file_info): Fix order of flags. > > OK (remember to insert a tab at the start of each ChangeLog line). Fixed tab chars. (they were really there, but gmail ate them! :)) Retested and found all my 'UNRESOLVED' problems were gone. Hence committed the fix as r193064. Thanks, Sharad > > Diego. ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 16:41 ` Sharad Singhai 2012-11-01 16:44 ` Diego Novillo @ 2012-11-01 18:02 ` Sterling Augustine 1 sibling, 0 replies; 59+ messages in thread From: Sterling Augustine @ 2012-11-01 18:02 UTC (permalink / raw) To: gcc-patches, gcc; +Cc: Jakub Jelinek Hi Jakub, I would like to get the fission implementation in before stage 1. It has been under review for some time, and is awaiting another round of review now. More info here: http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02684.html Sterling ^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon 2012-11-01 12:28 ` Jakub Jelinek 2012-11-01 13:09 ` Diego Novillo @ 2012-11-01 13:54 ` Sharad Singhai 1 sibling, 0 replies; 59+ messages in thread From: Sharad Singhai @ 2012-11-01 13:54 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches, gcc I am really sorry about that. I am looking and will fix the breakage or revert the patch shortly. Thanks, Sharad On Thu, Nov 1, 2012 at 5:28 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Thu, Nov 01, 2012 at 12:52:04AM -0700, Sharad Singhai wrote: >> On Tue, Oct 30, 2012 at 4:04 PM, Sharad Singhai <singhai@google.com> wrote: >> > Hi Jakub, >> > >> > My -fopt-info pass filtering patch >> > (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being >> > reviewed and I hope to get this in by Nov. 5 for inclusion in gcc >> > 4.8.0. >> >> I just committed -fopt-info pass filtering patch as r193061. > > How was that change tested? I'm seeing thousands of new UNRESOLVED > failures, of the form: > spawn -ignore SIGHUP /usr/src/gcc/obj415/gcc/xgcc -B/usr/src/gcc/obj415/gcc/ /usr/src/gcc/gcc/testsuite/gcc.target/i386/branch-cost1.c -fno-diagnostics-show-caret -O2 -fdump-tree-gimple -mbranch-cost=0 -S -o branch-cost1.s > PASS: gcc.target/i386/branch-cost1.c (test for excess errors) > gcc.target/i386/branch-cost1.c: dump file does not exist > UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-times gimple "if " 2 > gcc.target/i386/branch-cost1.c: dump file does not exist > UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-not gimple " & " > > See http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00033.html > or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00034.html, compare that > to http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00025.html > or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00026.html > > The difference is just your patch and unrelated sh backend change. > > Jakub ^ permalink raw reply [flat|nested] 59+ messages in thread
end of thread, other threads:[~2013-02-27 12:39 UTC | newest] Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek 2012-10-29 18:13 ` David Miller 2012-10-29 18:32 ` Eric Botcazou 2012-10-29 18:42 ` David Miller 2012-10-30 8:22 ` Jakub Jelinek 2012-10-29 22:14 ` Magnus Granberg 2012-10-30 7:01 ` Gopalasubramanian, Ganesh 2012-10-30 13:47 ` Diego Novillo 2012-10-30 21:31 ` Lawrence Crowl 2012-10-30 21:07 ` Kenneth Zadeck 2012-10-31 10:00 ` Richard Biener 2012-10-31 10:02 ` Richard Sandiford 2012-10-31 10:13 ` Richard Biener 2012-10-31 13:54 ` Kenneth Zadeck 2012-10-31 14:05 ` Jakub Jelinek 2012-10-31 14:06 ` Kenneth Zadeck 2012-10-31 14:31 ` Jakub Jelinek 2012-10-31 14:56 ` Kenneth Zadeck 2012-10-31 18:42 ` Kenneth Zadeck 2012-11-01 12:44 ` Kenneth Zadeck 2012-11-01 13:10 ` Richard Sandiford 2012-11-01 13:18 ` Kenneth Zadeck 2012-11-01 13:24 ` Kenneth Zadeck 2012-11-01 15:16 ` Richard Sandiford 2012-11-04 16:54 ` Richard Biener 2012-11-05 13:59 ` Kenneth Zadeck 2012-11-05 17:00 ` Kenneth Zadeck 2012-11-26 15:03 ` Richard Biener 2012-11-26 16:03 ` Kenneth Zadeck 2012-11-26 16:30 ` Richard Biener 2012-11-27 0:06 ` Kenneth Zadeck 2012-11-27 10:03 ` Richard Biener 2012-11-27 13:03 ` Kenneth Zadeck 2012-10-31 19:13 ` Marc Glisse 2013-02-27 12:39 ` patch to fix constant math - 5th patch - the main rtl work Kenneth Zadeck 2012-10-31 18:34 ` GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Andrew Haley 2012-10-30 22:06 ` Sriraman Tallam 2012-10-31 9:09 ` Bin Cheng 2012-10-31 10:23 ` Richard Biener 2012-11-05 16:32 ` David Malcolm 2012-10-31 10:31 ` JonY 2012-10-31 10:44 ` Jakub Jelinek 2012-10-31 11:12 ` Jonathan Wakely 2012-11-02 22:51 ` [wwwdocs] PATCH for " Gerald Pfeifer 2012-11-05 12:42 ` Peter Bergner 2012-11-05 12:53 ` Jakub Jelinek 2012-11-05 14:40 ` Peter Bergner 2012-11-05 14:48 ` Jakub Jelinek 2012-11-06 4:47 ` Peter Bergner 2012-11-06 2:57 ` Easwaran Raman 2012-10-30 23:18 Sharad Singhai 2012-11-01 7:52 ` Sharad Singhai 2012-11-01 12:28 ` Jakub Jelinek 2012-11-01 13:09 ` Diego Novillo 2012-11-01 16:41 ` Sharad Singhai 2012-11-01 16:44 ` Diego Novillo 2012-11-01 17:59 ` Sharad Singhai 2012-11-01 18:02 ` Sterling Augustine 2012-11-01 13:54 ` Sharad Singhai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).