GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
@ 2012-10-29 18:08 Jakub Jelinek
  2012-10-29 18:13 ` David Miller
                   ` (11 more replies)
  0 siblings, 12 replies; 59+ messages in thread
From: Jakub Jelinek @ 2012-10-29 18:08 UTC (permalink / raw)
  To: gcc; +Cc: gcc-patches

Status
======

I'd like to close the stage 1 phase of GCC 4.8 development
on Monday, November 5th.  If you have still patches for new features you'd
like to see in GCC 4.8, please post them for review soon.  Patches
posted before the freeze, but reviewed shortly after the freeze, may
still go in, further changes should be just bugfixes and documentation
fixes.

Quality Data
============

Priority          #   Change from Last Report
--------        ---   -----------------------
P1               23   + 23
P2               77   +  8
P3               85   + 84
--------        ---   -----------------------
Total           185   +115

Previous Report
===============

http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html

The next report will be sent by me again, announcing end of stage 1.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
@ 2012-10-29 18:13 ` David Miller
  2012-10-29 18:32   ` Eric Botcazou
  2012-10-30  8:22   ` Jakub Jelinek
  2012-10-29 22:14 ` Magnus Granberg
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 59+ messages in thread
From: David Miller @ 2012-10-29 18:13 UTC (permalink / raw)
  To: jakub; +Cc: gcc, gcc-patches

From: Jakub Jelinek <jakub@redhat.com>
Date: Mon, 29 Oct 2012 18:56:42 +0100

> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.

I'd like to get the Sparc cbcond stuff in (3 revisions posted) which
is waiting for Eric B. to do some Solaris specific work.

I'd also like to enable LRA for at least 32-bit sparc, even if I can't
find the time to work on auditing 64-bit completely.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:13 ` David Miller
@ 2012-10-29 18:32   ` Eric Botcazou
  2012-10-29 18:42     ` David Miller
  2012-10-30  8:22   ` Jakub Jelinek
  1 sibling, 1 reply; 59+ messages in thread
From: Eric Botcazou @ 2012-10-29 18:32 UTC (permalink / raw)
  To: David Miller; +Cc: gcc, jakub, gcc-patches

> I'd like to get the Sparc cbcond stuff in (3 revisions posted) which
> is waiting for Eric B. to do some Solaris specific work.
> 
> I'd also like to enable LRA for at least 32-bit sparc, even if I can't
> find the time to work on auditing 64-bit completely.

End of stage #1 isn't a hard limit for architecture-specific patches, so we 
need not make a decision about LRA immediately.  I don't think we want to half 
enable it though, so it's all or nothing.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:32   ` Eric Botcazou
@ 2012-10-29 18:42     ` David Miller
  0 siblings, 0 replies; 59+ messages in thread
From: David Miller @ 2012-10-29 18:42 UTC (permalink / raw)
  To: ebotcazou; +Cc: gcc, jakub, gcc-patches

From: Eric Botcazou <ebotcazou@adacore.com>
Date: Mon, 29 Oct 2012 20:25:15 +0100

>> I'd like to get the Sparc cbcond stuff in (3 revisions posted) which
>> is waiting for Eric B. to do some Solaris specific work.
>> 
>> I'd also like to enable LRA for at least 32-bit sparc, even if I can't
>> find the time to work on auditing 64-bit completely.
> 
> End of stage #1 isn't a hard limit for architecture-specific patches, so we 
> need not make a decision about LRA immediately.  I don't think we want to half 
> enable it though, so it's all or nothing.

Upon further consideration, agreed.  I'll only turn this on if I can
get the whole backend working.

FWIW, I think we should consider delaying stage1 for another reason.
A large number of North American developers are about to be hit by a
major natural disaster, and may be without power for weeks.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:13 ` David Miller
  2012-10-29 18:32   ` Eric Botcazou
@ 2012-10-30  8:22   ` Jakub Jelinek
  1 sibling, 0 replies; 59+ messages in thread
From: Jakub Jelinek @ 2012-10-30  8:22 UTC (permalink / raw)
  To: David Miller; +Cc: gcc, gcc-patches

On Mon, Oct 29, 2012 at 02:07:55PM -0400, David Miller wrote:
> > I'd like to close the stage 1 phase of GCC 4.8 development
> > on Monday, November 5th.  If you have still patches for new features you'd
> > like to see in GCC 4.8, please post them for review soon.  Patches
> > posted before the freeze, but reviewed shortly after the freeze, may
> > still go in, further changes should be just bugfixes and documentation
> > fixes.
> 
> I'd like to get the Sparc cbcond stuff in (3 revisions posted) which
> is waiting for Eric B. to do some Solaris specific work.

That has been posted in stage 1, so it is certainly ok to commit it even
during early stage 3.  And, on a case by case basis exceptions are always
possible.  This hasn't changed in the last few years.  By the reviewed
shortly after the freeze I just want to say that e.g. having large intrusive
patches posted now, but reviewed late December is already too late.

As for postponing end of stage 1 by a few weeks because of the storm, I'm
afraid if we want to keep roughly timely releases we don't have that luxury.
If you look at http://gcc.gnu.org/develop.html, ending stage 1 around end of
October happened already for 4.6 and 4.7, for 4.5 if was a month earlier and
for 4.4 even two months earlier.  The 4.7 bugfixing went IMHO smothly, but
we certainly have to expect lots of bugfixing.

> I'd also like to enable LRA for at least 32-bit sparc, even if I can't
> find the time to work on auditing 64-bit completely.

I agree with Eric that it is better to enable it for the whole target
together, rather than based on some options.  Enabling LRA in early stage 3
for some targets should be ok, if it doesn't require too large and intrusive
changes to the generic code that could destabilize other targets.

	Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
  2012-10-29 18:13 ` David Miller
@ 2012-10-29 22:14 ` Magnus Granberg
  2012-10-30  7:01 ` Gopalasubramanian, Ganesh
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 59+ messages in thread
From: Magnus Granberg @ 2012-10-29 22:14 UTC (permalink / raw)
  To: gcc-patches

måndag 29 oktober 2012 18.56.42 skrev  Jakub Jelinek:
> Status
> ======
> 
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.
> 

I want to get the new configure --enable-espf options included.
The patches have been posted some time ago.

Gentoo Hardened Project
Magnus Granberg

^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
  2012-10-29 18:13 ` David Miller
  2012-10-29 22:14 ` Magnus Granberg
@ 2012-10-30  7:01 ` Gopalasubramanian, Ganesh
  2012-10-30 13:47 ` Diego Novillo
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 59+ messages in thread
From: Gopalasubramanian, Ganesh @ 2012-10-30  7:01 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, gcc, Uros Bizjak (ubizjak@gmail.com)

Hi Jakub,

We are working on the following. 
1. bdver3 enablement. Review completed. Changes to be incorporated and checked-in.
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01131.html

2. btver2 basic enablement is done (http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01018.html)/
Scheduler descriptions are being updated. This is architecture specific and we consider it not to be a stage-1 material.

Regards
Ganesh

-----Original Message-----
From: Jakub Jelinek [mailto:jakub@redhat.com] 
Sent: Monday, October 29, 2012 11:27 PM
To: gcc@gcc.gnu.org
Cc: gcc-patches@gcc.gnu.org
Subject: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

Status
======

I'd like to close the stage 1 phase of GCC 4.8 development on Monday, November 5th.  If you have still patches for new features you'd like to see in GCC 4.8, please post them for review soon.  Patches posted before the freeze, but reviewed shortly after the freeze, may still go in, further changes should be just bugfixes and documentation fixes.

Quality Data
============

Priority          #   Change from Last Report
--------        ---   -----------------------
P1               23   + 23
P2               77   +  8
P3               85   + 84
--------        ---   -----------------------
Total           185   +115

Previous Report
===============

http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html

The next report will be sent by me again, announcing end of stage 1.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
                   ` (2 preceding siblings ...)
  2012-10-30  7:01 ` Gopalasubramanian, Ganesh
@ 2012-10-30 13:47 ` Diego Novillo
  2012-10-30 21:31   ` Lawrence Crowl
  2012-10-30 21:07 ` Kenneth Zadeck
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 59+ messages in thread
From: Diego Novillo @ 2012-10-30 13:47 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, gcc-patches

On Mon, Oct 29, 2012 at 1:56 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Status
> ======
>
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.

I will be committing the VEC overhaul soon.  With any luck this week,
but PCH and gengtype are giving me a lot of grief.


Diego.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-30 13:47 ` Diego Novillo
@ 2012-10-30 21:31   ` Lawrence Crowl
  0 siblings, 0 replies; 59+ messages in thread
From: Lawrence Crowl @ 2012-10-30 21:31 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jakub Jelinek, gcc, gcc-patches

On 10/30/12, Diego Novillo <dnovillo@google.com> wrote:
> On Mon, Oct 29, 2012 at 1:56 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> Status
>> ======
>>
>> I'd like to close the stage 1 phase of GCC 4.8 development
>> on Monday, November 5th.  If you have still patches for new features
>> you'd
>> like to see in GCC 4.8, please post them for review soon.  Patches
>> posted before the freeze, but reviewed shortly after the freeze, may
>> still go in, further changes should be just bugfixes and documentation
>> fixes.
>
> I will be committing the VEC overhaul soon.  With any luck this week,
> but PCH and gengtype are giving me a lot of grief.

I have three remaining bitmap patches and the recently approved
is_a/symtab/cgraph patch.

However, Alexandre Oliva <aoliva@redhat.com> has a patch for
bootstrap failure that is biting me.  I can either incorporate it
into my patches or wait for his patch and then submit.  Comments?

-- 
Lawrence Crowl

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
                   ` (3 preceding siblings ...)
  2012-10-30 13:47 ` Diego Novillo
@ 2012-10-30 21:07 ` Kenneth Zadeck
  2012-10-31 10:00   ` Richard Biener
  2012-10-30 22:06 ` Sriraman Tallam
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 59+ messages in thread
From: Kenneth Zadeck @ 2012-10-30 21:07 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, gcc-patches

jakub,

i am hoping to get the rest of my wide integer conversion posted by nov 
5.   I am under some adverse conditions here: hurricane sandy hit her 
pretty badly.  my house is hooked up to a small generator, and no one 
has any power for miles around.

So far richi has promised to review them.   he has sent some comments, 
but so far no reviews.    Some time after i get the first round of them 
posted, i will do a second round that incorporates everyones comments.

But i would like a little slack here if possible.    While this work is 
a show stopper for my private port, the patches address serious problems 
for many of the public ports, especially ones that have very flexible 
vector units.    I believe that there are significant set of latent 
problems currently with the existing ports that use ti mode that these 
patches will fix.

However, i will do everything in my power to get the first round of the 
patches posted by nov 5 deadline.

kenny

On 10/29/2012 01:56 PM, Jakub Jelinek wrote:
> Status
> ======
>
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.
>
>
> Quality Data
> ============
>
> Priority          #   Change from Last Report
> --------        ---   -----------------------
> P1               23   + 23
> P2               77   +  8
> P3               85   + 84
> --------        ---   -----------------------
> Total           185   +115
>
>
> Previous Report
> ===============
>
> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html
>
> The next report will be sent by me again, announcing end of stage 1.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-30 21:07 ` Kenneth Zadeck
@ 2012-10-31 10:00   ` Richard Biener
  2012-10-31 10:02     ` Richard Sandiford
  2012-10-31 18:34     ` GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Andrew Haley
  0 siblings, 2 replies; 59+ messages in thread
From: Richard Biener @ 2012-10-31 10:00 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Jakub Jelinek, gcc, gcc-patches

On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck
<zadeck@naturalbridge.com> wrote:
> jakub,
>
> i am hoping to get the rest of my wide integer conversion posted by nov 5.
> I am under some adverse conditions here: hurricane sandy hit her pretty
> badly.  my house is hooked up to a small generator, and no one has any power
> for miles around.
>
> So far richi has promised to review them.   he has sent some comments, but
> so far no reviews.    Some time after i get the first round of them posted,
> i will do a second round that incorporates everyones comments.
>
> But i would like a little slack here if possible.    While this work is a
> show stopper for my private port, the patches address serious problems for
> many of the public ports, especially ones that have very flexible vector
> units.    I believe that there are significant set of latent problems
> currently with the existing ports that use ti mode that these patches will
> fix.
>
> However, i will do everything in my power to get the first round of the
> patches posted by nov 5 deadline.

I suppose you are not going to merge your private port for 4.8 and thus
the wide-int changes are not a show-stopper for you.

That said, I considered the main conversion to be appropriate to be
defered for the next stage1.  There is no advantage in disrupting the
tree more at this stage.

Thanks,
Richard.

> kenny
>
>
> On 10/29/2012 01:56 PM, Jakub Jelinek wrote:
>>
>> Status
>> ======
>>
>> I'd like to close the stage 1 phase of GCC 4.8 development
>> on Monday, November 5th.  If you have still patches for new features you'd
>> like to see in GCC 4.8, please post them for review soon.  Patches
>> posted before the freeze, but reviewed shortly after the freeze, may
>> still go in, further changes should be just bugfixes and documentation
>> fixes.
>>
>>
>> Quality Data
>> ============
>>
>> Priority          #   Change from Last Report
>> --------        ---   -----------------------
>> P1               23   + 23
>> P2               77   +  8
>> P3               85   + 84
>> --------        ---   -----------------------
>> Total           185   +115
>>
>>
>> Previous Report
>> ===============
>>
>> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html
>>
>> The next report will be sent by me again, announcing end of stage 1.
>
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 10:00   ` Richard Biener
@ 2012-10-31 10:02     ` Richard Sandiford
  2012-10-31 10:13       ` Richard Biener
                         ` (2 more replies)
  2012-10-31 18:34     ` GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Andrew Haley
  1 sibling, 3 replies; 59+ messages in thread
From: Richard Sandiford @ 2012-10-31 10:02 UTC (permalink / raw)
  To: Richard Biener; +Cc: Kenneth Zadeck, Jakub Jelinek, gcc, gcc-patches

Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck
> <zadeck@naturalbridge.com> wrote:
>> jakub,
>>
>> i am hoping to get the rest of my wide integer conversion posted by nov 5.
>> I am under some adverse conditions here: hurricane sandy hit her pretty
>> badly.  my house is hooked up to a small generator, and no one has any power
>> for miles around.
>>
>> So far richi has promised to review them.   he has sent some comments, but
>> so far no reviews.    Some time after i get the first round of them posted,
>> i will do a second round that incorporates everyones comments.
>>
>> But i would like a little slack here if possible.    While this work is a
>> show stopper for my private port, the patches address serious problems for
>> many of the public ports, especially ones that have very flexible vector
>> units.    I believe that there are significant set of latent problems
>> currently with the existing ports that use ti mode that these patches will
>> fix.
>>
>> However, i will do everything in my power to get the first round of the
>> patches posted by nov 5 deadline.
>
> I suppose you are not going to merge your private port for 4.8 and thus
> the wide-int changes are not a show-stopper for you.
>
> That said, I considered the main conversion to be appropriate to be
> defered for the next stage1.  There is no advantage in disrupting the
> tree more at this stage.

I would like the wide_int class and rtl stuff to go in 4.8 though.
IMO it's a significant improvement in its own right, and Kenny
submitted it well before the deadline.

Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 10:02     ` Richard Sandiford
@ 2012-10-31 10:13       ` Richard Biener
  2012-10-31 13:54       ` Kenneth Zadeck
  2013-02-27 12:39       ` patch to fix constant math - 5th patch - the main rtl work Kenneth Zadeck
  2 siblings, 0 replies; 59+ messages in thread
From: Richard Biener @ 2012-10-31 10:13 UTC (permalink / raw)
  To: Richard Biener, Kenneth Zadeck, Jakub Jelinek, gcc, gcc-patches,
	rdsandiford

On Wed, Oct 31, 2012 at 10:59 AM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck
>> <zadeck@naturalbridge.com> wrote:
>>> jakub,
>>>
>>> i am hoping to get the rest of my wide integer conversion posted by nov 5.
>>> I am under some adverse conditions here: hurricane sandy hit her pretty
>>> badly.  my house is hooked up to a small generator, and no one has any power
>>> for miles around.
>>>
>>> So far richi has promised to review them.   he has sent some comments, but
>>> so far no reviews.    Some time after i get the first round of them posted,
>>> i will do a second round that incorporates everyones comments.
>>>
>>> But i would like a little slack here if possible.    While this work is a
>>> show stopper for my private port, the patches address serious problems for
>>> many of the public ports, especially ones that have very flexible vector
>>> units.    I believe that there are significant set of latent problems
>>> currently with the existing ports that use ti mode that these patches will
>>> fix.
>>>
>>> However, i will do everything in my power to get the first round of the
>>> patches posted by nov 5 deadline.
>>
>> I suppose you are not going to merge your private port for 4.8 and thus
>> the wide-int changes are not a show-stopper for you.
>>
>> That said, I considered the main conversion to be appropriate to be
>> defered for the next stage1.  There is no advantage in disrupting the
>> tree more at this stage.
>
> I would like the wide_int class and rtl stuff to go in 4.8 though.
> IMO it's a significant improvement in its own right, and Kenny
> submitted it well before the deadline.

If it gets in as-is then we'll have to live with the IMHO broken API
(yet another one besides the existing double-int).  So _please_
shrink the API down aggresively in favor of using non-member
helper functions with more descriptive names for things that
lump together multiple operations.  Look at double-int and
use the same API ideas as people are familiar with it
(like the unsigned flag stuff) - consistency always trumps.

I'm going to be on vacation for the next three weeks so somebody else
has to pick up the review work.  But I really think that the tree
has to recover from too many changes already.

Richard.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 10:02     ` Richard Sandiford
  2012-10-31 10:13       ` Richard Biener
@ 2012-10-31 13:54       ` Kenneth Zadeck
  2012-10-31 14:05         ` Jakub Jelinek
  2012-10-31 19:13         ` Marc Glisse
  2013-02-27 12:39       ` patch to fix constant math - 5th patch - the main rtl work Kenneth Zadeck
  2 siblings, 2 replies; 59+ messages in thread
From: Kenneth Zadeck @ 2012-10-31 13:54 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek, gcc, gcc-patches, rdsandiford

Richi,

Let me explain to you what a broken api is.   I have spent the last week 
screwing around with tree-vpn and as of last night i finally got it to 
work.   In tree-vpn, it is clear that double-int is the precise 
definition of a broken api.

The tree-vpn uses an infinite-precision view of arithmetic. However, 
that infinite precision is implemented on top of a finite, CARVED IN 
STONE, base that is and will always be without a patch like this, 128 
bits on an x86-64.    However, as was pointed out by earlier, tree-vrp 
needs 2 * the size of a type + 1 bit to work correctly.    Until 
yesterday i did not fully understand the significance of that 1 bit.  
what this means is that tree-vrp does not work on an x86-64 with _int128 
variables.

There are no checks in tree-vrp to back off when it sees something too 
large, tree-vrp simply gets the wrong answer.   To me, this is a broken 
api and is GCC at its very worst.   The patches that required this 
SHOULD HAVE NEVER GONE INTO GCC.   What you have with my patches is 
someone who is willing to fix a large and complex problem that should 
have been fixed years ago.

I understand that you do not like several aspects of the wide-int api 
and i am willing to make some of those improvements.   However, what i 
am worried about is that you are in some ways really attached to the 
style of programmed where everything is dependent on the size of a 
HWI.    I will continue to push back on those comments but have been 
working the rest in as i have been going along.

To answer your other question, it will be a significant problem if i 
cannot get these patches in.   They are very prone to patch rot and my 
customer wants a product without many patches to the base code.
Also, i fear that your real reason that you want to wait is because you 
really do not like the fact these patches get rid of double in and that 
style of programming and putting off that day serves no one well.

kenny

On 10/31/2012 05:59 AM, Richard Sandiford wrote:
> Richard Biener<richard.guenther@gmail.com>  writes:
>> On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck
>> <zadeck@naturalbridge.com>  wrote:
>>> jakub,
>>>
>>> i am hoping to get the rest of my wide integer conversion posted by nov 5.
>>> I am under some adverse conditions here: hurricane sandy hit her pretty
>>> badly.  my house is hooked up to a small generator, and no one has any power
>>> for miles around.
>>>
>>> So far richi has promised to review them.   he has sent some comments, but
>>> so far no reviews.    Some time after i get the first round of them posted,
>>> i will do a second round that incorporates everyones comments.
>>>
>>> But i would like a little slack here if possible.    While this work is a
>>> show stopper for my private port, the patches address serious problems for
>>> many of the public ports, especially ones that have very flexible vector
>>> units.    I believe that there are significant set of latent problems
>>> currently with the existing ports that use ti mode that these patches will
>>> fix.
>>>
>>> However, i will do everything in my power to get the first round of the
>>> patches posted by nov 5 deadline.
>> I suppose you are not going to merge your private port for 4.8 and thus
>> the wide-int changes are not a show-stopper for you.
>>
>> That said, I considered the main conversion to be appropriate to be
>> defered for the next stage1.  There is no advantage in disrupting the
>> tree more at this stage.
> I would like the wide_int class and rtl stuff to go in 4.8 though.
> IMO it's a significant improvement in its own right, and Kenny
> submitted it well before the deadline.
>
> Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 13:54       ` Kenneth Zadeck
@ 2012-10-31 14:05         ` Jakub Jelinek
  2012-10-31 14:06           ` Kenneth Zadeck
  2012-10-31 19:13         ` Marc Glisse
  1 sibling, 1 reply; 59+ messages in thread
From: Jakub Jelinek @ 2012-10-31 14:05 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford

On Wed, Oct 31, 2012 at 09:44:50AM -0400, Kenneth Zadeck wrote:
> The tree-vpn uses an infinite-precision view of arithmetic. However,
> that infinite precision is implemented on top of a finite, CARVED IN
> STONE, base that is and will always be without a patch like this,
> 128 bits on an x86-64.    However, as was pointed out by earlier,
> tree-vrp needs 2 * the size of a type + 1 bit to work correctly.
> Until yesterday i did not fully understand the significance of that
> 1 bit.  what this means is that tree-vrp does not work on an x86-64
> with _int128 variables.

If you see a VRP bug, please file a PR with a testcase, or point to existing
PR.  I agree with richi that it would be better to add a clean wide_int
implementation for 4.9, rather than rushing something in, introducing
lots of bugs, just for a port that hasn't been submitted, nor I understand
why > int128_t integer types are so crucial to your port, the vector
support doesn't generally need very large integers, even if your
vector unit is 256-bit, 512-bit or larger.

	Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 14:05         ` Jakub Jelinek
@ 2012-10-31 14:06           ` Kenneth Zadeck
  2012-10-31 14:31             ` Jakub Jelinek
  0 siblings, 1 reply; 59+ messages in thread
From: Kenneth Zadeck @ 2012-10-31 14:06 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford

jakub

my port has 256 bit integers.   They are done by strapping together all 
of the elements of a vector unit.
if one looks at where intel is going, they are doing exactly the same 
thing.    The difference is that they like to add the operations one at 
a time rather than just do a clean implementation like we did.   Soon 
they will get there, it is just a matter of time.

i understand the tree-vrp code well enough to say that this operation 
does not work if you have timode, but i do not know how to translate 
that back into c to generate a test case.    My patch to tree-vrp is 
adaptable in that it looks at the types in the program and adjusts its 
definition of infinite precision based on the code that it sees.  I can 
point people to that code in tree vrp and am happy to do that, but that 
is not my priority now.

also, richi pointed out that there are places in the tree level constant 
propagators that require infinite precision so he is really the person 
who both should know about this and generate proper tests.

kenny

On 10/31/2012 09:55 AM, Jakub Jelinek wrote:
> On Wed, Oct 31, 2012 at 09:44:50AM -0400, Kenneth Zadeck wrote:
>> The tree-vpn uses an infinite-precision view of arithmetic. However,
>> that infinite precision is implemented on top of a finite, CARVED IN
>> STONE, base that is and will always be without a patch like this,
>> 128 bits on an x86-64.    However, as was pointed out by earlier,
>> tree-vrp needs 2 * the size of a type + 1 bit to work correctly.
>> Until yesterday i did not fully understand the significance of that
>> 1 bit.  what this means is that tree-vrp does not work on an x86-64
>> with _int128 variables.
> If you see a VRP bug, please file a PR with a testcase, or point to existing
> PR.  I agree with richi that it would be better to add a clean wide_int
> implementation for 4.9, rather than rushing something in, introducing
> lots of bugs, just for a port that hasn't been submitted, nor I understand
> why > int128_t integer types are so crucial to your port, the vector
> support doesn't generally need very large integers, even if your
> vector unit is 256-bit, 512-bit or larger.
>
> 	Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 14:06           ` Kenneth Zadeck
@ 2012-10-31 14:31             ` Jakub Jelinek
  2012-10-31 14:56               ` Kenneth Zadeck
  2012-10-31 18:42               ` Kenneth Zadeck
  0 siblings, 2 replies; 59+ messages in thread
From: Jakub Jelinek @ 2012-10-31 14:31 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford

On Wed, Oct 31, 2012 at 10:04:58AM -0400, Kenneth Zadeck wrote:
> if one looks at where intel is going, they are doing exactly the
> same thing.    The difference is that they like to add the
> operations one at a time rather than just do a clean implementation
> like we did.   Soon they will get there, it is just a matter of
> time.

All I see on Intel is whole vector register shifts (and like on many other
ports and/or/xor/andn could be considered whole register too).
And, even if your port has 256-bit integer arithmetics, there is no mangling
for __int256_t or similar, so I don't see how you can offer such data type
as supported in the 4.8 timeframe.

	Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 14:31             ` Jakub Jelinek
@ 2012-10-31 14:56               ` Kenneth Zadeck
  2012-10-31 18:42               ` Kenneth Zadeck
  1 sibling, 0 replies; 59+ messages in thread
From: Kenneth Zadeck @ 2012-10-31 14:56 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford

I was not planning to do that mangling for 4.8.    My primary 
justification for getting it in publicly now is that there are a large 
number of places where the current compiler (both at the tree and rtl 
levels) do not do optimization of the value is larger than a single 
hwi.    My code generalizes all of these places so that they do the 
transformations independent of the size of the hwi.   (in some cases at 
the rtl level, the transformations were only done on 32 bit or smaller 
types, but i have seen nothing like that at the tree level.)   This 
provides benefits for cross compilers and for ports that support timode 
now.

The fact that i have chosen to do it in such a way that we will never 
have this problem again is the part of the patch that richi seems to 
object to.

We have patches that do the mangling for 256 for the front ends but we 
figured that we would post those for comments.   These are likely to be 
controversial because the require extensions to the syntax to accept 
large constants.

But there is no reason why the patches that fix the existing problems in 
a general way should not be considered for this release.

Kenny

On 10/31/2012 10:27 AM, Jakub Jelinek wrote:
> On Wed, Oct 31, 2012 at 10:04:58AM -0400, Kenneth Zadeck wrote:
>> if one looks at where intel is going, they are doing exactly the
>> same thing.    The difference is that they like to add the
>> operations one at a time rather than just do a clean implementation
>> like we did.   Soon they will get there, it is just a matter of
>> time.
> All I see on Intel is whole vector register shifts (and like on many other
> ports and/or/xor/andn could be considered whole register too).
> And, even if your port has 256-bit integer arithmetics, there is no mangling
> for __int256_t or similar, so I don't see how you can offer such data type
> as supported in the 4.8 timeframe.
>
> 	Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 14:31             ` Jakub Jelinek
  2012-10-31 14:56               ` Kenneth Zadeck
@ 2012-10-31 18:42               ` Kenneth Zadeck
  2012-11-01 12:44                 ` Kenneth Zadeck
  1 sibling, 1 reply; 59+ messages in thread
From: Kenneth Zadeck @ 2012-10-31 18:42 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford

[-- Attachment #1: Type: text/plain, Size: 4203 bytes --]

Jakub,

it is hard from all of the threads to actually distill what the real 
issues are here.  So let me start from a clean slate and state them simply.

Richi has three primary objections:

1) that we can do all of this with a templated version of double-int.
2) that we should not be passing in a precision and bitsize into the 
interface.
3) that the interface is too large.

I have attached a fragment of my patch #5 to illustrate the main thrust 
of my patches and to illustrate the usefulness to gcc right now.

In the current trunk, we have code that does simplification when the 
mode fits in an HWI and we have code that does the simplification if the 
mode fits in two HWIs.   if the mode does not fit in two hwi's the code 
does not do the simplification.

Thus here and in a large number of other places we have two copies of 
the code.    Richi wants there to be multiple template instantiations of 
double-int.    This means that we are now going to have to have 3 copies 
of this code to support oi mode on a 64 bit host and 4 copies on a 32 
bit host.

Further note that there are not as many cases for the 2*hwi in the code 
as their are for the hwi case and in general this is true through out 
the compiler.  (CLRSB is missing from the 2hwi case in the patch)  We 
really did not write twice the code when we stated supporting 2 hwi, we 
added about 1.5 times the code (simplify-rtx is better than most of the 
rest of the compiler).  I am using the rtl level as an example here 
because i have posted all of those patches, but the tree level is no 
better.

I do not want to write this code a third time and certainly not a fourth 
time.   Just fixing all of this is quite useful now: it fills in a lot 
of gaps in our transformations and it removes many edge case crashes 
because ti mode really is lightly tested.  However, this patch becomes 
crucial as the world gets larger.

Richi's second point is that we should be doing everything at "infinite 
precision" and not passing in an explicit bitsize and precision.   That 
works ok (sans the issues i raised with it in tree-vpn earlier) when the 
largest precision on the machine fits in a couple of hwis.    However, 
for targets that have large integers or cross compilers, this becomes 
expensive.    The idea behind my set of patches is that for the 
transformations that can work this way, we do the math in the precision 
of the type or mode.   In general this means that almost all of the math 
will be done quickly, even on targets that support really big 
integers.   For passes like tree-vrp, the math will be done at some 
multiple of the largest type seen in the actual program.    The amount 
of the multiple is a function of the optimization, not the target or the 
host. Currently (on my home computer) the wide-int interface allows the 
optimization to go 4x the largest mode on the target.

I can get rid of this bound at the expense of doing an alloca rather 
than stack allocating a fixed sized structure.    However, given the 
extremely heavy use of this interface, that does not seem like the best 
of tradeoffs.

The truth is that the vast majority of the compiler actually wants to 
see the math done the way that it is going to be done on the machine.  
Tree-vrp and the gimple constant prop do not.  But i have made 
accommodations to handle both needs.    I believe that the reason that 
double-int was never used at the rtl level is that it does not actually 
do the math in a way that is useful to the target.

Richi's third objection is that the interface is too large.   I 
disagree.   It was designed based on the actual usage of the 
interface.   When i found places where i was writing the same code over 
and over again, i put it in a function as part of the interface.   I 
later went back and optimized many of these because this is a very 
heavily used interface.  Richi has many other objections, but i have 
agreed to fix almost all of them, so i am not going to address them here.

It really will be a huge burden to have to carry these patched until the 
next revision.  We are currently in stage 1 and i believe that the minor 
issues that richi raises can be easily addressed.

kenny

[-- Attachment #2: small.diff --]
[-- Type: text/x-patch, Size: 8180 bytes --]

@@ -1373,302 +1411,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
     }

-  if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+  if (CONST_SCALAR_INT_P (op) && width > 0)
     {
-      HOST_WIDE_INT arg0 = INTVAL (op);
-      HOST_WIDE_INT val;
+      wide_int result;
+      enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode;
+      wide_int op0 = wide_int::from_rtx (op, imode);
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+      /* This assert keeps the simplification from producing a result
+	 that cannot be represented in a CONST_DOUBLE but a lot of
+	 upstream callers expect that this function never fails to
+	 simplify something and so you if you added this to the test
+	 above the code would die later anyway.  If this assert
+	 happens, you just need to make the port support wide int.  */
+      gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); 
+#endif

       switch (code)
 	{
 	case NOT:
-	  val = ~ arg0;
+	  result = ~op0;
 	  break;

 	case NEG:
-	  val = - arg0;
+	  result = op0.neg ();
 	  break;

 	case ABS:
-	  val = (arg0 >= 0 ? arg0 : - arg0);
+	  result = op0.abs ();
 	  break;

 	case FFS:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = ffs_hwi (arg0);
+	  result = op0.ffs ();
 	  break;

 	case CLZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
-	    ;
-	  else
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1;
+	  result = op0.clz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;

 	case CLRSB:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    val = GET_MODE_PRECISION (mode) - 1;
-	  else if (arg0 >= 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2;
-	  else if (arg0 < 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2;
+	  result = op0.clrsb (GET_MODE_BITSIZE (mode), 
+			      GET_MODE_PRECISION (mode));
 	  break;
-
+	  
 	case CTZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    {
-	      /* Even if the value at zero is undefined, we have to come
-		 up with some replacement.  Seems good enough.  */
-	      if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
-		val = GET_MODE_PRECISION (mode);
-	    }
-	  else
-	    val = ctz_hwi (arg0);
+	  result = op0.ctz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;

 	case POPCOUNT:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
+	  result = op0.popcount (GET_MODE_BITSIZE (mode), 
+				 GET_MODE_PRECISION (mode));
 	  break;

 	case PARITY:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
-	  val &= 1;
+	  result = op0.parity (GET_MODE_BITSIZE (mode), 
+			       GET_MODE_PRECISION (mode));
 	  break;

 	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    val = 0;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-		byte = (arg0 >> s) & 0xff;
-		val |= byte << d;
-	      }
-	  }
+	  result = op0.bswap ();
 	  break;

 	case TRUNCATE:
-	  val = arg0;
+	  result = op0.sext (mode);
 	  break;

 	case ZERO_EXTEND:
-	  /* When zero-extending a CONST_INT, we need to know its
-             original mode.  */
-	  gcc_assert (op_mode != VOIDmode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT)
-	    val = arg0 & GET_MODE_MASK (op_mode);
-	  else
-	    return 0;
+	  result = op0.zext (mode);
 	  break;

 	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode)
-	    op_mode = mode;
-	  op_width = GET_MODE_PRECISION (op_mode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (op_width < HOST_BITS_PER_WIDE_INT)
-	    {
-	      val = arg0 & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, val))
-		val |= ~GET_MODE_MASK (op_mode);
-	    }
-	  else
-	    return 0;
+	  result = op0.sext (mode);
 	  break;

 	case SQRT:
-	case FLOAT_EXTEND:
-	case FLOAT_TRUNCATE:
-	case SS_TRUNCATE:
-	case US_TRUNCATE:
-	case SS_NEG:
-	case US_NEG:
-	case SS_ABS:
-	  return 0;
-
-	default:
-	  gcc_unreachable ();
-	}
-
-      return gen_int_mode (val, mode);
-    }
-
-  /* We can do some operations on integer CONST_DOUBLEs.  Also allow
-     for a DImode operation on a CONST_INT.  */
-  else if (width <= HOST_BITS_PER_DOUBLE_INT
-	   && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op)))
-    {
-      double_int first, value;
-
-      if (CONST_DOUBLE_AS_INT_P (op))
-	first = double_int::from_pair (CONST_DOUBLE_HIGH (op),
-				       CONST_DOUBLE_LOW (op));
-      else
-	first = double_int::from_shwi (INTVAL (op));
-
-      switch (code)
-	{
-	case NOT:
-	  value = ~first;
-	  break;
-
-	case NEG:
-	  value = -first;
-	  break;
-
-	case ABS:
-	  if (first.is_negative ())
-	    value = -first;
-	  else
-	    value = first;
-	  break;
-
-	case FFS:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ffs_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high);
-	  else
-	    value.low = 0;
-	  break;
-
-	case CLZ:
-	  value.high = 0;
-	  if (first.high != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1
-	              - HOST_BITS_PER_WIDE_INT;
-	  else if (first.low != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1;
-	  else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case CTZ:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ctz_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high);
-	  else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case POPCOUNT:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  break;
-
-	case PARITY:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  value.low &= 1;
-	  break;
-
-	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    value = double_int_zero;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-
-		if (s < HOST_BITS_PER_WIDE_INT)
-		  byte = (first.low >> s) & 0xff;
-		else
-		  byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff;
-
-		if (d < HOST_BITS_PER_WIDE_INT)
-		  value.low |= byte << d;
-		else
-		  value.high |= byte << (d - HOST_BITS_PER_WIDE_INT);
-	      }
-	  }
-	  break;
-
-	case TRUNCATE:
-	  /* This is just a change-of-mode, so do nothing.  */
-	  value = first;
-	  break;
-
-	case ZERO_EXTEND:
-	  gcc_assert (op_mode != VOIDmode);
-
-	  if (op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-
-	  value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode));
-	  break;
-
-	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode
-	      || op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-	  else
-	    {
-	      value.low = first.low & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, value.low))
-		value.low |= ~GET_MODE_MASK (op_mode);
-
-	      value.high = HWI_SIGN_EXTEND (value.low);
-	    }
-	  break;
-
-	case SQRT:
-	  return 0;
-
 	default:
 	  return 0;
 	}

-      return immed_double_int_const (value, mode);
+      return immed_wide_int_const (result, mode);
     }

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 18:42               ` Kenneth Zadeck
@ 2012-11-01 12:44                 ` Kenneth Zadeck
  2012-11-01 13:10                   ` Richard Sandiford
  0 siblings, 1 reply; 59+ messages in thread
From: Kenneth Zadeck @ 2012-11-01 12:44 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc, gcc-patches, rdsandiford

[-- Attachment #1: Type: text/plain, Size: 5655 bytes --]

richi,

I would like you to respond to at least point 1 of this email.   In it 
there is code from the rtl level that was written twice, once for the 
case when the size of the mode is less than the size of a HWI and once 
for the case where the size of the mode is less that 2 HWIs.

my patch changes this to one instance of the code that works no matter 
how large the data passed to it is.

you have made a specific requirement for wide int to be a template that 
can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I 
would like to know how this particular fragment is to be rewritten in 
this model?   It seems that I would have to retain the structure where 
there is one version of the code for each size that the template is 
instantiated.

I would like to point out that there are about 125 places where we have 
two copies of the code for some operation.   Many of these places are 
smaller than this, but some are larger.   There are also at least 
several hundred places where the code only was written for the 1 hwi 
case.   These are harder to find with simple greps.

I am very concerned about this particular aspect of your comments 
because it seems to doom us to write the same code over and over again.

kenny




On 10/31/2012 02:19 PM, Kenneth Zadeck wrote:
> Jakub,
>
> it is hard from all of the threads to actually distill what the real 
> issues are here.  So let me start from a clean slate and state them 
> simply.
>
> Richi has three primary objections:
>
> 1) that we can do all of this with a templated version of double-int.
> 2) that we should not be passing in a precision and bitsize into the 
> interface.
> 3) that the interface is too large.
>
> I have attached a fragment of my patch #5 to illustrate the main 
> thrust of my patches and to illustrate the usefulness to gcc right now.
>
> In the current trunk, we have code that does simplification when the 
> mode fits in an HWI and we have code that does the simplification if 
> the mode fits in two HWIs.   if the mode does not fit in two hwi's the 
> code does not do the simplification.
>
> Thus here and in a large number of other places we have two copies of 
> the code.    Richi wants there to be multiple template instantiations 
> of double-int.    This means that we are now going to have to have 3 
> copies of this code to support oi mode on a 64 bit host and 4 copies 
> on a 32 bit host.
>
> Further note that there are not as many cases for the 2*hwi in the 
> code as their are for the hwi case and in general this is true through 
> out the compiler.  (CLRSB is missing from the 2hwi case in the patch)  
> We really did not write twice the code when we stated supporting 2 
> hwi, we added about 1.5 times the code (simplify-rtx is better than 
> most of the rest of the compiler).  I am using the rtl level as an 
> example here because i have posted all of those patches, but the tree 
> level is no better.
>
> I do not want to write this code a third time and certainly not a 
> fourth time.   Just fixing all of this is quite useful now: it fills 
> in a lot of gaps in our transformations and it removes many edge case 
> crashes because ti mode really is lightly tested. However, this patch 
> becomes crucial as the world gets larger.
>
> Richi's second point is that we should be doing everything at 
> "infinite precision" and not passing in an explicit bitsize and 
> precision.   That works ok (sans the issues i raised with it in 
> tree-vpn earlier) when the largest precision on the machine fits in a 
> couple of hwis.    However, for targets that have large integers or 
> cross compilers, this becomes expensive.    The idea behind my set of 
> patches is that for the transformations that can work this way, we do 
> the math in the precision of the type or mode.   In general this means 
> that almost all of the math will be done quickly, even on targets that 
> support really big integers. For passes like tree-vrp, the math will 
> be done at some multiple of the largest type seen in the actual 
> program.    The amount of the multiple is a function of the 
> optimization, not the target or the host. Currently (on my home 
> computer) the wide-int interface allows the optimization to go 4x the 
> largest mode on the target.
>
> I can get rid of this bound at the expense of doing an alloca rather 
> than stack allocating a fixed sized structure.    However, given the 
> extremely heavy use of this interface, that does not seem like the 
> best of tradeoffs.
>
> The truth is that the vast majority of the compiler actually wants to 
> see the math done the way that it is going to be done on the machine.  
> Tree-vrp and the gimple constant prop do not.  But i have made 
> accommodations to handle both needs.    I believe that the reason that 
> double-int was never used at the rtl level is that it does not 
> actually do the math in a way that is useful to the target.
>
> Richi's third objection is that the interface is too large.   I 
> disagree.   It was designed based on the actual usage of the 
> interface.   When i found places where i was writing the same code 
> over and over again, i put it in a function as part of the 
> interface.   I later went back and optimized many of these because 
> this is a very heavily used interface.  Richi has many other 
> objections, but i have agreed to fix almost all of them, so i am not 
> going to address them here.
>
> It really will be a huge burden to have to carry these patched until 
> the next revision.  We are currently in stage 1 and i believe that the 
> minor issues that richi raises can be easily addressed.
>
> kenny


[-- Attachment #2: small.diff --]
[-- Type: text/x-patch, Size: 8180 bytes --]

@@ -1373,302 +1411,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
     }
 
-  if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+  if (CONST_SCALAR_INT_P (op) && width > 0)
     {
-      HOST_WIDE_INT arg0 = INTVAL (op);
-      HOST_WIDE_INT val;
+      wide_int result;
+      enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode;
+      wide_int op0 = wide_int::from_rtx (op, imode);
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+      /* This assert keeps the simplification from producing a result
+	 that cannot be represented in a CONST_DOUBLE but a lot of
+	 upstream callers expect that this function never fails to
+	 simplify something and so you if you added this to the test
+	 above the code would die later anyway.  If this assert
+	 happens, you just need to make the port support wide int.  */
+      gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); 
+#endif
 
       switch (code)
 	{
 	case NOT:
-	  val = ~ arg0;
+	  result = ~op0;
 	  break;
 
 	case NEG:
-	  val = - arg0;
+	  result = op0.neg ();
 	  break;
 
 	case ABS:
-	  val = (arg0 >= 0 ? arg0 : - arg0);
+	  result = op0.abs ();
 	  break;
 
 	case FFS:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = ffs_hwi (arg0);
+	  result = op0.ffs ();
 	  break;
 
 	case CLZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
-	    ;
-	  else
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1;
+	  result = op0.clz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;
 
 	case CLRSB:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    val = GET_MODE_PRECISION (mode) - 1;
-	  else if (arg0 >= 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2;
-	  else if (arg0 < 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2;
+	  result = op0.clrsb (GET_MODE_BITSIZE (mode), 
+			      GET_MODE_PRECISION (mode));
 	  break;
-
+	  
 	case CTZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    {
-	      /* Even if the value at zero is undefined, we have to come
-		 up with some replacement.  Seems good enough.  */
-	      if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
-		val = GET_MODE_PRECISION (mode);
-	    }
-	  else
-	    val = ctz_hwi (arg0);
+	  result = op0.ctz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;
 
 	case POPCOUNT:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
+	  result = op0.popcount (GET_MODE_BITSIZE (mode), 
+				 GET_MODE_PRECISION (mode));
 	  break;
 
 	case PARITY:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
-	  val &= 1;
+	  result = op0.parity (GET_MODE_BITSIZE (mode), 
+			       GET_MODE_PRECISION (mode));
 	  break;
 
 	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    val = 0;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-		byte = (arg0 >> s) & 0xff;
-		val |= byte << d;
-	      }
-	  }
+	  result = op0.bswap ();
 	  break;
 
 	case TRUNCATE:
-	  val = arg0;
+	  result = op0.sext (mode);
 	  break;
 
 	case ZERO_EXTEND:
-	  /* When zero-extending a CONST_INT, we need to know its
-             original mode.  */
-	  gcc_assert (op_mode != VOIDmode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT)
-	    val = arg0 & GET_MODE_MASK (op_mode);
-	  else
-	    return 0;
+	  result = op0.zext (mode);
 	  break;
 
 	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode)
-	    op_mode = mode;
-	  op_width = GET_MODE_PRECISION (op_mode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (op_width < HOST_BITS_PER_WIDE_INT)
-	    {
-	      val = arg0 & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, val))
-		val |= ~GET_MODE_MASK (op_mode);
-	    }
-	  else
-	    return 0;
+	  result = op0.sext (mode);
 	  break;
 
 	case SQRT:
-	case FLOAT_EXTEND:
-	case FLOAT_TRUNCATE:
-	case SS_TRUNCATE:
-	case US_TRUNCATE:
-	case SS_NEG:
-	case US_NEG:
-	case SS_ABS:
-	  return 0;
-
-	default:
-	  gcc_unreachable ();
-	}
-
-      return gen_int_mode (val, mode);
-    }
-
-  /* We can do some operations on integer CONST_DOUBLEs.  Also allow
-     for a DImode operation on a CONST_INT.  */
-  else if (width <= HOST_BITS_PER_DOUBLE_INT
-	   && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op)))
-    {
-      double_int first, value;
-
-      if (CONST_DOUBLE_AS_INT_P (op))
-	first = double_int::from_pair (CONST_DOUBLE_HIGH (op),
-				       CONST_DOUBLE_LOW (op));
-      else
-	first = double_int::from_shwi (INTVAL (op));
-
-      switch (code)
-	{
-	case NOT:
-	  value = ~first;
-	  break;
-
-	case NEG:
-	  value = -first;
-	  break;
-
-	case ABS:
-	  if (first.is_negative ())
-	    value = -first;
-	  else
-	    value = first;
-	  break;
-
-	case FFS:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ffs_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high);
-	  else
-	    value.low = 0;
-	  break;
-
-	case CLZ:
-	  value.high = 0;
-	  if (first.high != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1
-	              - HOST_BITS_PER_WIDE_INT;
-	  else if (first.low != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1;
-	  else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case CTZ:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ctz_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high);
-	  else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case POPCOUNT:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  break;
-
-	case PARITY:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  value.low &= 1;
-	  break;
-
-	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    value = double_int_zero;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-
-		if (s < HOST_BITS_PER_WIDE_INT)
-		  byte = (first.low >> s) & 0xff;
-		else
-		  byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff;
-
-		if (d < HOST_BITS_PER_WIDE_INT)
-		  value.low |= byte << d;
-		else
-		  value.high |= byte << (d - HOST_BITS_PER_WIDE_INT);
-	      }
-	  }
-	  break;
-
-	case TRUNCATE:
-	  /* This is just a change-of-mode, so do nothing.  */
-	  value = first;
-	  break;
-
-	case ZERO_EXTEND:
-	  gcc_assert (op_mode != VOIDmode);
-
-	  if (op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-
-	  value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode));
-	  break;
-
-	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode
-	      || op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-	  else
-	    {
-	      value.low = first.low & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, value.low))
-		value.low |= ~GET_MODE_MASK (op_mode);
-
-	      value.high = HWI_SIGN_EXTEND (value.low);
-	    }
-	  break;
-
-	case SQRT:
-	  return 0;
-
 	default:
 	  return 0;
 	}
 
-      return immed_double_int_const (value, mode);
+      return immed_wide_int_const (result, mode);
     }
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 12:44                 ` Kenneth Zadeck
@ 2012-11-01 13:10                   ` Richard Sandiford
  2012-11-01 13:18                     ` Kenneth Zadeck
                                       ` (3 more replies)
  0 siblings, 4 replies; 59+ messages in thread
From: Richard Sandiford @ 2012-11-01 13:10 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Jakub Jelinek, Richard Biener, gcc, gcc-patches

Kenneth Zadeck <zadeck@naturalbridge.com> writes:
> I would like you to respond to at least point 1 of this email.   In it 
> there is code from the rtl level that was written twice, once for the 
> case when the size of the mode is less than the size of a HWI and once 
> for the case where the size of the mode is less that 2 HWIs.
>
> my patch changes this to one instance of the code that works no matter 
> how large the data passed to it is.
>
> you have made a specific requirement for wide int to be a template that 
> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I 
> would like to know how this particular fragment is to be rewritten in 
> this model?   It seems that I would have to retain the structure where 
> there is one version of the code for each size that the template is 
> instantiated.

I think richi's argument was that wide_int should be split into two.
There should be a "bare-metal" class that just has a length and HWIs,
and the main wide_int class should be an extension on top of that
that does things to a bit precision instead.  Presumably with some
template magic so that the length (number of HWIs) is a constant for:

  typedef foo<2> double_int;

and a variable for wide_int (because in wide_int the length would be
the number of significant HWIs rather than the size of the underlying
array).  wide_int would also record the precision and apply it after
the full HWI operation.

So the wide_int class would still provide "as wide as we need" arithmetic,
as in your rtl patch.  I don't think he was objecting to that.

As is probably obvious, I don't agree FWIW.  It seems like an unnecessary
complication without any clear use.  Especially since the number of
significant HWIs in a wide_int isn't always going to be the same for
both operands to a binary operation, and it's not clear to me whether
that should be handled in the base class or wide_int.

Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 13:10                   ` Richard Sandiford
@ 2012-11-01 13:18                     ` Kenneth Zadeck
  2012-11-01 13:24                     ` Kenneth Zadeck
                                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 59+ messages in thread
From: Kenneth Zadeck @ 2012-11-01 13:18 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener, gcc, gcc-patches, rdsandiford


On 11/01/2012 09:10 AM, Richard Sandiford wrote:
> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>> I would like you to respond to at least point 1 of this email.   In it
>> there is code from the rtl level that was written twice, once for the
>> case when the size of the mode is less than the size of a HWI and once
>> for the case where the size of the mode is less that 2 HWIs.
>>
>> my patch changes this to one instance of the code that works no matter
>> how large the data passed to it is.
>>
>> you have made a specific requirement for wide int to be a template that
>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I
>> would like to know how this particular fragment is to be rewritten in
>> this model?   It seems that I would have to retain the structure where
>> there is one version of the code for each size that the template is
>> instantiated.
> I think richi's argument was that wide_int should be split into two.
> There should be a "bare-metal" class that just has a length and HWIs,
> and the main wide_int class should be an extension on top of that
> that does things to a bit precision instead.  Presumably with some
> template magic so that the length (number of HWIs) is a constant for:
>
>    typedef foo<2> double_int;
>
> and a variable for wide_int (because in wide_int the length would be
> the number of significant HWIs rather than the size of the underlying
> array).  wide_int would also record the precision and apply it after
> the full HWI operation.
>
> So the wide_int class would still provide "as wide as we need" arithmetic,
> as in your rtl patch.  I don't think he was objecting to that.
>
> As is probably obvious, I don't agree FWIW.  It seems like an unnecessary
> complication without any clear use.  Especially since the number of
> significant HWIs in a wide_int isn't always going to be the same for
> both operands to a binary operation, and it's not clear to me whether
> that should be handled in the base class or wide_int.
>
> Richard
There is a certain amount of surprise about all of this on my part.    I 
thought that i was doing such a great thing by looking at the specific 
port that you are building to determine how to size these data 
structures.    You would think from the response that i am getting that 
i had murdered some one.

do you think that when he gets around to reading the patch for 
simplify-rtx.c that he is going to object to this frag?
@@ -5179,13 +4815,11 @@ static rtx
  simplify_immed_subreg (enum machine_mode outermode, rtx op,
                 enum machine_mode innermode, unsigned int byte)
  {
-  /* We support up to 512-bit values (for V8DFmode).  */
    enum {
-    max_bitsize = 512,
      value_bit = 8,
      value_mask = (1 << value_bit) - 1
    };
-  unsigned char value[max_bitsize / value_bit];
+  unsigned char value [MAX_BITSIZE_MODE_ANY_MODE/value_bit];
    int value_start;
    int i;
    int elem;




^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 13:10                   ` Richard Sandiford
  2012-11-01 13:18                     ` Kenneth Zadeck
@ 2012-11-01 13:24                     ` Kenneth Zadeck
  2012-11-01 15:16                     ` Richard Sandiford
  2012-11-04 16:54                     ` Richard Biener
  3 siblings, 0 replies; 59+ messages in thread
From: Kenneth Zadeck @ 2012-11-01 13:24 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener, gcc, gcc-patches, rdsandiford

anyway richard, it does not answer the question as to what you are going 
to do with a typedef foo<2>.

the point of all of this work by me was to leave no traces of the host 
in the way the compiler works.
instantiating a specific size of the double-ints is not going to get you 
there.

kenny

On 11/01/2012 09:10 AM, Richard Sandiford wrote:
> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>> I would like you to respond to at least point 1 of this email.   In it
>> there is code from the rtl level that was written twice, once for the
>> case when the size of the mode is less than the size of a HWI and once
>> for the case where the size of the mode is less that 2 HWIs.
>>
>> my patch changes this to one instance of the code that works no matter
>> how large the data passed to it is.
>>
>> you have made a specific requirement for wide int to be a template that
>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I
>> would like to know how this particular fragment is to be rewritten in
>> this model?   It seems that I would have to retain the structure where
>> there is one version of the code for each size that the template is
>> instantiated.
> I think richi's argument was that wide_int should be split into two.
> There should be a "bare-metal" class that just has a length and HWIs,
> and the main wide_int class should be an extension on top of that
> that does things to a bit precision instead.  Presumably with some
> template magic so that the length (number of HWIs) is a constant for:
>
>    typedef foo<2> double_int;
>
> and a variable for wide_int (because in wide_int the length would be
> the number of significant HWIs rather than the size of the underlying
> array).  wide_int would also record the precision and apply it after
> the full HWI operation.
>
> So the wide_int class would still provide "as wide as we need" arithmetic,
> as in your rtl patch.  I don't think he was objecting to that.
>
> As is probably obvious, I don't agree FWIW.  It seems like an unnecessary
> complication without any clear use.  Especially since the number of
> significant HWIs in a wide_int isn't always going to be the same for
> both operands to a binary operation, and it's not clear to me whether
> that should be handled in the base class or wide_int.
>
> Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 13:10                   ` Richard Sandiford
  2012-11-01 13:18                     ` Kenneth Zadeck
  2012-11-01 13:24                     ` Kenneth Zadeck
@ 2012-11-01 15:16                     ` Richard Sandiford
  2012-11-04 16:54                     ` Richard Biener
  3 siblings, 0 replies; 59+ messages in thread
From: Richard Sandiford @ 2012-11-01 15:16 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Jakub Jelinek, Richard Biener, gcc, gcc-patches

Richard Sandiford <rdsandiford@googlemail.com> writes:
> As is probably obvious, I don't agree FWIW.  It seems like an unnecessary
> complication without any clear use.  Especially since the number of
> significant HWIs in a wide_int isn't always going to be the same for
> both operands to a binary operation, and it's not clear to me whether
> that should be handled in the base class or wide_int.

...and the number of HWIs in the result might be different again.
Whether that's true depends on the value as well as the (HWI) size
of the operands.

Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 13:10                   ` Richard Sandiford
                                       ` (2 preceding siblings ...)
  2012-11-01 15:16                     ` Richard Sandiford
@ 2012-11-04 16:54                     ` Richard Biener
  2012-11-05 13:59                       ` Kenneth Zadeck
  3 siblings, 1 reply; 59+ messages in thread
From: Richard Biener @ 2012-11-04 16:54 UTC (permalink / raw)
  To: Kenneth Zadeck, Jakub Jelinek, Richard Biener, gcc, gcc-patches,
	rdsandiford

On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>> I would like you to respond to at least point 1 of this email.   In it
>> there is code from the rtl level that was written twice, once for the
>> case when the size of the mode is less than the size of a HWI and once
>> for the case where the size of the mode is less that 2 HWIs.
>>
>> my patch changes this to one instance of the code that works no matter
>> how large the data passed to it is.
>>
>> you have made a specific requirement for wide int to be a template that
>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I
>> would like to know how this particular fragment is to be rewritten in
>> this model?   It seems that I would have to retain the structure where
>> there is one version of the code for each size that the template is
>> instantiated.
>
> I think richi's argument was that wide_int should be split into two.
> There should be a "bare-metal" class that just has a length and HWIs,
> and the main wide_int class should be an extension on top of that
> that does things to a bit precision instead.  Presumably with some
> template magic so that the length (number of HWIs) is a constant for:
>
>   typedef foo<2> double_int;
>
> and a variable for wide_int (because in wide_int the length would be
> the number of significant HWIs rather than the size of the underlying
> array).  wide_int would also record the precision and apply it after
> the full HWI operation.
>
> So the wide_int class would still provide "as wide as we need" arithmetic,
> as in your rtl patch.  I don't think he was objecting to that.

That summarizes one part of my complaints / suggestions correctly.  In other
mails I suggested to not make it a template but a constant over object lifetime
'bitsize' (or maxlen) field.  Both suggestions likely require more thought than
I put into them.  The main reason is that with C++ you can abstract from where
wide-int information pieces are stored and thus use the arithmetic / operation
workers without copying the (source) "wide-int" objects.  Thus you should
be able to write adaptors for double-int storage, tree or RTX storage.

> As is probably obvious, I don't agree FWIW.  It seems like an unnecessary
> complication without any clear use.  Especially since the number of

Maybe the double_int typedef is without any clear use.  Properly
abstracting from the storage / information providers will save
compile-time, memory and code though.  I don't see that any thought
was spent on how to avoid excessive copying or dealing with
long(er)-lived objects and their storage needs.

> significant HWIs in a wide_int isn't always going to be the same for
> both operands to a binary operation, and it's not clear to me whether
> that should be handled in the base class or wide_int.

It certainly depends.

Richard.

> Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-04 16:54                     ` Richard Biener
@ 2012-11-05 13:59                       ` Kenneth Zadeck
  2012-11-05 17:00                         ` Kenneth Zadeck
  2012-11-26 15:03                         ` Richard Biener
  0 siblings, 2 replies; 59+ messages in thread
From: Kenneth Zadeck @ 2012-11-05 13:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford

On 11/04/2012 11:54 AM, Richard Biener wrote:
> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>>> I would like you to respond to at least point 1 of this email.   In it
>>> there is code from the rtl level that was written twice, once for the
>>> case when the size of the mode is less than the size of a HWI and once
>>> for the case where the size of the mode is less that 2 HWIs.
>>>
>>> my patch changes this to one instance of the code that works no matter
>>> how large the data passed to it is.
>>>
>>> you have made a specific requirement for wide int to be a template that
>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I
>>> would like to know how this particular fragment is to be rewritten in
>>> this model?   It seems that I would have to retain the structure where
>>> there is one version of the code for each size that the template is
>>> instantiated.
>> I think richi's argument was that wide_int should be split into two.
>> There should be a "bare-metal" class that just has a length and HWIs,
>> and the main wide_int class should be an extension on top of that
>> that does things to a bit precision instead.  Presumably with some
>> template magic so that the length (number of HWIs) is a constant for:
>>
>>    typedef foo<2> double_int;
>>
>> and a variable for wide_int (because in wide_int the length would be
>> the number of significant HWIs rather than the size of the underlying
>> array).  wide_int would also record the precision and apply it after
>> the full HWI operation.
>>
>> So the wide_int class would still provide "as wide as we need" arithmetic,
>> as in your rtl patch.  I don't think he was objecting to that.
> That summarizes one part of my complaints / suggestions correctly.  In other
> mails I suggested to not make it a template but a constant over object lifetime
> 'bitsize' (or maxlen) field.  Both suggestions likely require more thought than
> I put into them.  The main reason is that with C++ you can abstract from where
> wide-int information pieces are stored and thus use the arithmetic / operation
> workers without copying the (source) "wide-int" objects.  Thus you should
> be able to write adaptors for double-int storage, tree or RTX storage.
We had considered something along these lines and rejected it.   I am 
not really opposed to doing something like this, but it is not an 
obvious winning idea and is likely not to be a good idea.   Here was our 
thought process:

if you abstract away the storage inside a wide int, then you should be 
able to copy a pointer to the block of data from either the rtl level 
integer constant or the tree level one into the wide int.   It is 
certainly true that making a wide_int from one of these is an extremely 
common operation and doing this would avoid those copies.

However, this causes two problems:
1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to 
make the object.   it created the base object and then it allocated the 
array.  Richard S noticed that we could just allocate one CONST_WIDE_INT 
that had the array in it.   Doing it this way saves one ggc allocation 
and one indirection when accessing the data within the CONST_WIDE_INT.   
Our plan is to use the same trick at the tree level.   So to avoid the 
copying, you seem to have to have a more expensive rep for 
CONST_WIDE_INT and INT_CST.

2) You are now stuck either ggcing the storage inside a wide_int when 
they are created as part of an expression or you have to play some game 
to represent the two different storage plans inside of wide_int.   
Clearly this is where you think that we should be going by suggesting 
that we abstract away the internal storage.   However, this comes at a 
price:   what is currently an array access in my patches would (i 
believe) become a function call.  From a performance point of view, i 
believe that this is a non starter. If you can figure out how to design 
this so that it is not a function call, i would consider this a viable 
option.

On the other side of this you are clearly correct that we are copying 
the data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs. 
    But this is why we represent data inside of the wide_ints, the 
INT_CSTs and the CONST_WIDE_INTs in a compressed form.   Even with very 
big types, which are generally rare, the constants them selves are very 
small.   So the copy operation is a loop that almost always copies one 
element, even with tree-vrp which doubles the sizes of every type.

There is the third option which is that the storage inside the wide int 
is just ggced storage.  We rejected this because of the functional 
nature of wide-ints.    There are zillions created, they can be stack 
allocated, and they last for very short periods of time.

>> As is probably obvious, I don't agree FWIW.  It seems like an unnecessary
>> complication without any clear use.  Especially since the number of
> Maybe the double_int typedef is without any clear use.  Properly
> abstracting from the storage / information providers will save
> compile-time, memory and code though.  I don't see that any thought
> was spent on how to avoid excessive copying or dealing with
> long(er)-lived objects and their storage needs.
I actually disagree.    Wide ints can use a bloated amount of storage 
because they are designed to be very short lived and very low cost 
objects that are stack allocated.   For long term storage, there is 
INT_CST at the tree level and CONST_WIDE_INT at the rtl level.  Those 
use a very compact storage model.   The copying entailed is only a small 
part of the overall performance.

Everything that you are suggesting along these lines is adding to the 
weight of a wide-int object.  You have to understand there will be many 
more wide-ints created in a normal compilation than were ever created 
with double-int.    This is because the rtl level had no object like 
this at all and at the tree level, many of the places that should have 
used double int, short cut the code and only did the transformations if 
the types fit in a HWI.

This is why we are extremely defensive about this issue.   We really did 
think a lot about it.

Kenny

>> significant HWIs in a wide_int isn't always going to be the same for
>> both operands to a binary operation, and it's not clear to me whether
>> that should be handled in the base class or wide_int.
> It certainly depends.
>
> Richard.
>
>> Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-05 13:59                       ` Kenneth Zadeck
@ 2012-11-05 17:00                         ` Kenneth Zadeck
  2012-11-26 15:03                         ` Richard Biener
  1 sibling, 0 replies; 59+ messages in thread
From: Kenneth Zadeck @ 2012-11-05 17:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford

Jakub and Richi,

At this point I have decided to that i am not going to get the rest of 
the wide-int patches into a stable enough form for this round. The 
combination of still living without power at my house and some issues 
that i hit with the front ends has made it impossible to get this 
finished by today's deadline.

I do want patches 1-7 to go in (after proper review) but i am going to 
withdraw patch 8 for this round.

patches 1-5 deal with the rtl level.   These have been extensively 
tested and "examined" with the exception of patch 4, "examined" by 
Richard Sandiford.    They clean up a lot of things at the rtl level 
that effect every port as well as fixing some outstanding regressions.

patches 6 and 7 are general cleanups at the tree level and can be 
justified as on their own without any regard to wide-int.    They have 
also been extensively tested.

I am withdrawing patch 8 because it converted tree-vpn to use wide-ints 
but the benefit of this patch really cannot be seen without the rest of 
the tree level wide-int patches.

In the next couple of days i will resubmit patches 1-7 with the patch 
rot removed and the public comments folded into them.

Kenny

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-05 13:59                       ` Kenneth Zadeck
  2012-11-05 17:00                         ` Kenneth Zadeck
@ 2012-11-26 15:03                         ` Richard Biener
  2012-11-26 16:03                           ` Kenneth Zadeck
  1 sibling, 1 reply; 59+ messages in thread
From: Richard Biener @ 2012-11-26 15:03 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford

On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
>
> On 11/04/2012 11:54 AM, Richard Biener wrote:
>>
>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
>> <rdsandiford@googlemail.com> wrote:
>>>
>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>>>>
>>>> I would like you to respond to at least point 1 of this email.   In it
>>>> there is code from the rtl level that was written twice, once for the
>>>> case when the size of the mode is less than the size of a HWI and once
>>>> for the case where the size of the mode is less that 2 HWIs.
>>>>
>>>> my patch changes this to one instance of the code that works no matter
>>>> how large the data passed to it is.
>>>>
>>>> you have made a specific requirement for wide int to be a template that
>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I
>>>> would like to know how this particular fragment is to be rewritten in
>>>> this model?   It seems that I would have to retain the structure where
>>>> there is one version of the code for each size that the template is
>>>> instantiated.
>>>
>>> I think richi's argument was that wide_int should be split into two.
>>> There should be a "bare-metal" class that just has a length and HWIs,
>>> and the main wide_int class should be an extension on top of that
>>> that does things to a bit precision instead.  Presumably with some
>>> template magic so that the length (number of HWIs) is a constant for:
>>>
>>>    typedef foo<2> double_int;
>>>
>>> and a variable for wide_int (because in wide_int the length would be
>>> the number of significant HWIs rather than the size of the underlying
>>> array).  wide_int would also record the precision and apply it after
>>> the full HWI operation.
>>>
>>> So the wide_int class would still provide "as wide as we need"
>>> arithmetic,
>>> as in your rtl patch.  I don't think he was objecting to that.
>>
>> That summarizes one part of my complaints / suggestions correctly.  In
>> other
>> mails I suggested to not make it a template but a constant over object
>> lifetime
>> 'bitsize' (or maxlen) field.  Both suggestions likely require more thought
>> than
>> I put into them.  The main reason is that with C++ you can abstract from
>> where
>> wide-int information pieces are stored and thus use the arithmetic /
>> operation
>> workers without copying the (source) "wide-int" objects.  Thus you should
>> be able to write adaptors for double-int storage, tree or RTX storage.
>
> We had considered something along these lines and rejected it.   I am not
> really opposed to doing something like this, but it is not an obvious
> winning idea and is likely not to be a good idea.   Here was our thought
> process:
>
> if you abstract away the storage inside a wide int, then you should be able
> to copy a pointer to the block of data from either the rtl level integer
> constant or the tree level one into the wide int.   It is certainly true
> that making a wide_int from one of these is an extremely common operation
> and doing this would avoid those copies.
>
> However, this causes two problems:
> 1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to make
> the object.   it created the base object and then it allocated the array.
> Richard S noticed that we could just allocate one CONST_WIDE_INT that had
> the array in it.   Doing it this way saves one ggc allocation and one
> indirection when accessing the data within the CONST_WIDE_INT.   Our plan is
> to use the same trick at the tree level.   So to avoid the copying, you seem
> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.

I did not propose having a pointer to the data in the RTX or tree int.  Just
the short-lived wide-ints (which are on the stack) would have a pointer to
the data - which can then obviously point into the RTX and tree data.

> 2) You are now stuck either ggcing the storage inside a wide_int when they
> are created as part of an expression or you have to play some game to
> represent the two different storage plans inside of wide_int.

Hm?  wide-ints are short-lived and thus never live across a garbage collection
point.  We create non-GCed objects pointing to GCed objects all the time
and everywhere this way.

>   Clearly this
> is where you think that we should be going by suggesting that we abstract
> away the internal storage.   However, this comes at a price:   what is
> currently an array access in my patches would (i believe) become a function
> call.

No, the workers (that perform the array accesses) will simply get
a pointer to the first data element.  Then whether it's embedded or
external is of no interest to them.

>  From a performance point of view, i believe that this is a non
> starter. If you can figure out how to design this so that it is not a
> function call, i would consider this a viable option.
>
> On the other side of this you are clearly correct that we are copying the
> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs.    But
> this is why we represent data inside of the wide_ints, the INT_CSTs and the
> CONST_WIDE_INTs in a compressed form.   Even with very big types, which are
> generally rare, the constants them selves are very small.   So the copy
> operation is a loop that almost always copies one element, even with
> tree-vrp which doubles the sizes of every type.
>
> There is the third option which is that the storage inside the wide int is
> just ggced storage.  We rejected this because of the functional nature of
> wide-ints.    There are zillions created, they can be stack allocated, and
> they last for very short periods of time.

Of course - GCing wide-ints is a non-starter.

>
>>> As is probably obvious, I don't agree FWIW.  It seems like an unnecessary
>>> complication without any clear use.  Especially since the number of
>>
>> Maybe the double_int typedef is without any clear use.  Properly
>> abstracting from the storage / information providers will save
>> compile-time, memory and code though.  I don't see that any thought
>> was spent on how to avoid excessive copying or dealing with
>> long(er)-lived objects and their storage needs.
>
> I actually disagree.    Wide ints can use a bloated amount of storage
> because they are designed to be very short lived and very low cost objects
> that are stack allocated.   For long term storage, there is INT_CST at the
> tree level and CONST_WIDE_INT at the rtl level.  Those use a very compact
> storage model.   The copying entailed is only a small part of the overall
> performance.

Well, but both trees and RTXen are not viable for short-lived things because
the are GCed!  double-ints were suitable for this kind of stuff because
the also have a moderate size.  With wide-ints size becomes a problem
(or GC, if you instead use trees or RTXen).

> Everything that you are suggesting along these lines is adding to the weight
> of a wide-int object.

On the contrary - it lessens their weight (with external already
existing storage)
or does not do anything to it (with the embedded storage).

>  You have to understand there will be many more
> wide-ints created in a normal compilation than were ever created with
> double-int.    This is because the rtl level had no object like this at all
> and at the tree level, many of the places that should have used double int,
> short cut the code and only did the transformations if the types fit in a
> HWI.

Your argument shows that the copy-in/out from tree/RTX to/from wide-int
will become a very frequent operation and thus it is worth optimizing it.

> This is why we are extremely defensive about this issue.   We really did
> think a lot about it.

I'm sure you did.

Richard.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-26 15:03                         ` Richard Biener
@ 2012-11-26 16:03                           ` Kenneth Zadeck
  2012-11-26 16:30                             ` Richard Biener
  0 siblings, 1 reply; 59+ messages in thread
From: Kenneth Zadeck @ 2012-11-26 16:03 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford

On 11/26/2012 10:03 AM, Richard Biener wrote:
> On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
>> On 11/04/2012 11:54 AM, Richard Biener wrote:
>>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
>>> <rdsandiford@googlemail.com> wrote:
>>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>>>>> I would like you to respond to at least point 1 of this email.   In it
>>>>> there is code from the rtl level that was written twice, once for the
>>>>> case when the size of the mode is less than the size of a HWI and once
>>>>> for the case where the size of the mode is less that 2 HWIs.
>>>>>
>>>>> my patch changes this to one instance of the code that works no matter
>>>>> how large the data passed to it is.
>>>>>
>>>>> you have made a specific requirement for wide int to be a template that
>>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I
>>>>> would like to know how this particular fragment is to be rewritten in
>>>>> this model?   It seems that I would have to retain the structure where
>>>>> there is one version of the code for each size that the template is
>>>>> instantiated.
>>>> I think richi's argument was that wide_int should be split into two.
>>>> There should be a "bare-metal" class that just has a length and HWIs,
>>>> and the main wide_int class should be an extension on top of that
>>>> that does things to a bit precision instead.  Presumably with some
>>>> template magic so that the length (number of HWIs) is a constant for:
>>>>
>>>>     typedef foo<2> double_int;
>>>>
>>>> and a variable for wide_int (because in wide_int the length would be
>>>> the number of significant HWIs rather than the size of the underlying
>>>> array).  wide_int would also record the precision and apply it after
>>>> the full HWI operation.
>>>>
>>>> So the wide_int class would still provide "as wide as we need"
>>>> arithmetic,
>>>> as in your rtl patch.  I don't think he was objecting to that.
>>> That summarizes one part of my complaints / suggestions correctly.  In
>>> other
>>> mails I suggested to not make it a template but a constant over object
>>> lifetime
>>> 'bitsize' (or maxlen) field.  Both suggestions likely require more thought
>>> than
>>> I put into them.  The main reason is that with C++ you can abstract from
>>> where
>>> wide-int information pieces are stored and thus use the arithmetic /
>>> operation
>>> workers without copying the (source) "wide-int" objects.  Thus you should
>>> be able to write adaptors for double-int storage, tree or RTX storage.
>> We had considered something along these lines and rejected it.   I am not
>> really opposed to doing something like this, but it is not an obvious
>> winning idea and is likely not to be a good idea.   Here was our thought
>> process:
>>
>> if you abstract away the storage inside a wide int, then you should be able
>> to copy a pointer to the block of data from either the rtl level integer
>> constant or the tree level one into the wide int.   It is certainly true
>> that making a wide_int from one of these is an extremely common operation
>> and doing this would avoid those copies.
>>
>> However, this causes two problems:
>> 1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to make
>> the object.   it created the base object and then it allocated the array.
>> Richard S noticed that we could just allocate one CONST_WIDE_INT that had
>> the array in it.   Doing it this way saves one ggc allocation and one
>> indirection when accessing the data within the CONST_WIDE_INT.   Our plan is
>> to use the same trick at the tree level.   So to avoid the copying, you seem
>> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.
> I did not propose having a pointer to the data in the RTX or tree int.  Just
> the short-lived wide-ints (which are on the stack) would have a pointer to
> the data - which can then obviously point into the RTX and tree data.
There is the issue then what if some wide-ints are not short lived. It 
makes me nervous to create internal pointers to gc ed memory.
>> 2) You are now stuck either ggcing the storage inside a wide_int when they
>> are created as part of an expression or you have to play some game to
>> represent the two different storage plans inside of wide_int.
> Hm?  wide-ints are short-lived and thus never live across a garbage collection
> point.  We create non-GCed objects pointing to GCed objects all the time
> and everywhere this way.
Again, this makes me nervous but it could be done.  However, it does 
mean that now the wide ints that are not created from rtxes or trees 
will be more expensive because they are not going to get their storage 
"for free", they are going to alloca it.

however, it still is not clear, given that 99% of the wide ints are 
going to fit in a single hwi, that this would be a noticeable win.
>
>>    Clearly this
>> is where you think that we should be going by suggesting that we abstract
>> away the internal storage.   However, this comes at a price:   what is
>> currently an array access in my patches would (i believe) become a function
>> call.
> No, the workers (that perform the array accesses) will simply get
> a pointer to the first data element.  Then whether it's embedded or
> external is of no interest to them.
so is your plan that the wide int constructors from rtx or tree would 
just copy the pointer to the array on top of the array that is otherwise 
allocated on the stack?    I can easily do this.   But as i said, the 
gain seems quite small.

And of course, going the other way still does need the copy.
>>   From a performance point of view, i believe that this is a non
>> starter. If you can figure out how to design this so that it is not a
>> function call, i would consider this a viable option.
>>
>> On the other side of this you are clearly correct that we are copying the
>> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs.    But
>> this is why we represent data inside of the wide_ints, the INT_CSTs and the
>> CONST_WIDE_INTs in a compressed form.   Even with very big types, which are
>> generally rare, the constants them selves are very small.   So the copy
>> operation is a loop that almost always copies one element, even with
>> tree-vrp which doubles the sizes of every type.
>>
>> There is the third option which is that the storage inside the wide int is
>> just ggced storage.  We rejected this because of the functional nature of
>> wide-ints.    There are zillions created, they can be stack allocated, and
>> they last for very short periods of time.
> Of course - GCing wide-ints is a non-starter.
>
>>>> As is probably obvious, I don't agree FWIW.  It seems like an unnecessary
>>>> complication without any clear use.  Especially since the number of
>>> Maybe the double_int typedef is without any clear use.  Properly
>>> abstracting from the storage / information providers will save
>>> compile-time, memory and code though.  I don't see that any thought
>>> was spent on how to avoid excessive copying or dealing with
>>> long(er)-lived objects and their storage needs.
>> I actually disagree.    Wide ints can use a bloated amount of storage
>> because they are designed to be very short lived and very low cost objects
>> that are stack allocated.   For long term storage, there is INT_CST at the
>> tree level and CONST_WIDE_INT at the rtl level.  Those use a very compact
>> storage model.   The copying entailed is only a small part of the overall
>> performance.
> Well, but both trees and RTXen are not viable for short-lived things because
> the are GCed!  double-ints were suitable for this kind of stuff because
> the also have a moderate size.  With wide-ints size becomes a problem
> (or GC, if you instead use trees or RTXen).
>
>> Everything that you are suggesting along these lines is adding to the weight
>> of a wide-int object.
> On the contrary - it lessens their weight (with external already
> existing storage)
> or does not do anything to it (with the embedded storage).
>
>>   You have to understand there will be many more
>> wide-ints created in a normal compilation than were ever created with
>> double-int.    This is because the rtl level had no object like this at all
>> and at the tree level, many of the places that should have used double int,
>> short cut the code and only did the transformations if the types fit in a
>> HWI.
> Your argument shows that the copy-in/out from tree/RTX to/from wide-int
> will become a very frequent operation and thus it is worth optimizing it.
>
>> This is why we are extremely defensive about this issue.   We really did
>> think a lot about it.
> I'm sure you did.
>
> Richard.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-26 16:03                           ` Kenneth Zadeck
@ 2012-11-26 16:30                             ` Richard Biener
  2012-11-27  0:06                               ` Kenneth Zadeck
  0 siblings, 1 reply; 59+ messages in thread
From: Richard Biener @ 2012-11-26 16:30 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford

On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck
<zadeck@naturalbridge.com> wrote:
> On 11/26/2012 10:03 AM, Richard Biener wrote:
>>
>> On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck <zadeck@naturalbridge.com>
>> wrote:
>>>
>>> On 11/04/2012 11:54 AM, Richard Biener wrote:
>>>>
>>>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
>>>> <rdsandiford@googlemail.com> wrote:
>>>>>
>>>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>>>>>>
>>>>>> I would like you to respond to at least point 1 of this email.   In it
>>>>>> there is code from the rtl level that was written twice, once for the
>>>>>> case when the size of the mode is less than the size of a HWI and once
>>>>>> for the case where the size of the mode is less that 2 HWIs.
>>>>>>
>>>>>> my patch changes this to one instance of the code that works no matter
>>>>>> how large the data passed to it is.
>>>>>>
>>>>>> you have made a specific requirement for wide int to be a template
>>>>>> that
>>>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.
>>>>>> I
>>>>>> would like to know how this particular fragment is to be rewritten in
>>>>>> this model?   It seems that I would have to retain the structure where
>>>>>> there is one version of the code for each size that the template is
>>>>>> instantiated.
>>>>>
>>>>> I think richi's argument was that wide_int should be split into two.
>>>>> There should be a "bare-metal" class that just has a length and HWIs,
>>>>> and the main wide_int class should be an extension on top of that
>>>>> that does things to a bit precision instead.  Presumably with some
>>>>> template magic so that the length (number of HWIs) is a constant for:
>>>>>
>>>>>     typedef foo<2> double_int;
>>>>>
>>>>> and a variable for wide_int (because in wide_int the length would be
>>>>> the number of significant HWIs rather than the size of the underlying
>>>>> array).  wide_int would also record the precision and apply it after
>>>>> the full HWI operation.
>>>>>
>>>>> So the wide_int class would still provide "as wide as we need"
>>>>> arithmetic,
>>>>> as in your rtl patch.  I don't think he was objecting to that.
>>>>
>>>> That summarizes one part of my complaints / suggestions correctly.  In
>>>> other
>>>> mails I suggested to not make it a template but a constant over object
>>>> lifetime
>>>> 'bitsize' (or maxlen) field.  Both suggestions likely require more
>>>> thought
>>>> than
>>>> I put into them.  The main reason is that with C++ you can abstract from
>>>> where
>>>> wide-int information pieces are stored and thus use the arithmetic /
>>>> operation
>>>> workers without copying the (source) "wide-int" objects.  Thus you
>>>> should
>>>> be able to write adaptors for double-int storage, tree or RTX storage.
>>>
>>> We had considered something along these lines and rejected it.   I am not
>>> really opposed to doing something like this, but it is not an obvious
>>> winning idea and is likely not to be a good idea.   Here was our thought
>>> process:
>>>
>>> if you abstract away the storage inside a wide int, then you should be
>>> able
>>> to copy a pointer to the block of data from either the rtl level integer
>>> constant or the tree level one into the wide int.   It is certainly true
>>> that making a wide_int from one of these is an extremely common operation
>>> and doing this would avoid those copies.
>>>
>>> However, this causes two problems:
>>> 1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to
>>> make
>>> the object.   it created the base object and then it allocated the array.
>>> Richard S noticed that we could just allocate one CONST_WIDE_INT that had
>>> the array in it.   Doing it this way saves one ggc allocation and one
>>> indirection when accessing the data within the CONST_WIDE_INT.   Our plan
>>> is
>>> to use the same trick at the tree level.   So to avoid the copying, you
>>> seem
>>> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.
>>
>> I did not propose having a pointer to the data in the RTX or tree int.
>> Just
>> the short-lived wide-ints (which are on the stack) would have a pointer to
>> the data - which can then obviously point into the RTX and tree data.
>
> There is the issue then what if some wide-ints are not short lived. It makes
> me nervous to create internal pointers to gc ed memory.

I thought they were all short-lived.

>>> 2) You are now stuck either ggcing the storage inside a wide_int when
>>> they
>>> are created as part of an expression or you have to play some game to
>>> represent the two different storage plans inside of wide_int.
>>
>> Hm?  wide-ints are short-lived and thus never live across a garbage
>> collection
>> point.  We create non-GCed objects pointing to GCed objects all the time
>> and everywhere this way.
>
> Again, this makes me nervous but it could be done.  However, it does mean
> that now the wide ints that are not created from rtxes or trees will be more
> expensive because they are not going to get their storage "for free", they
> are going to alloca it.

No, those would simply use the embedded storage model.

> however, it still is not clear, given that 99% of the wide ints are going to
> fit in a single hwi, that this would be a noticeable win.

Currently even if they fit into a HWI you will still allocate 4 times the
larges integer mode size.  You say that doesn't matter because they
are short-lived, but I say it does matter because not all of them are
short-lived enough.  If 99% fit in a HWI why allocate 4 times the
largest integer mode size in 99% of the cases?

>>
>>>    Clearly this
>>> is where you think that we should be going by suggesting that we abstract
>>> away the internal storage.   However, this comes at a price:   what is
>>> currently an array access in my patches would (i believe) become a
>>> function
>>> call.
>>
>> No, the workers (that perform the array accesses) will simply get
>> a pointer to the first data element.  Then whether it's embedded or
>> external is of no interest to them.
>
> so is your plan that the wide int constructors from rtx or tree would just
> copy the pointer to the array on top of the array that is otherwise
> allocated on the stack?    I can easily do this.   But as i said, the gain
> seems quite small.
>
> And of course, going the other way still does need the copy.

The proposal was to template wide_int on a storage model, the embedded
one would work as-is (embedding 4 times largest integer mode), the
external one would have a pointer to data.  All functions that return a
wide_int produce a wide_int with the embedded model.  To avoid
the function call penalty you described the storage model provides
a way to get a pointer to the first element and the templated operations
simply dispatch to a worker that takes this pointer to the first element
(as the storage model is designed as a template its abstraction is going
to be optimized away by means of inlining).

Richard.

>>>   From a performance point of view, i believe that this is a non
>>> starter. If you can figure out how to design this so that it is not a
>>> function call, i would consider this a viable option.
>>>
>>> On the other side of this you are clearly correct that we are copying the
>>> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs.
>>> But
>>> this is why we represent data inside of the wide_ints, the INT_CSTs and
>>> the
>>> CONST_WIDE_INTs in a compressed form.   Even with very big types, which
>>> are
>>> generally rare, the constants them selves are very small.   So the copy
>>> operation is a loop that almost always copies one element, even with
>>> tree-vrp which doubles the sizes of every type.
>>>
>>> There is the third option which is that the storage inside the wide int
>>> is
>>> just ggced storage.  We rejected this because of the functional nature of
>>> wide-ints.    There are zillions created, they can be stack allocated,
>>> and
>>> they last for very short periods of time.
>>
>> Of course - GCing wide-ints is a non-starter.
>>
>>>>> As is probably obvious, I don't agree FWIW.  It seems like an
>>>>> unnecessary
>>>>> complication without any clear use.  Especially since the number of
>>>>
>>>> Maybe the double_int typedef is without any clear use.  Properly
>>>> abstracting from the storage / information providers will save
>>>> compile-time, memory and code though.  I don't see that any thought
>>>> was spent on how to avoid excessive copying or dealing with
>>>> long(er)-lived objects and their storage needs.
>>>
>>> I actually disagree.    Wide ints can use a bloated amount of storage
>>> because they are designed to be very short lived and very low cost
>>> objects
>>> that are stack allocated.   For long term storage, there is INT_CST at
>>> the
>>> tree level and CONST_WIDE_INT at the rtl level.  Those use a very compact
>>> storage model.   The copying entailed is only a small part of the overall
>>> performance.
>>
>> Well, but both trees and RTXen are not viable for short-lived things
>> because
>> the are GCed!  double-ints were suitable for this kind of stuff because
>> the also have a moderate size.  With wide-ints size becomes a problem
>> (or GC, if you instead use trees or RTXen).
>>
>>> Everything that you are suggesting along these lines is adding to the
>>> weight
>>> of a wide-int object.
>>
>> On the contrary - it lessens their weight (with external already
>> existing storage)
>> or does not do anything to it (with the embedded storage).
>>
>>>   You have to understand there will be many more
>>> wide-ints created in a normal compilation than were ever created with
>>> double-int.    This is because the rtl level had no object like this at
>>> all
>>> and at the tree level, many of the places that should have used double
>>> int,
>>> short cut the code and only did the transformations if the types fit in a
>>> HWI.
>>
>> Your argument shows that the copy-in/out from tree/RTX to/from wide-int
>> will become a very frequent operation and thus it is worth optimizing it.
>>
>>> This is why we are extremely defensive about this issue.   We really did
>>> think a lot about it.
>>
>> I'm sure you did.
>>
>> Richard.
>
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-26 16:30                             ` Richard Biener
@ 2012-11-27  0:06                               ` Kenneth Zadeck
  2012-11-27 10:03                                 ` Richard Biener
  0 siblings, 1 reply; 59+ messages in thread
From: Kenneth Zadeck @ 2012-11-27  0:06 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford, Mike Stump

Richard,

I spent a good part of the afternoon talking to Mike about this.  He is 
on the c++ standards committee and is a much more seasoned c++ 
programmer than I am.

He convinced me that with a large amount of engineering and c++ 
"foolishness" that it was indeed possible to get your proposal to 
POSSIBLY work as well as what we did.

But now the question is why would any want to do this?

At the very least you are talking about instantiating two instances of 
wide-ints, one for the stack allocated uses and one for the places where 
we just move a pointer from the tree or the rtx. Then you are talking 
about creating connectors so that the stack allocated functions can take 
parameters of pointer version and visa versa.

Then there is the issue that rather than just saying that something is a 
wide int, that the programmer is going to have to track it's origin.   
In particular,  where in the code right now i say.

wide_int foo = wide_int::from_rtx (r1);
wide_int bar = wide_int::from_rtx (r2) + foo;

now i would have to say

wide_int_ptr foo = wide_int_ptr::from_rtx (r1);
wide_int_stack bar = wide_int_ptr::from_rtx (r2) + foo;

then when i want to call some function using a wide_int ref that 
function now must be either overloaded to take both or i have to choose 
one of the two instantiations (presumably based on which is going to be 
more common) and just have the compiler fix up everything (which it is 
likely to do).

And so what is the payoff:
1) No one except the c++ elite is going to understand the code. The rest 
of the community will hate me and curse the ground that i walk on.
2) I will end up with a version of wide-int that can be used as a medium 
life container (where i define medium life as not allowed to survive a 
gc since they will contain pointers into rtxes and trees.)
3) An no clients that actually wanted to do this!!    I could use as an 
example one of your favorite passes, tree-vrp.   The current double-int 
could have been a medium lifetime container since it has a smaller 
footprint, but in fact tree-vrp converts those double-ints back into 
trees for medium storage.   Why, because it needs the other fields of a 
tree-cst to store the entire state.  Wide-ints also "suffer" this 
problem.  their only state are the data, and the three length fields.   
They have no type and none of the other tree info so the most obvious 
client for a medium lifetime object is really not going to be a good 
match even if you "solve the storage problem".

The fact is that wide-ints are an excellent short term storage class 
that can be very quickly converted into our two long term storage 
classes.  Your proposal is requires a lot of work, will not be easy to 
use and as far as i can see has no payoff on the horizon.   It could be 
that there could be future clients for a medium lifetime value, but 
asking for this with no clients in hand is really beyond the scope of a 
reasonable review.

I remind you that the purpose of these patches is to solve problems that 
exist in the current compiler that we have papered over for years.   If 
someone needs wide-ints in some way that is not foreseen then they can 
change it.

kenny

On 11/26/2012 11:30 AM, Richard Biener wrote:
> On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck
> <zadeck@naturalbridge.com> wrote:
>> On 11/26/2012 10:03 AM, Richard Biener wrote:
>>> On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck <zadeck@naturalbridge.com>
>>> wrote:
>>>> On 11/04/2012 11:54 AM, Richard Biener wrote:
>>>>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
>>>>> <rdsandiford@googlemail.com> wrote:
>>>>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>>>>>>> I would like you to respond to at least point 1 of this email.   In it
>>>>>>> there is code from the rtl level that was written twice, once for the
>>>>>>> case when the size of the mode is less than the size of a HWI and once
>>>>>>> for the case where the size of the mode is less that 2 HWIs.
>>>>>>>
>>>>>>> my patch changes this to one instance of the code that works no matter
>>>>>>> how large the data passed to it is.
>>>>>>>
>>>>>>> you have made a specific requirement for wide int to be a template
>>>>>>> that
>>>>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.
>>>>>>> I
>>>>>>> would like to know how this particular fragment is to be rewritten in
>>>>>>> this model?   It seems that I would have to retain the structure where
>>>>>>> there is one version of the code for each size that the template is
>>>>>>> instantiated.
>>>>>> I think richi's argument was that wide_int should be split into two.
>>>>>> There should be a "bare-metal" class that just has a length and HWIs,
>>>>>> and the main wide_int class should be an extension on top of that
>>>>>> that does things to a bit precision instead.  Presumably with some
>>>>>> template magic so that the length (number of HWIs) is a constant for:
>>>>>>
>>>>>>      typedef foo<2> double_int;
>>>>>>
>>>>>> and a variable for wide_int (because in wide_int the length would be
>>>>>> the number of significant HWIs rather than the size of the underlying
>>>>>> array).  wide_int would also record the precision and apply it after
>>>>>> the full HWI operation.
>>>>>>
>>>>>> So the wide_int class would still provide "as wide as we need"
>>>>>> arithmetic,
>>>>>> as in your rtl patch.  I don't think he was objecting to that.
>>>>> That summarizes one part of my complaints / suggestions correctly.  In
>>>>> other
>>>>> mails I suggested to not make it a template but a constant over object
>>>>> lifetime
>>>>> 'bitsize' (or maxlen) field.  Both suggestions likely require more
>>>>> thought
>>>>> than
>>>>> I put into them.  The main reason is that with C++ you can abstract from
>>>>> where
>>>>> wide-int information pieces are stored and thus use the arithmetic /
>>>>> operation
>>>>> workers without copying the (source) "wide-int" objects.  Thus you
>>>>> should
>>>>> be able to write adaptors for double-int storage, tree or RTX storage.
>>>> We had considered something along these lines and rejected it.   I am not
>>>> really opposed to doing something like this, but it is not an obvious
>>>> winning idea and is likely not to be a good idea.   Here was our thought
>>>> process:
>>>>
>>>> if you abstract away the storage inside a wide int, then you should be
>>>> able
>>>> to copy a pointer to the block of data from either the rtl level integer
>>>> constant or the tree level one into the wide int.   It is certainly true
>>>> that making a wide_int from one of these is an extremely common operation
>>>> and doing this would avoid those copies.
>>>>
>>>> However, this causes two problems:
>>>> 1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to
>>>> make
>>>> the object.   it created the base object and then it allocated the array.
>>>> Richard S noticed that we could just allocate one CONST_WIDE_INT that had
>>>> the array in it.   Doing it this way saves one ggc allocation and one
>>>> indirection when accessing the data within the CONST_WIDE_INT.   Our plan
>>>> is
>>>> to use the same trick at the tree level.   So to avoid the copying, you
>>>> seem
>>>> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.
>>> I did not propose having a pointer to the data in the RTX or tree int.
>>> Just
>>> the short-lived wide-ints (which are on the stack) would have a pointer to
>>> the data - which can then obviously point into the RTX and tree data.
>> There is the issue then what if some wide-ints are not short lived. It makes
>> me nervous to create internal pointers to gc ed memory.
> I thought they were all short-lived.
>
>>>> 2) You are now stuck either ggcing the storage inside a wide_int when
>>>> they
>>>> are created as part of an expression or you have to play some game to
>>>> represent the two different storage plans inside of wide_int.
>>> Hm?  wide-ints are short-lived and thus never live across a garbage
>>> collection
>>> point.  We create non-GCed objects pointing to GCed objects all the time
>>> and everywhere this way.
>> Again, this makes me nervous but it could be done.  However, it does mean
>> that now the wide ints that are not created from rtxes or trees will be more
>> expensive because they are not going to get their storage "for free", they
>> are going to alloca it.
> No, those would simply use the embedded storage model.
>
>> however, it still is not clear, given that 99% of the wide ints are going to
>> fit in a single hwi, that this would be a noticeable win.
> Currently even if they fit into a HWI you will still allocate 4 times the
> larges integer mode size.  You say that doesn't matter because they
> are short-lived, but I say it does matter because not all of them are
> short-lived enough.  If 99% fit in a HWI why allocate 4 times the
> largest integer mode size in 99% of the cases?
>
>>>>     Clearly this
>>>> is where you think that we should be going by suggesting that we abstract
>>>> away the internal storage.   However, this comes at a price:   what is
>>>> currently an array access in my patches would (i believe) become a
>>>> function
>>>> call.
>>> No, the workers (that perform the array accesses) will simply get
>>> a pointer to the first data element.  Then whether it's embedded or
>>> external is of no interest to them.
>> so is your plan that the wide int constructors from rtx or tree would just
>> copy the pointer to the array on top of the array that is otherwise
>> allocated on the stack?    I can easily do this.   But as i said, the gain
>> seems quite small.
>>
>> And of course, going the other way still does need the copy.
> The proposal was to template wide_int on a storage model, the embedded
> one would work as-is (embedding 4 times largest integer mode), the
> external one would have a pointer to data.  All functions that return a
> wide_int produce a wide_int with the embedded model.  To avoid
> the function call penalty you described the storage model provides
> a way to get a pointer to the first element and the templated operations
> simply dispatch to a worker that takes this pointer to the first element
> (as the storage model is designed as a template its abstraction is going
> to be optimized away by means of inlining).
>
> Richard.
>
>>>>    From a performance point of view, i believe that this is a non
>>>> starter. If you can figure out how to design this so that it is not a
>>>> function call, i would consider this a viable option.
>>>>
>>>> On the other side of this you are clearly correct that we are copying the
>>>> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs.
>>>> But
>>>> this is why we represent data inside of the wide_ints, the INT_CSTs and
>>>> the
>>>> CONST_WIDE_INTs in a compressed form.   Even with very big types, which
>>>> are
>>>> generally rare, the constants them selves are very small.   So the copy
>>>> operation is a loop that almost always copies one element, even with
>>>> tree-vrp which doubles the sizes of every type.
>>>>
>>>> There is the third option which is that the storage inside the wide int
>>>> is
>>>> just ggced storage.  We rejected this because of the functional nature of
>>>> wide-ints.    There are zillions created, they can be stack allocated,
>>>> and
>>>> they last for very short periods of time.
>>> Of course - GCing wide-ints is a non-starter.
>>>
>>>>>> As is probably obvious, I don't agree FWIW.  It seems like an
>>>>>> unnecessary
>>>>>> complication without any clear use.  Especially since the number of
>>>>> Maybe the double_int typedef is without any clear use.  Properly
>>>>> abstracting from the storage / information providers will save
>>>>> compile-time, memory and code though.  I don't see that any thought
>>>>> was spent on how to avoid excessive copying or dealing with
>>>>> long(er)-lived objects and their storage needs.
>>>> I actually disagree.    Wide ints can use a bloated amount of storage
>>>> because they are designed to be very short lived and very low cost
>>>> objects
>>>> that are stack allocated.   For long term storage, there is INT_CST at
>>>> the
>>>> tree level and CONST_WIDE_INT at the rtl level.  Those use a very compact
>>>> storage model.   The copying entailed is only a small part of the overall
>>>> performance.
>>> Well, but both trees and RTXen are not viable for short-lived things
>>> because
>>> the are GCed!  double-ints were suitable for this kind of stuff because
>>> the also have a moderate size.  With wide-ints size becomes a problem
>>> (or GC, if you instead use trees or RTXen).
>>>
>>>> Everything that you are suggesting along these lines is adding to the
>>>> weight
>>>> of a wide-int object.
>>> On the contrary - it lessens their weight (with external already
>>> existing storage)
>>> or does not do anything to it (with the embedded storage).
>>>
>>>>    You have to understand there will be many more
>>>> wide-ints created in a normal compilation than were ever created with
>>>> double-int.    This is because the rtl level had no object like this at
>>>> all
>>>> and at the tree level, many of the places that should have used double
>>>> int,
>>>> short cut the code and only did the transformations if the types fit in a
>>>> HWI.
>>> Your argument shows that the copy-in/out from tree/RTX to/from wide-int
>>> will become a very frequent operation and thus it is worth optimizing it.
>>>
>>>> This is why we are extremely defensive about this issue.   We really did
>>>> think a lot about it.
>>> I'm sure you did.
>>>
>>> Richard.
>>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-27  0:06                               ` Kenneth Zadeck
@ 2012-11-27 10:03                                 ` Richard Biener
  2012-11-27 13:03                                   ` Kenneth Zadeck
  0 siblings, 1 reply; 59+ messages in thread
From: Richard Biener @ 2012-11-27 10:03 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford, Mike Stump

On Tue, Nov 27, 2012 at 1:06 AM, Kenneth Zadeck
<zadeck@naturalbridge.com> wrote:
> Richard,
>
> I spent a good part of the afternoon talking to Mike about this.  He is on
> the c++ standards committee and is a much more seasoned c++ programmer than
> I am.
>
> He convinced me that with a large amount of engineering and c++
> "foolishness" that it was indeed possible to get your proposal to POSSIBLY
> work as well as what we did.
>
> But now the question is why would any want to do this?
>
> At the very least you are talking about instantiating two instances of
> wide-ints, one for the stack allocated uses and one for the places where we
> just move a pointer from the tree or the rtx. Then you are talking about
> creating connectors so that the stack allocated functions can take
> parameters of pointer version and visa versa.
>
> Then there is the issue that rather than just saying that something is a
> wide int, that the programmer is going to have to track it's origin.   In
> particular,  where in the code right now i say.
>
> wide_int foo = wide_int::from_rtx (r1);
> wide_int bar = wide_int::from_rtx (r2) + foo;
>
> now i would have to say
>
> wide_int_ptr foo = wide_int_ptr::from_rtx (r1);
> wide_int_stack bar = wide_int_ptr::from_rtx (r2) + foo;

No, you'd say

wide_int foo = wide_int::from_rtx (r1);

and the static, non-templated from_rtx method would automagically
return (always!) a "wide_int_ptr" kind.  The initialization then would
use the assignment operator that mediates between wide_int and
"wide_int_ptr", doing the copying.

The user should get a 'stack' kind by default when specifying wide_int,
like implemented with

struct wide_int_storage_stack;
struct wide_int_storage_ptr;

template <class storage = wide_int_storage_stack>
class wide_int : public storage
{
...
   static wide_int <wide_int_storage_ptr> from_rtx (rtx);
}

the whole point of the exercise is to make from_rtx and from_tree avoid
the copying (and excessive stack space allocation) for the rvalue case
like in

 wide_int res = wide_int::from_rtx (x) + 1;

if you save the result into a wide_int temporary first then you are lost
of course (modulo some magic GCC optimization being able to elide
the copy somehow).

And of course for code like VRP that keeps a lattice of wide_ints to
be able to reduce its footprint by using ptr storage and explicit allocations
(that's a secondary concern, of course).  And for VRP to specify that
it needs more than the otherwise needed MAX_INT_MODE_SIZE.
ptr storage would not have this arbitrary limitation, only embedded
storage (should) have.

> then when i want to call some function using a wide_int ref that function
> now must be either overloaded to take both or i have to choose one of the
> two instantiations (presumably based on which is going to be more common)
> and just have the compiler fix up everything (which it is likely to do).

Nope, they'd be

class wide_int ...
{
   template <class storage1, class storage2>
   wide_int operator+(wide_int <storage1> a, wide_int<storage2> b)
   {
      return wide_int::plus_worker (a.precision, a. ...., a.get_storage_ptr (),
                                                b.precision, ...,
b.get_storage_ptr ());
   }


> And so what is the payoff:
> 1) No one except the c++ elite is going to understand the code. The rest of
> the community will hate me and curse the ground that i walk on.

Maybe for the implementation - but look at hash-table and vec ... not for
usage certainly.

> 2) I will end up with a version of wide-int that can be used as a medium
> life container (where i define medium life as not allowed to survive a gc
> since they will contain pointers into rtxes and trees.)
> 3) An no clients that actually wanted to do this!!    I could use as an
> example one of your favorite passes, tree-vrp.   The current double-int
> could have been a medium lifetime container since it has a smaller
> footprint, but in fact tree-vrp converts those double-ints back into trees
> for medium storage.   Why, because it needs the other fields of a tree-cst
> to store the entire state.  Wide-ints also "suffer" this problem.  their
> only state are the data, and the three length fields.   They have no type
> and none of the other tree info so the most obvious client for a medium
> lifetime object is really not going to be a good match even if you "solve
> the storage problem".
>
> The fact is that wide-ints are an excellent short term storage class that
> can be very quickly converted into our two long term storage classes.  Your
> proposal is requires a lot of work, will not be easy to use and as far as i
> can see has no payoff on the horizon.   It could be that there could be
> future clients for a medium lifetime value, but asking for this with no
> clients in hand is really beyond the scope of a reasonable review.
>
> I remind you that the purpose of these patches is to solve problems that
> exist in the current compiler that we have papered over for years.   If
> someone needs wide-ints in some way that is not foreseen then they can
> change it.

The patches introduce a lot more temporary wide-ints (your words) and
at the same time makes construction of them from tree / rtx very expensive
both stack space and compile-time wise.  Look at how we for example
compute TREE_INT_CST + 1 - int_cst_binop internally uses double_ints
for the computation and then instantiates a new tree for holding the result.
Now we'd use wide_ints for this requring totally unnecessary copying.
Why not in the first place try to avoid that.  And try to avoid making
wide_ints 4 times as large as really necessary just for the sake of VRP!
(VRP should have a way to say "_I_ want larger wide_ints", without putting
this burden on all other users).

Richard.

> kenny
>
>
> On 11/26/2012 11:30 AM, Richard Biener wrote:
>>
>> On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck
>> <zadeck@naturalbridge.com> wrote:
>>>
>>> On 11/26/2012 10:03 AM, Richard Biener wrote:
>>>>
>>>> On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck
>>>> <zadeck@naturalbridge.com>
>>>> wrote:
>>>>>
>>>>> On 11/04/2012 11:54 AM, Richard Biener wrote:
>>>>>>
>>>>>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
>>>>>> <rdsandiford@googlemail.com> wrote:
>>>>>>>
>>>>>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>>>>>>>>
>>>>>>>> I would like you to respond to at least point 1 of this email.   In
>>>>>>>> it
>>>>>>>> there is code from the rtl level that was written twice, once for
>>>>>>>> the
>>>>>>>> case when the size of the mode is less than the size of a HWI and
>>>>>>>> once
>>>>>>>> for the case where the size of the mode is less that 2 HWIs.
>>>>>>>>
>>>>>>>> my patch changes this to one instance of the code that works no
>>>>>>>> matter
>>>>>>>> how large the data passed to it is.
>>>>>>>>
>>>>>>>> you have made a specific requirement for wide int to be a template
>>>>>>>> that
>>>>>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.
>>>>>>>> I
>>>>>>>> would like to know how this particular fragment is to be rewritten
>>>>>>>> in
>>>>>>>> this model?   It seems that I would have to retain the structure
>>>>>>>> where
>>>>>>>> there is one version of the code for each size that the template is
>>>>>>>> instantiated.
>>>>>>>
>>>>>>> I think richi's argument was that wide_int should be split into two.
>>>>>>> There should be a "bare-metal" class that just has a length and HWIs,
>>>>>>> and the main wide_int class should be an extension on top of that
>>>>>>> that does things to a bit precision instead.  Presumably with some
>>>>>>> template magic so that the length (number of HWIs) is a constant for:
>>>>>>>
>>>>>>>      typedef foo<2> double_int;
>>>>>>>
>>>>>>> and a variable for wide_int (because in wide_int the length would be
>>>>>>> the number of significant HWIs rather than the size of the underlying
>>>>>>> array).  wide_int would also record the precision and apply it after
>>>>>>> the full HWI operation.
>>>>>>>
>>>>>>> So the wide_int class would still provide "as wide as we need"
>>>>>>> arithmetic,
>>>>>>> as in your rtl patch.  I don't think he was objecting to that.
>>>>>>
>>>>>> That summarizes one part of my complaints / suggestions correctly.  In
>>>>>> other
>>>>>> mails I suggested to not make it a template but a constant over object
>>>>>> lifetime
>>>>>> 'bitsize' (or maxlen) field.  Both suggestions likely require more
>>>>>> thought
>>>>>> than
>>>>>> I put into them.  The main reason is that with C++ you can abstract
>>>>>> from
>>>>>> where
>>>>>> wide-int information pieces are stored and thus use the arithmetic /
>>>>>> operation
>>>>>> workers without copying the (source) "wide-int" objects.  Thus you
>>>>>> should
>>>>>> be able to write adaptors for double-int storage, tree or RTX storage.
>>>>>
>>>>> We had considered something along these lines and rejected it.   I am
>>>>> not
>>>>> really opposed to doing something like this, but it is not an obvious
>>>>> winning idea and is likely not to be a good idea.   Here was our
>>>>> thought
>>>>> process:
>>>>>
>>>>> if you abstract away the storage inside a wide int, then you should be
>>>>> able
>>>>> to copy a pointer to the block of data from either the rtl level
>>>>> integer
>>>>> constant or the tree level one into the wide int.   It is certainly
>>>>> true
>>>>> that making a wide_int from one of these is an extremely common
>>>>> operation
>>>>> and doing this would avoid those copies.
>>>>>
>>>>> However, this causes two problems:
>>>>> 1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to
>>>>> make
>>>>> the object.   it created the base object and then it allocated the
>>>>> array.
>>>>> Richard S noticed that we could just allocate one CONST_WIDE_INT that
>>>>> had
>>>>> the array in it.   Doing it this way saves one ggc allocation and one
>>>>> indirection when accessing the data within the CONST_WIDE_INT.   Our
>>>>> plan
>>>>> is
>>>>> to use the same trick at the tree level.   So to avoid the copying, you
>>>>> seem
>>>>> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.
>>>>
>>>> I did not propose having a pointer to the data in the RTX or tree int.
>>>> Just
>>>> the short-lived wide-ints (which are on the stack) would have a pointer
>>>> to
>>>> the data - which can then obviously point into the RTX and tree data.
>>>
>>> There is the issue then what if some wide-ints are not short lived. It
>>> makes
>>> me nervous to create internal pointers to gc ed memory.
>>
>> I thought they were all short-lived.
>>
>>>>> 2) You are now stuck either ggcing the storage inside a wide_int when
>>>>> they
>>>>> are created as part of an expression or you have to play some game to
>>>>> represent the two different storage plans inside of wide_int.
>>>>
>>>> Hm?  wide-ints are short-lived and thus never live across a garbage
>>>> collection
>>>> point.  We create non-GCed objects pointing to GCed objects all the time
>>>> and everywhere this way.
>>>
>>> Again, this makes me nervous but it could be done.  However, it does mean
>>> that now the wide ints that are not created from rtxes or trees will be
>>> more
>>> expensive because they are not going to get their storage "for free",
>>> they
>>> are going to alloca it.
>>
>> No, those would simply use the embedded storage model.
>>
>>> however, it still is not clear, given that 99% of the wide ints are going
>>> to
>>> fit in a single hwi, that this would be a noticeable win.
>>
>> Currently even if they fit into a HWI you will still allocate 4 times the
>> larges integer mode size.  You say that doesn't matter because they
>> are short-lived, but I say it does matter because not all of them are
>> short-lived enough.  If 99% fit in a HWI why allocate 4 times the
>> largest integer mode size in 99% of the cases?
>>
>>>>>     Clearly this
>>>>> is where you think that we should be going by suggesting that we
>>>>> abstract
>>>>> away the internal storage.   However, this comes at a price:   what is
>>>>> currently an array access in my patches would (i believe) become a
>>>>> function
>>>>> call.
>>>>
>>>> No, the workers (that perform the array accesses) will simply get
>>>> a pointer to the first data element.  Then whether it's embedded or
>>>> external is of no interest to them.
>>>
>>> so is your plan that the wide int constructors from rtx or tree would
>>> just
>>> copy the pointer to the array on top of the array that is otherwise
>>> allocated on the stack?    I can easily do this.   But as i said, the
>>> gain
>>> seems quite small.
>>>
>>> And of course, going the other way still does need the copy.
>>
>> The proposal was to template wide_int on a storage model, the embedded
>> one would work as-is (embedding 4 times largest integer mode), the
>> external one would have a pointer to data.  All functions that return a
>> wide_int produce a wide_int with the embedded model.  To avoid
>> the function call penalty you described the storage model provides
>> a way to get a pointer to the first element and the templated operations
>> simply dispatch to a worker that takes this pointer to the first element
>> (as the storage model is designed as a template its abstraction is going
>> to be optimized away by means of inlining).
>>
>> Richard.
>>
>>>>>    From a performance point of view, i believe that this is a non
>>>>> starter. If you can figure out how to design this so that it is not a
>>>>> function call, i would consider this a viable option.
>>>>>
>>>>> On the other side of this you are clearly correct that we are copying
>>>>> the
>>>>> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs.
>>>>> But
>>>>> this is why we represent data inside of the wide_ints, the INT_CSTs and
>>>>> the
>>>>> CONST_WIDE_INTs in a compressed form.   Even with very big types, which
>>>>> are
>>>>> generally rare, the constants them selves are very small.   So the copy
>>>>> operation is a loop that almost always copies one element, even with
>>>>> tree-vrp which doubles the sizes of every type.
>>>>>
>>>>> There is the third option which is that the storage inside the wide int
>>>>> is
>>>>> just ggced storage.  We rejected this because of the functional nature
>>>>> of
>>>>> wide-ints.    There are zillions created, they can be stack allocated,
>>>>> and
>>>>> they last for very short periods of time.
>>>>
>>>> Of course - GCing wide-ints is a non-starter.
>>>>
>>>>>>> As is probably obvious, I don't agree FWIW.  It seems like an
>>>>>>> unnecessary
>>>>>>> complication without any clear use.  Especially since the number of
>>>>>>
>>>>>> Maybe the double_int typedef is without any clear use.  Properly
>>>>>> abstracting from the storage / information providers will save
>>>>>> compile-time, memory and code though.  I don't see that any thought
>>>>>> was spent on how to avoid excessive copying or dealing with
>>>>>> long(er)-lived objects and their storage needs.
>>>>>
>>>>> I actually disagree.    Wide ints can use a bloated amount of storage
>>>>> because they are designed to be very short lived and very low cost
>>>>> objects
>>>>> that are stack allocated.   For long term storage, there is INT_CST at
>>>>> the
>>>>> tree level and CONST_WIDE_INT at the rtl level.  Those use a very
>>>>> compact
>>>>> storage model.   The copying entailed is only a small part of the
>>>>> overall
>>>>> performance.
>>>>
>>>> Well, but both trees and RTXen are not viable for short-lived things
>>>> because
>>>> the are GCed!  double-ints were suitable for this kind of stuff because
>>>> the also have a moderate size.  With wide-ints size becomes a problem
>>>> (or GC, if you instead use trees or RTXen).
>>>>
>>>>> Everything that you are suggesting along these lines is adding to the
>>>>> weight
>>>>> of a wide-int object.
>>>>
>>>> On the contrary - it lessens their weight (with external already
>>>> existing storage)
>>>> or does not do anything to it (with the embedded storage).
>>>>
>>>>>    You have to understand there will be many more
>>>>> wide-ints created in a normal compilation than were ever created with
>>>>> double-int.    This is because the rtl level had no object like this at
>>>>> all
>>>>> and at the tree level, many of the places that should have used double
>>>>> int,
>>>>> short cut the code and only did the transformations if the types fit in
>>>>> a
>>>>> HWI.
>>>>
>>>> Your argument shows that the copy-in/out from tree/RTX to/from wide-int
>>>> will become a very frequent operation and thus it is worth optimizing
>>>> it.
>>>>
>>>>> This is why we are extremely defensive about this issue.   We really
>>>>> did
>>>>> think a lot about it.
>>>>
>>>> I'm sure you did.
>>>>
>>>> Richard.
>>>
>>>
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-27 10:03                                 ` Richard Biener
@ 2012-11-27 13:03                                   ` Kenneth Zadeck
  0 siblings, 0 replies; 59+ messages in thread
From: Kenneth Zadeck @ 2012-11-27 13:03 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, rdsandiford, Mike Stump

i will discuss this with mike when he wakes up.    he lives on the west 
pole so that will not be until after you go to bed.

the one point that i will take exception to is that the copying 
operation is, in practice, any more time expensive than the pointer 
copy.   I never bother to initialize the storage in the array, i only 
copy the elements that are live.    This is with almost always 1 hwi 
because either most types are small or most constants of large types 
compress to 1 hwi.    So even if a compilation does a zillion 
::from_trees, you will most likely never see the difference in time.

kenny


On 11/27/2012 05:03 AM, Richard Biener wrote:
> On Tue, Nov 27, 2012 at 1:06 AM, Kenneth Zadeck
> <zadeck@naturalbridge.com> wrote:
>> Richard,
>>
>> I spent a good part of the afternoon talking to Mike about this.  He is on
>> the c++ standards committee and is a much more seasoned c++ programmer than
>> I am.
>>
>> He convinced me that with a large amount of engineering and c++
>> "foolishness" that it was indeed possible to get your proposal to POSSIBLY
>> work as well as what we did.
>>
>> But now the question is why would any want to do this?
>>
>> At the very least you are talking about instantiating two instances of
>> wide-ints, one for the stack allocated uses and one for the places where we
>> just move a pointer from the tree or the rtx. Then you are talking about
>> creating connectors so that the stack allocated functions can take
>> parameters of pointer version and visa versa.
>>
>> Then there is the issue that rather than just saying that something is a
>> wide int, that the programmer is going to have to track it's origin.   In
>> particular,  where in the code right now i say.
>>
>> wide_int foo = wide_int::from_rtx (r1);
>> wide_int bar = wide_int::from_rtx (r2) + foo;
>>
>> now i would have to say
>>
>> wide_int_ptr foo = wide_int_ptr::from_rtx (r1);
>> wide_int_stack bar = wide_int_ptr::from_rtx (r2) + foo;
> No, you'd say
>
> wide_int foo = wide_int::from_rtx (r1);
>
> and the static, non-templated from_rtx method would automagically
> return (always!) a "wide_int_ptr" kind.  The initialization then would
> use the assignment operator that mediates between wide_int and
> "wide_int_ptr", doing the copying.
>
> The user should get a 'stack' kind by default when specifying wide_int,
> like implemented with
>
> struct wide_int_storage_stack;
> struct wide_int_storage_ptr;
>
> template <class storage = wide_int_storage_stack>
> class wide_int : public storage
> {
> ...
>     static wide_int <wide_int_storage_ptr> from_rtx (rtx);
> }
>
> the whole point of the exercise is to make from_rtx and from_tree avoid
> the copying (and excessive stack space allocation) for the rvalue case
> like in
>
>   wide_int res = wide_int::from_rtx (x) + 1;
>
> if you save the result into a wide_int temporary first then you are lost
> of course (modulo some magic GCC optimization being able to elide
> the copy somehow).
>
> And of course for code like VRP that keeps a lattice of wide_ints to
> be able to reduce its footprint by using ptr storage and explicit allocations
> (that's a secondary concern, of course).  And for VRP to specify that
> it needs more than the otherwise needed MAX_INT_MODE_SIZE.
> ptr storage would not have this arbitrary limitation, only embedded
> storage (should) have.
>
>> then when i want to call some function using a wide_int ref that function
>> now must be either overloaded to take both or i have to choose one of the
>> two instantiations (presumably based on which is going to be more common)
>> and just have the compiler fix up everything (which it is likely to do).
> Nope, they'd be
>
> class wide_int ...
> {
>     template <class storage1, class storage2>
>     wide_int operator+(wide_int <storage1> a, wide_int<storage2> b)
>     {
>        return wide_int::plus_worker (a.precision, a. ...., a.get_storage_ptr (),
>                                                  b.precision, ...,
> b.get_storage_ptr ());
>     }
>
>
>> And so what is the payoff:
>> 1) No one except the c++ elite is going to understand the code. The rest of
>> the community will hate me and curse the ground that i walk on.
> Maybe for the implementation - but look at hash-table and vec ... not for
> usage certainly.
>
>> 2) I will end up with a version of wide-int that can be used as a medium
>> life container (where i define medium life as not allowed to survive a gc
>> since they will contain pointers into rtxes and trees.)
>> 3) An no clients that actually wanted to do this!!    I could use as an
>> example one of your favorite passes, tree-vrp.   The current double-int
>> could have been a medium lifetime container since it has a smaller
>> footprint, but in fact tree-vrp converts those double-ints back into trees
>> for medium storage.   Why, because it needs the other fields of a tree-cst
>> to store the entire state.  Wide-ints also "suffer" this problem.  their
>> only state are the data, and the three length fields.   They have no type
>> and none of the other tree info so the most obvious client for a medium
>> lifetime object is really not going to be a good match even if you "solve
>> the storage problem".
>>
>> The fact is that wide-ints are an excellent short term storage class that
>> can be very quickly converted into our two long term storage classes.  Your
>> proposal is requires a lot of work, will not be easy to use and as far as i
>> can see has no payoff on the horizon.   It could be that there could be
>> future clients for a medium lifetime value, but asking for this with no
>> clients in hand is really beyond the scope of a reasonable review.
>>
>> I remind you that the purpose of these patches is to solve problems that
>> exist in the current compiler that we have papered over for years.   If
>> someone needs wide-ints in some way that is not foreseen then they can
>> change it.
> The patches introduce a lot more temporary wide-ints (your words) and
> at the same time makes construction of them from tree / rtx very expensive
> both stack space and compile-time wise.  Look at how we for example
> compute TREE_INT_CST + 1 - int_cst_binop internally uses double_ints
> for the computation and then instantiates a new tree for holding the result.
> Now we'd use wide_ints for this requring totally unnecessary copying.
> Why not in the first place try to avoid that.  And try to avoid making
> wide_ints 4 times as large as really necessary just for the sake of VRP!
> (VRP should have a way to say "_I_ want larger wide_ints", without putting
> this burden on all other users).
>
> Richard.
>
>> kenny
>>
>>
>> On 11/26/2012 11:30 AM, Richard Biener wrote:
>>> On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck
>>> <zadeck@naturalbridge.com> wrote:
>>>> On 11/26/2012 10:03 AM, Richard Biener wrote:
>>>>> On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck
>>>>> <zadeck@naturalbridge.com>
>>>>> wrote:
>>>>>> On 11/04/2012 11:54 AM, Richard Biener wrote:
>>>>>>> On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
>>>>>>> <rdsandiford@googlemail.com> wrote:
>>>>>>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>>>>>>>>> I would like you to respond to at least point 1 of this email.   In
>>>>>>>>> it
>>>>>>>>> there is code from the rtl level that was written twice, once for
>>>>>>>>> the
>>>>>>>>> case when the size of the mode is less than the size of a HWI and
>>>>>>>>> once
>>>>>>>>> for the case where the size of the mode is less that 2 HWIs.
>>>>>>>>>
>>>>>>>>> my patch changes this to one instance of the code that works no
>>>>>>>>> matter
>>>>>>>>> how large the data passed to it is.
>>>>>>>>>
>>>>>>>>> you have made a specific requirement for wide int to be a template
>>>>>>>>> that
>>>>>>>>> can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.
>>>>>>>>> I
>>>>>>>>> would like to know how this particular fragment is to be rewritten
>>>>>>>>> in
>>>>>>>>> this model?   It seems that I would have to retain the structure
>>>>>>>>> where
>>>>>>>>> there is one version of the code for each size that the template is
>>>>>>>>> instantiated.
>>>>>>>> I think richi's argument was that wide_int should be split into two.
>>>>>>>> There should be a "bare-metal" class that just has a length and HWIs,
>>>>>>>> and the main wide_int class should be an extension on top of that
>>>>>>>> that does things to a bit precision instead.  Presumably with some
>>>>>>>> template magic so that the length (number of HWIs) is a constant for:
>>>>>>>>
>>>>>>>>       typedef foo<2> double_int;
>>>>>>>>
>>>>>>>> and a variable for wide_int (because in wide_int the length would be
>>>>>>>> the number of significant HWIs rather than the size of the underlying
>>>>>>>> array).  wide_int would also record the precision and apply it after
>>>>>>>> the full HWI operation.
>>>>>>>>
>>>>>>>> So the wide_int class would still provide "as wide as we need"
>>>>>>>> arithmetic,
>>>>>>>> as in your rtl patch.  I don't think he was objecting to that.
>>>>>>> That summarizes one part of my complaints / suggestions correctly.  In
>>>>>>> other
>>>>>>> mails I suggested to not make it a template but a constant over object
>>>>>>> lifetime
>>>>>>> 'bitsize' (or maxlen) field.  Both suggestions likely require more
>>>>>>> thought
>>>>>>> than
>>>>>>> I put into them.  The main reason is that with C++ you can abstract
>>>>>>> from
>>>>>>> where
>>>>>>> wide-int information pieces are stored and thus use the arithmetic /
>>>>>>> operation
>>>>>>> workers without copying the (source) "wide-int" objects.  Thus you
>>>>>>> should
>>>>>>> be able to write adaptors for double-int storage, tree or RTX storage.
>>>>>> We had considered something along these lines and rejected it.   I am
>>>>>> not
>>>>>> really opposed to doing something like this, but it is not an obvious
>>>>>> winning idea and is likely not to be a good idea.   Here was our
>>>>>> thought
>>>>>> process:
>>>>>>
>>>>>> if you abstract away the storage inside a wide int, then you should be
>>>>>> able
>>>>>> to copy a pointer to the block of data from either the rtl level
>>>>>> integer
>>>>>> constant or the tree level one into the wide int.   It is certainly
>>>>>> true
>>>>>> that making a wide_int from one of these is an extremely common
>>>>>> operation
>>>>>> and doing this would avoid those copies.
>>>>>>
>>>>>> However, this causes two problems:
>>>>>> 1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to
>>>>>> make
>>>>>> the object.   it created the base object and then it allocated the
>>>>>> array.
>>>>>> Richard S noticed that we could just allocate one CONST_WIDE_INT that
>>>>>> had
>>>>>> the array in it.   Doing it this way saves one ggc allocation and one
>>>>>> indirection when accessing the data within the CONST_WIDE_INT.   Our
>>>>>> plan
>>>>>> is
>>>>>> to use the same trick at the tree level.   So to avoid the copying, you
>>>>>> seem
>>>>>> to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.
>>>>> I did not propose having a pointer to the data in the RTX or tree int.
>>>>> Just
>>>>> the short-lived wide-ints (which are on the stack) would have a pointer
>>>>> to
>>>>> the data - which can then obviously point into the RTX and tree data.
>>>> There is the issue then what if some wide-ints are not short lived. It
>>>> makes
>>>> me nervous to create internal pointers to gc ed memory.
>>> I thought they were all short-lived.
>>>
>>>>>> 2) You are now stuck either ggcing the storage inside a wide_int when
>>>>>> they
>>>>>> are created as part of an expression or you have to play some game to
>>>>>> represent the two different storage plans inside of wide_int.
>>>>> Hm?  wide-ints are short-lived and thus never live across a garbage
>>>>> collection
>>>>> point.  We create non-GCed objects pointing to GCed objects all the time
>>>>> and everywhere this way.
>>>> Again, this makes me nervous but it could be done.  However, it does mean
>>>> that now the wide ints that are not created from rtxes or trees will be
>>>> more
>>>> expensive because they are not going to get their storage "for free",
>>>> they
>>>> are going to alloca it.
>>> No, those would simply use the embedded storage model.
>>>
>>>> however, it still is not clear, given that 99% of the wide ints are going
>>>> to
>>>> fit in a single hwi, that this would be a noticeable win.
>>> Currently even if they fit into a HWI you will still allocate 4 times the
>>> larges integer mode size.  You say that doesn't matter because they
>>> are short-lived, but I say it does matter because not all of them are
>>> short-lived enough.  If 99% fit in a HWI why allocate 4 times the
>>> largest integer mode size in 99% of the cases?
>>>
>>>>>>      Clearly this
>>>>>> is where you think that we should be going by suggesting that we
>>>>>> abstract
>>>>>> away the internal storage.   However, this comes at a price:   what is
>>>>>> currently an array access in my patches would (i believe) become a
>>>>>> function
>>>>>> call.
>>>>> No, the workers (that perform the array accesses) will simply get
>>>>> a pointer to the first data element.  Then whether it's embedded or
>>>>> external is of no interest to them.
>>>> so is your plan that the wide int constructors from rtx or tree would
>>>> just
>>>> copy the pointer to the array on top of the array that is otherwise
>>>> allocated on the stack?    I can easily do this.   But as i said, the
>>>> gain
>>>> seems quite small.
>>>>
>>>> And of course, going the other way still does need the copy.
>>> The proposal was to template wide_int on a storage model, the embedded
>>> one would work as-is (embedding 4 times largest integer mode), the
>>> external one would have a pointer to data.  All functions that return a
>>> wide_int produce a wide_int with the embedded model.  To avoid
>>> the function call penalty you described the storage model provides
>>> a way to get a pointer to the first element and the templated operations
>>> simply dispatch to a worker that takes this pointer to the first element
>>> (as the storage model is designed as a template its abstraction is going
>>> to be optimized away by means of inlining).
>>>
>>> Richard.
>>>
>>>>>>     From a performance point of view, i believe that this is a non
>>>>>> starter. If you can figure out how to design this so that it is not a
>>>>>> function call, i would consider this a viable option.
>>>>>>
>>>>>> On the other side of this you are clearly correct that we are copying
>>>>>> the
>>>>>> data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs.
>>>>>> But
>>>>>> this is why we represent data inside of the wide_ints, the INT_CSTs and
>>>>>> the
>>>>>> CONST_WIDE_INTs in a compressed form.   Even with very big types, which
>>>>>> are
>>>>>> generally rare, the constants them selves are very small.   So the copy
>>>>>> operation is a loop that almost always copies one element, even with
>>>>>> tree-vrp which doubles the sizes of every type.
>>>>>>
>>>>>> There is the third option which is that the storage inside the wide int
>>>>>> is
>>>>>> just ggced storage.  We rejected this because of the functional nature
>>>>>> of
>>>>>> wide-ints.    There are zillions created, they can be stack allocated,
>>>>>> and
>>>>>> they last for very short periods of time.
>>>>> Of course - GCing wide-ints is a non-starter.
>>>>>
>>>>>>>> As is probably obvious, I don't agree FWIW.  It seems like an
>>>>>>>> unnecessary
>>>>>>>> complication without any clear use.  Especially since the number of
>>>>>>> Maybe the double_int typedef is without any clear use.  Properly
>>>>>>> abstracting from the storage / information providers will save
>>>>>>> compile-time, memory and code though.  I don't see that any thought
>>>>>>> was spent on how to avoid excessive copying or dealing with
>>>>>>> long(er)-lived objects and their storage needs.
>>>>>> I actually disagree.    Wide ints can use a bloated amount of storage
>>>>>> because they are designed to be very short lived and very low cost
>>>>>> objects
>>>>>> that are stack allocated.   For long term storage, there is INT_CST at
>>>>>> the
>>>>>> tree level and CONST_WIDE_INT at the rtl level.  Those use a very
>>>>>> compact
>>>>>> storage model.   The copying entailed is only a small part of the
>>>>>> overall
>>>>>> performance.
>>>>> Well, but both trees and RTXen are not viable for short-lived things
>>>>> because
>>>>> the are GCed!  double-ints were suitable for this kind of stuff because
>>>>> the also have a moderate size.  With wide-ints size becomes a problem
>>>>> (or GC, if you instead use trees or RTXen).
>>>>>
>>>>>> Everything that you are suggesting along these lines is adding to the
>>>>>> weight
>>>>>> of a wide-int object.
>>>>> On the contrary - it lessens their weight (with external already
>>>>> existing storage)
>>>>> or does not do anything to it (with the embedded storage).
>>>>>
>>>>>>     You have to understand there will be many more
>>>>>> wide-ints created in a normal compilation than were ever created with
>>>>>> double-int.    This is because the rtl level had no object like this at
>>>>>> all
>>>>>> and at the tree level, many of the places that should have used double
>>>>>> int,
>>>>>> short cut the code and only did the transformations if the types fit in
>>>>>> a
>>>>>> HWI.
>>>>> Your argument shows that the copy-in/out from tree/RTX to/from wide-int
>>>>> will become a very frequent operation and thus it is worth optimizing
>>>>> it.
>>>>>
>>>>>> This is why we are extremely defensive about this issue.   We really
>>>>>> did
>>>>>> think a lot about it.
>>>>> I'm sure you did.
>>>>>
>>>>> Richard.
>>>>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 13:54       ` Kenneth Zadeck
  2012-10-31 14:05         ` Jakub Jelinek
@ 2012-10-31 19:13         ` Marc Glisse
  1 sibling, 0 replies; 59+ messages in thread
From: Marc Glisse @ 2012-10-31 19:13 UTC (permalink / raw)
  To: Kenneth Zadeck
  Cc: Richard Biener, Jakub Jelinek, gcc, gcc-patches, rdsandiford

On Wed, 31 Oct 2012, Kenneth Zadeck wrote:

> Richi,
>
> Let me explain to you what a broken api is.   I have spent the last week 
> screwing around with tree-vpn and as of last night i finally got it to work. 
> In tree-vpn, it is clear that double-int is the precise definition of a 
> broken api.
>
> The tree-vpn uses an infinite-precision view of arithmetic. However, that 
> infinite precision is implemented on top of a finite, CARVED IN STONE, base 
> that is and will always be without a patch like this, 128 bits on an x86-64. 
> However, as was pointed out by earlier, tree-vrp needs 2 * the size of a type 
> + 1 bit to work correctly.    Until yesterday i did not fully understand the 
> significance of that 1 bit.  what this means is that tree-vrp does not work 
> on an x86-64 with _int128 variables.

I am a bit surprised by that. AFAIK, the wrapping multiplication case is 
the only place that uses quad-sized arithmetic, so that must be what you 
are talking about. But when I wrote that code, I was well aware of the 
need for that extra bit and worked around it using signed / unsigned as an 
extra bit of information. So if you found a bug there, I'd like to know 
(although it becomes moot once the code is replaced with wide_int).

Note that my original patch for VRP used the GMP library for computations 
(it was rejected as likely too slow), so I think simplifying the thing 
with a multi-precision type is great. And if as you explained you have one 
(large) fixed size used for all temporaries on the stack but never used 
for malloc'ed objects, that sounds good too.

Good luck with the useful wide_int work,

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 59+ messages in thread

* patch to fix constant math - 5th patch - the main rtl work
  2012-10-31 10:02     ` Richard Sandiford
  2012-10-31 10:13       ` Richard Biener
  2012-10-31 13:54       ` Kenneth Zadeck
@ 2013-02-27 12:39       ` Kenneth Zadeck
  2 siblings, 0 replies; 59+ messages in thread
From: Kenneth Zadeck @ 2013-02-27 12:39 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek, gcc, gcc-patches, rdsandiford,
	Ian Lance Taylor

[-- Attachment #1: Type: text/plain, Size: 3491 bytes --]

This patch fixes the rtl level so that the constant math performed is 
independent of the host compiler.
This patch improves the rtl level in two ways:

1) This patch unifies the way that constant math is preformed. Without 
this patch, there are a large number of checks to see if a constant fit 
in a one or two HOST_WIDE_INTs.   In many cases, transformations were 
not done or done differently depending on the results of the test.   
Now, virtually all constant math at the rtl level use the wide-int class 
and so there are no host dependent differences on how the math is 
done.    This means that TImode is now better supported on 64bit host 
compiling to 64 bit targets.

2) This patch conditionally introduces a new rtl class, the WIDE_INT 
that holds integer constants that do not fit into a CONST_INT.   For 
those targets that define TARGET_SUPPORTS_WIDE_INT, this removes the 
punning of using CONST_DOUBLE to hold both floats and ints that are 
larger than two HOST_WIDE_INTS.   If the target defines this, then (at 
least at the rtl level) TImode can be used without iceing or getting the 
wrong answer on a 32 bit host and it makes it possible for the target to 
use modes larger than TImode.

Note that we already have 2 public platforms that are beginning to make 
use of modes larger than 128 bits.  For instance, the x86-64 can now do 
vector wide shifts which require 256 bit data types.   It would be 
unsurprising to see more vector wide operations in the future.   This 
patch fixes the rtl level so that GCC can support these operations.

This patch was heavily reviewed by Richard Sandiford before he resigned 
as a reviewer.  It was mostly just waiting on patch 4 to be accepted on 
which it depends very heavily.

Ok to commit when stage1 opens?

kenny


On 10/31/2012 05:59 AM, Richard Sandiford wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck
>> <zadeck@naturalbridge.com> wrote:
>>> jakub,
>>>
>>> i am hoping to get the rest of my wide integer conversion posted by nov 5.
>>> I am under some adverse conditions here: hurricane sandy hit her pretty
>>> badly.  my house is hooked up to a small generator, and no one has any power
>>> for miles around.
>>>
>>> So far richi has promised to review them.   he has sent some comments, but
>>> so far no reviews.    Some time after i get the first round of them posted,
>>> i will do a second round that incorporates everyones comments.
>>>
>>> But i would like a little slack here if possible.    While this work is a
>>> show stopper for my private port, the patches address serious problems for
>>> many of the public ports, especially ones that have very flexible vector
>>> units.    I believe that there are significant set of latent problems
>>> currently with the existing ports that use ti mode that these patches will
>>> fix.
>>>
>>> However, i will do everything in my power to get the first round of the
>>> patches posted by nov 5 deadline.
>> I suppose you are not going to merge your private port for 4.8 and thus
>> the wide-int changes are not a show-stopper for you.
>>
>> That said, I considered the main conversion to be appropriate to be
>> defered for the next stage1.  There is no advantage in disrupting the
>> tree more at this stage.
> I would like the wide_int class and rtl stuff to go in 4.8 though.
> IMO it's a significant improvement in its own right, and Kenny
> submitted it well before the deadline.
>
> Richard


[-- Attachment #2: p5-3.diff --]
[-- Type: text/x-patch, Size: 135195 bytes --]

diff --git a/gcc/alias.c b/gcc/alias.c
index e18dd34..58e4eac 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -1471,9 +1471,7 @@ rtx_equal_for_memref_p (const_rtx x, const_rtx y)
 
     case VALUE:
     CASE_CONST_UNIQUE:
-      /* There's no need to compare the contents of CONST_DOUBLEs or
-	 CONST_INTs because pointer equality is a good enough
-	 comparison for these nodes.  */
+      /* Pointer equality guarantees equality for these nodes.  */
       return 0;
 
     default:
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 68b6a2c..f076cee 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -669,20 +669,24 @@ c_getstr (tree src)
   return TREE_STRING_POINTER (src) + tree_low_cst (offset_node, 1);
 }
 
-/* Return a CONST_INT or CONST_DOUBLE corresponding to target reading
+/* Return a constant integer corresponding to target reading
    GET_MODE_BITSIZE (MODE) bits from string constant STR.  */
 
 static rtx
 c_readstr (const char *str, enum machine_mode mode)
 {
-  HOST_WIDE_INT c[2];
+  wide_int c;
   HOST_WIDE_INT ch;
   unsigned int i, j;
+  HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
+  unsigned int len = (GET_MODE_PRECISION (mode) + HOST_BITS_PER_WIDE_INT - 1)
+    / HOST_BITS_PER_WIDE_INT;
+
+  for (i = 0; i < len; i++)
+    tmp[i] = 0;
 
   gcc_assert (GET_MODE_CLASS (mode) == MODE_INT);
 
-  c[0] = 0;
-  c[1] = 0;
   ch = 1;
   for (i = 0; i < GET_MODE_SIZE (mode); i++)
     {
@@ -693,13 +697,14 @@ c_readstr (const char *str, enum machine_mode mode)
 	  && GET_MODE_SIZE (mode) >= UNITS_PER_WORD)
 	j = j + UNITS_PER_WORD - 2 * (j % UNITS_PER_WORD) - 1;
       j *= BITS_PER_UNIT;
-      gcc_assert (j < HOST_BITS_PER_DOUBLE_INT);
 
       if (ch)
 	ch = (unsigned char) str[i];
-      c[j / HOST_BITS_PER_WIDE_INT] |= ch << (j % HOST_BITS_PER_WIDE_INT);
+      tmp[j / HOST_BITS_PER_WIDE_INT] |= ch << (j % HOST_BITS_PER_WIDE_INT);
     }
-  return immed_double_const (c[0], c[1], mode);
+  
+  c = wide_int::from_array (tmp, len, mode);
+  return immed_wide_int_const (c, mode);
 }
 
 /* Cast a target constant CST to target CHAR and if that value fits into
@@ -4991,12 +4996,12 @@ expand_builtin_signbit (tree exp, rtx target)
 
   if (bitpos < GET_MODE_BITSIZE (rmode))
     {
-      double_int mask = double_int_zero.set_bit (bitpos);
+      wide_int mask = wide_int::set_bit_in_zero (bitpos, rmode);
 
       if (GET_MODE_SIZE (imode) > GET_MODE_SIZE (rmode))
 	temp = gen_lowpart (rmode, temp);
       temp = expand_binop (rmode, and_optab, temp,
-			   immed_double_int_const (mask, rmode),
+			   immed_wide_int_const (mask, rmode),
 			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
     }
   else
diff --git a/gcc/combine.c b/gcc/combine.c
index 98ca4a8..7dd29b8 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2669,23 +2669,15 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int *new_direct_jump_p,
 	    offset = -1;
 	}
 
-      if (offset >= 0
-	  && (GET_MODE_PRECISION (GET_MODE (SET_DEST (temp)))
-	      <= HOST_BITS_PER_DOUBLE_INT))
+      if (offset >= 0)
 	{
-	  double_int m, o, i;
+	  wide_int o;
 	  rtx inner = SET_SRC (PATTERN (i3));
 	  rtx outer = SET_SRC (temp);
-
-	  o = rtx_to_double_int (outer);
-	  i = rtx_to_double_int (inner);
-
-	  m = double_int::mask (width);
-	  i &= m;
-	  m = m.llshift (offset, HOST_BITS_PER_DOUBLE_INT);
-	  i = i.llshift (offset, HOST_BITS_PER_DOUBLE_INT);
-	  o = o.and_not (m) | i;
-
+	  
+	  o = (wide_int::from_rtx (outer, GET_MODE (SET_DEST (temp)))
+	       .insert (wide_int::from_rtx (inner, GET_MODE (dest)),
+			offset, width));
 	  combine_merges++;
 	  subst_insn = i3;
 	  subst_low_luid = DF_INSN_LUID (i2);
@@ -2696,8 +2688,8 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int *new_direct_jump_p,
 	  /* Replace the source in I2 with the new constant and make the
 	     resulting insn the new pattern for I3.  Then skip to where we
 	     validate the pattern.  Everything was set up above.  */
-	  SUBST (SET_SRC (temp),
-		 immed_double_int_const (o, GET_MODE (SET_DEST (temp))));
+	  SUBST (SET_SRC (temp), 
+		 immed_wide_int_const (o, GET_MODE (SET_DEST (temp))));
 
 	  newpat = PATTERN (i2);
 
@@ -5113,7 +5105,7 @@ subst (rtx x, rtx from, rtx to, int in_dest, int in_cond, int unique_copy)
 		  if (! x)
 		    x = gen_rtx_CLOBBER (mode, const0_rtx);
 		}
-	      else if (CONST_INT_P (new_rtx)
+	      else if (CONST_SCALAR_INT_P (new_rtx)
 		       && GET_CODE (x) == ZERO_EXTEND)
 		{
 		  x = simplify_unary_operation (ZERO_EXTEND, GET_MODE (x),
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index 320b4dd..3ea8920 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -55,6 +55,9 @@ typedef const struct rtx_def *const_rtx;
 struct rtvec_def;
 typedef struct rtvec_def *rtvec;
 typedef const struct rtvec_def *const_rtvec;
+struct hwivec_def;
+typedef struct hwivec_def *hwivec;
+typedef const struct hwivec_def *const_hwivec;
 union tree_node;
 typedef union tree_node *tree;
 typedef const union tree_node *const_tree;
diff --git a/gcc/cse.c b/gcc/cse.c
index b200fef..db57f33 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -2331,15 +2331,23 @@ hash_rtx_cb (const_rtx x, enum machine_mode mode,
                + (unsigned int) INTVAL (x));
       return hash;
 
+    case CONST_WIDE_INT:
+      {
+	int i;
+	for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+	  hash += CONST_WIDE_INT_ELT (x, i);
+      }
+      return hash;
+
     case CONST_DOUBLE:
       /* This is like the general case, except that it only counts
 	 the integers representing the constant.  */
       hash += (unsigned int) code + (unsigned int) GET_MODE (x);
-      if (GET_MODE (x) != VOIDmode)
-	hash += real_hash (CONST_DOUBLE_REAL_VALUE (x));
-      else
+      if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (x) == VOIDmode)
 	hash += ((unsigned int) CONST_DOUBLE_LOW (x)
 		 + (unsigned int) CONST_DOUBLE_HIGH (x));
+      else
+	hash += real_hash (CONST_DOUBLE_REAL_VALUE (x));
       return hash;
 
     case CONST_FIXED:
@@ -3756,6 +3764,7 @@ equiv_constant (rtx x)
 
       /* See if we previously assigned a constant value to this SUBREG.  */
       if ((new_rtx = lookup_as_function (x, CONST_INT)) != 0
+	  || (new_rtx = lookup_as_function (x, CONST_WIDE_INT)) != 0
           || (new_rtx = lookup_as_function (x, CONST_DOUBLE)) != 0
           || (new_rtx = lookup_as_function (x, CONST_FIXED)) != 0)
         return new_rtx;
diff --git a/gcc/cselib.c b/gcc/cselib.c
index dcad9741..3f7c156 100644
--- a/gcc/cselib.c
+++ b/gcc/cselib.c
@@ -923,8 +923,7 @@ rtx_equal_for_cselib_1 (rtx x, rtx y, enum machine_mode memmode)
   /* These won't be handled correctly by the code below.  */
   switch (GET_CODE (x))
     {
-    case CONST_DOUBLE:
-    case CONST_FIXED:
+    CASE_CONST_UNIQUE:
     case DEBUG_EXPR:
       return 0;
 
@@ -1118,15 +1117,23 @@ cselib_hash_rtx (rtx x, int create, enum machine_mode memmode)
       hash += ((unsigned) CONST_INT << 7) + INTVAL (x);
       return hash ? hash : (unsigned int) CONST_INT;
 
+    case CONST_WIDE_INT:
+      {
+	int i;
+	for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+	  hash += CONST_WIDE_INT_ELT (x, i);
+      }
+      return hash;
+
     case CONST_DOUBLE:
       /* This is like the general case, except that it only counts
 	 the integers representing the constant.  */
       hash += (unsigned) code + (unsigned) GET_MODE (x);
-      if (GET_MODE (x) != VOIDmode)
-	hash += real_hash (CONST_DOUBLE_REAL_VALUE (x));
-      else
+      if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (x) == VOIDmode)
 	hash += ((unsigned) CONST_DOUBLE_LOW (x)
 		 + (unsigned) CONST_DOUBLE_HIGH (x));
+      else
+	hash += real_hash (CONST_DOUBLE_REAL_VALUE (x));
       return hash ? hash : (unsigned int) CONST_DOUBLE;
 
     case CONST_FIXED:
diff --git a/gcc/defaults.h b/gcc/defaults.h
index 4f43f6f0..0801073 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1404,6 +1404,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define SWITCHABLE_TARGET 0
 #endif
 
+/* If the target supports integers that are wider than two
+   HOST_WIDE_INTs on the host compiler, then the target should define
+   TARGET_SUPPORTS_WIDE_INT and make the appropriate fixups.
+   Otherwise the compiler really is not robust.  */
+#ifndef TARGET_SUPPORTS_WIDE_INT
+#define TARGET_SUPPORTS_WIDE_INT 0
+#endif
+
 #endif /* GCC_INSN_FLAGS_H  */
 
 #endif  /* ! GCC_DEFAULTS_H */
diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 095a642..a4e4381 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -1531,17 +1531,22 @@ Similarly, there is only one object for the integer whose value is
 
 @findex const_double
 @item (const_double:@var{m} @var{i0} @var{i1} @dots{})
-Represents either a floating-point constant of mode @var{m} or an
-integer constant too large to fit into @code{HOST_BITS_PER_WIDE_INT}
-bits but small enough to fit within twice that number of bits (GCC
-does not provide a mechanism to represent even larger constants).  In
-the latter case, @var{m} will be @code{VOIDmode}.  For integral values
-constants for modes with more bits than twice the number in
-@code{HOST_WIDE_INT} the implied high order bits of that constant are
-copies of the top bit of @code{CONST_DOUBLE_HIGH}.  Note however that
-integral values are neither inherently signed nor inherently unsigned;
-where necessary, signedness is determined by the rtl operation
-instead.
+This represents either a floating-point constant of mode @var{m} or
+(on ports older ports that do not define
+@code{TARGET_SUPPORTS_WIDE_INT}) an integer constant too large to fit
+into @code{HOST_BITS_PER_WIDE_INT} bits but small enough to fit within
+twice that number of bits (GCC does not provide a mechanism to
+represent even larger constants).  In the latter case, @var{m} will be
+@code{VOIDmode}.  For integral values constants for modes with more
+bits than twice the number in @code{HOST_WIDE_INT} the implied high
+order bits of that constant are copies of the top bit of
+@code{CONST_DOUBLE_HIGH}.  Note however that integral values are
+neither inherently signed nor inherently unsigned; where necessary,
+signedness is determined by the rtl operation instead.
+
+On more modern ports, @code{CONST_DOUBLE} only represents floating
+point values.  New ports define to @code{TARGET_SUPPORTS_WIDE_INT} to
+make this designation.
 
 @findex CONST_DOUBLE_LOW
 If @var{m} is @code{VOIDmode}, the bits of the value are stored in
@@ -1556,6 +1561,37 @@ machine's or host machine's floating point format.  To convert them to
 the precise bit pattern used by the target machine, use the macro
 @code{REAL_VALUE_TO_TARGET_DOUBLE} and friends (@pxref{Data Output}).
 
+@findex CONST_WIDE_INT
+@item (const_wide_int:@var{m} @var{nunits} @var{elt0} @dots{})
+This contains an array of @code{HOST_WIDE_INTS} that is large enough
+to hold any constant that can be represented on the target.  This form
+of rtl is only used on targets that define
+@code{TARGET_SUPPORTS_WIDE_INT} to be non zero and then
+@code{CONST_DOUBLES} are only used to hold floating point values.  If
+the target leaves @code{TARGET_SUPPORTS_WIDE_INT} defined as 0,
+@code{CONST_WIDE_INT}s are not used and @code{CONST_DOUBLE}s are as
+they were before.
+
+The values are stored in a compressed format.   The higher order
+0s or -1s are not represented if they are just the logical sign
+extension of the number that is represented.   
+
+@findex CONST_WIDE_INT_VEC
+@item CONST_WIDE_INT_VEC (@var{code})
+Returns the entire array of @code{HOST_WIDE_INT}s that are used to
+store the value.   This macro should be rarely used.
+
+@findex CONST_WIDE_INT_NUNITS
+@item CONST_WIDE_INT_NUNITS (@var{code})
+The number of @code{HOST_WIDE_INT}s used to represent the number.
+Note that this generally be smaller than the number of
+@code{HOST_WIDE_INT}s implied by the mode size.
+
+@findex CONST_WIDE_INT_ELT
+@item CONST_WIDE_INT_NUNITS (@var{code},@var{i})
+Returns the @code{i}th element of the array.   Element 0 is contains
+the low order bits of the constant.
+
 @findex const_fixed
 @item (const_fixed:@var{m} @dots{})
 Represents a fixed-point constant of mode @var{m}.
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index ce2b44d..dc123c9 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11341,3 +11341,48 @@ memory model bits are allowed.
 @deftypevr {Target Hook} {unsigned char} TARGET_ATOMIC_TEST_AND_SET_TRUEVAL
 This value should be set if the result written by @code{atomic_test_and_set} is not exactly 1, i.e. the @code{bool} @code{true}.
 @end deftypevr
+@defmac TARGET_SUPPORTS_WIDE_INT
+
+On older ports, large integers are stored in @code{CONST_DOUBLE} rtl
+objects.  Newer ports define @code{TARGET_SUPPORTS_WIDE_INT} to be non
+zero to indicate that large integers are stored in
+@code{CONST_WIDE_INT} rtl objects.  The @code{CONST_WIDE_INT} allows
+very large integer constants to be represented.  @code{CONST_DOUBLE}
+are limited to twice the size of host's @code{HOST_WIDE_INT}
+representation.
+
+Converting a port mostly requires looking for the places where
+@code{CONST_DOUBLES} are used with @code{VOIDmode} and replacing that
+code with code that accesses @code{CONST_WIDE_INT}s.  @samp{"grep -i
+const_double"} at the port level gets you to 95% of the changes that
+need to be made.  There are a few places that require a deeper look.
+
+@itemize @bullet
+@item
+There is no equivalent to @code{hval} and @code{lval} for
+@code{CONST_WIDE_INT}s.  This would be difficult to express in the md
+language since there are a variable number of elements.
+
+Most ports only check that @code{hval} is either 0 or -1 to see if the
+value is small.  As mentioned above, this will no longer be necessary
+since small constants are always @code{CONST_INT}.  Of course there
+are still a few exceptions, the alpha's constraint used by the zap
+instruction certainly requires careful examination by C code.
+However, all the current code does is pass the hval and lval to C
+code, so evolving the c code to look at the @code{CONST_WIDE_INT} is
+not really a large change.
+
+@item
+Because there is no standard template that ports use to materialize
+constants, there is likely to be some futzing that is unique to each
+port in this code.
+
+@item
+The rtx costs may have to be adjusted to properly account for larger
+constants that are represented as @code{CONST_WIDE_INT}.
+@end itemize
+
+All and all it does not takes long to convert ports that the
+maintainer is familiar with.
+
+@end defmac
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index d6e7ce7..6345fcb 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -11177,3 +11177,48 @@ memory model bits are allowed.
 @end deftypefn
 
 @hook TARGET_ATOMIC_TEST_AND_SET_TRUEVAL
+@defmac TARGET_SUPPORTS_WIDE_INT
+
+On older ports, large integers are stored in @code{CONST_DOUBLE} rtl
+objects.  Newer ports define @code{TARGET_SUPPORTS_WIDE_INT} to be non
+zero to indicate that large integers are stored in
+@code{CONST_WIDE_INT} rtl objects.  The @code{CONST_WIDE_INT} allows
+very large integer constants to be represented.  @code{CONST_DOUBLE}
+are limited to twice the size of host's @code{HOST_WIDE_INT}
+representation.
+
+Converting a port mostly requires looking for the places where
+@code{CONST_DOUBLES} are used with @code{VOIDmode} and replacing that
+code with code that accesses @code{CONST_WIDE_INT}s.  @samp{"grep -i
+const_double"} at the port level gets you to 95% of the changes that
+need to be made.  There are a few places that require a deeper look.
+
+@itemize @bullet
+@item
+There is no equivalent to @code{hval} and @code{lval} for
+@code{CONST_WIDE_INT}s.  This would be difficult to express in the md
+language since there are a variable number of elements.
+
+Most ports only check that @code{hval} is either 0 or -1 to see if the
+value is small.  As mentioned above, this will no longer be necessary
+since small constants are always @code{CONST_INT}.  Of course there
+are still a few exceptions, the alpha's constraint used by the zap
+instruction certainly requires careful examination by C code.
+However, all the current code does is pass the hval and lval to C
+code, so evolving the c code to look at the @code{CONST_WIDE_INT} is
+not really a large change.
+
+@item
+Because there is no standard template that ports use to materialize
+constants, there is likely to be some futzing that is unique to each
+port in this code.
+
+@item
+The rtx costs may have to be adjusted to properly account for larger
+constants that are represented as @code{CONST_WIDE_INT}.
+@end itemize
+
+All and all it does not takes long to convert ports that the
+maintainer is familiar with.
+
+@end defmac
diff --git a/gcc/dojump.c b/gcc/dojump.c
index 3f04eac..ecbec40 100644
--- a/gcc/dojump.c
+++ b/gcc/dojump.c
@@ -142,6 +142,7 @@ static bool
 prefer_and_bit_test (enum machine_mode mode, int bitnum)
 {
   bool speed_p;
+  wide_int mask = wide_int::set_bit_in_zero (bitnum, mode);
 
   if (and_test == 0)
     {
@@ -162,8 +163,7 @@ prefer_and_bit_test (enum machine_mode mode, int bitnum)
     }
 
   /* Fill in the integers.  */
-  XEXP (and_test, 1)
-    = immed_double_int_const (double_int_zero.set_bit (bitnum), mode);
+  XEXP (and_test, 1) = immed_wide_int_const (mask, mode);
   XEXP (XEXP (shift_test, 0), 1) = GEN_INT (bitnum);
 
   speed_p = optimize_insn_for_speed_p ();
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 4e75407..40836ce 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -323,6 +323,17 @@ dump_struct_debug (tree type, enum debug_info_usage usage,
 
 #endif
 
+
+/* Get the number of host wide ints needed to represent the precision
+   of the number.  */
+
+static unsigned int
+get_full_len (const wide_int &op)
+{
+  return ((op.get_precision () + HOST_BITS_PER_WIDE_INT - 1)
+	  / HOST_BITS_PER_WIDE_INT);
+}
+
 static bool
 should_emit_struct_debug (tree type, enum debug_info_usage usage)
 {
@@ -1354,6 +1365,9 @@ dw_val_equal_p (dw_val_node *a, dw_val_node *b)
       return (a->v.val_double.high == b->v.val_double.high
 	      && a->v.val_double.low == b->v.val_double.low);
 
+    case dw_val_class_wide_int:
+      return a->v.val_wide == b->v.val_wide;
+
     case dw_val_class_vec:
       {
 	size_t a_len = a->v.val_vec.elt_size * a->v.val_vec.length;
@@ -1610,6 +1624,10 @@ size_of_loc_descr (dw_loc_descr_ref loc)
 	  case dw_val_class_const_double:
 	    size += HOST_BITS_PER_DOUBLE_INT / BITS_PER_UNIT;
 	    break;
+	  case dw_val_class_wide_int:
+	    size += (get_full_len (loc->dw_loc_oprnd2.v.val_wide)
+		     * HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT);
+	    break;
 	  default:
 	    gcc_unreachable ();
 	  }
@@ -1787,6 +1805,20 @@ output_loc_operands (dw_loc_descr_ref loc, int for_eh_or_skip)
 				 second, NULL);
 	  }
 	  break;
+	case dw_val_class_wide_int:
+	  {
+	    int i;
+	    int len = get_full_len (val2->v.val_wide);
+	    if (WORDS_BIG_ENDIAN)
+	      for (i = len; i >= 0; --i)
+		dw2_asm_output_data (HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR,
+				     val2->v.val_wide.elt (i), NULL);
+	    else
+	      for (i = 0; i < len; ++i)
+		dw2_asm_output_data (HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR,
+				     val2->v.val_wide.elt (i), NULL);
+	  }
+	  break;
 	case dw_val_class_addr:
 	  gcc_assert (val1->v.val_unsigned == DWARF2_ADDR_SIZE);
 	  dw2_asm_output_addr_rtx (DWARF2_ADDR_SIZE, val2->v.val_addr, NULL);
@@ -1996,6 +2028,21 @@ output_loc_operands (dw_loc_descr_ref loc, int for_eh_or_skip)
 	      dw2_asm_output_data (l, second, NULL);
 	    }
 	    break;
+	  case dw_val_class_wide_int:
+	    {
+	      int i;
+	      int len = get_full_len (val2->v.val_wide);
+	      l = HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR;
+
+	      dw2_asm_output_data (1, len * l, NULL);
+	      if (WORDS_BIG_ENDIAN)
+		for (i = len; i >= 0; --i)
+		  dw2_asm_output_data (l, val2->v.val_wide.elt (i), NULL);
+	      else
+		for (i = 0; i < len; ++i)
+		  dw2_asm_output_data (l, val2->v.val_wide.elt (i), NULL);
+	    }
+	    break;
 	  default:
 	    gcc_unreachable ();
 	  }
@@ -3095,7 +3142,7 @@ static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
 static void add_data_member_location_attribute (dw_die_ref, tree);
 static bool add_const_value_attribute (dw_die_ref, rtx);
 static void insert_int (HOST_WIDE_INT, unsigned, unsigned char *);
-static void insert_double (double_int, unsigned char *);
+static void insert_wide_int (const wide_int &, unsigned char *);
 static void insert_float (const_rtx, unsigned char *);
 static rtx rtl_for_decl_location (tree);
 static bool add_location_or_const_value_attribute (dw_die_ref, tree, bool,
@@ -3720,6 +3767,20 @@ AT_unsigned (dw_attr_ref a)
 /* Add an unsigned double integer attribute value to a DIE.  */
 
 static inline void
+add_AT_wide (dw_die_ref die, enum dwarf_attribute attr_kind,
+	     wide_int w)
+{
+  dw_attr_node attr;
+
+  attr.dw_attr = attr_kind;
+  attr.dw_attr_val.val_class = dw_val_class_wide_int;
+  attr.dw_attr_val.v.val_wide = w;
+  add_dwarf_attr (die, &attr);
+}
+
+/* Add an unsigned double integer attribute value to a DIE.  */
+
+static inline void
 add_AT_double (dw_die_ref die, enum dwarf_attribute attr_kind,
 	       HOST_WIDE_INT high, unsigned HOST_WIDE_INT low)
 {
@@ -5273,6 +5334,19 @@ print_die (dw_die_ref die, FILE *outfile)
 		   a->dw_attr_val.v.val_double.high,
 		   a->dw_attr_val.v.val_double.low);
 	  break;
+	case dw_val_class_wide_int:
+	  {
+	    int i = a->dw_attr_val.v.val_wide.get_len ();
+	    fprintf (outfile, "constant (");
+	    gcc_assert (i > 0);
+	    if (a->dw_attr_val.v.val_wide.elt (i) == 0)
+	      fprintf (outfile, "0x");
+	    fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, a->dw_attr_val.v.val_wide.elt (--i));
+	    while (-- i >= 0)
+	      fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, a->dw_attr_val.v.val_wide.elt (i));
+	    fprintf (outfile, ")");
+	    break;
+	  }
 	case dw_val_class_vec:
 	  fprintf (outfile, "floating-point or vector constant");
 	  break;
@@ -5428,6 +5502,9 @@ attr_checksum (dw_attr_ref at, struct md5_ctx *ctx, int *mark)
     case dw_val_class_const_double:
       CHECKSUM (at->dw_attr_val.v.val_double);
       break;
+    case dw_val_class_wide_int:
+      CHECKSUM (at->dw_attr_val.v.val_wide);
+      break;
     case dw_val_class_vec:
       CHECKSUM (at->dw_attr_val.v.val_vec);
       break;
@@ -5698,6 +5775,12 @@ attr_checksum_ordered (enum dwarf_tag tag, dw_attr_ref at,
       CHECKSUM (at->dw_attr_val.v.val_double);
       break;
 
+    case dw_val_class_wide_int:
+      CHECKSUM_ULEB128 (DW_FORM_block);
+      CHECKSUM_ULEB128 (sizeof (at->dw_attr_val.v.val_wide));
+      CHECKSUM (at->dw_attr_val.v.val_wide);
+      break;
+
     case dw_val_class_vec:
       CHECKSUM_ULEB128 (DW_FORM_block);
       CHECKSUM_ULEB128 (sizeof (at->dw_attr_val.v.val_vec));
@@ -6162,6 +6245,8 @@ same_dw_val_p (const dw_val_node *v1, const dw_val_node *v2, int *mark)
     case dw_val_class_const_double:
       return v1->v.val_double.high == v2->v.val_double.high
 	     && v1->v.val_double.low == v2->v.val_double.low;
+    case dw_val_class_wide_int:
+      return v1->v.val_wide == v2->v.val_wide;
     case dw_val_class_vec:
       if (v1->v.val_vec.length != v2->v.val_vec.length
 	  || v1->v.val_vec.elt_size != v2->v.val_vec.elt_size)
@@ -7624,6 +7709,13 @@ size_of_die (dw_die_ref die)
 	  if (HOST_BITS_PER_WIDE_INT >= 64)
 	    size++; /* block */
 	  break;
+	case dw_val_class_wide_int:
+	  size += (get_full_len (a->dw_attr_val.v.val_wide)
+		   * HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR);
+	  if (get_full_len (a->dw_attr_val.v.val_wide) * HOST_BITS_PER_WIDE_INT
+	      > 64)
+	    size++; /* block */
+	  break;
 	case dw_val_class_vec:
 	  size += constant_size (a->dw_attr_val.v.val_vec.length
 				 * a->dw_attr_val.v.val_vec.elt_size)
@@ -7960,6 +8052,20 @@ value_format (dw_attr_ref a)
 	default:
 	  return DW_FORM_block1;
 	}
+    case dw_val_class_wide_int:
+      switch (get_full_len (a->dw_attr_val.v.val_wide) * HOST_BITS_PER_WIDE_INT)
+	{
+	case 8:
+	  return DW_FORM_data1;
+	case 16:
+	  return DW_FORM_data2;
+	case 32:
+	  return DW_FORM_data4;
+	case 64:
+	  return DW_FORM_data8;
+	default:
+	  return DW_FORM_block1;
+	}
     case dw_val_class_vec:
       switch (constant_size (a->dw_attr_val.v.val_vec.length
 			     * a->dw_attr_val.v.val_vec.elt_size))
@@ -8399,6 +8505,32 @@ output_die (dw_die_ref die)
 	  }
 	  break;
 
+	case dw_val_class_wide_int:
+	  {
+	    int i;
+	    int len = get_full_len (a->dw_attr_val.v.val_wide);
+	    int l = HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR;
+	    if (len * HOST_BITS_PER_WIDE_INT > 64)
+	      dw2_asm_output_data (1, get_full_len (a->dw_attr_val.v.val_wide) * l,
+				   NULL);
+
+	    if (WORDS_BIG_ENDIAN)
+	      for (i = len; i >= 0; --i)
+		{
+		  dw2_asm_output_data (l, a->dw_attr_val.v.val_wide.elt (i),
+				       name);
+		  name = NULL;
+		}
+	    else
+	      for (i = 0; i < len; ++i)
+		{
+		  dw2_asm_output_data (l, a->dw_attr_val.v.val_wide.elt (i),
+				       name);
+		  name = NULL;
+		}
+	  }
+	  break;
+
 	case dw_val_class_vec:
 	  {
 	    unsigned int elt_size = a->dw_attr_val.v.val_vec.elt_size;
@@ -11524,9 +11656,8 @@ clz_loc_descriptor (rtx rtl, enum machine_mode mode,
     msb = GEN_INT ((unsigned HOST_WIDE_INT) 1
 		   << (GET_MODE_BITSIZE (mode) - 1));
   else
-    msb = immed_double_const (0, (unsigned HOST_WIDE_INT) 1
-				  << (GET_MODE_BITSIZE (mode)
-				      - HOST_BITS_PER_WIDE_INT - 1), mode);
+    msb = immed_wide_int_const 
+      (wide_int::set_bit_in_zero (GET_MODE_PRECISION (mode) - 1, mode), mode);
   if (GET_CODE (msb) == CONST_INT && INTVAL (msb) < 0)
     tmp = new_loc_descr (HOST_BITS_PER_WIDE_INT == 32
 			 ? DW_OP_const4u : HOST_BITS_PER_WIDE_INT == 64
@@ -12467,7 +12598,16 @@ mem_loc_descriptor (rtx rtl, enum machine_mode mode,
 	  mem_loc_result->dw_loc_oprnd1.val_class = dw_val_class_die_ref;
 	  mem_loc_result->dw_loc_oprnd1.v.val_die_ref.die = type_die;
 	  mem_loc_result->dw_loc_oprnd1.v.val_die_ref.external = 0;
-	  if (SCALAR_FLOAT_MODE_P (mode))
+#if TARGET_SUPPORTS_WIDE_INT == 0
+	  if (!SCALAR_FLOAT_MODE_P (mode))
+	    {
+	      mem_loc_result->dw_loc_oprnd2.val_class
+		= dw_val_class_const_double;
+	      mem_loc_result->dw_loc_oprnd2.v.val_double
+		= rtx_to_double_int (rtl);
+	    }
+	  else
+#endif
 	    {
 	      unsigned int length = GET_MODE_SIZE (mode);
 	      unsigned char *array
@@ -12479,13 +12619,26 @@ mem_loc_descriptor (rtx rtl, enum machine_mode mode,
 	      mem_loc_result->dw_loc_oprnd2.v.val_vec.elt_size = 4;
 	      mem_loc_result->dw_loc_oprnd2.v.val_vec.array = array;
 	    }
-	  else
-	    {
-	      mem_loc_result->dw_loc_oprnd2.val_class
-		= dw_val_class_const_double;
-	      mem_loc_result->dw_loc_oprnd2.v.val_double
-		= rtx_to_double_int (rtl);
-	    }
+	}
+      break;
+
+    case CONST_WIDE_INT:
+      if (!dwarf_strict)
+	{
+	  dw_die_ref type_die;
+
+	  type_die = base_type_for_mode (mode,
+					 GET_MODE_CLASS (mode) == MODE_INT);
+	  if (type_die == NULL)
+	    return NULL;
+	  mem_loc_result = new_loc_descr (DW_OP_GNU_const_type, 0, 0);
+	  mem_loc_result->dw_loc_oprnd1.val_class = dw_val_class_die_ref;
+	  mem_loc_result->dw_loc_oprnd1.v.val_die_ref.die = type_die;
+	  mem_loc_result->dw_loc_oprnd1.v.val_die_ref.external = 0;
+	  mem_loc_result->dw_loc_oprnd2.val_class
+	    = dw_val_class_wide_int;
+	  mem_loc_result->dw_loc_oprnd2.v.val_wide
+	    = wide_int::from_rtx (rtl, mode);
 	}
       break;
 
@@ -12956,7 +13109,15 @@ loc_descriptor (rtx rtl, enum machine_mode mode,
 	     adequately represented.  We output CONST_DOUBLEs as blocks.  */
 	  loc_result = new_loc_descr (DW_OP_implicit_value,
 				      GET_MODE_SIZE (mode), 0);
-	  if (SCALAR_FLOAT_MODE_P (mode))
+#if TARGET_SUPPORTS_WIDE_INT == 0
+	  if (!SCALAR_FLOAT_MODE_P (mode))
+	    {
+	      loc_result->dw_loc_oprnd2.val_class = dw_val_class_const_double;
+	      loc_result->dw_loc_oprnd2.v.val_double
+	        = rtx_to_double_int (rtl);
+	    }
+	  else
+#endif
 	    {
 	      unsigned int length = GET_MODE_SIZE (mode);
 	      unsigned char *array
@@ -12968,12 +13129,26 @@ loc_descriptor (rtx rtl, enum machine_mode mode,
 	      loc_result->dw_loc_oprnd2.v.val_vec.elt_size = 4;
 	      loc_result->dw_loc_oprnd2.v.val_vec.array = array;
 	    }
-	  else
-	    {
-	      loc_result->dw_loc_oprnd2.val_class = dw_val_class_const_double;
-	      loc_result->dw_loc_oprnd2.v.val_double
-	        = rtx_to_double_int (rtl);
-	    }
+	}
+      break;
+
+    case CONST_WIDE_INT:
+      if (mode == VOIDmode)
+	mode = GET_MODE (rtl);
+
+      if (mode != VOIDmode && (dwarf_version >= 4 || !dwarf_strict))
+	{
+	  gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl));
+
+	  /* Note that a CONST_DOUBLE rtx could represent either an integer
+	     or a floating-point constant.  A CONST_DOUBLE is used whenever
+	     the constant requires more than one word in order to be
+	     adequately represented.  We output CONST_DOUBLEs as blocks.  */
+	  loc_result = new_loc_descr (DW_OP_implicit_value,
+				      GET_MODE_SIZE (mode), 0);
+	  loc_result->dw_loc_oprnd2.val_class = dw_val_class_wide_int;
+	  loc_result->dw_loc_oprnd2.v.val_wide
+	    = wide_int::from_rtx (rtl, mode);
 	}
       break;
 
@@ -12989,6 +13164,7 @@ loc_descriptor (rtx rtl, enum machine_mode mode,
 	    ggc_alloc_atomic (length * elt_size);
 	  unsigned int i;
 	  unsigned char *p;
+	  enum machine_mode imode = GET_MODE_INNER (mode);
 
 	  gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl));
 	  switch (GET_MODE_CLASS (mode))
@@ -12997,15 +13173,8 @@ loc_descriptor (rtx rtl, enum machine_mode mode,
 	      for (i = 0, p = array; i < length; i++, p += elt_size)
 		{
 		  rtx elt = CONST_VECTOR_ELT (rtl, i);
-		  double_int val = rtx_to_double_int (elt);
-
-		  if (elt_size <= sizeof (HOST_WIDE_INT))
-		    insert_int (val.to_shwi (), elt_size, p);
-		  else
-		    {
-		      gcc_assert (elt_size == 2 * sizeof (HOST_WIDE_INT));
-		      insert_double (val, p);
-		    }
+		  wide_int val = wide_int::from_rtx (elt, imode);
+		  insert_wide_int (val, p);
 		}
 	      break;
 
@@ -14630,22 +14799,27 @@ extract_int (const unsigned char *src, unsigned int size)
   return val;
 }
 
-/* Writes double_int values to dw_vec_const array.  */
+/* Writes wide_int values to dw_vec_const array.  */
 
 static void
-insert_double (double_int val, unsigned char *dest)
+insert_wide_int (const wide_int &val, unsigned char *dest)
 {
-  unsigned char *p0 = dest;
-  unsigned char *p1 = dest + sizeof (HOST_WIDE_INT);
+  int i;
 
   if (WORDS_BIG_ENDIAN)
-    {
-      p0 = p1;
-      p1 = dest;
-    }
-
-  insert_int ((HOST_WIDE_INT) val.low, sizeof (HOST_WIDE_INT), p0);
-  insert_int ((HOST_WIDE_INT) val.high, sizeof (HOST_WIDE_INT), p1);
+    for (i = (int)get_full_len (val) - 1; i >= 0; i--)
+      {
+	insert_int ((HOST_WIDE_INT) val.elt (i), 
+		    sizeof (HOST_WIDE_INT), dest);
+	dest += sizeof (HOST_WIDE_INT);
+      }
+  else
+    for (i = 0; i < (int)get_full_len (val); i++)
+      {
+	insert_int ((HOST_WIDE_INT) val.elt (i), 
+		    sizeof (HOST_WIDE_INT), dest);
+	dest += sizeof (HOST_WIDE_INT);
+      }
 }
 
 /* Writes floating point values to dw_vec_const array.  */
@@ -14690,6 +14864,11 @@ add_const_value_attribute (dw_die_ref die, rtx rtl)
       }
       return true;
 
+    case CONST_WIDE_INT:
+      add_AT_wide (die, DW_AT_const_value,
+		   wide_int::from_rtx (rtl, GET_MODE (rtl)));
+      return true;
+
     case CONST_DOUBLE:
       /* Note that a CONST_DOUBLE rtx could represent either an integer or a
 	 floating-point constant.  A CONST_DOUBLE is used whenever the
@@ -14698,7 +14877,10 @@ add_const_value_attribute (dw_die_ref die, rtx rtl)
       {
 	enum machine_mode mode = GET_MODE (rtl);
 
-	if (SCALAR_FLOAT_MODE_P (mode))
+	if (TARGET_SUPPORTS_WIDE_INT == 0 && !SCALAR_FLOAT_MODE_P (mode))
+	  add_AT_double (die, DW_AT_const_value,
+			 CONST_DOUBLE_HIGH (rtl), CONST_DOUBLE_LOW (rtl));
+	else
 	  {
 	    unsigned int length = GET_MODE_SIZE (mode);
 	    unsigned char *array = (unsigned char *) ggc_alloc_atomic (length);
@@ -14706,9 +14888,6 @@ add_const_value_attribute (dw_die_ref die, rtx rtl)
 	    insert_float (rtl, array);
 	    add_AT_vec (die, DW_AT_const_value, length / 4, 4, array);
 	  }
-	else
-	  add_AT_double (die, DW_AT_const_value,
-			 CONST_DOUBLE_HIGH (rtl), CONST_DOUBLE_LOW (rtl));
       }
       return true;
 
@@ -14721,6 +14900,7 @@ add_const_value_attribute (dw_die_ref die, rtx rtl)
 	  (length * elt_size);
 	unsigned int i;
 	unsigned char *p;
+	enum machine_mode imode = GET_MODE_INNER (mode);
 
 	switch (GET_MODE_CLASS (mode))
 	  {
@@ -14728,15 +14908,8 @@ add_const_value_attribute (dw_die_ref die, rtx rtl)
 	    for (i = 0, p = array; i < length; i++, p += elt_size)
 	      {
 		rtx elt = CONST_VECTOR_ELT (rtl, i);
-		double_int val = rtx_to_double_int (elt);
-
-		if (elt_size <= sizeof (HOST_WIDE_INT))
-		  insert_int (val.to_shwi (), elt_size, p);
-		else
-		  {
-		    gcc_assert (elt_size == 2 * sizeof (HOST_WIDE_INT));
-		    insert_double (val, p);
-		  }
+		wide_int val = wide_int::from_rtx (elt, imode);
+		insert_wide_int (val, p);
 	      }
 	    break;
 
@@ -22869,6 +23042,9 @@ hash_loc_operands (dw_loc_descr_ref loc, hashval_t hash)
 	  hash = iterative_hash_object (val2->v.val_double.low, hash);
 	  hash = iterative_hash_object (val2->v.val_double.high, hash);
 	  break;
+	case dw_val_class_wide_int:
+	  hash = iterative_hash_object (val2->v.val_wide, hash);
+	  break;
 	case dw_val_class_addr:
 	  hash = iterative_hash_rtx (val2->v.val_addr, hash);
 	  break;
@@ -22958,6 +23134,9 @@ hash_loc_operands (dw_loc_descr_ref loc, hashval_t hash)
 	    hash = iterative_hash_object (val2->v.val_double.low, hash);
 	    hash = iterative_hash_object (val2->v.val_double.high, hash);
 	    break;
+	  case dw_val_class_wide_int:
+	    hash = iterative_hash_object (val2->v.val_wide, hash);
+	    break;
 	  default:
 	    gcc_unreachable ();
 	  }
@@ -23106,6 +23285,8 @@ compare_loc_operands (dw_loc_descr_ref x, dw_loc_descr_ref y)
 	case dw_val_class_const_double:
 	  return valx2->v.val_double.low == valy2->v.val_double.low
 		 && valx2->v.val_double.high == valy2->v.val_double.high;
+	case dw_val_class_wide_int:
+	  return valx2->v.val_wide == valy2->v.val_wide;
 	case dw_val_class_addr:
 	  return rtx_equal_p (valx2->v.val_addr, valy2->v.val_addr);
 	default:
@@ -23149,6 +23330,8 @@ compare_loc_operands (dw_loc_descr_ref x, dw_loc_descr_ref y)
 	case dw_val_class_const_double:
 	  return valx2->v.val_double.low == valy2->v.val_double.low
 		 && valx2->v.val_double.high == valy2->v.val_double.high;
+	case dw_val_class_wide_int:
+	  return valx2->v.val_wide == valy2->v.val_wide;
 	default:
 	  gcc_unreachable ();
 	}
diff --git a/gcc/dwarf2out.h b/gcc/dwarf2out.h
index f68d0e4..7c5f142 100644
--- a/gcc/dwarf2out.h
+++ b/gcc/dwarf2out.h
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_DWARF2OUT_H 1
 
 #include "dwarf2.h"	/* ??? Remove this once only used by dwarf2foo.c.  */
+#include "wide-int.h"
 
 typedef struct die_struct *dw_die_ref;
 typedef const struct die_struct *const_dw_die_ref;
@@ -139,6 +140,7 @@ enum dw_val_class
   dw_val_class_const,
   dw_val_class_unsigned_const,
   dw_val_class_const_double,
+  dw_val_class_wide_int,
   dw_val_class_vec,
   dw_val_class_flag,
   dw_val_class_die_ref,
@@ -180,6 +182,7 @@ typedef struct GTY(()) dw_val_struct {
       HOST_WIDE_INT GTY ((default)) val_int;
       unsigned HOST_WIDE_INT GTY ((tag ("dw_val_class_unsigned_const"))) val_unsigned;
       double_int GTY ((tag ("dw_val_class_const_double"))) val_double;
+      wide_int GTY ((tag ("dw_val_class_wide_int"))) val_wide;
       dw_vec_const GTY ((tag ("dw_val_class_vec"))) val_vec;
       struct dw_val_die_union
 	{
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 2c70fb1..a234e39 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -124,6 +124,9 @@ rtx cc0_rtx;
 static GTY ((if_marked ("ggc_marked_p"), param_is (struct rtx_def)))
      htab_t const_int_htab;
 
+static GTY ((if_marked ("ggc_marked_p"), param_is (struct rtx_def)))
+     htab_t const_wide_int_htab;
+
 /* A hash table storing memory attribute structures.  */
 static GTY ((if_marked ("ggc_marked_p"), param_is (struct mem_attrs)))
      htab_t mem_attrs_htab;
@@ -149,6 +152,11 @@ static void set_used_decls (tree);
 static void mark_label_nuses (rtx);
 static hashval_t const_int_htab_hash (const void *);
 static int const_int_htab_eq (const void *, const void *);
+#if TARGET_SUPPORTS_WIDE_INT
+static hashval_t const_wide_int_htab_hash (const void *);
+static int const_wide_int_htab_eq (const void *, const void *);
+static rtx lookup_const_wide_int (rtx);
+#endif
 static hashval_t const_double_htab_hash (const void *);
 static int const_double_htab_eq (const void *, const void *);
 static rtx lookup_const_double (rtx);
@@ -185,6 +193,43 @@ const_int_htab_eq (const void *x, const void *y)
   return (INTVAL ((const_rtx) x) == *((const HOST_WIDE_INT *) y));
 }
 
+#if TARGET_SUPPORTS_WIDE_INT
+/* Returns a hash code for X (which is a really a CONST_WIDE_INT).  */
+
+static hashval_t
+const_wide_int_htab_hash (const void *x)
+{
+  int i;
+  HOST_WIDE_INT hash = 0;
+  const_rtx xr = (const_rtx) x;
+
+  for (i = 0; i < CONST_WIDE_INT_NUNITS (xr); i++)
+    hash += CONST_WIDE_INT_ELT (xr, i);
+
+  return (hashval_t) hash;
+}
+
+/* Returns nonzero if the value represented by X (which is really a
+   CONST_WIDE_INT) is the same as that given by Y (which is really a
+   CONST_WIDE_INT).  */
+
+static int
+const_wide_int_htab_eq (const void *x, const void *y)
+{
+  int i;
+  const_rtx xr = (const_rtx)x;
+  const_rtx yr = (const_rtx)y;
+  if (CONST_WIDE_INT_NUNITS (xr) != CONST_WIDE_INT_NUNITS (yr))
+    return 0;
+
+  for (i = 0; i < CONST_WIDE_INT_NUNITS (xr); i++)
+    if (CONST_WIDE_INT_ELT (xr, i) != CONST_WIDE_INT_ELT (yr, i))
+      return 0;
+  
+  return 1;
+}
+#endif
+
 /* Returns a hash code for X (which is really a CONST_DOUBLE).  */
 static hashval_t
 const_double_htab_hash (const void *x)
@@ -192,7 +237,7 @@ const_double_htab_hash (const void *x)
   const_rtx const value = (const_rtx) x;
   hashval_t h;
 
-  if (GET_MODE (value) == VOIDmode)
+  if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (value) == VOIDmode)
     h = CONST_DOUBLE_LOW (value) ^ CONST_DOUBLE_HIGH (value);
   else
     {
@@ -212,7 +257,7 @@ const_double_htab_eq (const void *x, const void *y)
 
   if (GET_MODE (a) != GET_MODE (b))
     return 0;
-  if (GET_MODE (a) == VOIDmode)
+  if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (a) == VOIDmode)
     return (CONST_DOUBLE_LOW (a) == CONST_DOUBLE_LOW (b)
 	    && CONST_DOUBLE_HIGH (a) == CONST_DOUBLE_HIGH (b));
   else
@@ -478,6 +523,7 @@ const_fixed_from_fixed_value (FIXED_VALUE_TYPE value, enum machine_mode mode)
   return lookup_const_fixed (fixed);
 }
 
+#if TARGET_SUPPORTS_WIDE_INT == 0
 /* Constructs double_int from rtx CST.  */
 
 double_int
@@ -497,17 +543,61 @@ rtx_to_double_int (const_rtx cst)
   
   return r;
 }
+#endif
 
+#if TARGET_SUPPORTS_WIDE_INT
+/* Determine whether WIDE_INT, already exists in the hash table.  If
+   so, return its counterpart; otherwise add it to the hash table and
+   return it.  */
+
+static rtx
+lookup_const_wide_int (rtx wint)
+{
+  void **slot = htab_find_slot (const_wide_int_htab, wint, INSERT);
+  if (*slot == 0)
+    *slot = wint;
 
-/* Return a CONST_DOUBLE or CONST_INT for a value specified as
-   a double_int.  */
+  return (rtx) *slot;
+}
+#endif
 
+/* V contains a wide_int.  A CONST_INT or CONST_WIDE_INT (if
+   TARGET_SUPPORTS_WIDE_INT is defined) or CONST_DOUBLE if
+   TARGET_SUPPORTS_WIDE_INT is not defined is produced based on the
+   number of HOST_WIDE_INTs that are necessary to represent the value
+   in compact form.  */
 rtx
-immed_double_int_const (double_int i, enum machine_mode mode)
+immed_wide_int_const (const wide_int &v, enum machine_mode mode)
 {
-  return immed_double_const (i.low, i.high, mode);
+  unsigned int len = v.get_len ();
+
+  if (len < 2)
+    return gen_int_mode (v.elt (0), mode);
+
+  gcc_assert (GET_MODE_PRECISION (mode) == v.get_precision ());
+  gcc_assert (GET_MODE_BITSIZE (mode) == v.get_bitsize ());
+
+#if TARGET_SUPPORTS_WIDE_INT
+  {
+    rtx value = const_wide_int_alloc (len);
+    unsigned int i;
+
+    /* It is so tempting to just put the mode in here.  Must control
+       myself ... */
+    PUT_MODE (value, VOIDmode);
+    HWI_PUT_NUM_ELEM (CONST_WIDE_INT_VEC (value), len);
+
+    for (i = 0; i < len; i++)
+      CONST_WIDE_INT_ELT (value, i) = v.elt (i);
+
+    return lookup_const_wide_int (value);
+  }
+#else
+  return immed_double_const (v.elt (0), v.elt (1), mode);
+#endif
 }
 
+#if TARGET_SUPPORTS_WIDE_INT == 0
 /* Return a CONST_DOUBLE or CONST_INT for a value specified as a pair
    of ints: I0 is the low-order word and I1 is the high-order word.
    For values that are larger than HOST_BITS_PER_DOUBLE_INT, the
@@ -559,6 +649,7 @@ immed_double_const (HOST_WIDE_INT i0, HOST_WIDE_INT i1, enum machine_mode mode)
 
   return lookup_const_double (value);
 }
+#endif
 
 rtx
 gen_rtx_REG (enum machine_mode mode, unsigned int regno)
@@ -5626,11 +5717,15 @@ init_emit_once (void)
   enum machine_mode mode;
   enum machine_mode double_mode;
 
-  /* Initialize the CONST_INT, CONST_DOUBLE, CONST_FIXED, and memory attribute
-     hash tables.  */
+  /* Initialize the CONST_INT, CONST_WIDE_INT, CONST_DOUBLE,
+     CONST_FIXED, and memory attribute hash tables.  */
   const_int_htab = htab_create_ggc (37, const_int_htab_hash,
 				    const_int_htab_eq, NULL);
 
+#if TARGET_SUPPORTS_WIDE_INT
+  const_wide_int_htab = htab_create_ggc (37, const_wide_int_htab_hash,
+					 const_wide_int_htab_eq, NULL);
+#endif
   const_double_htab = htab_create_ggc (37, const_double_htab_hash,
 				       const_double_htab_eq, NULL);
 
diff --git a/gcc/explow.c b/gcc/explow.c
index 08a6653..c154472 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -95,38 +95,9 @@ plus_constant (enum machine_mode mode, rtx x, HOST_WIDE_INT c)
 
   switch (code)
     {
-    case CONST_INT:
-      if (GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT)
-	{
-	  double_int di_x = double_int::from_shwi (INTVAL (x));
-	  double_int di_c = double_int::from_shwi (c);
-
-	  bool overflow;
-	  double_int v = di_x.add_with_sign (di_c, false, &overflow);
-	  if (overflow)
-	    gcc_unreachable ();
-
-	  return immed_double_int_const (v, VOIDmode);
-	}
-
-      return GEN_INT (INTVAL (x) + c);
-
-    case CONST_DOUBLE:
-      {
-	double_int di_x = double_int::from_pair (CONST_DOUBLE_HIGH (x),
-						 CONST_DOUBLE_LOW (x));
-	double_int di_c = double_int::from_shwi (c);
-
-	bool overflow;
-	double_int v = di_x.add_with_sign (di_c, false, &overflow);
-	if (overflow)
-	  /* Sorry, we have no way to represent overflows this wide.
-	     To fix, add constant support wider than CONST_DOUBLE.  */
-	  gcc_assert (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_DOUBLE_INT);
-
-	return immed_double_int_const (v, VOIDmode);
-      }
-
+    CASE_CONST_SCALAR_INT:
+      return immed_wide_int_const (wide_int::from_rtx (x, mode) 
+				   + wide_int::from_shwi (c, mode), mode);
     case MEM:
       /* If this is a reference to the constant pool, try replacing it with
 	 a reference to a new constant.  If the resulting address isn't
diff --git a/gcc/expmed.c b/gcc/expmed.c
index 954a360..a1b7fb4 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -55,7 +55,6 @@ static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
 static rtx extract_fixed_bit_field (enum machine_mode, rtx,
 				    unsigned HOST_WIDE_INT,
 				    unsigned HOST_WIDE_INT, rtx, int, bool);
-static rtx mask_rtx (enum machine_mode, int, int, int);
 static rtx lshift_value (enum machine_mode, rtx, int, int);
 static rtx extract_split_bit_field (rtx, unsigned HOST_WIDE_INT,
 				    unsigned HOST_WIDE_INT, int);
@@ -63,6 +62,18 @@ static void do_cmp_and_jump (rtx, rtx, enum rtx_code, enum machine_mode, rtx);
 static rtx expand_smod_pow2 (enum machine_mode, rtx, HOST_WIDE_INT);
 static rtx expand_sdiv_pow2 (enum machine_mode, rtx, HOST_WIDE_INT);
 
+/* Return a constant integer mask value of mode MODE with BITSIZE ones
+   followed by BITPOS zeros, or the complement of that if COMPLEMENT.
+   The mask is truncated if necessary to the width of mode MODE.  The
+   mask is zero-extended if BITSIZE+BITPOS is too small for MODE.  */
+
+static inline rtx 
+mask_rtx (enum machine_mode mode, int bitpos, int bitsize, bool complement)
+{
+  return immed_wide_int_const 
+    (wide_int::shifted_mask (bitpos, bitsize, complement, mode), mode);
+}
+
 /* Test whether a value is zero of a power of two.  */
 #define EXACT_POWER_OF_2_OR_ZERO_P(x) (((x) & ((x) - 1)) == 0)
 
@@ -1831,39 +1842,16 @@ extract_fixed_bit_field (enum machine_mode tmode, rtx op0,
   return expand_shift (RSHIFT_EXPR, mode, op0,
 		       GET_MODE_BITSIZE (mode) - bitsize, target, 0);
 }
-\f
-/* Return a constant integer (CONST_INT or CONST_DOUBLE) mask value
-   of mode MODE with BITSIZE ones followed by BITPOS zeros, or the
-   complement of that if COMPLEMENT.  The mask is truncated if
-   necessary to the width of mode MODE.  The mask is zero-extended if
-   BITSIZE+BITPOS is too small for MODE.  */
-
-static rtx
-mask_rtx (enum machine_mode mode, int bitpos, int bitsize, int complement)
-{
-  double_int mask;
-
-  mask = double_int::mask (bitsize);
-  mask = mask.llshift (bitpos, HOST_BITS_PER_DOUBLE_INT);
-
-  if (complement)
-    mask = ~mask;
-
-  return immed_double_int_const (mask, mode);
-}
-
-/* Return a constant integer (CONST_INT or CONST_DOUBLE) rtx with the value
-   VALUE truncated to BITSIZE bits and then shifted left BITPOS bits.  */
+/* Return a constant integer rtx with the value VALUE truncated to
+   BITSIZE bits and then shifted left BITPOS bits.  */
 
 static rtx
 lshift_value (enum machine_mode mode, rtx value, int bitpos, int bitsize)
 {
-  double_int val;
-  
-  val = double_int::from_uhwi (INTVAL (value)).zext (bitsize);
-  val = val.llshift (bitpos, HOST_BITS_PER_DOUBLE_INT);
-
-  return immed_double_int_const (val, mode);
+  return 
+    immed_wide_int_const (wide_int::from_rtx (value, mode)
+			  .zext (bitsize)
+			  .lshift (bitpos, wide_int::NONE), mode);
 }
 \f
 /* Extract a bit field that is split across two words
@@ -3068,34 +3056,41 @@ expand_mult (enum machine_mode mode, rtx op0, rtx op1, rtx target,
 	 only if the constant value exactly fits in an `unsigned int' without
 	 any truncation.  This means that multiplying by negative values does
 	 not work; results are off by 2^32 on a 32 bit machine.  */
-
       if (CONST_INT_P (scalar_op1))
 	{
 	  coeff = INTVAL (scalar_op1);
 	  is_neg = coeff < 0;
 	}
+#if TARGET_SUPPORTS_WIDE_INT
+      else if (CONST_WIDE_INT_P (scalar_op1))
+#else
       else if (CONST_DOUBLE_AS_INT_P (scalar_op1))
+#endif
 	{
-	  /* If we are multiplying in DImode, it may still be a win
-	     to try to work with shifts and adds.  */
-	  if (CONST_DOUBLE_HIGH (scalar_op1) == 0
-	      && CONST_DOUBLE_LOW (scalar_op1) > 0)
+	  int p = GET_MODE_PRECISION (mode);
+	  wide_int val = wide_int::from_rtx (scalar_op1, mode);
+	  int shift = val.exact_log2 (); 
+	  /* Perfect power of 2.  */
+	  is_neg = false;
+	  if (shift > 0)
 	    {
-	      coeff = CONST_DOUBLE_LOW (scalar_op1);
-	      is_neg = false;
+	      /* Do the shift count trucation against the bitsize, not
+		 the precision.  See the comment above
+		 wide-int.c:trunc_shift for details.  */
+	      if (SHIFT_COUNT_TRUNCATED)
+		shift &= GET_MODE_BITSIZE (mode) - 1;
+	      /* We could consider adding just a move of 0 to target
+		 if the shift >= p  */
+	      if (shift < p)
+		return expand_shift (LSHIFT_EXPR, mode, op0, 
+				     shift, target, unsignedp);
+	      /* Any positive number that fits in a word.  */
+	      coeff = CONST_WIDE_INT_ELT (scalar_op1, 0);
 	    }
-	  else if (CONST_DOUBLE_LOW (scalar_op1) == 0)
+	  else if (val.sign_mask () == 0)
 	    {
-	      coeff = CONST_DOUBLE_HIGH (scalar_op1);
-	      if (EXACT_POWER_OF_2_OR_ZERO_P (coeff))
-		{
-		  int shift = floor_log2 (coeff) + HOST_BITS_PER_WIDE_INT;
-		  if (shift < HOST_BITS_PER_DOUBLE_INT - 1
-		      || mode_bitsize <= HOST_BITS_PER_DOUBLE_INT)
-		    return expand_shift (LSHIFT_EXPR, mode, op0,
-					 shift, target, unsignedp);
-		}
-	      goto skip_synth;
+	      /* Any positive number that fits in a word.  */
+	      coeff = CONST_WIDE_INT_ELT (scalar_op1, 0);
 	    }
 	  else
 	    goto skip_synth;
@@ -3585,9 +3580,10 @@ expmed_mult_highpart (enum machine_mode mode, rtx op0, rtx op1,
 static rtx
 expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d)
 {
-  unsigned HOST_WIDE_INT masklow, maskhigh;
   rtx result, temp, shift, label;
   int logd;
+  wide_int mask;
+  int prec = GET_MODE_PRECISION (mode);
 
   logd = floor_log2 (d);
   result = gen_reg_rtx (mode);
@@ -3600,8 +3596,8 @@ expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d)
 				      mode, 0, -1);
       if (signmask)
 	{
+	  HOST_WIDE_INT masklow = ((HOST_WIDE_INT) 1 << logd) - 1;
 	  signmask = force_reg (mode, signmask);
-	  masklow = ((HOST_WIDE_INT) 1 << logd) - 1;
 	  shift = GEN_INT (GET_MODE_BITSIZE (mode) - logd);
 
 	  /* Use the rtx_cost of a LSHIFTRT instruction to determine
@@ -3646,19 +3642,11 @@ expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d)
      modulus.  By including the signbit in the operation, many targets
      can avoid an explicit compare operation in the following comparison
      against zero.  */
-
-  masklow = ((HOST_WIDE_INT) 1 << logd) - 1;
-  if (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT)
-    {
-      masklow |= (HOST_WIDE_INT) -1 << (GET_MODE_BITSIZE (mode) - 1);
-      maskhigh = -1;
-    }
-  else
-    maskhigh = (HOST_WIDE_INT) -1
-		 << (GET_MODE_BITSIZE (mode) - HOST_BITS_PER_WIDE_INT - 1);
+  mask = wide_int::mask (logd, false, mode);
+  mask = mask.set_bit (prec - 1);
 
   temp = expand_binop (mode, and_optab, op0,
-		       immed_double_const (masklow, maskhigh, mode),
+		       immed_wide_int_const (mask, mode),
 		       result, 1, OPTAB_LIB_WIDEN);
   if (temp != result)
     emit_move_insn (result, temp);
@@ -3668,10 +3656,10 @@ expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d)
 
   temp = expand_binop (mode, sub_optab, result, const1_rtx, result,
 		       0, OPTAB_LIB_WIDEN);
-  masklow = (HOST_WIDE_INT) -1 << logd;
-  maskhigh = -1;
+
+  mask = wide_int::mask (logd, true, mode); 
   temp = expand_binop (mode, ior_optab, temp,
-		       immed_double_const (masklow, maskhigh, mode),
+		       immed_wide_int_const (mask, mode),
 		       result, 1, OPTAB_LIB_WIDEN);
   temp = expand_binop (mode, add_optab, temp, const1_rtx, result,
 		       0, OPTAB_LIB_WIDEN);
@@ -4925,8 +4913,12 @@ make_tree (tree type, rtx x)
 	return t;
       }
 
+    case CONST_WIDE_INT:
+      t = wide_int_to_tree (type, wide_int::from_rtx (x, TYPE_MODE (type)));
+      return t;
+
     case CONST_DOUBLE:
-      if (GET_MODE (x) == VOIDmode)
+      if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (x) == VOIDmode)
 	t = build_int_cst_wide (type,
 				CONST_DOUBLE_LOW (x), CONST_DOUBLE_HIGH (x));
       else
diff --git a/gcc/expr.c b/gcc/expr.c
index 08c5c9d..5478b83 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -710,23 +710,23 @@ convert_modes (enum machine_mode mode, enum machine_mode oldmode, rtx x, int uns
   if (mode == oldmode)
     return x;
 
-  /* There is one case that we must handle specially: If we are converting
-     a CONST_INT into a mode whose size is twice HOST_BITS_PER_WIDE_INT and
-     we are to interpret the constant as unsigned, gen_lowpart will do
-     the wrong if the constant appears negative.  What we want to do is
-     make the high-order word of the constant zero, not all ones.  */
+  /* There is one case that we must handle specially: If we are
+     converting a CONST_INT into a mode whose size is larger than
+     HOST_BITS_PER_WIDE_INT and we are to interpret the constant as
+     unsigned, gen_lowpart will do the wrong if the constant appears
+     negative.  What we want to do is make the high-order word of the
+     constant zero, not all ones.  */
 
   if (unsignedp && GET_MODE_CLASS (mode) == MODE_INT
-      && GET_MODE_BITSIZE (mode) == HOST_BITS_PER_DOUBLE_INT
+      && GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT
       && CONST_INT_P (x) && INTVAL (x) < 0)
     {
-      double_int val = double_int::from_uhwi (INTVAL (x));
-
+      HOST_WIDE_INT val = INTVAL (x);
       /* We need to zero extend VAL.  */
       if (oldmode != VOIDmode)
-	val = val.zext (GET_MODE_BITSIZE (oldmode));
+	val &= GET_MODE_PRECISION (oldmode) - 1;
 
-      return immed_double_int_const (val, mode);
+      return immed_wide_int_const (wide_int::from_uhwi (val, mode), mode);
     }
 
   /* We can do this with a gen_lowpart if both desired and current modes
@@ -738,7 +738,11 @@ convert_modes (enum machine_mode mode, enum machine_mode oldmode, rtx x, int uns
        && GET_MODE_PRECISION (mode) <= HOST_BITS_PER_WIDE_INT)
       || (GET_MODE_CLASS (mode) == MODE_INT
 	  && GET_MODE_CLASS (oldmode) == MODE_INT
-	  && (CONST_DOUBLE_AS_INT_P (x) 
+#if TARGET_SUPPORTS_WIDE_INT
+	  && (CONST_WIDE_INT_P (x)
+#else
+ 	  && (CONST_DOUBLE_AS_INT_P (x)
+#endif
 	      || (GET_MODE_PRECISION (mode) <= GET_MODE_PRECISION (oldmode)
 		  && ((MEM_P (x) && ! MEM_VOLATILE_P (x)
 		       && direct_load[(int) mode])
@@ -1743,6 +1747,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type, int ssize)
 	    {
 	      rtx first, second;
 
+	      /* TODO: const_wide_int can have sizes other than this...  */
 	      gcc_assert (2 * len == ssize);
 	      split_double (src, &first, &second);
 	      if (i)
@@ -5239,10 +5244,10 @@ store_expr (tree exp, rtx target, int call_param_p, bool nontemporal)
 			       &alt_rtl);
     }
 
-  /* If TEMP is a VOIDmode constant and the mode of the type of EXP is not
-     the same as that of TARGET, adjust the constant.  This is needed, for
-     example, in case it is a CONST_DOUBLE and we want only a word-sized
-     value.  */
+  /* If TEMP is a VOIDmode constant and the mode of the type of EXP is
+     not the same as that of TARGET, adjust the constant.  This is
+     needed, for example, in case it is a CONST_DOUBLE or
+     CONST_WIDE_INT and we want only a word-sized value.  */
   if (CONSTANT_P (temp) && GET_MODE (temp) == VOIDmode
       && TREE_CODE (exp) != ERROR_MARK
       && GET_MODE (target) != TYPE_MODE (TREE_TYPE (exp)))
@@ -7741,11 +7746,12 @@ expand_constructor (tree exp, rtx target, enum expand_modifier modifier,
 
   /* All elts simple constants => refer to a constant in memory.  But
      if this is a non-BLKmode mode, let it store a field at a time
-     since that should make a CONST_INT or CONST_DOUBLE when we
-     fold.  Likewise, if we have a target we can use, it is best to
-     store directly into the target unless the type is large enough
-     that memcpy will be used.  If we are making an initializer and
-     all operands are constant, put it in memory as well.
+     since that should make a CONST_INT, CONST_WIDE_INT or
+     CONST_DOUBLE when we fold.  Likewise, if we have a target we can
+     use, it is best to store directly into the target unless the type
+     is large enough that memcpy will be used.  If we are making an
+     initializer and all operands are constant, put it in memory as
+     well.
 
      FIXME: Avoid trying to fill vector constructors piece-meal.
      Output them with output_constant_def below unless we're sure
@@ -8214,17 +8220,18 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	      && TREE_CONSTANT (treeop1))
 	    {
 	      rtx constant_part;
+	      HOST_WIDE_INT wc;
+	      enum machine_mode wmode = TYPE_MODE (TREE_TYPE (treeop1));
 
 	      op1 = expand_expr (treeop1, subtarget, VOIDmode,
 				 EXPAND_SUM);
-	      /* Use immed_double_const to ensure that the constant is
+	      /* Use wide_int::from_shwi to ensure that the constant is
 		 truncated according to the mode of OP1, then sign extended
 		 to a HOST_WIDE_INT.  Using the constant directly can result
 		 in non-canonical RTL in a 64x32 cross compile.  */
-	      constant_part
-		= immed_double_const (TREE_INT_CST_LOW (treeop0),
-				      (HOST_WIDE_INT) 0,
-				      TYPE_MODE (TREE_TYPE (treeop1)));
+	      wc = TREE_INT_CST_LOW (treeop0);
+	      constant_part 
+		= immed_wide_int_const (wide_int::from_shwi (wc, wmode), wmode);
 	      op1 = plus_constant (mode, op1, INTVAL (constant_part));
 	      if (modifier != EXPAND_SUM && modifier != EXPAND_INITIALIZER)
 		op1 = force_operand (op1, target);
@@ -8236,7 +8243,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 		   && TREE_CONSTANT (treeop0))
 	    {
 	      rtx constant_part;
-
+	      HOST_WIDE_INT wc;
+	      enum machine_mode wmode = TYPE_MODE (TREE_TYPE (treeop0));
 	      op0 = expand_expr (treeop0, subtarget, VOIDmode,
 				 (modifier == EXPAND_INITIALIZER
 				 ? EXPAND_INITIALIZER : EXPAND_SUM));
@@ -8250,14 +8258,13 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 		    return simplify_gen_binary (PLUS, mode, op0, op1);
 		  goto binop2;
 		}
-	      /* Use immed_double_const to ensure that the constant is
+	      /* Use wide_int::from_shwi to ensure that the constant is
 		 truncated according to the mode of OP1, then sign extended
 		 to a HOST_WIDE_INT.  Using the constant directly can result
 		 in non-canonical RTL in a 64x32 cross compile.  */
-	      constant_part
-		= immed_double_const (TREE_INT_CST_LOW (treeop1),
-				      (HOST_WIDE_INT) 0,
-				      TYPE_MODE (TREE_TYPE (treeop0)));
+	      wc = TREE_INT_CST_LOW (treeop1);
+	      constant_part 
+		= immed_wide_int_const (wide_int::from_shwi (wc, wmode), wmode);
 	      op0 = plus_constant (mode, op0, INTVAL (constant_part));
 	      if (modifier != EXPAND_SUM && modifier != EXPAND_INITIALIZER)
 		op0 = force_operand (op0, target);
@@ -8759,10 +8766,13 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	 for unsigned bitfield expand this as XOR with a proper constant
 	 instead.  */
       if (reduce_bit_field && TYPE_UNSIGNED (type))
-	temp = expand_binop (mode, xor_optab, op0,
-			     immed_double_int_const
-			       (double_int::mask (TYPE_PRECISION (type)), mode),
-			     target, 1, OPTAB_LIB_WIDEN);
+	{
+	  wide_int mask = wide_int::mask (TYPE_PRECISION (type), false, mode);
+
+	  temp = expand_binop (mode, xor_optab, op0,
+			       immed_wide_int_const (mask, mode),
+			       target, 1, OPTAB_LIB_WIDEN);
+	}
       else
 	temp = expand_unop (mode, one_cmpl_optab, op0, target, 1);
       gcc_assert (temp);
@@ -9395,9 +9405,8 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
       return decl_rtl;
 
     case INTEGER_CST:
-      temp = immed_double_const (TREE_INT_CST_LOW (exp),
-				 TREE_INT_CST_HIGH (exp), mode);
-
+      temp = immed_wide_int_const (wide_int::from_tree (exp), 
+				   TYPE_MODE (TREE_TYPE (exp)));
       return temp;
 
     case VECTOR_CST:
@@ -9628,8 +9637,9 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 	op0 = memory_address_addr_space (address_mode, op0, as);
 	if (!integer_zerop (TREE_OPERAND (exp, 1)))
 	  {
-	    rtx off
-	      = immed_double_int_const (mem_ref_offset (exp), address_mode);
+	    wide_int wi = wide_int::from_double_int
+	      (mem_ref_offset (exp), address_mode);
+	    rtx off = immed_wide_int_const (wi, address_mode);
 	    op0 = simplify_gen_binary (PLUS, address_mode, op0, off);
 	  }
 	op0 = memory_address_addr_space (mode, op0, as);
@@ -10507,9 +10517,10 @@ reduce_to_bit_field_precision (rtx exp, rtx target, tree type)
     }
   else if (TYPE_UNSIGNED (type))
     {
-      rtx mask = immed_double_int_const (double_int::mask (prec),
-					 GET_MODE (exp));
-      return expand_and (GET_MODE (exp), exp, mask, target);
+      enum machine_mode mode = GET_MODE (exp);
+      rtx mask = immed_wide_int_const 
+	(wide_int::mask (prec, false, mode), mode);
+      return expand_and (mode, exp, mask, target);
     }
   else
     {
@@ -11081,8 +11092,9 @@ const_vector_from_tree (tree exp)
 	RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt),
 							 inner);
       else
-	RTVEC_ELT (v, i) = immed_double_int_const (tree_to_double_int (elt),
-						   inner);
+	RTVEC_ELT (v, i) 
+	  = immed_wide_int_const (wide_int::from_tree (elt),
+				  TYPE_MODE (TREE_TYPE (elt)));
     }
 
   return gen_rtx_CONST_VECTOR (mode, v);
diff --git a/gcc/final.c b/gcc/final.c
index d25b8e0..ae44b00 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -3799,8 +3799,16 @@ output_addr_const (FILE *file, rtx x)
       output_addr_const (file, XEXP (x, 0));
       break;
 
+    case CONST_WIDE_INT:
+      /* This should be ok for a while.  */
+      gcc_assert (CONST_WIDE_INT_NUNITS (x) == 2);
+      fprintf (file, HOST_WIDE_INT_PRINT_DOUBLE_HEX,
+	       (unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (x, 1),
+	       (unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (x, 0));
+      break;
+
     case CONST_DOUBLE:
-      if (GET_MODE (x) == VOIDmode)
+      if (CONST_DOUBLE_AS_INT_P (x))
 	{
 	  /* We can use %d if the number is one word and positive.  */
 	  if (CONST_DOUBLE_HIGH (x))
diff --git a/gcc/genemit.c b/gcc/genemit.c
index 692ef52..7b1e471 100644
--- a/gcc/genemit.c
+++ b/gcc/genemit.c
@@ -204,6 +204,7 @@ gen_exp (rtx x, enum rtx_code subroutine_type, char *used)
 
     case CONST_DOUBLE:
     case CONST_FIXED:
+    case CONST_WIDE_INT:
       /* These shouldn't be written in MD files.  Instead, the appropriate
 	 routines in varasm.c should be called.  */
       gcc_unreachable ();
diff --git a/gcc/gengenrtl.c b/gcc/gengenrtl.c
index 5b5a3ca..1f93dd5 100644
--- a/gcc/gengenrtl.c
+++ b/gcc/gengenrtl.c
@@ -142,6 +142,7 @@ static int
 excluded_rtx (int idx)
 {
   return ((strcmp (defs[idx].enumname, "CONST_DOUBLE") == 0)
+	  || (strcmp (defs[idx].enumname, "CONST_WIDE_INT") == 0)
 	  || (strcmp (defs[idx].enumname, "CONST_FIXED") == 0));
 }
 
diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index a2eebf2..ff6b125 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -5440,6 +5440,7 @@ main (int argc, char **argv)
       POS_HERE (do_scalar_typedef ("REAL_VALUE_TYPE", &pos));
       POS_HERE (do_scalar_typedef ("FIXED_VALUE_TYPE", &pos));
       POS_HERE (do_scalar_typedef ("double_int", &pos));
+      POS_HERE (do_scalar_typedef ("wide_int", &pos));
       POS_HERE (do_scalar_typedef ("uint64_t", &pos));
       POS_HERE (do_scalar_typedef ("uint8", &pos));
       POS_HERE (do_scalar_typedef ("uintptr_t", &pos));
diff --git a/gcc/genpreds.c b/gcc/genpreds.c
index 09fc87b..e8a25bc 100644
--- a/gcc/genpreds.c
+++ b/gcc/genpreds.c
@@ -612,7 +612,7 @@ write_one_predicate_function (struct pred_data *p)
   add_mode_tests (p);
 
   /* A normal predicate can legitimately not look at enum machine_mode
-     if it accepts only CONST_INTs and/or CONST_DOUBLEs.  */
+     if it accepts only CONST_INTs and/or CONST_WIDE_INT and/or CONST_DOUBLEs.  */
   printf ("int\n%s (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)\n{\n",
 	  p->name);
   write_predicate_stmts (p->exp);
@@ -809,8 +809,11 @@ add_constraint (const char *name, const char *regclass,
   if (is_const_int || is_const_dbl)
     {
       enum rtx_code appropriate_code
+#if TARGET_SUPPORTS_WIDE_INT
+	= is_const_int ? CONST_INT : CONST_WIDE_INT;
+#else
 	= is_const_int ? CONST_INT : CONST_DOUBLE;
-
+#endif
       /* Consider relaxing this requirement in the future.  */
       if (regclass
 	  || GET_CODE (exp) != AND
@@ -1074,12 +1077,17 @@ write_tm_constrs_h (void)
 	if (needs_ival)
 	  puts ("  if (CONST_INT_P (op))\n"
 		"    ival = INTVAL (op);");
+#if TARGET_SUPPORTS_WIDE_INT
+	if (needs_lval || needs_hval)
+	  error ("you can't use lval or hval");
+#else
 	if (needs_hval)
 	  puts ("  if (GET_CODE (op) == CONST_DOUBLE && mode == VOIDmode)"
 		"    hval = CONST_DOUBLE_HIGH (op);");
 	if (needs_lval)
 	  puts ("  if (GET_CODE (op) == CONST_DOUBLE && mode == VOIDmode)"
 		"    lval = CONST_DOUBLE_LOW (op);");
+#endif
 	if (needs_rval)
 	  puts ("  if (GET_CODE (op) == CONST_DOUBLE && mode != VOIDmode)"
 		"    rval = CONST_DOUBLE_REAL_VALUE (op);");
diff --git a/gcc/gensupport.c b/gcc/gensupport.c
index 9b9a03e..638e051 100644
--- a/gcc/gensupport.c
+++ b/gcc/gensupport.c
@@ -2775,7 +2775,13 @@ static const struct std_pred_table std_preds[] = {
   {"scratch_operand", false, false, {SCRATCH, REG}},
   {"immediate_operand", false, true, {UNKNOWN}},
   {"const_int_operand", false, false, {CONST_INT}},
+#if TARGET_SUPPORTS_WIDE_INT
+  {"const_wide_int_operand", false, false, {CONST_WIDE_INT}},
+  {"const_scalar_int_operand", false, false, {CONST_INT, CONST_WIDE_INT}},
+  {"const_double_operand", false, false, {CONST_DOUBLE}},
+#else
   {"const_double_operand", false, false, {CONST_INT, CONST_DOUBLE}},
+#endif
   {"nonimmediate_operand", false, false, {SUBREG, REG, MEM}},
   {"nonmemory_operand", false, true, {SUBREG, REG}},
   {"push_operand", false, false, {MEM}},
diff --git a/gcc/optabs.c b/gcc/optabs.c
index c1dacf4..c877800 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -850,7 +850,8 @@ expand_subword_shift (enum machine_mode op1_mode, optab binoptab,
   if (CONSTANT_P (op1) || shift_mask >= BITS_PER_WORD)
     {
       carries = outof_input;
-      tmp = immed_double_const (BITS_PER_WORD, 0, op1_mode);
+      tmp = immed_wide_int_const (wide_int::from_shwi (BITS_PER_WORD,
+						       op1_mode), op1_mode);
       tmp = simplify_expand_binop (op1_mode, sub_optab, tmp, op1,
 				   0, true, methods);
     }
@@ -865,13 +866,14 @@ expand_subword_shift (enum machine_mode op1_mode, optab binoptab,
 			      outof_input, const1_rtx, 0, unsignedp, methods);
       if (shift_mask == BITS_PER_WORD - 1)
 	{
-	  tmp = immed_double_const (-1, -1, op1_mode);
+	  tmp = immed_wide_int_const (wide_int::minus_one (op1_mode), op1_mode);
 	  tmp = simplify_expand_binop (op1_mode, xor_optab, op1, tmp,
 				       0, true, methods);
 	}
       else
 	{
-	  tmp = immed_double_const (BITS_PER_WORD - 1, 0, op1_mode);
+	  tmp = immed_wide_int_const (wide_int::from_shwi (BITS_PER_WORD - 1,
+							   op1_mode), op1_mode);
 	  tmp = simplify_expand_binop (op1_mode, sub_optab, tmp, op1,
 				       0, true, methods);
 	}
@@ -1034,7 +1036,8 @@ expand_doubleword_shift (enum machine_mode op1_mode, optab binoptab,
      is true when the effective shift value is less than BITS_PER_WORD.
      Set SUPERWORD_OP1 to the shift count that should be used to shift
      OUTOF_INPUT into INTO_TARGET when the condition is false.  */
-  tmp = immed_double_const (BITS_PER_WORD, 0, op1_mode);
+  tmp = immed_wide_int_const (wide_int::from_shwi (BITS_PER_WORD, op1_mode),
+			      op1_mode);
   if (!CONSTANT_P (op1) && shift_mask == BITS_PER_WORD - 1)
     {
       /* Set CMP1 to OP1 & BITS_PER_WORD.  The result is zero iff OP1
@@ -2884,7 +2887,7 @@ expand_absneg_bit (enum rtx_code code, enum machine_mode mode,
   const struct real_format *fmt;
   int bitpos, word, nwords, i;
   enum machine_mode imode;
-  double_int mask;
+  wide_int mask;
   rtx temp, insns;
 
   /* The format has to have a simple sign bit.  */
@@ -2920,7 +2923,7 @@ expand_absneg_bit (enum rtx_code code, enum machine_mode mode,
       nwords = (GET_MODE_BITSIZE (mode) + BITS_PER_WORD - 1) / BITS_PER_WORD;
     }
 
-  mask = double_int_zero.set_bit (bitpos);
+  mask = wide_int::set_bit_in_zero (bitpos, imode);
   if (code == ABS)
     mask = ~mask;
 
@@ -2942,7 +2945,7 @@ expand_absneg_bit (enum rtx_code code, enum machine_mode mode,
 	    {
 	      temp = expand_binop (imode, code == ABS ? and_optab : xor_optab,
 				   op0_piece,
-				   immed_double_int_const (mask, imode),
+				   immed_wide_int_const (mask, imode),
 				   targ_piece, 1, OPTAB_LIB_WIDEN);
 	      if (temp != targ_piece)
 		emit_move_insn (targ_piece, temp);
@@ -2960,7 +2963,7 @@ expand_absneg_bit (enum rtx_code code, enum machine_mode mode,
     {
       temp = expand_binop (imode, code == ABS ? and_optab : xor_optab,
 			   gen_lowpart (imode, op0),
-			   immed_double_int_const (mask, imode),
+			   immed_wide_int_const (mask, imode),
 		           gen_lowpart (imode, target), 1, OPTAB_LIB_WIDEN);
       target = lowpart_subreg_maybe_copy (mode, temp, imode);
 
@@ -3559,7 +3562,7 @@ expand_copysign_absneg (enum machine_mode mode, rtx op0, rtx op1, rtx target,
     }
   else
     {
-      double_int mask;
+      wide_int mask;
 
       if (GET_MODE_SIZE (mode) <= UNITS_PER_WORD)
 	{
@@ -3581,10 +3584,9 @@ expand_copysign_absneg (enum machine_mode mode, rtx op0, rtx op1, rtx target,
 	  op1 = operand_subword_force (op1, word, mode);
 	}
 
-      mask = double_int_zero.set_bit (bitpos);
-
+      mask = wide_int::set_bit_in_zero (bitpos, imode);
       sign = expand_binop (imode, and_optab, op1,
-			   immed_double_int_const (mask, imode),
+			   immed_wide_int_const (mask, imode),
 			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
     }
 
@@ -3628,7 +3630,7 @@ expand_copysign_bit (enum machine_mode mode, rtx op0, rtx op1, rtx target,
 		     int bitpos, bool op0_is_abs)
 {
   enum machine_mode imode;
-  double_int mask;
+  wide_int mask, nmask;
   int word, nwords, i;
   rtx temp, insns;
 
@@ -3652,7 +3654,7 @@ expand_copysign_bit (enum machine_mode mode, rtx op0, rtx op1, rtx target,
       nwords = (GET_MODE_BITSIZE (mode) + BITS_PER_WORD - 1) / BITS_PER_WORD;
     }
 
-  mask = double_int_zero.set_bit (bitpos);
+  mask = wide_int::set_bit_in_zero (bitpos, imode);
 
   if (target == 0
       || target == op0
@@ -3672,14 +3674,16 @@ expand_copysign_bit (enum machine_mode mode, rtx op0, rtx op1, rtx target,
 	  if (i == word)
 	    {
 	      if (!op0_is_abs)
-		op0_piece
-		  = expand_binop (imode, and_optab, op0_piece,
-				  immed_double_int_const (~mask, imode),
-				  NULL_RTX, 1, OPTAB_LIB_WIDEN);
-
+		{
+		  nmask = ~mask;
+  		  op0_piece
+		    = expand_binop (imode, and_optab, op0_piece,
+				    immed_wide_int_const (nmask, imode),
+				    NULL_RTX, 1, OPTAB_LIB_WIDEN);
+		}
 	      op1 = expand_binop (imode, and_optab,
 				  operand_subword_force (op1, i, mode),
-				  immed_double_int_const (mask, imode),
+				  immed_wide_int_const (mask, imode),
 				  NULL_RTX, 1, OPTAB_LIB_WIDEN);
 
 	      temp = expand_binop (imode, ior_optab, op0_piece, op1,
@@ -3699,15 +3703,17 @@ expand_copysign_bit (enum machine_mode mode, rtx op0, rtx op1, rtx target,
   else
     {
       op1 = expand_binop (imode, and_optab, gen_lowpart (imode, op1),
-		          immed_double_int_const (mask, imode),
+		          immed_wide_int_const (mask, imode),
 		          NULL_RTX, 1, OPTAB_LIB_WIDEN);
 
       op0 = gen_lowpart (imode, op0);
       if (!op0_is_abs)
-	op0 = expand_binop (imode, and_optab, op0,
-			    immed_double_int_const (~mask, imode),
-			    NULL_RTX, 1, OPTAB_LIB_WIDEN);
-
+	{
+	  nmask = ~mask;
+	  op0 = expand_binop (imode, and_optab, op0,
+			      immed_wide_int_const (nmask, imode),
+			      NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	}
       temp = expand_binop (imode, ior_optab, op0, op1,
 			   gen_lowpart (imode, target), 1, OPTAB_LIB_WIDEN);
       target = lowpart_subreg_maybe_copy (mode, temp, imode);
diff --git a/gcc/postreload.c b/gcc/postreload.c
index daabaa1..34e8e61 100644
--- a/gcc/postreload.c
+++ b/gcc/postreload.c
@@ -295,27 +295,25 @@ reload_cse_simplify_set (rtx set, rtx insn)
 #ifdef LOAD_EXTEND_OP
 	  if (extend_op != UNKNOWN)
 	    {
-	      HOST_WIDE_INT this_val;
+	      wide_int result;
 
-	      /* ??? I'm lazy and don't wish to handle CONST_DOUBLE.  Other
-		 constants, such as SYMBOL_REF, cannot be extended.  */
-	      if (!CONST_INT_P (this_rtx))
+	      if (!CONST_SCALAR_INT_P (this_rtx))
 		continue;
 
-	      this_val = INTVAL (this_rtx);
 	      switch (extend_op)
 		{
 		case ZERO_EXTEND:
-		  this_val &= GET_MODE_MASK (GET_MODE (src));
+		  result = (wide_int::from_rtx (this_rtx, GET_MODE (src))
+			    .zext (word_mode));
 		  break;
 		case SIGN_EXTEND:
-		  /* ??? In theory we're already extended.  */
-		  if (this_val == trunc_int_for_mode (this_val, GET_MODE (src)))
-		    break;
+		  result = (wide_int::from_rtx (this_rtx, GET_MODE (src))
+			    .sext (word_mode));
+		  break;
 		default:
 		  gcc_unreachable ();
 		}
-	      this_rtx = GEN_INT (this_val);
+	      this_rtx = immed_wide_int_const (result, GET_MODE (src));
 	    }
 #endif
 	  this_cost = set_src_cost (this_rtx, speed);
diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index 3793109..1f43de1 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -612,6 +612,12 @@ print_rtx (const_rtx in_rtx)
 	  fprintf (outfile, " [%s]", s);
 	}
       break;
+
+    case CONST_WIDE_INT:
+      if (! flag_simple)
+	fprintf (outfile, " ");
+      hwivec_output_hex (outfile, CONST_WIDE_INT_VEC (in_rtx));
+      break;
 #endif
 
     case CODE_LABEL:
diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c
index cd58b1f..a73a41b 100644
--- a/gcc/read-rtl.c
+++ b/gcc/read-rtl.c
@@ -806,6 +806,29 @@ validate_const_int (const char *string)
     fatal_with_file_and_line ("invalid decimal constant \"%s\"\n", string);
 }
 
+static void
+validate_const_wide_int (const char *string)
+{
+  const char *cp;
+  int valid = 1;
+
+  cp = string;
+  while (*cp && ISSPACE (*cp))
+    cp++;
+  /* Skip the leading 0x.  */
+  if (cp[0] == '0' || cp[1] == 'x')
+    cp += 2;
+  else
+    valid = 0;
+  if (*cp == 0)
+    valid = 0;
+  for (; *cp; cp++)
+    if (! ISXDIGIT (*cp))
+      valid = 0;
+  if (!valid)
+    fatal_with_file_and_line ("invalid hex constant \"%s\"\n", string);
+}
+
 /* Record that PTR uses iterator ITERATOR.  */
 
 static void
@@ -1319,6 +1342,56 @@ read_rtx_code (const char *code_name)
 	gcc_unreachable ();
       }
 
+  if (CONST_WIDE_INT_P (return_rtx))
+    {
+      read_name (&name);
+      validate_const_wide_int (name.string);
+      {
+	hwivec hwiv;
+	const char *s = name.string;
+	int len;
+	int index = 0;
+	int gs = HOST_BITS_PER_WIDE_INT/4;
+	int pos;
+	char * buf = XALLOCAVEC (char, gs + 1);
+	unsigned HOST_WIDE_INT wi;
+	int wlen;
+
+	/* Skip the leading spaces.  */
+	while (*s && ISSPACE (*s))
+	  s++;
+
+	/* Skip the leading 0x.  */
+	gcc_assert (s[0] == '0');
+	gcc_assert (s[1] == 'x');
+	s += 2;
+
+	len = strlen (s);
+	pos = len - gs;
+	wlen = (len + gs - 1) / gs;	/* Number of words needed */
+
+	return_rtx = const_wide_int_alloc (wlen);
+
+	hwiv = CONST_WIDE_INT_VEC (return_rtx);
+	while (pos > 0)
+	  {
+#if HOST_BITS_PER_WIDE_INT == 64
+	    sscanf (s + pos, "%16" HOST_WIDE_INT_PRINT "x", &wi);
+#else
+	    sscanf (s + pos, "%8" HOST_WIDE_INT_PRINT "x", &wi);
+#endif
+	    XHWIVEC_ELT (hwiv, index++) = wi;
+	    pos -= gs;
+	  }
+	strncpy (buf, s, gs - pos);
+	buf [gs - pos] = 0;
+	sscanf (buf, "%" HOST_WIDE_INT_PRINT "x", &wi);
+	XHWIVEC_ELT (hwiv, index++) = wi;
+	/* TODO: After reading, do we want to canonicalize with:
+	   value = lookup_const_wide_int (value); ? */
+      }
+    }
+
   c = read_skip_spaces ();
   /* Syntactic sugar for AND and IOR, allowing Lisp-like
      arbitrary number of arguments for them.  */
diff --git a/gcc/recog.c b/gcc/recog.c
index ed359f6..05e08e9 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -1141,7 +1141,7 @@ immediate_operand (rtx op, enum machine_mode mode)
 					    : mode, op));
 }
 
-/* Returns 1 if OP is an operand that is a CONST_INT.  */
+/* Returns 1 if OP is an operand that is a CONST_INT of mode MODE.  */
 
 int
 const_int_operand (rtx op, enum machine_mode mode)
@@ -1156,8 +1156,64 @@ const_int_operand (rtx op, enum machine_mode mode)
   return 1;
 }
 
+#if TARGET_SUPPORTS_WIDE_INT
+/* Returns 1 if OP is an operand that is a CONST_INT or CONST_WIDE_INT
+   of mode MODE.  */
+int
+const_scalar_int_operand (rtx op, enum machine_mode mode)
+{
+  if (!CONST_SCALAR_INT_P (op))
+    return 0;
+
+  if (CONST_INT_P (op))
+    return const_int_operand (op, mode);
+
+  if (mode != VOIDmode)
+    {
+      int prec = GET_MODE_PRECISION (mode);
+      int bitsize = GET_MODE_BITSIZE (mode);
+      
+      if (CONST_WIDE_INT_NUNITS (op) * HOST_BITS_PER_WIDE_INT > bitsize)
+	return 0;
+      
+      if (prec == bitsize)
+	return 1;
+      else
+	{
+	  /* Multiword partial int.  */
+	  HOST_WIDE_INT x 
+	    = CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op) - 1);
+	  return (wide_int::sext (x, prec & (HOST_BITS_PER_WIDE_INT - 1))
+		  == x);
+	}
+    }
+  return 1;
+}
+
+/* Returns 1 if OP is an operand that is a CONST_WIDE_INT of mode
+   MODE.  This most likely is not as useful as
+   const_scalar_int_operand, but is here for consistancy.  */
+int
+const_wide_int_operand (rtx op, enum machine_mode mode)
+{
+  if (!CONST_WIDE_INT_P (op))
+    return 0;
+
+  return const_scalar_int_operand (op, mode);
+}
+
 /* Returns 1 if OP is an operand that is a constant integer or constant
-   floating-point number.  */
+   floating-point number of MODE.  */
+
+int
+const_double_operand (rtx op, enum machine_mode mode)
+{
+  return (GET_CODE (op) == CONST_DOUBLE)
+	  && (GET_MODE (op) == mode || mode == VOIDmode);
+}
+#else
+/* Returns 1 if OP is an operand that is a constant integer or constant
+   floating-point number of MODE.  */
 
 int
 const_double_operand (rtx op, enum machine_mode mode)
@@ -1173,8 +1229,9 @@ const_double_operand (rtx op, enum machine_mode mode)
 	  && (mode == VOIDmode || GET_MODE (op) == mode
 	      || GET_MODE (op) == VOIDmode));
 }
-
-/* Return 1 if OP is a general operand that is not an immediate operand.  */
+#endif
+/* Return 1 if OP is a general operand that is not an immediate
+   operand of mode MODE.  */
 
 int
 nonimmediate_operand (rtx op, enum machine_mode mode)
@@ -1182,7 +1239,8 @@ nonimmediate_operand (rtx op, enum machine_mode mode)
   return (general_operand (op, mode) && ! CONSTANT_P (op));
 }
 
-/* Return 1 if OP is a register reference or immediate value of mode MODE.  */
+/* Return 1 if OP is a register reference or immediate value of mode
+   MODE.  */
 
 int
 nonmemory_operand (rtx op, enum machine_mode mode)
diff --git a/gcc/rtl.c b/gcc/rtl.c
index bc49fc8..137da07 100644
--- a/gcc/rtl.c
+++ b/gcc/rtl.c
@@ -109,7 +109,7 @@ const enum rtx_class rtx_class[NUM_RTX_CODE] = {
 const unsigned char rtx_code_size[NUM_RTX_CODE] = {
 #define DEF_RTL_EXPR(ENUM, NAME, FORMAT, CLASS)				\
   (((ENUM) == CONST_INT || (ENUM) == CONST_DOUBLE			\
-    || (ENUM) == CONST_FIXED)						\
+    || (ENUM) == CONST_FIXED || (ENUM) == CONST_WIDE_INT)		\
    ? RTX_HDR_SIZE + (sizeof FORMAT - 1) * sizeof (HOST_WIDE_INT)	\
    : RTX_HDR_SIZE + (sizeof FORMAT - 1) * sizeof (rtunion)),
 
@@ -181,18 +181,24 @@ shallow_copy_rtvec (rtvec vec)
 unsigned int
 rtx_size (const_rtx x)
 {
+  if (CONST_WIDE_INT_P (x))
+    return (RTX_HDR_SIZE
+	    + sizeof (struct hwivec_def)
+	    + ((CONST_WIDE_INT_NUNITS (x) - 1)
+	       * sizeof (HOST_WIDE_INT)));
   if (GET_CODE (x) == SYMBOL_REF && SYMBOL_REF_HAS_BLOCK_INFO_P (x))
     return RTX_HDR_SIZE + sizeof (struct block_symbol);
   return RTX_CODE_SIZE (GET_CODE (x));
 }
 
-/* Allocate an rtx of code CODE.  The CODE is stored in the rtx;
-   all the rest is initialized to zero.  */
+/* Allocate an rtx of code CODE with EXTRA bytes in it.  The CODE is
+   stored in the rtx; all the rest is initialized to zero.  */
 
 rtx
-rtx_alloc_stat (RTX_CODE code MEM_STAT_DECL)
+rtx_alloc_stat_v (RTX_CODE code MEM_STAT_DECL, int extra)
 {
-  rtx rt = ggc_alloc_rtx_def_stat (RTX_CODE_SIZE (code) PASS_MEM_STAT);
+  rtx rt = ggc_alloc_rtx_def_stat (RTX_CODE_SIZE (code) + extra
+				   PASS_MEM_STAT);
 
   /* We want to clear everything up to the FLD array.  Normally, this
      is one int, but we don't want to assume that and it isn't very
@@ -210,6 +216,29 @@ rtx_alloc_stat (RTX_CODE code MEM_STAT_DECL)
   return rt;
 }
 
+/* Allocate an rtx of code CODE.  The CODE is stored in the rtx;
+   all the rest is initialized to zero.  */
+
+rtx
+rtx_alloc_stat (RTX_CODE code MEM_STAT_DECL)
+{
+  return rtx_alloc_stat_v (code PASS_MEM_STAT, 0);
+}
+
+/* Write the wide constant OP0 to OUTFILE.  */
+
+void
+hwivec_output_hex (FILE *outfile, const_hwivec op0)
+{
+  int i = HWI_GET_NUM_ELEM (op0);
+  gcc_assert (i > 0);
+  if (XHWIVEC_ELT (op0, i-1) == 0)
+    fprintf (outfile, "0x");
+  fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, XHWIVEC_ELT (op0, --i));
+  while (--i >= 0)
+    fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, XHWIVEC_ELT (op0, i));
+}
+
 \f
 /* Return true if ORIG is a sharable CONST.  */
 
@@ -424,7 +453,6 @@ rtx_equal_p_cb (const_rtx x, const_rtx y, rtx_equal_p_callback_function cb)
 	  if (XWINT (x, i) != XWINT (y, i))
 	    return 0;
 	  break;
-
 	case 'n':
 	case 'i':
 	  if (XINT (x, i) != XINT (y, i))
@@ -642,6 +670,10 @@ iterative_hash_rtx (const_rtx x, hashval_t hash)
       return iterative_hash_object (i, hash);
     case CONST_INT:
       return iterative_hash_object (INTVAL (x), hash);
+    case CONST_WIDE_INT:
+      for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+	hash = iterative_hash_object (CONST_WIDE_INT_ELT (x, i), hash);
+      return hash;
     case SYMBOL_REF:
       if (XSTR (x, 0))
 	return iterative_hash (XSTR (x, 0), strlen (XSTR (x, 0)) + 1,
@@ -807,6 +839,16 @@ rtl_check_failed_block_symbol (const char *file, int line, const char *func)
 
 /* XXX Maybe print the vector?  */
 void
+hwivec_check_failed_bounds (const_hwivec r, int n, const char *file, int line,
+			    const char *func)
+{
+  internal_error
+    ("RTL check: access of hwi elt %d of vector with last elt %d in %s, at %s:%d",
+     n, GET_NUM_ELEM (r) - 1, func, trim_filename (file), line);
+}
+
+/* XXX Maybe print the vector?  */
+void
 rtvec_check_failed_bounds (const_rtvec r, int n, const char *file, int line,
 			   const char *func)
 {
diff --git a/gcc/rtl.def b/gcc/rtl.def
index d6c881f..8fae62f 100644
--- a/gcc/rtl.def
+++ b/gcc/rtl.def
@@ -317,6 +317,9 @@ DEF_RTL_EXPR(TRAP_IF, "trap_if", "ee", RTX_EXTRA)
 /* numeric integer constant */
 DEF_RTL_EXPR(CONST_INT, "const_int", "w", RTX_CONST_OBJ)
 
+/* numeric integer constant */
+DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ)
+
 /* fixed-point constant */
 DEF_RTL_EXPR(CONST_FIXED, "const_fixed", "www", RTX_CONST_OBJ)
 
diff --git a/gcc/rtl.h b/gcc/rtl.h
index 93a64f4..58c5902 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "fixed-value.h"
 #include "alias.h"
 #include "hashtab.h"
+#include "wide-int.h"
 #include "flags.h"
 
 /* Value used by some passes to "recognize" noop moves as valid
@@ -249,6 +250,14 @@ struct GTY(()) object_block {
   vec<rtx, va_gc> *anchors;
 };
 
+struct GTY((variable_size)) hwivec_def {
+  int num_elem;		/* number of elements */
+  HOST_WIDE_INT elem[1];
+};
+
+#define HWI_GET_NUM_ELEM(HWIVEC)	((HWIVEC)->num_elem)
+#define HWI_PUT_NUM_ELEM(HWIVEC, NUM)	((HWIVEC)->num_elem = (NUM))
+
 /* RTL expression ("rtx").  */
 
 struct GTY((chain_next ("RTX_NEXT (&%h)"),
@@ -343,6 +352,7 @@ struct GTY((chain_next ("RTX_NEXT (&%h)"),
     struct block_symbol block_sym;
     struct real_value rv;
     struct fixed_value fv;
+    struct hwivec_def hwiv;
   } GTY ((special ("rtx_def"), desc ("GET_CODE (&%0)"))) u;
 };
 
@@ -381,13 +391,13 @@ struct GTY((chain_next ("RTX_NEXT (&%h)"),
    for a variable number of things.  The principle use is inside
    PARALLEL expressions.  */
 
+#define NULL_RTVEC (rtvec) 0
+
 struct GTY((variable_size)) rtvec_def {
   int num_elem;		/* number of elements */
   rtx GTY ((length ("%h.num_elem"))) elem[1];
 };
 
-#define NULL_RTVEC (rtvec) 0
-
 #define GET_NUM_ELEM(RTVEC)		((RTVEC)->num_elem)
 #define PUT_NUM_ELEM(RTVEC, NUM)	((RTVEC)->num_elem = (NUM))
 
@@ -397,12 +407,38 @@ struct GTY((variable_size)) rtvec_def {
 /* Predicate yielding nonzero iff X is an rtx for a memory location.  */
 #define MEM_P(X) (GET_CODE (X) == MEM)
 
+#if TARGET_SUPPORTS_WIDE_INT
+
+/* Match CONST_*s that can represent compile-time constant integers.  */
+#define CASE_CONST_SCALAR_INT \
+   case CONST_INT: \
+   case CONST_WIDE_INT
+
+/* Match CONST_*s for which pointer equality corresponds to value 
+   equality.  */
+#define CASE_CONST_UNIQUE \
+   case CONST_INT: \
+   case CONST_WIDE_INT: \
+   case CONST_DOUBLE: \
+   case CONST_FIXED
+
+/* Match all CONST_* rtxes.  */
+#define CASE_CONST_ANY \
+   case CONST_INT: \
+   case CONST_WIDE_INT: \
+   case CONST_DOUBLE: \
+   case CONST_FIXED: \
+   case CONST_VECTOR
+
+#else
+
 /* Match CONST_*s that can represent compile-time constant integers.  */
 #define CASE_CONST_SCALAR_INT \
    case CONST_INT: \
    case CONST_DOUBLE
 
-/* Match CONST_*s for which pointer equality corresponds to value equality.  */
+/* Match CONST_*s for which pointer equality corresponds to value 
+equality.  */
 #define CASE_CONST_UNIQUE \
    case CONST_INT: \
    case CONST_DOUBLE: \
@@ -414,10 +450,17 @@ struct GTY((variable_size)) rtvec_def {
    case CONST_DOUBLE: \
    case CONST_FIXED: \
    case CONST_VECTOR
+#endif
+
+
+
 
 /* Predicate yielding nonzero iff X is an rtx for a constant integer.  */
 #define CONST_INT_P(X) (GET_CODE (X) == CONST_INT)
 
+/* Predicate yielding nonzero iff X is an rtx for a constant integer.  */
+#define CONST_WIDE_INT_P(X) (GET_CODE (X) == CONST_WIDE_INT)
+
 /* Predicate yielding nonzero iff X is an rtx for a constant fixed-point.  */
 #define CONST_FIXED_P(X) (GET_CODE (X) == CONST_FIXED)
 
@@ -430,8 +473,13 @@ struct GTY((variable_size)) rtvec_def {
   (GET_CODE (X) == CONST_DOUBLE && GET_MODE (X) == VOIDmode)
 
 /* Predicate yielding true iff X is an rtx for a integer const.  */
+#if TARGET_SUPPORTS_WIDE_INT
+#define CONST_SCALAR_INT_P(X) \
+  (CONST_INT_P (X) || CONST_WIDE_INT_P (X))
+#else
 #define CONST_SCALAR_INT_P(X) \
   (CONST_INT_P (X) || CONST_DOUBLE_AS_INT_P (X))
+#endif
 
 /* Predicate yielding true iff X is an rtx for a double-int.  */
 #define CONST_DOUBLE_AS_FLOAT_P(X) \
@@ -594,6 +642,13 @@ struct GTY((variable_size)) rtvec_def {
 			       __FUNCTION__);				\
      &_rtx->u.hwint[_n]; }))
 
+#define XHWIVEC_ELT(HWIVEC, I) __extension__				\
+(*({ __typeof (HWIVEC) const _hwivec = (HWIVEC); const int _i = (I);	\
+     if (_i < 0 || _i >= HWI_GET_NUM_ELEM (_hwivec))			\
+       hwivec_check_failed_bounds (_hwivec, _i, __FILE__, __LINE__,	\
+				  __FUNCTION__);			\
+     &_hwivec->elem[_i]; }))
+
 #define XCWINT(RTX, N, C) __extension__					\
 (*({ __typeof (RTX) const _rtx = (RTX);					\
      if (GET_CODE (_rtx) != (C))					\
@@ -630,6 +685,11 @@ struct GTY((variable_size)) rtvec_def {
 				    __FUNCTION__);			\
    &_symbol->u.block_sym; })
 
+#define HWIVEC_CHECK(RTX,C) __extension__				\
+({ __typeof (RTX) const _symbol = (RTX);				\
+   RTL_CHECKC1 (_symbol, 0, C);						\
+   &_symbol->u.hwiv; })
+
 extern void rtl_check_failed_bounds (const_rtx, int, const char *, int,
 				     const char *)
     ATTRIBUTE_NORETURN;
@@ -650,6 +710,9 @@ extern void rtl_check_failed_code_mode (const_rtx, enum rtx_code, enum machine_m
     ATTRIBUTE_NORETURN;
 extern void rtl_check_failed_block_symbol (const char *, int, const char *)
     ATTRIBUTE_NORETURN;
+extern void hwivec_check_failed_bounds (const_rtvec, int, const char *, int,
+					const char *)
+    ATTRIBUTE_NORETURN;
 extern void rtvec_check_failed_bounds (const_rtvec, int, const char *, int,
 				       const char *)
     ATTRIBUTE_NORETURN;
@@ -662,12 +725,14 @@ extern void rtvec_check_failed_bounds (const_rtvec, int, const char *, int,
 #define RTL_CHECKC2(RTX, N, C1, C2) ((RTX)->u.fld[N])
 #define RTVEC_ELT(RTVEC, I)	    ((RTVEC)->elem[I])
 #define XWINT(RTX, N)		    ((RTX)->u.hwint[N])
+#define XHWIVEC_ELT(HWIVEC, I)	    ((HWIVEC)->elem[I])
 #define XCWINT(RTX, N, C)	    ((RTX)->u.hwint[N])
 #define XCMWINT(RTX, N, C, M)	    ((RTX)->u.hwint[N])
 #define XCNMWINT(RTX, N, C, M)	    ((RTX)->u.hwint[N])
 #define XCNMPRV(RTX, C, M)	    (&(RTX)->u.rv)
 #define XCNMPFV(RTX, C, M)	    (&(RTX)->u.fv)
 #define BLOCK_SYMBOL_CHECK(RTX)	    (&(RTX)->u.block_sym)
+#define HWIVEC_CHECK(RTX,C)	    (&(RTX)->u.hwiv)
 
 #endif
 
@@ -810,8 +875,8 @@ extern void rtl_check_failed_flag (const char *, const_rtx, const char *,
 #define XCCFI(RTX, N, C)      (RTL_CHECKC1 (RTX, N, C).rt_cfi)
 #define XCCSELIB(RTX, N, C)   (RTL_CHECKC1 (RTX, N, C).rt_cselib)
 
-#define XCVECEXP(RTX, N, M, C)	RTVEC_ELT (XCVEC (RTX, N, C), M)
-#define XCVECLEN(RTX, N, C)	GET_NUM_ELEM (XCVEC (RTX, N, C))
+#define XCVECEXP(RTX, N, M, C) RTVEC_ELT (XCVEC (RTX, N, C), M)
+#define XCVECLEN(RTX, N, C)    GET_NUM_ELEM (XCVEC (RTX, N, C))
 
 #define XC2EXP(RTX, N, C1, C2)      (RTL_CHECKC2 (RTX, N, C1, C2).rt_rtx)
 \f
@@ -1153,9 +1218,19 @@ rhs_regno (const_rtx x)
 #define INTVAL(RTX) XCWINT(RTX, 0, CONST_INT)
 #define UINTVAL(RTX) ((unsigned HOST_WIDE_INT) INTVAL (RTX))
 
+/* For a CONST_WIDE_INT, CONST_WIDE_INT_NUNITS is the number of
+   elements actually needed to represent the constant.
+   CONST_WIDE_INT_ELT gets one of the elements.  0 is the least
+   significant HOST_WIDE_INT.  */
+#define CONST_WIDE_INT_VEC(RTX) HWIVEC_CHECK (RTX, CONST_WIDE_INT)
+#define CONST_WIDE_INT_NUNITS(RTX) HWI_GET_NUM_ELEM (CONST_WIDE_INT_VEC (RTX))
+#define CONST_WIDE_INT_ELT(RTX, N) XHWIVEC_ELT (CONST_WIDE_INT_VEC (RTX), N) 
+
 /* For a CONST_DOUBLE:
+#if TARGET_SUPPORTS_WIDE_INT == 0
    For a VOIDmode, there are two integers CONST_DOUBLE_LOW is the
      low-order word and ..._HIGH the high-order.
+#endif
    For a float, there is a REAL_VALUE_TYPE structure, and
      CONST_DOUBLE_REAL_VALUE(r) is a pointer to it.  */
 #define CONST_DOUBLE_LOW(r) XCMWINT (r, 0, CONST_DOUBLE, VOIDmode)
@@ -1760,6 +1835,12 @@ extern rtx plus_constant (enum machine_mode, rtx, HOST_WIDE_INT);
 /* In rtl.c */
 extern rtx rtx_alloc_stat (RTX_CODE MEM_STAT_DECL);
 #define rtx_alloc(c) rtx_alloc_stat (c MEM_STAT_INFO)
+extern rtx rtx_alloc_stat_v (RTX_CODE MEM_STAT_DECL, int);
+#define rtx_alloc_v(c, SZ) rtx_alloc_stat_v (c MEM_STAT_INFO, SZ)
+#define const_wide_int_alloc(NWORDS)				\
+  rtx_alloc_v (CONST_WIDE_INT,					\
+	       (sizeof (struct hwivec_def)			\
+		+ ((NWORDS)-1) * sizeof (HOST_WIDE_INT)))	\
 
 extern rtvec rtvec_alloc (int);
 extern rtvec shallow_copy_rtvec (rtvec);
@@ -1816,10 +1897,17 @@ extern void start_sequence (void);
 extern void push_to_sequence (rtx);
 extern void push_to_sequence2 (rtx, rtx);
 extern void end_sequence (void);
+#if TARGET_SUPPORTS_WIDE_INT == 0
 extern double_int rtx_to_double_int (const_rtx);
-extern rtx immed_double_int_const (double_int, enum machine_mode);
+#endif
+extern void hwivec_output_hex (FILE *, const_hwivec);
+#ifndef GENERATOR_FILE
+extern rtx immed_wide_int_const (const wide_int &cst, enum machine_mode mode);
+#endif
+#if TARGET_SUPPORTS_WIDE_INT == 0
 extern rtx immed_double_const (HOST_WIDE_INT, HOST_WIDE_INT,
 			       enum machine_mode);
+#endif
 
 /* In loop-iv.c  */
 
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index b198685..0fe1d0e 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -3091,6 +3091,8 @@ commutative_operand_precedence (rtx op)
   /* Constants always come the second operand.  Prefer "nice" constants.  */
   if (code == CONST_INT)
     return -8;
+  if (code == CONST_WIDE_INT)
+    return -8;
   if (code == CONST_DOUBLE)
     return -7;
   if (code == CONST_FIXED)
@@ -3103,6 +3105,8 @@ commutative_operand_precedence (rtx op)
     case RTX_CONST_OBJ:
       if (code == CONST_INT)
         return -6;
+      if (code == CONST_WIDE_INT)
+        return -6;
       if (code == CONST_DOUBLE)
         return -5;
       if (code == CONST_FIXED)
@@ -5289,7 +5293,10 @@ get_address_mode (rtx mem)
 /* Split up a CONST_DOUBLE or integer constant rtx
    into two rtx's for single words,
    storing in *FIRST the word that comes first in memory in the target
-   and in *SECOND the other.  */
+   and in *SECOND the other. 
+
+   TODO: This function needs to be rewritten to work on any size
+   integer.  */
 
 void
 split_double (rtx value, rtx *first, rtx *second)
@@ -5366,6 +5373,22 @@ split_double (rtx value, rtx *first, rtx *second)
 	    }
 	}
     }
+  else if (GET_CODE (value) == CONST_WIDE_INT)
+    {
+      /* All of this is scary code and needs to be converted to
+	 properly work with any size integer.  */
+      gcc_assert (CONST_WIDE_INT_NUNITS (value) == 2);
+      if (WORDS_BIG_ENDIAN)
+	{
+	  *first = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
+	  *second = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
+	}
+      else
+	{
+	  *first = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
+	  *second = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
+	}
+    }
   else if (!CONST_DOUBLE_P (value))
     {
       if (WORDS_BIG_ENDIAN)
diff --git a/gcc/sched-vis.c b/gcc/sched-vis.c
index 98de37e..514b0d8 100644
--- a/gcc/sched-vis.c
+++ b/gcc/sched-vis.c
@@ -429,6 +429,23 @@ print_value (pretty_printer *pp, const_rtx x, int verbose)
       pp_scalar (pp, HOST_WIDE_INT_PRINT_HEX,
 		 (unsigned HOST_WIDE_INT) INTVAL (x));
       break;
+
+    case CONST_WIDE_INT:
+      {
+	const char *sep = "<";
+	int i;
+	for (i = CONST_WIDE_INT_NUNITS (x) - 1; i >= 0; i--)
+	  {
+	    pp_string (pp, sep);
+	    sep = ",";
+	    sprintf (tmp, HOST_WIDE_INT_PRINT_HEX,
+		     (unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (x, i));
+	    pp_string (pp, tmp);
+	  }
+        pp_greater (pp);
+      }
+      break;
+
     case CONST_DOUBLE:
       if (FLOAT_MODE_P (GET_MODE (x)))
 	{
diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index 39dc52f..2499eaa 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -1138,10 +1138,10 @@ lhs_and_rhs_separable_p (rtx lhs, rtx rhs)
   if (lhs == NULL || rhs == NULL)
     return false;
 
-  /* Do not schedule CONST, CONST_INT and CONST_DOUBLE etc as rhs: no point
-     to use reg, if const can be used.  Moreover, scheduling const as rhs may
-     lead to mode mismatch cause consts don't have modes but they could be
-     merged from branches where the same const used in different modes.  */
+  /* Do not schedule constants as rhs: no point to use reg, if const
+     can be used.  Moreover, scheduling const as rhs may lead to mode
+     mismatch cause consts don't have modes but they could be merged
+     from branches where the same const used in different modes.  */
   if (CONSTANT_P (rhs))
     return false;
 
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 3f04b8b..4a03299 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -86,6 +86,22 @@ mode_signbit_p (enum machine_mode mode, const_rtx x)
   if (width <= HOST_BITS_PER_WIDE_INT
       && CONST_INT_P (x))
     val = INTVAL (x);
+#if TARGET_SUPPORTS_WIDE_INT
+  else if (CONST_WIDE_INT_P (x))
+    {
+      unsigned int i;
+      unsigned int elts = CONST_WIDE_INT_NUNITS (x);
+      if (elts != (width + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT)
+	return false;
+      for (i = 0; i < elts - 1; i++)
+	if (CONST_WIDE_INT_ELT (x, i) != 0)
+	  return false;
+      val = CONST_WIDE_INT_ELT (x, elts - 1);
+      width %= HOST_BITS_PER_WIDE_INT;
+      if (width == 0)
+	width = HOST_BITS_PER_WIDE_INT;
+    }
+#else
   else if (width <= HOST_BITS_PER_DOUBLE_INT
 	   && CONST_DOUBLE_AS_INT_P (x)
 	   && CONST_DOUBLE_LOW (x) == 0)
@@ -93,8 +109,9 @@ mode_signbit_p (enum machine_mode mode, const_rtx x)
       val = CONST_DOUBLE_HIGH (x);
       width -= HOST_BITS_PER_WIDE_INT;
     }
+#endif
   else
-    /* FIXME: We don't yet have a representation for wider modes.  */
+    /* X is not an integer constant.  */
     return false;
 
   if (width < HOST_BITS_PER_WIDE_INT)
@@ -1487,7 +1504,6 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
 				rtx op, enum machine_mode op_mode)
 {
   unsigned int width = GET_MODE_PRECISION (mode);
-  unsigned int op_width = GET_MODE_PRECISION (op_mode);
 
   if (code == VEC_DUPLICATE)
     {
@@ -1561,8 +1577,19 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
       if (CONST_INT_P (op))
 	lv = INTVAL (op), hv = HWI_SIGN_EXTEND (lv);
       else
+#if TARGET_SUPPORTS_WIDE_INT
+	{
+	  /* The conversion code to floats really want exactly 2 HWIs.
+	     This needs to be fixed.  For now, if the constant is
+	     really big, just return 0 which is safe.  */
+	  if (CONST_WIDE_INT_NUNITS (op) > 2)
+	    return 0;
+	  lv = CONST_WIDE_INT_ELT (op, 0);
+	  hv = CONST_WIDE_INT_ELT (op, 1);
+	}
+#else
 	lv = CONST_DOUBLE_LOW (op),  hv = CONST_DOUBLE_HIGH (op);
-
+#endif
       REAL_VALUE_FROM_INT (d, lv, hv, mode);
       d = real_value_truncate (mode, d);
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
@@ -1575,8 +1602,19 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
       if (CONST_INT_P (op))
 	lv = INTVAL (op), hv = HWI_SIGN_EXTEND (lv);
       else
+#if TARGET_SUPPORTS_WIDE_INT
+	{
+	  /* The conversion code to floats really want exactly 2 HWIs.
+	     This needs to be fixed.  For now, if the constant is
+	     really big, just return 0 which is safe.  */
+	  if (CONST_WIDE_INT_NUNITS (op) > 2)
+	    return 0;
+	  lv = CONST_WIDE_INT_ELT (op, 0);
+	  hv = CONST_WIDE_INT_ELT (op, 1);
+	}
+#else
 	lv = CONST_DOUBLE_LOW (op),  hv = CONST_DOUBLE_HIGH (op);
-
+#endif
       if (op_mode == VOIDmode
 	  || GET_MODE_PRECISION (op_mode) > HOST_BITS_PER_DOUBLE_INT)
 	/* We should never get a negative number.  */
@@ -1589,302 +1627,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
     }
 
-  if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+  if (CONST_SCALAR_INT_P (op) && width > 0)
     {
-      HOST_WIDE_INT arg0 = INTVAL (op);
-      HOST_WIDE_INT val;
+      wide_int result;
+      enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode;
+      wide_int op0 = wide_int::from_rtx (op, imode);
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+      /* This assert keeps the simplification from producing a result
+	 that cannot be represented in a CONST_DOUBLE but a lot of
+	 upstream callers expect that this function never fails to
+	 simplify something and so you if you added this to the test
+	 above the code would die later anyway.  If this assert
+	 happens, you just need to make the port support wide int.  */
+      gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); 
+#endif
 
       switch (code)
 	{
 	case NOT:
-	  val = ~ arg0;
+	  result = ~op0;
 	  break;
 
 	case NEG:
-	  val = - arg0;
+	  result = op0.neg ();
 	  break;
 
 	case ABS:
-	  val = (arg0 >= 0 ? arg0 : - arg0);
+	  result = op0.abs ();
 	  break;
 
 	case FFS:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = ffs_hwi (arg0);
+	  result = op0.ffs ();
 	  break;
 
 	case CLZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
-	    ;
-	  else
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1;
+	  result = op0.clz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;
 
 	case CLRSB:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    val = GET_MODE_PRECISION (mode) - 1;
-	  else if (arg0 >= 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2;
-	  else if (arg0 < 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2;
+	  result = op0.clrsb (GET_MODE_BITSIZE (mode), 
+			      GET_MODE_PRECISION (mode));
 	  break;
-
+	  
 	case CTZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    {
-	      /* Even if the value at zero is undefined, we have to come
-		 up with some replacement.  Seems good enough.  */
-	      if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
-		val = GET_MODE_PRECISION (mode);
-	    }
-	  else
-	    val = ctz_hwi (arg0);
+	  result = op0.ctz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;
 
 	case POPCOUNT:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
+	  result = op0.popcount (GET_MODE_BITSIZE (mode), 
+				 GET_MODE_PRECISION (mode));
 	  break;
 
 	case PARITY:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
-	  val &= 1;
+	  result = op0.parity (GET_MODE_BITSIZE (mode), 
+			       GET_MODE_PRECISION (mode));
 	  break;
 
 	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    val = 0;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-		byte = (arg0 >> s) & 0xff;
-		val |= byte << d;
-	      }
-	  }
+	  result = op0.bswap ();
 	  break;
 
 	case TRUNCATE:
-	  val = arg0;
+	  result = op0.zforce_to_size (mode);
 	  break;
 
 	case ZERO_EXTEND:
-	  /* When zero-extending a CONST_INT, we need to know its
-             original mode.  */
-	  gcc_assert (op_mode != VOIDmode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT)
-	    val = arg0 & GET_MODE_MASK (op_mode);
-	  else
-	    return 0;
+	  result = op0.zforce_to_size (mode);
 	  break;
 
 	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode)
-	    op_mode = mode;
-	  op_width = GET_MODE_PRECISION (op_mode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (op_width < HOST_BITS_PER_WIDE_INT)
-	    {
-	      val = arg0 & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, val))
-		val |= ~GET_MODE_MASK (op_mode);
-	    }
-	  else
-	    return 0;
+	  result = op0.sforce_to_size (mode);
 	  break;
 
 	case SQRT:
-	case FLOAT_EXTEND:
-	case FLOAT_TRUNCATE:
-	case SS_TRUNCATE:
-	case US_TRUNCATE:
-	case SS_NEG:
-	case US_NEG:
-	case SS_ABS:
-	  return 0;
-
-	default:
-	  gcc_unreachable ();
-	}
-
-      return gen_int_mode (val, mode);
-    }
-
-  /* We can do some operations on integer CONST_DOUBLEs.  Also allow
-     for a DImode operation on a CONST_INT.  */
-  else if (width <= HOST_BITS_PER_DOUBLE_INT
-	   && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op)))
-    {
-      double_int first, value;
-
-      if (CONST_DOUBLE_AS_INT_P (op))
-	first = double_int::from_pair (CONST_DOUBLE_HIGH (op),
-				       CONST_DOUBLE_LOW (op));
-      else
-	first = double_int::from_shwi (INTVAL (op));
-
-      switch (code)
-	{
-	case NOT:
-	  value = ~first;
-	  break;
-
-	case NEG:
-	  value = -first;
-	  break;
-
-	case ABS:
-	  if (first.is_negative ())
-	    value = -first;
-	  else
-	    value = first;
-	  break;
-
-	case FFS:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ffs_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high);
-	  else
-	    value.low = 0;
-	  break;
-
-	case CLZ:
-	  value.high = 0;
-	  if (first.high != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1
-	              - HOST_BITS_PER_WIDE_INT;
-	  else if (first.low != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1;
-	  else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case CTZ:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ctz_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high);
-	  else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case POPCOUNT:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  break;
-
-	case PARITY:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  value.low &= 1;
-	  break;
-
-	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    value = double_int_zero;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-
-		if (s < HOST_BITS_PER_WIDE_INT)
-		  byte = (first.low >> s) & 0xff;
-		else
-		  byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff;
-
-		if (d < HOST_BITS_PER_WIDE_INT)
-		  value.low |= byte << d;
-		else
-		  value.high |= byte << (d - HOST_BITS_PER_WIDE_INT);
-	      }
-	  }
-	  break;
-
-	case TRUNCATE:
-	  /* This is just a change-of-mode, so do nothing.  */
-	  value = first;
-	  break;
-
-	case ZERO_EXTEND:
-	  gcc_assert (op_mode != VOIDmode);
-
-	  if (op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-
-	  value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode));
-	  break;
-
-	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode
-	      || op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-	  else
-	    {
-	      value.low = first.low & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, value.low))
-		value.low |= ~GET_MODE_MASK (op_mode);
-
-	      value.high = HWI_SIGN_EXTEND (value.low);
-	    }
-	  break;
-
-	case SQRT:
-	  return 0;
-
 	default:
 	  return 0;
 	}
 
-      return immed_double_int_const (value, mode);
+      return immed_wide_int_const (result, mode);
     }
 
   else if (CONST_DOUBLE_AS_FLOAT_P (op) 
@@ -1936,7 +1759,6 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
 	}
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
     }
-
   else if (CONST_DOUBLE_AS_FLOAT_P (op)
 	   && SCALAR_FLOAT_MODE_P (GET_MODE (op))
 	   && GET_MODE_CLASS (mode) == MODE_INT
@@ -1949,9 +1771,12 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
 
       /* This was formerly used only for non-IEEE float.
 	 eggert@twinsun.com says it is safe for IEEE also.  */
-      HOST_WIDE_INT xh, xl, th, tl;
+      HOST_WIDE_INT th, tl;
       REAL_VALUE_TYPE x, t;
+      wide_int wc;
       REAL_VALUE_FROM_CONST_DOUBLE (x, op);
+      HOST_WIDE_INT tmp[2];
+
       switch (code)
 	{
 	case FIX:
@@ -1973,8 +1798,8 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
 	  real_from_integer (&t, VOIDmode, tl, th, 0);
 	  if (REAL_VALUES_LESS (t, x))
 	    {
-	      xh = th;
-	      xl = tl;
+	      tmp[1] = th;
+	      tmp[0] = tl;
 	      break;
 	    }
 
@@ -1993,11 +1818,11 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
 	  real_from_integer (&t, VOIDmode, tl, th, 0);
 	  if (REAL_VALUES_LESS (x, t))
 	    {
-	      xh = th;
-	      xl = tl;
+	      tmp[1] = th;
+	      tmp[0] = tl;
 	      break;
 	    }
-	  REAL_VALUE_TO_INT (&xl, &xh, x);
+	  REAL_VALUE_TO_INT (&tmp[0], &tmp[1], x);
 	  break;
 
 	case UNSIGNED_FIX:
@@ -2024,18 +1849,19 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
 	  real_from_integer (&t, VOIDmode, tl, th, 1);
 	  if (REAL_VALUES_LESS (t, x))
 	    {
-	      xh = th;
-	      xl = tl;
+	      tmp[1] = th;
+	      tmp[0] = tl;
 	      break;
 	    }
 
-	  REAL_VALUE_TO_INT (&xl, &xh, x);
+	  REAL_VALUE_TO_INT (&tmp[0], &tmp[1], x);
 	  break;
 
 	default:
 	  gcc_unreachable ();
 	}
-      return immed_double_const (xl, xh, mode);
+      wc = wide_int::from_array (tmp, 2, mode);
+      return immed_wide_int_const (wc, mode);
     }
 
   return NULL_RTX;
@@ -2195,49 +2021,50 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode,
 
       if (SCALAR_INT_MODE_P (mode))
 	{
-	  double_int coeff0, coeff1;
+	  wide_int coeff0;
+	  wide_int coeff1;
 	  rtx lhs = op0, rhs = op1;
 
-	  coeff0 = double_int_one;
-	  coeff1 = double_int_one;
+	  coeff0 = wide_int::one (mode);
+	  coeff1 = wide_int::one (mode);
 
 	  if (GET_CODE (lhs) == NEG)
 	    {
-	      coeff0 = double_int_minus_one;
+	      coeff0 = wide_int::minus_one (mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 	  else if (GET_CODE (lhs) == MULT
-		   && CONST_INT_P (XEXP (lhs, 1)))
+		   && CONST_SCALAR_INT_P (XEXP (lhs, 1)))
 	    {
-	      coeff0 = double_int::from_shwi (INTVAL (XEXP (lhs, 1)));
+	      coeff0 = wide_int::from_rtx (XEXP (lhs, 1), mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 	  else if (GET_CODE (lhs) == ASHIFT
 		   && CONST_INT_P (XEXP (lhs, 1))
                    && INTVAL (XEXP (lhs, 1)) >= 0
-		   && INTVAL (XEXP (lhs, 1)) < HOST_BITS_PER_WIDE_INT)
+		   && INTVAL (XEXP (lhs, 1)) < GET_MODE_PRECISION (mode))
 	    {
-	      coeff0 = double_int_zero.set_bit (INTVAL (XEXP (lhs, 1)));
+	      coeff0 = wide_int::set_bit_in_zero (INTVAL (XEXP (lhs, 1)), mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 
 	  if (GET_CODE (rhs) == NEG)
 	    {
-	      coeff1 = double_int_minus_one;
+	      coeff1 = wide_int::minus_one (mode);
 	      rhs = XEXP (rhs, 0);
 	    }
 	  else if (GET_CODE (rhs) == MULT
 		   && CONST_INT_P (XEXP (rhs, 1)))
 	    {
-	      coeff1 = double_int::from_shwi (INTVAL (XEXP (rhs, 1)));
+	      coeff1 = wide_int::from_rtx (XEXP (rhs, 1), mode);
 	      rhs = XEXP (rhs, 0);
 	    }
 	  else if (GET_CODE (rhs) == ASHIFT
 		   && CONST_INT_P (XEXP (rhs, 1))
 		   && INTVAL (XEXP (rhs, 1)) >= 0
-		   && INTVAL (XEXP (rhs, 1)) < HOST_BITS_PER_WIDE_INT)
+		   && INTVAL (XEXP (rhs, 1)) < GET_MODE_PRECISION (mode))
 	    {
-	      coeff1 = double_int_zero.set_bit (INTVAL (XEXP (rhs, 1)));
+	      coeff1 = wide_int::set_bit_in_zero (INTVAL (XEXP (rhs, 1)), mode);
 	      rhs = XEXP (rhs, 0);
 	    }
 
@@ -2245,11 +2072,9 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode,
 	    {
 	      rtx orig = gen_rtx_PLUS (mode, op0, op1);
 	      rtx coeff;
-	      double_int val;
 	      bool speed = optimize_function_for_speed_p (cfun);
 
-	      val = coeff0 + coeff1;
-	      coeff = immed_double_int_const (val, mode);
+	      coeff = immed_wide_int_const (coeff0 + coeff1, mode);
 
 	      tem = simplify_gen_binary (MULT, mode, lhs, coeff);
 	      return set_src_cost (tem, speed) <= set_src_cost (orig, speed)
@@ -2371,50 +2196,52 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode,
 
       if (SCALAR_INT_MODE_P (mode))
 	{
-	  double_int coeff0, negcoeff1;
+	  wide_int coeff0;
+	  wide_int negcoeff1;
 	  rtx lhs = op0, rhs = op1;
 
-	  coeff0 = double_int_one;
-	  negcoeff1 = double_int_minus_one;
+	  coeff0 = wide_int::one (mode);
+	  negcoeff1 = wide_int::minus_one (mode);
 
 	  if (GET_CODE (lhs) == NEG)
 	    {
-	      coeff0 = double_int_minus_one;
+	      coeff0 = wide_int::minus_one (mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 	  else if (GET_CODE (lhs) == MULT
-		   && CONST_INT_P (XEXP (lhs, 1)))
+		   && CONST_SCALAR_INT_P (XEXP (lhs, 1)))
 	    {
-	      coeff0 = double_int::from_shwi (INTVAL (XEXP (lhs, 1)));
+	      coeff0 = wide_int::from_rtx (XEXP (lhs, 1), mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 	  else if (GET_CODE (lhs) == ASHIFT
 		   && CONST_INT_P (XEXP (lhs, 1))
 		   && INTVAL (XEXP (lhs, 1)) >= 0
-		   && INTVAL (XEXP (lhs, 1)) < HOST_BITS_PER_WIDE_INT)
+		   && INTVAL (XEXP (lhs, 1)) < GET_MODE_PRECISION (mode))
 	    {
-	      coeff0 = double_int_zero.set_bit (INTVAL (XEXP (lhs, 1)));
+	      coeff0 = wide_int::set_bit_in_zero (INTVAL (XEXP (lhs, 1)), mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 
 	  if (GET_CODE (rhs) == NEG)
 	    {
-	      negcoeff1 = double_int_one;
+	      negcoeff1 = wide_int::one (mode);
 	      rhs = XEXP (rhs, 0);
 	    }
 	  else if (GET_CODE (rhs) == MULT
 		   && CONST_INT_P (XEXP (rhs, 1)))
 	    {
-	      negcoeff1 = double_int::from_shwi (-INTVAL (XEXP (rhs, 1)));
+	      negcoeff1 = wide_int::from_rtx (XEXP (rhs, 1), mode).neg ();
 	      rhs = XEXP (rhs, 0);
 	    }
 	  else if (GET_CODE (rhs) == ASHIFT
 		   && CONST_INT_P (XEXP (rhs, 1))
 		   && INTVAL (XEXP (rhs, 1)) >= 0
-		   && INTVAL (XEXP (rhs, 1)) < HOST_BITS_PER_WIDE_INT)
+		   && INTVAL (XEXP (rhs, 1)) < GET_MODE_PRECISION (mode))
 	    {
-	      negcoeff1 = double_int_zero.set_bit (INTVAL (XEXP (rhs, 1)));
-	      negcoeff1 = -negcoeff1;
+	      negcoeff1 = wide_int::set_bit_in_zero (INTVAL (XEXP (rhs, 1)),
+						    mode);
+	      negcoeff1 = negcoeff1.neg ();
 	      rhs = XEXP (rhs, 0);
 	    }
 
@@ -2422,11 +2249,9 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode,
 	    {
 	      rtx orig = gen_rtx_MINUS (mode, op0, op1);
 	      rtx coeff;
-	      double_int val;
 	      bool speed = optimize_function_for_speed_p (cfun);
 
-	      val = coeff0 + negcoeff1;
-	      coeff = immed_double_int_const (val, mode);
+	      coeff = immed_wide_int_const (coeff0 + negcoeff1, mode);
 
 	      tem = simplify_gen_binary (MULT, mode, lhs, coeff);
 	      return set_src_cost (tem, speed) <= set_src_cost (orig, speed)
@@ -2578,26 +2403,13 @@ simplify_binary_operation_1 (enum rtx_code code, enum machine_mode mode,
 	  && trueop1 == CONST1_RTX (mode))
 	return op0;
 
-      /* Convert multiply by constant power of two into shift unless
-	 we are still generating RTL.  This test is a kludge.  */
-      if (CONST_INT_P (trueop1)
-	  && (val = exact_log2 (UINTVAL (trueop1))) >= 0
-	  /* If the mode is larger than the host word size, and the
-	     uppermost bit is set, then this isn't a power of two due
-	     to implicit sign extension.  */
-	  && (width <= HOST_BITS_PER_WIDE_INT
-	      || val != HOST_BITS_PER_WIDE_INT - 1))
-	return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
-
-      /* Likewise for multipliers wider than a word.  */
-      if (CONST_DOUBLE_AS_INT_P (trueop1)
-	  && GET_MODE (op0) == mode
-	  && CONST_DOUBLE_LOW (trueop1) == 0
-	  && (val = exact_log2 (CONST_DOUBLE_HIGH (trueop1))) >= 0
-	  && (val < HOST_BITS_PER_DOUBLE_INT - 1
-	      || GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_DOUBLE_INT))
-	return simplify_gen_binary (ASHIFT, mode, op0,
-				    GEN_INT (val + HOST_BITS_PER_WIDE_INT));
+      /* Convert multiply by constant power of two into shift.  */
+      if (CONST_SCALAR_INT_P (trueop1))
+	{
+	  val = wide_int::from_rtx (trueop1, mode).exact_log2 ();
+	  if (val >= 0 && val < GET_MODE_BITSIZE (mode))
+	    return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
+	}
 
       /* x*2 is x+x and x*(-1) is -x */
       if (CONST_DOUBLE_AS_FLOAT_P (trueop1)
@@ -3645,9 +3457,9 @@ rtx
 simplify_const_binary_operation (enum rtx_code code, enum machine_mode mode,
 				 rtx op0, rtx op1)
 {
-  HOST_WIDE_INT arg0, arg1, arg0s, arg1s;
-  HOST_WIDE_INT val;
+#if TARGET_SUPPORTS_WIDE_INT == 0
   unsigned int width = GET_MODE_PRECISION (mode);
+#endif
 
   if (VECTOR_MODE_P (mode)
       && code != VEC_CONCAT
@@ -3840,299 +3652,128 @@ simplify_const_binary_operation (enum rtx_code code, enum machine_mode mode,
 
   /* We can fold some multi-word operations.  */
   if (GET_MODE_CLASS (mode) == MODE_INT
-      && width == HOST_BITS_PER_DOUBLE_INT
-      && (CONST_DOUBLE_AS_INT_P (op0) || CONST_INT_P (op0))
-      && (CONST_DOUBLE_AS_INT_P (op1) || CONST_INT_P (op1)))
+      && CONST_SCALAR_INT_P (op0)
+      && CONST_SCALAR_INT_P (op1))
     {
-      double_int o0, o1, res, tmp;
-      bool overflow;
-
-      o0 = rtx_to_double_int (op0);
-      o1 = rtx_to_double_int (op1);
-
+      wide_int result;
+      wide_int wop0 = wide_int::from_rtx (op0, mode);
+      wide_int wop1 = wide_int::from_rtx (op1, mode);
+      bool overflow = false;
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+      /* This assert keeps the simplification from producing a result
+	 that cannot be represented in a CONST_DOUBLE but a lot of
+	 upstream callers expect that this function never fails to
+	 simplify something and so you if you added this to the test
+	 above the code would die later anyway.  If this assert
+	 happens, you just need to make the port support wide int.  */
+      gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT);
+#endif
       switch (code)
 	{
 	case MINUS:
-	  /* A - B == A + (-B).  */
-	  o1 = -o1;
-
-	  /* Fall through....  */
+	  result = wop0 - wop1;
+	  break;
 
 	case PLUS:
-	  res = o0 + o1;
+	  result = wop0 + wop1;
 	  break;
 
 	case MULT:
-	  res = o0 * o1;
+	  result = wop0 * wop1;
 	  break;
 
 	case DIV:
-          res = o0.divmod_with_overflow (o1, false, TRUNC_DIV_EXPR,
-					 &tmp, &overflow);
+	  result = wop0.div_trunc (wop1, wide_int::SIGNED, &overflow);
 	  if (overflow)
-	    return 0;
+	    return NULL_RTX;
 	  break;
-
+	  
 	case MOD:
-          tmp = o0.divmod_with_overflow (o1, false, TRUNC_DIV_EXPR,
-					 &res, &overflow);
+	  result = wop0.mod_trunc (wop1, wide_int::SIGNED, &overflow);
 	  if (overflow)
-	    return 0;
+	    return NULL_RTX;
 	  break;
 
 	case UDIV:
-          res = o0.divmod_with_overflow (o1, true, TRUNC_DIV_EXPR,
-					 &tmp, &overflow);
+	  result = wop0.div_trunc (wop1, wide_int::UNSIGNED, &overflow);
 	  if (overflow)
-	    return 0;
+	    return NULL_RTX;
 	  break;
 
 	case UMOD:
-          tmp = o0.divmod_with_overflow (o1, true, TRUNC_DIV_EXPR,
-					 &res, &overflow);
+	  result = wop0.mod_trunc (wop1, wide_int::UNSIGNED, &overflow);
 	  if (overflow)
-	    return 0;
+	    return NULL_RTX;
 	  break;
 
 	case AND:
-	  res = o0 & o1;
+	  result = wop0 & wop1;
 	  break;
 
 	case IOR:
-	  res = o0 | o1;
+	  result = wop0 | wop1;
 	  break;
 
 	case XOR:
-	  res = o0 ^ o1;
+	  result = wop0 ^ wop1;
 	  break;
 
 	case SMIN:
-	  res = o0.smin (o1);
+	  result = wop0.smin (wop1);
 	  break;
 
 	case SMAX:
-	  res = o0.smax (o1);
+	  result = wop0.smax (wop1);
 	  break;
 
 	case UMIN:
-	  res = o0.umin (o1);
+	  result = wop0.umin (wop1);
 	  break;
 
 	case UMAX:
-	  res = o0.umax (o1);
-	  break;
-
-	case LSHIFTRT:   case ASHIFTRT:
-	case ASHIFT:
-	case ROTATE:     case ROTATERT:
-	  {
-	    unsigned HOST_WIDE_INT cnt;
-
-	    if (SHIFT_COUNT_TRUNCATED)
-	      {
-		o1.high = 0; 
-		o1.low &= GET_MODE_PRECISION (mode) - 1;
-	      }
-
-	    if (!o1.fits_uhwi ()
-	        || o1.to_uhwi () >= GET_MODE_PRECISION (mode))
-	      return 0;
-
-	    cnt = o1.to_uhwi ();
-	    unsigned short prec = GET_MODE_PRECISION (mode);
-
-	    if (code == LSHIFTRT || code == ASHIFTRT)
-	      res = o0.rshift (cnt, prec, code == ASHIFTRT);
-	    else if (code == ASHIFT)
-	      res = o0.alshift (cnt, prec);
-	    else if (code == ROTATE)
-	      res = o0.lrotate (cnt, prec);
-	    else /* code == ROTATERT */
-	      res = o0.rrotate (cnt, prec);
-	  }
-	  break;
-
-	default:
-	  return 0;
-	}
-
-      return immed_double_int_const (res, mode);
-    }
-
-  if (CONST_INT_P (op0) && CONST_INT_P (op1)
-      && width <= HOST_BITS_PER_WIDE_INT && width != 0)
-    {
-      /* Get the integer argument values in two forms:
-         zero-extended in ARG0, ARG1 and sign-extended in ARG0S, ARG1S.  */
-
-      arg0 = INTVAL (op0);
-      arg1 = INTVAL (op1);
-
-      if (width < HOST_BITS_PER_WIDE_INT)
-        {
-          arg0 &= GET_MODE_MASK (mode);
-          arg1 &= GET_MODE_MASK (mode);
-
-          arg0s = arg0;
-	  if (val_signbit_known_set_p (mode, arg0s))
-	    arg0s |= ~GET_MODE_MASK (mode);
-
-          arg1s = arg1;
-	  if (val_signbit_known_set_p (mode, arg1s))
-	    arg1s |= ~GET_MODE_MASK (mode);
-	}
-      else
-	{
-	  arg0s = arg0;
-	  arg1s = arg1;
-	}
-
-      /* Compute the value of the arithmetic.  */
-
-      switch (code)
-	{
-	case PLUS:
-	  val = arg0s + arg1s;
-	  break;
-
-	case MINUS:
-	  val = arg0s - arg1s;
-	  break;
-
-	case MULT:
-	  val = arg0s * arg1s;
-	  break;
-
-	case DIV:
-	  if (arg1s == 0
-	      || ((unsigned HOST_WIDE_INT) arg0s
-		  == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1)
-		  && arg1s == -1))
-	    return 0;
-	  val = arg0s / arg1s;
-	  break;
-
-	case MOD:
-	  if (arg1s == 0
-	      || ((unsigned HOST_WIDE_INT) arg0s
-		  == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1)
-		  && arg1s == -1))
-	    return 0;
-	  val = arg0s % arg1s;
+	  result = wop0.umax (wop1);
 	  break;
 
-	case UDIV:
-	  if (arg1 == 0
-	      || ((unsigned HOST_WIDE_INT) arg0s
-		  == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1)
-		  && arg1s == -1))
-	    return 0;
-	  val = (unsigned HOST_WIDE_INT) arg0 / arg1;
-	  break;
-
-	case UMOD:
-	  if (arg1 == 0
-	      || ((unsigned HOST_WIDE_INT) arg0s
-		  == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1)
-		  && arg1s == -1))
-	    return 0;
-	  val = (unsigned HOST_WIDE_INT) arg0 % arg1;
-	  break;
-
-	case AND:
-	  val = arg0 & arg1;
-	  break;
-
-	case IOR:
-	  val = arg0 | arg1;
-	  break;
+	case LSHIFTRT:
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	case XOR:
-	  val = arg0 ^ arg1;
+	  result = wop0.rshiftu (wop1, wide_int::TRUNC);
 	  break;
-
-	case LSHIFTRT:
-	case ASHIFT:
+	  
 	case ASHIFTRT:
-	  /* Truncate the shift if SHIFT_COUNT_TRUNCATED, otherwise make sure
-	     the value is in range.  We can't return any old value for
-	     out-of-range arguments because either the middle-end (via
-	     shift_truncation_mask) or the back-end might be relying on
-	     target-specific knowledge.  Nor can we rely on
-	     shift_truncation_mask, since the shift might not be part of an
-	     ashlM3, lshrM3 or ashrM3 instruction.  */
-	  if (SHIFT_COUNT_TRUNCATED)
-	    arg1 = (unsigned HOST_WIDE_INT) arg1 % width;
-	  else if (arg1 < 0 || arg1 >= GET_MODE_BITSIZE (mode))
-	    return 0;
-
-	  val = (code == ASHIFT
-		 ? ((unsigned HOST_WIDE_INT) arg0) << arg1
-		 : ((unsigned HOST_WIDE_INT) arg0) >> arg1);
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	  /* Sign-extend the result for arithmetic right shifts.  */
-	  if (code == ASHIFTRT && arg0s < 0 && arg1 > 0)
-	    val |= ((unsigned HOST_WIDE_INT) (-1)) << (width - arg1);
+	  result = wop0.rshifts (wop1, wide_int::TRUNC);
 	  break;
+	  
+	case ASHIFT:
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	case ROTATERT:
-	  if (arg1 < 0)
-	    return 0;
-
-	  arg1 %= width;
-	  val = ((((unsigned HOST_WIDE_INT) arg0) << (width - arg1))
-		 | (((unsigned HOST_WIDE_INT) arg0) >> arg1));
+	  result = wop0.lshift (wop1, wide_int::TRUNC);
 	  break;
-
+	  
 	case ROTATE:
-	  if (arg1 < 0)
-	    return 0;
-
-	  arg1 %= width;
-	  val = ((((unsigned HOST_WIDE_INT) arg0) << arg1)
-		 | (((unsigned HOST_WIDE_INT) arg0) >> (width - arg1)));
-	  break;
-
-	case COMPARE:
-	  /* Do nothing here.  */
-	  return 0;
-
-	case SMIN:
-	  val = arg0s <= arg1s ? arg0s : arg1s;
-	  break;
-
-	case UMIN:
-	  val = ((unsigned HOST_WIDE_INT) arg0
-		 <= (unsigned HOST_WIDE_INT) arg1 ? arg0 : arg1);
-	  break;
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	case SMAX:
-	  val = arg0s > arg1s ? arg0s : arg1s;
+	  result = wop0.lrotate (wop1);
 	  break;
+	  
+	case ROTATERT:
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	case UMAX:
-	  val = ((unsigned HOST_WIDE_INT) arg0
-		 > (unsigned HOST_WIDE_INT) arg1 ? arg0 : arg1);
+	  result = wop0.rrotate (wop1);
 	  break;
 
-	case SS_PLUS:
-	case US_PLUS:
-	case SS_MINUS:
-	case US_MINUS:
-	case SS_MULT:
-	case US_MULT:
-	case SS_DIV:
-	case US_DIV:
-	case SS_ASHIFT:
-	case US_ASHIFT:
-	  /* ??? There are simplifications that can be done.  */
-	  return 0;
-
 	default:
-	  gcc_unreachable ();
+	  return NULL_RTX;
 	}
-
-      return gen_int_mode (val, mode);
+      return immed_wide_int_const (result, mode);
     }
 
   return NULL_RTX;
@@ -4800,10 +4441,11 @@ comparison_result (enum rtx_code code, int known_results)
     }
 }
 
-/* Check if the given comparison (done in the given MODE) is actually a
-   tautology or a contradiction.
-   If no simplification is possible, this function returns zero.
-   Otherwise, it returns either const_true_rtx or const0_rtx.  */
+/* Check if the given comparison (done in the given MODE) is actually
+   a tautology or a contradiction.  If the mode is VOID_mode, the
+   comparison is done in "infinite precision".  If no simplification
+   is possible, this function returns zero.  Otherwise, it returns
+   either const_true_rtx or const0_rtx.  */
 
 rtx
 simplify_const_relational_operation (enum rtx_code code,
@@ -4927,59 +4569,25 @@ simplify_const_relational_operation (enum rtx_code code,
 
   /* Otherwise, see if the operands are both integers.  */
   if ((GET_MODE_CLASS (mode) == MODE_INT || mode == VOIDmode)
-       && (CONST_DOUBLE_AS_INT_P (trueop0) || CONST_INT_P (trueop0))
-       && (CONST_DOUBLE_AS_INT_P (trueop1) || CONST_INT_P (trueop1)))
+      && CONST_SCALAR_INT_P (trueop0) && CONST_SCALAR_INT_P (trueop1))
     {
-      int width = GET_MODE_PRECISION (mode);
-      HOST_WIDE_INT l0s, h0s, l1s, h1s;
-      unsigned HOST_WIDE_INT l0u, h0u, l1u, h1u;
-
-      /* Get the two words comprising each integer constant.  */
-      if (CONST_DOUBLE_AS_INT_P (trueop0))
-	{
-	  l0u = l0s = CONST_DOUBLE_LOW (trueop0);
-	  h0u = h0s = CONST_DOUBLE_HIGH (trueop0);
-	}
-      else
-	{
-	  l0u = l0s = INTVAL (trueop0);
-	  h0u = h0s = HWI_SIGN_EXTEND (l0s);
-	}
-
-      if (CONST_DOUBLE_AS_INT_P (trueop1))
-	{
-	  l1u = l1s = CONST_DOUBLE_LOW (trueop1);
-	  h1u = h1s = CONST_DOUBLE_HIGH (trueop1);
-	}
-      else
-	{
-	  l1u = l1s = INTVAL (trueop1);
-	  h1u = h1s = HWI_SIGN_EXTEND (l1s);
-	}
-
-      /* If WIDTH is nonzero and smaller than HOST_BITS_PER_WIDE_INT,
-	 we have to sign or zero-extend the values.  */
-      if (width != 0 && width < HOST_BITS_PER_WIDE_INT)
-	{
-	  l0u &= GET_MODE_MASK (mode);
-	  l1u &= GET_MODE_MASK (mode);
-
-	  if (val_signbit_known_set_p (mode, l0s))
-	    l0s |= ~GET_MODE_MASK (mode);
-
-	  if (val_signbit_known_set_p (mode, l1s))
-	    l1s |= ~GET_MODE_MASK (mode);
-	}
-      if (width != 0 && width <= HOST_BITS_PER_WIDE_INT)
-	h0u = h1u = 0, h0s = HWI_SIGN_EXTEND (l0s), h1s = HWI_SIGN_EXTEND (l1s);
-
-      if (h0u == h1u && l0u == l1u)
+      enum machine_mode cmode = mode;
+      wide_int wo0;
+      wide_int wo1;
+
+      /* It would be nice if we really had a mode here.  However, the
+	 largest int representable on the target is as good as
+	 infinite.  */
+      if (mode == VOIDmode)
+	cmode = MAX_MODE_INT;
+      wo0 = wide_int::from_rtx (trueop0, cmode);
+      wo1 = wide_int::from_rtx (trueop1, cmode);
+      if (wo0 == wo1)
 	return comparison_result (code, CMP_EQ);
       else
 	{
-	  int cr;
-	  cr = (h0s < h1s || (h0s == h1s && l0u < l1u)) ? CMP_LT : CMP_GT;
-	  cr |= (h0u < h1u || (h0u == h1u && l0u < l1u)) ? CMP_LTU : CMP_GTU;
+	  int cr = wo0.lts_p (wo1) ? CMP_LT : CMP_GT;
+	  cr |= wo0.ltu_p (wo1) ? CMP_LTU : CMP_GTU;
 	  return comparison_result (code, cr);
 	}
     }
@@ -5394,9 +5002,9 @@ simplify_ternary_operation (enum rtx_code code, enum machine_mode mode,
   return 0;
 }
 
-/* Evaluate a SUBREG of a CONST_INT or CONST_DOUBLE or CONST_FIXED
-   or CONST_VECTOR,
-   returning another CONST_INT or CONST_DOUBLE or CONST_FIXED or CONST_VECTOR.
+/* Evaluate a SUBREG of a CONST_INT or CONST_WIDE_INT or CONST_DOUBLE
+   or CONST_FIXED or CONST_VECTOR, returning another CONST_INT or
+   CONST_WIDE_INT or CONST_DOUBLE or CONST_FIXED or CONST_VECTOR.
 
    Works by unpacking OP into a collection of 8-bit values
    represented as a little-endian array of 'unsigned char', selecting by BYTE,
@@ -5406,13 +5014,11 @@ static rtx
 simplify_immed_subreg (enum machine_mode outermode, rtx op,
 		       enum machine_mode innermode, unsigned int byte)
 {
-  /* We support up to 512-bit values (for V8DFmode).  */
   enum {
-    max_bitsize = 512,
     value_bit = 8,
     value_mask = (1 << value_bit) - 1
   };
-  unsigned char value[max_bitsize / value_bit];
+  unsigned char value[MAX_BITSIZE_MODE_ANY_MODE/value_bit];
   int value_start;
   int i;
   int elem;
@@ -5424,6 +5030,7 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op,
   rtvec result_v = NULL;
   enum mode_class outer_class;
   enum machine_mode outer_submode;
+  int max_bitsize;
 
   /* Some ports misuse CCmode.  */
   if (GET_MODE_CLASS (outermode) == MODE_CC && CONST_INT_P (op))
@@ -5433,6 +5040,10 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op,
   if (COMPLEX_MODE_P (outermode))
     return NULL_RTX;
 
+  /* We support any size mode.  */
+  max_bitsize = MAX (GET_MODE_BITSIZE (outermode), 
+		     GET_MODE_BITSIZE (innermode));
+
   /* Unpack the value.  */
 
   if (GET_CODE (op) == CONST_VECTOR)
@@ -5482,8 +5093,20 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op,
 	    *vp++ = INTVAL (el) < 0 ? -1 : 0;
 	  break;
 
+	case CONST_WIDE_INT:
+	  {
+	    wide_int val = wide_int::from_rtx (el, innermode);
+	    unsigned char extend = val.sign_mask ();
+
+	    for (i = 0; i < elem_bitsize; i += value_bit) 
+	      *vp++ = val.extract_to_hwi (i, value_bit);
+	    for (; i < elem_bitsize; i += value_bit)
+	      *vp++ = extend;
+	  }
+	  break;
+
 	case CONST_DOUBLE:
-	  if (GET_MODE (el) == VOIDmode)
+	  if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (el) == VOIDmode)
 	    {
 	      unsigned char extend = 0;
 	      /* If this triggers, someone should have generated a
@@ -5506,7 +5129,8 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op,
 	    }
 	  else
 	    {
-	      long tmp[max_bitsize / 32];
+	      /* This is big enough for anything on the platform.  */
+	      long tmp[MAX_BITSIZE_MODE_ANY_MODE / 32];
 	      int bitsize = GET_MODE_BITSIZE (GET_MODE (el));
 
 	      gcc_assert (SCALAR_FLOAT_MODE_P (GET_MODE (el)));
@@ -5626,24 +5250,27 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op,
 	case MODE_INT:
 	case MODE_PARTIAL_INT:
 	  {
-	    unsigned HOST_WIDE_INT hi = 0, lo = 0;
-
-	    for (i = 0;
-		 i < HOST_BITS_PER_WIDE_INT && i < elem_bitsize;
-		 i += value_bit)
-	      lo |= (unsigned HOST_WIDE_INT)(*vp++ & value_mask) << i;
-	    for (; i < elem_bitsize; i += value_bit)
-	      hi |= (unsigned HOST_WIDE_INT)(*vp++ & value_mask)
-		     << (i - HOST_BITS_PER_WIDE_INT);
-
-	    /* immed_double_const doesn't call trunc_int_for_mode.  I don't
-	       know why.  */
-	    if (elem_bitsize <= HOST_BITS_PER_WIDE_INT)
-	      elems[elem] = gen_int_mode (lo, outer_submode);
-	    else if (elem_bitsize <= HOST_BITS_PER_DOUBLE_INT)
-	      elems[elem] = immed_double_const (lo, hi, outer_submode);
-	    else
-	      return NULL_RTX;
+	    int u;
+	    int base = 0;
+	    int units 
+	      = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1) 
+	      / HOST_BITS_PER_WIDE_INT;
+	    HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
+	    wide_int r;
+
+	    for (u = 0; u < units; u++) 
+	      {
+		unsigned HOST_WIDE_INT buf = 0;
+		for (i = 0; 
+		     i < HOST_BITS_PER_WIDE_INT && base + i < elem_bitsize; 
+		     i += value_bit)
+		  buf |= (unsigned HOST_WIDE_INT)(*vp++ & value_mask) << i;
+
+		tmp[u] = buf;
+		base += HOST_BITS_PER_WIDE_INT;
+	      }
+	    r = wide_int::from_array (tmp, units, outer_submode);
+	    elems[elem] = immed_wide_int_const (r, outer_submode);
 	  }
 	  break;
 
@@ -5651,7 +5278,7 @@ simplify_immed_subreg (enum machine_mode outermode, rtx op,
 	case MODE_DECIMAL_FLOAT:
 	  {
 	    REAL_VALUE_TYPE r;
-	    long tmp[max_bitsize / 32];
+	    long tmp[MAX_BITSIZE_MODE_ANY_INT / 32];
 
 	    /* real_from_target wants its input in words affected by
 	       FLOAT_WORDS_BIG_ENDIAN.  However, we ignore this,
diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
index cfd42ad..85b1552 100644
--- a/gcc/tree-ssa-address.c
+++ b/gcc/tree-ssa-address.c
@@ -189,15 +189,18 @@ addr_for_mem_ref (struct mem_address *addr, addr_space_t as,
   struct mem_addr_template *templ;
 
   if (addr->step && !integer_onep (addr->step))
-    st = immed_double_int_const (tree_to_double_int (addr->step), pointer_mode);
+    st = immed_wide_int_const (wide_int::from_tree (addr->step),
+			       TYPE_MODE (TREE_TYPE (addr->step)));
   else
     st = NULL_RTX;
 
   if (addr->offset && !integer_zerop (addr->offset))
-    off = immed_double_int_const
-	    (tree_to_double_int (addr->offset)
-	     .sext (TYPE_PRECISION (TREE_TYPE (addr->offset))),
-	     pointer_mode);
+    {
+      wide_int dc = wide_int::from_tree (addr->offset);
+      dc = dc.sforce_to_size (TREE_TYPE (addr->offset));
+      off = immed_wide_int_const (dc,
+			       TYPE_MODE (TREE_TYPE (addr->offset)));
+    }
   else
     off = NULL_RTX;
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 98ad5d8..11075e3 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -59,6 +59,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "except.h"
 #include "debug.h"
 #include "intl.h"
+#include "wide-int.h"
 
 /* Tree code classes.  */
 
@@ -1064,6 +1065,23 @@ double_int_to_tree (tree type, double_int cst)
   return build_int_cst_wide (type, cst.low, cst.high);
 }
 
+/* Constructs tree in type TYPE from with value given by CST.  Signedness
+   of CST is assumed to be the same as the signedness of TYPE.  */
+
+tree
+wide_int_to_tree (tree type, const wide_int &cst)
+{
+  wide_int v;
+
+  gcc_assert (cst.get_len () <= 2);
+  if (TYPE_UNSIGNED (type))
+    v = cst.zext (TYPE_PRECISION (type));
+  else
+    v = cst.sext (TYPE_PRECISION (type));
+
+  return build_int_cst_wide (type, v.elt (0), v.elt (1));
+}
+
 /* Returns true if CST fits into range of TYPE.  Signedness of CST is assumed
    to be the same as the signedness of TYPE.  */
 
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 0db1562..7cb99ac 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -3513,6 +3513,23 @@ loc_cmp (rtx x, rtx y)
       default:
 	gcc_unreachable ();
       }
+  if (CONST_WIDE_INT_P (x))
+    {
+      /* Compare the vector length first.  */
+      if (CONST_WIDE_INT_NUNITS (x) >= CONST_WIDE_INT_NUNITS (y))
+	return 1;
+      else if (CONST_WIDE_INT_NUNITS (x) < CONST_WIDE_INT_NUNITS (y))
+	return -1;
+
+      /* Compare the vectors elements.  */;
+      for (j = CONST_WIDE_INT_NUNITS (x) - 1; j >= 0 ; j--)
+	{
+	  if (CONST_WIDE_INT_ELT (x, j) < CONST_WIDE_INT_ELT (y, j))
+	    return -1;
+	  if (CONST_WIDE_INT_ELT (x, j) > CONST_WIDE_INT_ELT (y, j))
+	    return 1;
+	}
+    }
 
   return 0;
 }
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 6648103..c104d87 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -3406,6 +3406,7 @@ const_rtx_hash_1 (rtx *xp, void *data)
   enum rtx_code code;
   hashval_t h, *hp;
   rtx x;
+  int i;
 
   x = *xp;
   code = GET_CODE (x);
@@ -3416,12 +3417,12 @@ const_rtx_hash_1 (rtx *xp, void *data)
     {
     case CONST_INT:
       hwi = INTVAL (x);
+
     fold_hwi:
       {
 	int shift = sizeof (hashval_t) * CHAR_BIT;
 	const int n = sizeof (HOST_WIDE_INT) / sizeof (hashval_t);
-	int i;
-
+	
 	h ^= (hashval_t) hwi;
 	for (i = 1; i < n; ++i)
 	  {
@@ -3431,8 +3432,16 @@ const_rtx_hash_1 (rtx *xp, void *data)
       }
       break;
 
+    case CONST_WIDE_INT:
+      hwi = GET_MODE_PRECISION (mode);
+      {
+	for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+	  hwi ^= CONST_WIDE_INT_ELT (x, i);
+	goto fold_hwi;
+      }
+
     case CONST_DOUBLE:
-      if (mode == VOIDmode)
+      if (TARGET_SUPPORTS_WIDE_INT == 0 && mode == VOIDmode)
 	{
 	  hwi = CONST_DOUBLE_LOW (x) ^ CONST_DOUBLE_HIGH (x);
 	  goto fold_hwi;

[-- Attachment #3: p5-3.clog --]
[-- Type: text/plain, Size: 4478 bytes --]

2013-2-26  Kenneth Zadeck <zadeck@naturalbridge.com>

	* alias.c  (rtx_equal_for_memref_p): Fixed comment.
	* builtins.c (c_getstr, c_readstr, expand_builtin_signbit): 
	Make to work with any size int.
	* combine.c (try_combine, subst): Changed to support any 
	size integer.
	* coretypes.h (hwivec_def, hwivec, const_hwivec): New.
	* cse.c (hash_rtx_cb): Added CONST_WIDE_INT case are
	modified DOUBLE_INT case.
	* cselib.c (rtx_equal_for_cselib_1): Converted cases to 
	CASE_CONST_UNIQUE.
	(cselib_hash_rtx): Added CONST_WIDE_INT case.
	* defaults.h (TARGET_SUPPORTS_WIDE_INT): New.
	* doc/rtl.texi (CONST_DOUBLE, CONST_WIDE_INT): Updated.
	* doc/tm.texi (TARGET_SUPPORTS_WIDE_INT): New.	
	* doc/tm.texi.in (TARGET_SUPPORTS_WIDE_INT): New.
	* dojump.c (prefer_and_bit_test): Use wide int api.
	* dwarf2out.c (get_full_len): New function.
	(dw_val_equal_p, size_of_loc_descr,
	output_loc_operands, print_die, attr_checksum, same_dw_val_p,
	size_of_die, value_format, output_die, mem_loc_descriptor,
	loc_descriptor, extract_int, add_const_value_attribute,
	hash_loc_operands, compare_loc_operands): Add support for wide-ints.
	(add_AT_wide): New function.
	* dwarf2out.h (enum dw_val_class): Added dw_val_class_wide_int.
	* emit-rtl.c (const_wide_int_htab): Add marking.
	(const_wide_int_htab_hash, const_wide_int_htab_eq,
	lookup_const_wide_int, immed_wide_int_const): New functions.
	(const_double_htab_hash, const_double_htab_eq,
	rtx_to_double_int, immed_double_const): Conditionally 
	changed CONST_DOUBLE behavior.
 	(immed_double_const, init_emit_once): Changed to support wide-int.
	* explow.c (plus_constant): Now uses wide-int api.
	* expmed.c (mask_rtx, lshift_value): Now uses wide-int.
 	(expand_mult, expand_smod_pow2): Make to work with any size int.
	(make_tree): Added CONST_WIDE_INT case.
	* expr.c (convert_modes): Added support for any size int.
	(emit_group_load_1): Added todo for place that still does not
	allow large ints.
	(store_expr, expand_constructor): Fixed comments.
	(expand_expr_real_2, expand_expr_real_1,
	reduce_to_bit_field_precision, const_vector_from_tree):
	Converted to use wide-int api.
	* final.c (output_addr_const): Added CONST_WIDE_INT case.
	* genemit.c (gen_exp): Added CONST_WIDE_INT case.
	* gengenrtl.c (excluded_rtx): Added CONST_WIDE_INT case.
	* gengtype.c (wide-int): New type.
	* genpreds.c (write_one_predicate_function): Fixed comment.
	(add_constraint): Added CONST_WIDE_INT test.
	(write_tm_constrs_h): Do not emit hval or lval if target
	supports wide integers.
	* gensupport.c (std_preds): Added const_wide_int_operand and
	const_scalar_int_operand.
	* optabs.c (expand_subword_shift, expand_doubleword_shift,
	expand_absneg_bit, expand_absneg_bit, expand_copysign_absneg,
	expand_copysign_bit): Made to work with any size int.  
	* postreload.c (reload_cse_simplify_set):  Now uses wide-int api.
	* print-rtl.c (print_rtx): Added CONST_WIDE_INT case.
	* read-rtl.c (validate_const_wide_int): New function.
	(read_rtx_code): Added CONST_WIDE_INT case.
	* recog.c (const_scalar_int_operand, const_double_operand):
	New versions if target supports wide integers.
	(const_wide_int_operand): New function.
	* rtl.c (DEF_RTL_EXPR): Added CONST_WIDE_INT case.
	(rtx_size): Ditto.
	(rtx_alloc_stat, hwivec_output_hex, hwivec_check_failed_bounds):
	New functions.
	(iterative_hash_rtx): Added CONST_WIDE_INT case.
	* rtl.def (CONST_WIDE_INT): New.
	* rtl.h (hwivec_def): New function.
	(HWI_GET_NUM_ELEM, HWI_PUT_NUM_ELEM, CONST_WIDE_INT_P,
	CONST_SCALAR_INT_P, XHWIVEC_ELT, HWIVEC_CHECK, CONST_WIDE_INT_VEC,
	CONST_WIDE_INT_NUNITS, CONST_WIDE_INT_ELT, rtx_alloc_v): New macros.
	(chain_next): Added hwiv case.
	(CASE_CONST_SCALAR_INT, CONST_INT, CONST_WIDE_INT):  Added new
	defs if target supports wide ints.
	* rtlanal.c (commutative_operand_precedence, split_double):
	Added CONST_WIDE_INT case.
	* sched-vis.c (print_value): Added CONST_WIDE_INT case are
	modified DOUBLE_INT case.
	* sel-sched-ir.c (lhs_and_rhs_separable_p): Fixed comment
	* simplify-rtx.c (mode_signbit_p,
	simplify_const_unary_operation, simplify_binary_operation_1,
	simplify_const_binary_operation,
	simplify_const_relational_operation, simplify_immed_subreg):
	Make work with any size int.  .
	* tree-ssa-address.c (addr_for_mem_ref): Changes to use
	wide-int rather than double-int.
	* tree.c (wide_int_to_tree): New function.
	* var-tracking.c (loc_cmp): Added CONST_WIDE_INT case.
	* varasm.c (const_rtx_hash_1): Added CONST_WIDE_INT case.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 10:00   ` Richard Biener
  2012-10-31 10:02     ` Richard Sandiford
@ 2012-10-31 18:34     ` Andrew Haley
  1 sibling, 0 replies; 59+ messages in thread
From: Andrew Haley @ 2012-10-31 18:34 UTC (permalink / raw)
  To: Richard Biener; +Cc: Kenneth Zadeck, Jakub Jelinek, gcc, gcc-patches

On 10/31/2012 09:49 AM, Richard Biener wrote:
> On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck
> <zadeck@naturalbridge.com> wrote:
>> jakub,
>>
>> i am hoping to get the rest of my wide integer conversion posted by nov 5.
>> I am under some adverse conditions here: hurricane sandy hit her pretty
>> badly.  my house is hooked up to a small generator, and no one has any power
>> for miles around.
>>
>> So far richi has promised to review them.   he has sent some comments, but
>> so far no reviews.    Some time after i get the first round of them posted,
>> i will do a second round that incorporates everyones comments.
>>
>> But i would like a little slack here if possible.    While this work is a
>> show stopper for my private port, the patches address serious problems for
>> many of the public ports, especially ones that have very flexible vector
>> units.    I believe that there are significant set of latent problems
>> currently with the existing ports that use ti mode that these patches will
>> fix.
>>
>> However, i will do everything in my power to get the first round of the
>> patches posted by nov 5 deadline.
> 
> I suppose you are not going to merge your private port for 4.8 and thus
> the wide-int changes are not a show-stopper for you.
> 
> That said, I considered the main conversion to be appropriate to be
> defered for the next stage1.  There is no advantage in disrupting the
> tree more at this stage.

We are still in Stage 1.  If it were later in the release cycle this
argument would have some merit, but under the rules this sort of thing
is allowed at any point in Stage 1.  If we aren't going to allow
something like this because "it's too late" we should have closed
Stage 1 earlier.

Andrew.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
                   ` (4 preceding siblings ...)
  2012-10-30 21:07 ` Kenneth Zadeck
@ 2012-10-30 22:06 ` Sriraman Tallam
  2012-10-31  9:09 ` Bin Cheng
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 59+ messages in thread
From: Sriraman Tallam @ 2012-10-30 22:06 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, GCC Patches

Hi Jakub,

   My function multiversioning patch is being reviewed and  I hope to
get this in by Nov. 5.

Thanks,
-Sri.

On Mon, Oct 29, 2012 at 10:56 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Status
> ======
>
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.
>
>
> Quality Data
> ============
>
> Priority          #   Change from Last Report
> --------        ---   -----------------------
> P1               23   + 23
> P2               77   +  8
> P3               85   + 84
> --------        ---   -----------------------
> Total           185   +115
>
>
> Previous Report
> ===============
>
> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html
>
> The next report will be sent by me again, announcing end of stage 1.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
                   ` (5 preceding siblings ...)
  2012-10-30 22:06 ` Sriraman Tallam
@ 2012-10-31  9:09 ` Bin Cheng
  2012-10-31 10:23 ` Richard Biener
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 59+ messages in thread
From: Bin Cheng @ 2012-10-31  9:09 UTC (permalink / raw)
  To: 'Jakub Jelinek', gcc; +Cc: gcc-patches



> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-owner@gcc.gnu.org]
On
> Behalf Of Jakub Jelinek
> Sent: Tuesday, October 30, 2012 1:57 AM
> To: gcc@gcc.gnu.org
> Cc: gcc-patches@gcc.gnu.org
> Subject: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
> 
> Status
> ======
> 
> I'd like to close the stage 1 phase of GCC 4.8 development on Monday,
November
> 5th.  If you have still patches for new features you'd like to see in GCC
4.8,
> please post them for review soon.  Patches posted before the freeze, but
> reviewed shortly after the freeze, may still go in, further changes should
be
> just bugfixes and documentation fixes.
> 
> 
> Quality Data
> ============
> 
> Priority          #   Change from Last Report
> --------        ---   -----------------------
> P1               23   + 23
> P2               77   +  8
> P3               85   + 84
> --------        ---   -----------------------
> Total           185   +115
> 
> 
> Previous Report
> ===============
> 
> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html
> 
> The next report will be sent by me again, announcing end of stage 1.

Hi,
I am working on register pressure directed hoist pass and have committed the
main patch in trunk. Here I still have two patches in this area improving
it. I will send these two patches recently and hope it can be included in
4.8 if OK.

Thanks.



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
                   ` (6 preceding siblings ...)
  2012-10-31  9:09 ` Bin Cheng
@ 2012-10-31 10:23 ` Richard Biener
  2012-11-05 16:32   ` David Malcolm
  2012-10-31 10:31 ` JonY
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 59+ messages in thread
From: Richard Biener @ 2012-10-31 10:23 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, gcc-patches, David Malcolm, Michael Matz

On Mon, Oct 29, 2012 at 6:56 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Status
> ======
>
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.

Reminds me of the stable plugin API for introspection.  David, Micha - what's
the status here?  Adding this is certainly ok during stage3 and I think that
we should have something in 4.8 to kick of further development here.

Richard.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 10:23 ` Richard Biener
@ 2012-11-05 16:32   ` David Malcolm
  0 siblings, 0 replies; 59+ messages in thread
From: David Malcolm @ 2012-11-05 16:32 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc, gcc-patches, Michael Matz

On Wed, 2012-10-31 at 11:13 +0100, Richard Biener wrote:
> On Mon, Oct 29, 2012 at 6:56 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> > Status
> > ======
> >
> > I'd like to close the stage 1 phase of GCC 4.8 development
> > on Monday, November 5th.  If you have still patches for new features you'd
> > like to see in GCC 4.8, please post them for review soon.
> 
> Reminds me of the stable plugin API for introspection.  David, Micha - what's
> the status here?  Adding this is certainly ok during stage3 and I think that
> we should have something in 4.8 to kick of further development here.

(sorry for the belated response, I was on vacation).

I'm currently leaning towards having the API as a separate source tree
that can be compiled against 4.6 through 4.8 onwards (hiding all
necessary compatibility cruft within it [1]), generating a library that
plugins can link against, providing a consistent C API across all of
these GCC versions.  Keeping it out-of-tree allows plugins to be written
that can work with older versions of gcc, and allows the plugin API to
change more rapidly than the rest of gcc (especially important for these
older gcc releases).  Distributions of gcc could build the plugin api at
the same time as gcc, albeit from a separate tarball.

When the API is more mature, we could merge it inside gcc proper, I
guess.

I'll try to post something later today.

Dave
[1] e.g C vs C++ linkage

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
                   ` (7 preceding siblings ...)
  2012-10-31 10:23 ` Richard Biener
@ 2012-10-31 10:31 ` JonY
  2012-10-31 10:44   ` Jakub Jelinek
  2012-10-31 11:12   ` Jonathan Wakely
  2012-11-02 22:51 ` [wwwdocs] PATCH for " Gerald Pfeifer
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 59+ messages in thread
From: JonY @ 2012-10-31 10:31 UTC (permalink / raw)
  To: gcc; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 620 bytes --]

On 10/30/2012 01:56, Jakub Jelinek wrote:
> Status
> ======
> 
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.
> 

Somebody with commit rights please push "[Patch] Remove
_GLIBCXX_HAVE_BROKEN_VSWPRINTF from mingw32-w64/os_defines.h".

Kai has already approved, but is off for the week.




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 196 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 10:31 ` JonY
@ 2012-10-31 10:44   ` Jakub Jelinek
  2012-10-31 11:12   ` Jonathan Wakely
  1 sibling, 0 replies; 59+ messages in thread
From: Jakub Jelinek @ 2012-10-31 10:44 UTC (permalink / raw)
  To: JonY; +Cc: gcc, gcc-patches

On Wed, Oct 31, 2012 at 06:25:45PM +0800, JonY wrote:
> On 10/30/2012 01:56, Jakub Jelinek wrote:
> > I'd like to close the stage 1 phase of GCC 4.8 development
> > on Monday, November 5th.  If you have still patches for new features you'd
> > like to see in GCC 4.8, please post them for review soon.  Patches
> > posted before the freeze, but reviewed shortly after the freeze, may
> > still go in, further changes should be just bugfixes and documentation
> > fixes.
> > 
> 
> Somebody with commit rights please push "[Patch] Remove
> _GLIBCXX_HAVE_BROKEN_VSWPRINTF from mingw32-w64/os_defines.h".
> 
> Kai has already approved, but is off for the week.

That looks like a bugfix (or even regression bugfix).  Bugfixes are
fine through stage 3, regression bugfixes are fine even in stage 4.

	Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-31 10:31 ` JonY
  2012-10-31 10:44   ` Jakub Jelinek
@ 2012-10-31 11:12   ` Jonathan Wakely
  1 sibling, 0 replies; 59+ messages in thread
From: Jonathan Wakely @ 2012-10-31 11:12 UTC (permalink / raw)
  To: JonY; +Cc: gcc, gcc-patches, libstdc++

On 31 October 2012 10:25, JonY wrote:
> On 10/30/2012 01:56, Jakub Jelinek wrote:
>> Status
>> ======
>>
>> I'd like to close the stage 1 phase of GCC 4.8 development
>> on Monday, November 5th.  If you have still patches for new features you'd
>> like to see in GCC 4.8, please post them for review soon.  Patches
>> posted before the freeze, but reviewed shortly after the freeze, may
>> still go in, further changes should be just bugfixes and documentation
>> fixes.
>>
>
> Somebody with commit rights please push "[Patch] Remove
> _GLIBCXX_HAVE_BROKEN_VSWPRINTF from mingw32-w64/os_defines.h".
>
> Kai has already approved, but is off for the week.

I could have done that, if it had been sent to the right lists. All
libstdc++ patches go to both gcc-patches and libstdc++@gcc.gnu.org
please.

Let's move this to the libstdc++ list, I have some questions about the patch.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [wwwdocs] PATCH for Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
                   ` (8 preceding siblings ...)
  2012-10-31 10:31 ` JonY
@ 2012-11-02 22:51 ` Gerald Pfeifer
  2012-11-05 12:42 ` Peter Bergner
  2012-11-06  2:57 ` Easwaran Raman
  11 siblings, 0 replies; 59+ messages in thread
From: Gerald Pfeifer @ 2012-11-02 22:51 UTC (permalink / raw)
  To: gcc-patches

On Mon, 29 Oct 2012, Jakub Jelinek wrote:
> I'd like to close the stage 1 phase of GCC 4.8 development

Documented via the patch below.  I also changed "Active Development"
to "Development" to reduce text density and improve formatting on a
wider range of window/text sizes.

Gerald

Index: index.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.863
diff -u -3 -p -r1.863 index.html
--- index.html	20 Sep 2012 15:35:43 -0000	1.863
+++ index.html	2 Nov 2012 22:48:54 -0000
@@ -171,12 +171,12 @@ Any additions?  Don't be shy, send them 
   </span>
 </dd>
 
-<dt><span class="version">Active development:</span>
+<dt><span class="version">Development:</span>
   GCC 4.8.0 (<a href="gcc-4.8/changes.html">changes</a>, <a href="gcc-4.8/criteria.html">release criteria</a>)
 </dt><dd>
   Status:
   <!--GCC 4.8 status below-->
-  <a href="http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html">2012-03-02</a>
+  <a href="http://gcc.gnu.org/ml/gcc/2012-10/msg00434.html">2012-10-29</a>
   <!--GCC 4.8 status above-->
   (general development, stage 1).
   <br />

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
                   ` (9 preceding siblings ...)
  2012-11-02 22:51 ` [wwwdocs] PATCH for " Gerald Pfeifer
@ 2012-11-05 12:42 ` Peter Bergner
  2012-11-05 12:53   ` Jakub Jelinek
  2012-11-06  2:57 ` Easwaran Raman
  11 siblings, 1 reply; 59+ messages in thread
From: Peter Bergner @ 2012-11-05 12:42 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, gcc-patches

On Mon, 2012-10-29 at 18:56 +0100, Jakub Jelinek wrote:
> Status
> ======
> 
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.

I'd like to post later today (hopefully this morning) a very minimal
configure patch that adds the -mcpu=power8 and -mtune=power8 compiler
options to gcc.  Currently, power8 will be an alias for power7, but
getting this path in now allows us to add power8 support to the
compiler without having to touch the arch independent configure script.

The only hang up at the moment is we're still determining the
assembler mnemonic we'll be releasing that the gcc configure script
will use to test for power6 assembler support.

Peter

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-05 12:42 ` Peter Bergner
@ 2012-11-05 12:53   ` Jakub Jelinek
  2012-11-05 14:40     ` Peter Bergner
  0 siblings, 1 reply; 59+ messages in thread
From: Jakub Jelinek @ 2012-11-05 12:53 UTC (permalink / raw)
  To: Peter Bergner; +Cc: gcc, gcc-patches

On Mon, Nov 05, 2012 at 06:41:47AM -0600, Peter Bergner wrote:
> On Mon, 2012-10-29 at 18:56 +0100, Jakub Jelinek wrote:
> > I'd like to close the stage 1 phase of GCC 4.8 development
> > on Monday, November 5th.  If you have still patches for new features you'd
> > like to see in GCC 4.8, please post them for review soon.  Patches
> > posted before the freeze, but reviewed shortly after the freeze, may
> > still go in, further changes should be just bugfixes and documentation
> > fixes.
> 
> I'd like to post later today (hopefully this morning) a very minimal
> configure patch that adds the -mcpu=power8 and -mtune=power8 compiler
> options to gcc.  Currently, power8 will be an alias for power7, but
> getting this path in now allows us to add power8 support to the
> compiler without having to touch the arch independent configure script.

config.gcc target specific hunks are part of the backend, the individual
target maintainers can approve changes to that, I really don't see a reason
to add a dummy alias now just for that.  If the power8 enablement is
approved and non-intrusive enough that it would be acceptable even during
stage 3, then so would be corresponding config.gcc changes.

	Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-05 12:53   ` Jakub Jelinek
@ 2012-11-05 14:40     ` Peter Bergner
  2012-11-05 14:48       ` Jakub Jelinek
  0 siblings, 1 reply; 59+ messages in thread
From: Peter Bergner @ 2012-11-05 14:40 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, gcc-patches

On Mon, 2012-11-05 at 13:53 +0100, Jakub Jelinek wrote:
> On Mon, Nov 05, 2012 at 06:41:47AM -0600, Peter Bergner wrote:
> > I'd like to post later today (hopefully this morning) a very minimal
> > configure patch that adds the -mcpu=power8 and -mtune=power8 compiler
> > options to gcc.  Currently, power8 will be an alias for power7, but
> > getting this path in now allows us to add power8 support to the
> > compiler without having to touch the arch independent configure script.
> 
> config.gcc target specific hunks are part of the backend, the individual
> target maintainers can approve changes to that, I really don't see a reason
> to add a dummy alias now just for that.  If the power8 enablement is
> approved and non-intrusive enough that it would be acceptable even during
> stage 3, then so would be corresponding config.gcc changes.

Well we also patch config.in and configure.ac/configure.  If those are
acceptable to be patched later too, then great.  If not, the patch
isn't really very large.  We did do this for power7 initially too:

  http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00162.html

Peter


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-05 14:40     ` Peter Bergner
@ 2012-11-05 14:48       ` Jakub Jelinek
  2012-11-06  4:47         ` Peter Bergner
  0 siblings, 1 reply; 59+ messages in thread
From: Jakub Jelinek @ 2012-11-05 14:48 UTC (permalink / raw)
  To: Peter Bergner; +Cc: gcc, gcc-patches

On Mon, Nov 05, 2012 at 08:40:00AM -0600, Peter Bergner wrote:
> Well we also patch config.in and configure.ac/configure.  If those are
> acceptable to be patched later too, then great.  If not, the patch

That is the same thing as config.gcc bits.

> isn't really very large.  We did do this for power7 initially too:
> 
>   http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00162.html

But then power7 patch went in during stage1 of the n+1 release, and
wasn't really backported to release branch (just to distro vendor branches),
right?

	Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-05 14:48       ` Jakub Jelinek
@ 2012-11-06  4:47         ` Peter Bergner
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Bergner @ 2012-11-06  4:47 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, gcc-patches

On Mon, 2012-11-05 at 15:47 +0100, Jakub Jelinek wrote:
> On Mon, Nov 05, 2012 at 08:40:00AM -0600, Peter Bergner wrote:
> > Well we also patch config.in and configure.ac/configure.  If those are
> > acceptable to be patched later too, then great.  If not, the patch
> 
> That is the same thing as config.gcc bits.
> 
> > isn't really very large.  We did do this for power7 initially too:
> > 
> >   http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00162.html
> 
> But then power7 patch went in during stage1 of the n+1 release, and
> wasn't really backported to release branch (just to distro vendor branches),
> right?

I think we could have done better there, yes, but not all of our patches
were appropriate for backporting, especially those parts that touched
outside of the port.  There will be portions of power8 we won't/don't
want to backport either, but I would like to get the major backend
portions like machine description files and the like backported to
4.8 when the time comes.  Having the configurey changes in would help
that, but if you say those are things we can get in after stage1,
then that can ease things a bit.  That said, I'll post our current
patch as is and discuss within our team and with David on what our
next course of action should be.

Peter

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
                   ` (10 preceding siblings ...)
  2012-11-05 12:42 ` Peter Bergner
@ 2012-11-06  2:57 ` Easwaran Raman
  11 siblings, 0 replies; 59+ messages in thread
From: Easwaran Raman @ 2012-11-06  2:57 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Mailing List, gcc-patches

I'd like to get a small patch to tree reassociation (
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01761.html ) in.

Thanks,
Easwaran

On Mon, Oct 29, 2012 at 10:56 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Status
> ======
>
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.
>
>
> Quality Data
> ============
>
> Priority          #   Change from Last Report
> --------        ---   -----------------------
> P1               23   + 23
> P2               77   +  8
> P3               85   + 84
> --------        ---   -----------------------
> Total           185   +115
>
>
> Previous Report
> ===============
>
> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html
>
> The next report will be sent by me again, announcing end of stage 1.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
@ 2012-10-30 23:18 Sharad Singhai
  2012-11-01  7:52 ` Sharad Singhai
  0 siblings, 1 reply; 59+ messages in thread
From: Sharad Singhai @ 2012-10-30 23:18 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, gcc

Hi Jakub,

My -fopt-info pass filtering patch
(http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being
reviewed and I hope to get this in by Nov. 5 for inclusion in gcc
4.8.0.

Thanks,
Sharad

On Mon, Oct 29, 2012 at 10:56 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Status
> ======
>
> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.
>
>
> Quality Data
> ============
>
> Priority          #   Change from Last Report
> --------        ---   -----------------------
> P1               23   + 23
> P2               77   +  8
> P3               85   + 84
> --------        ---   -----------------------
> Total           185   +115
>
>
> Previous Report
> ===============
>
> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html
>
> The next report will be sent by me again, announcing end of stage 1.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-10-30 23:18 Sharad Singhai
@ 2012-11-01  7:52 ` Sharad Singhai
  2012-11-01 12:28   ` Jakub Jelinek
  0 siblings, 1 reply; 59+ messages in thread
From: Sharad Singhai @ 2012-11-01  7:52 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, gcc

On Tue, Oct 30, 2012 at 4:04 PM, Sharad Singhai <singhai@google.com> wrote:
> Hi Jakub,
>
> My -fopt-info pass filtering patch
> (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being
> reviewed and I hope to get this in by Nov. 5 for inclusion in gcc
> 4.8.0.

I just committed -fopt-info pass filtering patch as r193061.

Thanks,
Sharad

> Thanks,
> Sharad
>
> On Mon, Oct 29, 2012 at 10:56 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>> Status
>> ======
>>
>> I'd like to close the stage 1 phase of GCC 4.8 development
>> on Monday, November 5th.  If you have still patches for new features you'd
>> like to see in GCC 4.8, please post them for review soon.  Patches
>> posted before the freeze, but reviewed shortly after the freeze, may
>> still go in, further changes should be just bugfixes and documentation
>> fixes.
>>
>>
>> Quality Data
>> ============
>>
>> Priority          #   Change from Last Report
>> --------        ---   -----------------------
>> P1               23   + 23
>> P2               77   +  8
>> P3               85   + 84
>> --------        ---   -----------------------
>> Total           185   +115
>>
>>
>> Previous Report
>> ===============
>>
>> http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html
>>
>> The next report will be sent by me again, announcing end of stage 1.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01  7:52 ` Sharad Singhai
@ 2012-11-01 12:28   ` Jakub Jelinek
  2012-11-01 13:09     ` Diego Novillo
  2012-11-01 13:54     ` Sharad Singhai
  0 siblings, 2 replies; 59+ messages in thread
From: Jakub Jelinek @ 2012-11-01 12:28 UTC (permalink / raw)
  To: Sharad Singhai; +Cc: gcc-patches, gcc

On Thu, Nov 01, 2012 at 12:52:04AM -0700, Sharad Singhai wrote:
> On Tue, Oct 30, 2012 at 4:04 PM, Sharad Singhai <singhai@google.com> wrote:
> > Hi Jakub,
> >
> > My -fopt-info pass filtering patch
> > (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being
> > reviewed and I hope to get this in by Nov. 5 for inclusion in gcc
> > 4.8.0.
> 
> I just committed -fopt-info pass filtering patch as r193061.

How was that change tested?  I'm seeing thousands of new UNRESOLVED
failures, of the form:
spawn -ignore SIGHUP /usr/src/gcc/obj415/gcc/xgcc -B/usr/src/gcc/obj415/gcc/ /usr/src/gcc/gcc/testsuite/gcc.target/i386/branch-cost1.c -fno-diagnostics-show-caret -O2 -fdump-tree-gimple -mbranch-cost=0 -S -o branch-cost1.s
PASS: gcc.target/i386/branch-cost1.c (test for excess errors)
gcc.target/i386/branch-cost1.c: dump file does not exist
UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-times gimple "if " 2
gcc.target/i386/branch-cost1.c: dump file does not exist
UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-not gimple " & "

See http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00033.html
or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00034.html, compare that
to http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00025.html
or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00026.html

The difference is just your patch and unrelated sh backend change.

	Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 12:28   ` Jakub Jelinek
@ 2012-11-01 13:09     ` Diego Novillo
  2012-11-01 16:41       ` Sharad Singhai
  2012-11-01 13:54     ` Sharad Singhai
  1 sibling, 1 reply; 59+ messages in thread
From: Diego Novillo @ 2012-11-01 13:09 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Sharad Singhai, gcc-patches, gcc

On Thu, Nov 1, 2012 at 8:28 AM, Jakub Jelinek <jakub@redhat.com> wrote:

> How was that change tested?  I'm seeing thousands of new UNRESOLVED
> failures, of the form:
> spawn -ignore SIGHUP /usr/src/gcc/obj415/gcc/xgcc -B/usr/src/gcc/obj415/gcc/ /usr/src/gcc/gcc/testsuite/gcc.target/i386/branch-cost1.c -fno-diagnostics-show-caret -O2 -fdump-tree-gimple -mbranch-cost=0 -S -o branch-cost1.s
> PASS: gcc.target/i386/branch-cost1.c (test for excess errors)
> gcc.target/i386/branch-cost1.c: dump file does not exist
> UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-times gimple "if " 2
> gcc.target/i386/branch-cost1.c: dump file does not exist
> UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-not gimple " & "
>
> See http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00033.html
> or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00034.html, compare that
> to http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00025.html
> or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00026.html
>
> The difference is just your patch and unrelated sh backend change.

I'm seeing the same failures.  Sharad, could you fix them or revert your change?


Thanks.  Diego.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 13:09     ` Diego Novillo
@ 2012-11-01 16:41       ` Sharad Singhai
  2012-11-01 16:44         ` Diego Novillo
  2012-11-01 18:02         ` Sterling Augustine
  0 siblings, 2 replies; 59+ messages in thread
From: Sharad Singhai @ 2012-11-01 16:41 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jakub Jelinek, gcc-patches, gcc

I found the problem and the following patch fixes it. The issue with
my testing was that I was only looking at 'FAIL' lines but forgot to
tally the 'UNRESOLVED' test cases, the real symptoms of my test
problems.  In any case,  I am rerunning the whole testsuite just to be
sure.

Assuming tests pass, is it okay to commit the following?

Thanks,
Sharad

2012-11-01  Sharad Singhai  <singhai@google.com>

PR other/55164
* dumpfile.h (struct dump_file_info): Fix order of flags.

Index: dumpfile.h
===================================================================
--- dumpfile.h (revision 193061)
+++ dumpfile.h (working copy)
@@ -113,8 +113,8 @@ struct dump_file_info
   const char *alt_filename;     /* filename for the -fopt-info stream  */
   FILE *pstream;                /* pass-specific dump stream  */
   FILE *alt_stream;             /* -fopt-info stream */
+  int pflags;                   /* dump flags */
   int optgroup_flags;           /* optgroup flags for -fopt-info */
-  int pflags;                   /* dump flags */
   int alt_flags;                /* flags for opt-info */
   int pstate;                   /* state of pass-specific stream */
   int alt_state;                /* state of the -fopt-info stream */

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 16:41       ` Sharad Singhai
@ 2012-11-01 16:44         ` Diego Novillo
  2012-11-01 17:59           ` Sharad Singhai
  2012-11-01 18:02         ` Sterling Augustine
  1 sibling, 1 reply; 59+ messages in thread
From: Diego Novillo @ 2012-11-01 16:44 UTC (permalink / raw)
  To: Sharad Singhai; +Cc: Jakub Jelinek, gcc-patches, gcc

On Thu, Nov 1, 2012 at 12:40 PM, Sharad Singhai <singhai@google.com> wrote:
> I found the problem and the following patch fixes it. The issue with
> my testing was that I was only looking at 'FAIL' lines but forgot to
> tally the 'UNRESOLVED' test cases, the real symptoms of my test
> problems.  In any case,  I am rerunning the whole testsuite just to be
> sure.
>
> Assuming tests pass, is it okay to commit the following?
>
> Thanks,
> Sharad
>
> 2012-11-01  Sharad Singhai  <singhai@google.com>
>
> PR other/55164
> * dumpfile.h (struct dump_file_info): Fix order of flags.

OK (remember to insert a tab at the start of each ChangeLog line).


Diego.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 16:44         ` Diego Novillo
@ 2012-11-01 17:59           ` Sharad Singhai
  0 siblings, 0 replies; 59+ messages in thread
From: Sharad Singhai @ 2012-11-01 17:59 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jakub Jelinek, gcc-patches, gcc

On Thu, Nov 1, 2012 at 9:44 AM, Diego Novillo <dnovillo@google.com> wrote:
> On Thu, Nov 1, 2012 at 12:40 PM, Sharad Singhai <singhai@google.com> wrote:
>> I found the problem and the following patch fixes it. The issue with
>> my testing was that I was only looking at 'FAIL' lines but forgot to
>> tally the 'UNRESOLVED' test cases, the real symptoms of my test
>> problems.  In any case,  I am rerunning the whole testsuite just to be
>> sure.
>>
>> Assuming tests pass, is it okay to commit the following?
>>
>> Thanks,
>> Sharad
>>
>> 2012-11-01  Sharad Singhai  <singhai@google.com>
>>
>> PR other/55164
>> * dumpfile.h (struct dump_file_info): Fix order of flags.
>
> OK (remember to insert a tab at the start of each ChangeLog line).

Fixed tab chars. (they were really there, but gmail ate them! :))

Retested and found all my 'UNRESOLVED' problems were gone. Hence
committed the fix as r193064.

Thanks,
Sharad

>
> Diego.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 16:41       ` Sharad Singhai
  2012-11-01 16:44         ` Diego Novillo
@ 2012-11-01 18:02         ` Sterling Augustine
  1 sibling, 0 replies; 59+ messages in thread
From: Sterling Augustine @ 2012-11-01 18:02 UTC (permalink / raw)
  To: gcc-patches, gcc; +Cc: Jakub Jelinek

Hi Jakub,

I would like to get the fission implementation in before stage 1. It
has been under review for some time, and is awaiting another round of
review now.

More info here:

http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02684.html

Sterling

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  2012-11-01 12:28   ` Jakub Jelinek
  2012-11-01 13:09     ` Diego Novillo
@ 2012-11-01 13:54     ` Sharad Singhai
  1 sibling, 0 replies; 59+ messages in thread
From: Sharad Singhai @ 2012-11-01 13:54 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, gcc

I am really sorry about that. I am looking and will fix the breakage
or revert the patch shortly.

Thanks,
Sharad

On Thu, Nov 1, 2012 at 5:28 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Nov 01, 2012 at 12:52:04AM -0700, Sharad Singhai wrote:
>> On Tue, Oct 30, 2012 at 4:04 PM, Sharad Singhai <singhai@google.com> wrote:
>> > Hi Jakub,
>> >
>> > My -fopt-info pass filtering patch
>> > (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being
>> > reviewed and I hope to get this in by Nov. 5 for inclusion in gcc
>> > 4.8.0.
>>
>> I just committed -fopt-info pass filtering patch as r193061.
>
> How was that change tested?  I'm seeing thousands of new UNRESOLVED
> failures, of the form:
> spawn -ignore SIGHUP /usr/src/gcc/obj415/gcc/xgcc -B/usr/src/gcc/obj415/gcc/ /usr/src/gcc/gcc/testsuite/gcc.target/i386/branch-cost1.c -fno-diagnostics-show-caret -O2 -fdump-tree-gimple -mbranch-cost=0 -S -o branch-cost1.s
> PASS: gcc.target/i386/branch-cost1.c (test for excess errors)
> gcc.target/i386/branch-cost1.c: dump file does not exist
> UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-times gimple "if " 2
> gcc.target/i386/branch-cost1.c: dump file does not exist
> UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-not gimple " & "
>
> See http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00033.html
> or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00034.html, compare that
> to http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00025.html
> or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00026.html
>
> The difference is just your patch and unrelated sh backend change.
>
>         Jakub

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2013-02-27 12:39 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-29 18:08 GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Jakub Jelinek
2012-10-29 18:13 ` David Miller
2012-10-29 18:32   ` Eric Botcazou
2012-10-29 18:42     ` David Miller
2012-10-30  8:22   ` Jakub Jelinek
2012-10-29 22:14 ` Magnus Granberg
2012-10-30  7:01 ` Gopalasubramanian, Ganesh
2012-10-30 13:47 ` Diego Novillo
2012-10-30 21:31   ` Lawrence Crowl
2012-10-30 21:07 ` Kenneth Zadeck
2012-10-31 10:00   ` Richard Biener
2012-10-31 10:02     ` Richard Sandiford
2012-10-31 10:13       ` Richard Biener
2012-10-31 13:54       ` Kenneth Zadeck
2012-10-31 14:05         ` Jakub Jelinek
2012-10-31 14:06           ` Kenneth Zadeck
2012-10-31 14:31             ` Jakub Jelinek
2012-10-31 14:56               ` Kenneth Zadeck
2012-10-31 18:42               ` Kenneth Zadeck
2012-11-01 12:44                 ` Kenneth Zadeck
2012-11-01 13:10                   ` Richard Sandiford
2012-11-01 13:18                     ` Kenneth Zadeck
2012-11-01 13:24                     ` Kenneth Zadeck
2012-11-01 15:16                     ` Richard Sandiford
2012-11-04 16:54                     ` Richard Biener
2012-11-05 13:59                       ` Kenneth Zadeck
2012-11-05 17:00                         ` Kenneth Zadeck
2012-11-26 15:03                         ` Richard Biener
2012-11-26 16:03                           ` Kenneth Zadeck
2012-11-26 16:30                             ` Richard Biener
2012-11-27  0:06                               ` Kenneth Zadeck
2012-11-27 10:03                                 ` Richard Biener
2012-11-27 13:03                                   ` Kenneth Zadeck
2012-10-31 19:13         ` Marc Glisse
2013-02-27 12:39       ` patch to fix constant math - 5th patch - the main rtl work Kenneth Zadeck
2012-10-31 18:34     ` GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon Andrew Haley
2012-10-30 22:06 ` Sriraman Tallam
2012-10-31  9:09 ` Bin Cheng
2012-10-31 10:23 ` Richard Biener
2012-11-05 16:32   ` David Malcolm
2012-10-31 10:31 ` JonY
2012-10-31 10:44   ` Jakub Jelinek
2012-10-31 11:12   ` Jonathan Wakely
2012-11-02 22:51 ` [wwwdocs] PATCH for " Gerald Pfeifer
2012-11-05 12:42 ` Peter Bergner
2012-11-05 12:53   ` Jakub Jelinek
2012-11-05 14:40     ` Peter Bergner
2012-11-05 14:48       ` Jakub Jelinek
2012-11-06  4:47         ` Peter Bergner
2012-11-06  2:57 ` Easwaran Raman
2012-10-30 23:18 Sharad Singhai
2012-11-01  7:52 ` Sharad Singhai
2012-11-01 12:28   ` Jakub Jelinek
2012-11-01 13:09     ` Diego Novillo
2012-11-01 16:41       ` Sharad Singhai
2012-11-01 16:44         ` Diego Novillo
2012-11-01 17:59           ` Sharad Singhai
2012-11-01 18:02         ` Sterling Augustine
2012-11-01 13:54     ` Sharad Singhai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).