public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Design a microcontroller for gcc
@ 2006-02-14 23:22 Sylvain Munaut
  2006-02-14 23:40 ` DJ Delorie
  2006-02-15 22:10 ` Hans-Peter Nilsson
  0 siblings, 2 replies; 26+ messages in thread
From: Sylvain Munaut @ 2006-02-14 23:22 UTC (permalink / raw)
  To: gcc

Hello,


I'm currently considering writing my own microcontroller to use on
a FPGA. Since I'd like to be able to use C to write
"non-timing-critical" parts of my code, I thought I'd include a gcc port
as part of the design considerations.


I've read the online manual about gcc backend and googled to find
comments about gcc for microcontroller and I found that thread
particulary interesting :

http://gcc.gnu.org/ml/gcc/2003-03/msg01402.html


Here is a small description of the type of microcontroller I'm thinking of :
	* 8 bit RISC microcontroller
	* 16 general purpose registers
	* 16 level deep hardware call stack
	* no special instruction or regs for the stack
	* load/store to a flat "memory" (minimum 1k, up to 16k)
	  (addressing in that memory uses a register for the 8 lsb and
           a special register for the msbs)
	  that memory can be loaded along with the code to have known
	  content at startup for eg.
	* classic add/sub/... have 3 registers operand (dest,src1,src2)
	* no hardware mul, nor shift-by-n (have to be done in software)
	* pipelined with some restriction on instruction scheduling
	  (cfr later)
	* a special "I/O" space (gcc shouldn't care, we should only
	  access it in asm)
	* relative short/long call/branch, absolute short/long
	  call/branch. (The long branch/jump are achieved by preloading
	  a special GPR with the MSBs to use for the next jump)
	* 2 flags Carry & Zero for testing.
	
	

(note these are not random choices, I know how I can implement that in a
FPGA quite efficiently ...)

My first question would be : "Do you see anything that's missing that
would be a great plus ?"

I mentionned earlier that there is some scheduling restriction on the
instructions due to internal pipelining. For example, the result of a
fetch from memory may not be used in the instruction directly following
the fetch. When there is a conditionnal branch, the instruction just
following the branch will always be executed, no matter what the result
is (branch/call/... are not immediate but have a 1 instruction latency
beforce the occur). Is theses kind of limitation 'easily' supported by
gcc ?

I saw several time that gcc works better with a lot of GPRs. I could
increase them to 32 but then arithmetic instructions would have to use
the same register for destination than for src1.

eg. for the add instruction,  I'm planning :

add rD, rA, rB		; rD = rA + rB
add rD, rD, imm8	; rD = rD + immediate_8bits

but if I allow 32 registers, my opcode is too short so I have to limit :

add rD, rD, rA		; rD = rD + rA
add rD, rD, imm8	; rD = rD + immediate_8bits

I'm more in favor of only 16 registers because :
 - Some uc have even less and have working gcc
 - 16 is already a good number
 - 16 uses less space in the FPGA ;p
 - I'd rather have the possibility to store result elsewhere than have
32 registers.

But maybe I'm wrong ...



Any comments, thoughts, ... are welcomed !



	Sylvain

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-14 23:22 Design a microcontroller for gcc Sylvain Munaut
@ 2006-02-14 23:40 ` DJ Delorie
  2006-02-15  0:28   ` Sylvain Munaut
  2006-02-15 22:10 ` Hans-Peter Nilsson
  1 sibling, 1 reply; 26+ messages in thread
From: DJ Delorie @ 2006-02-14 23:40 UTC (permalink / raw)
  To: tnt; +Cc: gcc


> 	* 8 bit RISC microcontroller

Not 16?

> 	* 16 general purpose registers
> 	* 16 level deep hardware call stack

If you have RAM, why not use it?  Model calls like the PPC - put
current $pc in a register and jump.  The caller saves the old $pc in
the regular stack.  GCC is going to want a "normal" frame.  This is
easy to do in hardware, and more flexible than a hardware call stack.

> 	* load/store to a flat "memory" (minimum 1k, up to 16k)
> 	  (addressing in that memory uses a register for the 8 lsb and
>            a special register for the msbs)
> 	  that memory can be loaded along with the code to have known
> 	  content at startup for eg.

GCC is going to want register pairs to work as larger registers.
Like, if you have $r2 and $r3 as 8 bit registers, gcc wants [$r2$r3]
to be usable as a 16 bit register.  Another reason to go with 16 bit
registers ;-)

GCC won't like having an address split across "special" registers.

But it's OK to limit index registers to evenly numbered ones.

> 	* pipelined with some restriction on instruction scheduling
> 	  (cfr later)

GCC works better if the hardware enforces the locks; it's good at
scheduling pipelines but it doesn't *always* do the right thing; it's
easier if your hardware allows this, if suboptimally.

Of course, I don't know *that* much about the current scheduler.
There may be a way to deal with this cleanly now.

> 	* 2 flags Carry & Zero for testing.

GCC will want 4 (add sign and overflow) to support signed comparisons.
Sign should be easy; overflow is the carry out of bit 6.

> I mentionned earlier that there is some scheduling restriction on the
> instructions due to internal pipelining. For example, the result of a
> fetch from memory may not be used in the instruction directly following
> the fetch. When there is a conditionnal branch, the instruction just
> following the branch will always be executed, no matter what the result
> is (branch/call/... are not immediate but have a 1 instruction latency
> beforce the occur). Is theses kind of limitation 'easily' supported by
> gcc ?

Delay slots are common; gcc handles them well.  You might need to add
custom code to enforce the pipeline rules if your pipeline won't
automatically stall.

> I saw several time that gcc works better with a lot of GPRs. I could
> increase them to 32 but then arithmetic instructions would have to use
> the same register for destination than for src1.

16 is sufficient.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-14 23:40 ` DJ Delorie
@ 2006-02-15  0:28   ` Sylvain Munaut
  2006-02-15  0:41     ` DJ Delorie
  0 siblings, 1 reply; 26+ messages in thread
From: Sylvain Munaut @ 2006-02-15  0:28 UTC (permalink / raw)
  To: DJ Delorie, gcc

DJ Delorie wrote:
>>	* 8 bit RISC microcontroller
> 
> Not 16?

Well, at first it was to save space in the FPGA (basically, the regs and
ALU takes twice the space) and because many 16 bits ops can be done with
8 bits regs and lots of the code I do can live with that.

But I had a quick glance at the diagrams I drawed and with the
modification I should do to supports 16 bits pointers with 8 bits regs,
the hardware "tricks" might end up costing me more than the doubled reg
banks and ALU.

Another reason is speed ... I'd like to run that thing at 133 Mhz in a
low cost spartan 3 ... (2 cycles per instructions so 66MIPS at the end)
and a 16 bits carry chain is ... well ... longer ;) I'll do some timing
test to see if it's viable.

The final reason are immediates. I can't have 16 bits immediates in my
opecode, there is just not the room ... (opcodes are 18 bits, the width
of a classical ram-block in my FPGA technology). Do you have a 16 bits
uc that fits gcc really well that I could use as a guide for the
instruction set ?

Using 16 bits regs, do I have to support 8 bits operations on them ?



>>	* 16 level deep hardware call stack
> 
> If you have RAM, why not use it?  Model calls like the PPC - put
> current $pc in a register and jump.  The caller saves the old $pc in
> the regular stack.  GCC is going to want a "normal" frame.  This is
> easy to do in hardware, and more flexible than a hardware call stack.

Well that solution isn't that easy with 8 bits regs and the 16 deep
stack is the same a a single register in the FPGAs I use. But now, with
16 bits regs, it might simplify this and that may become a better solution.

>>	* load/store to a flat "memory" (minimum 1k, up to 16k)
>>	  (addressing in that memory uses a register for the 8 lsb and
>>           a special register for the msbs)
>>	  that memory can be loaded along with the code to have known
>>	  content at startup for eg.
> 
> GCC is going to want register pairs to work as larger registers.
> Like, if you have $r2 and $r3 as 8 bit registers, gcc wants [$r2$r3]
> to be usable as a 16 bit register.  Another reason to go with 16 bit
> registers ;-)
> 
> GCC won't like having an address split across "special" registers.
> 
> But it's OK to limit index registers to evenly numbered ones.

So If I use 16 bits registers, do I have to handle pairs of them to form
32 bits ?

16 bits regs keeps sounding better and better ... I know HDL better than
gcc so if it can ease the gcc port at the cost of a slightly bigger/more
complex HDL design, I'm willing to make the trade-off.


>>	* 2 flags Carry & Zero for testing.
> 
> GCC will want 4 (add sign and overflow) to support signed comparisons.
> Sign should be easy; overflow is the carry out of bit 6.

Shouldn't be much of a problem to add that.

>>I mentionned earlier that there is some scheduling restriction on the
>>instructions due to internal pipelining. For example, the result of a
>>fetch from memory may not be used in the instruction directly following
>>the fetch. When there is a conditionnal branch, the instruction just
>>following the branch will always be executed, no matter what the result
>>is (branch/call/... are not immediate but have a 1 instruction latency
>>beforce the occur). Is theses kind of limitation 'easily' supported by
>>gcc ?
> 
> Delay slots are common; gcc handles them well.  You might need to add
> custom code to enforce the pipeline rules if your pipeline won't
> automatically stall.

Yes, making the pipeline stall isn't easy in my case and detecting those
case will costs me logic and decrease my timing margin. The hardware
detects that the preceding instruction writes in a register read by the
current instructions and forward the results. But for memory fetch (and
io access), there is just no way, the results appears too late ...


Sylvain

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-15  0:28   ` Sylvain Munaut
@ 2006-02-15  0:41     ` DJ Delorie
  2006-02-15 19:59       ` Sylvain Munaut
  0 siblings, 1 reply; 26+ messages in thread
From: DJ Delorie @ 2006-02-15  0:41 UTC (permalink / raw)
  To: tnt; +Cc: gcc


> The final reason are immediates. I can't have 16 bits immediates in my
> opecode,

Not a problem; gcc knows how to use two insns to load immediates like
that.  MIPS does it.  Actually, most RISC processors end up doing
that.  What you *would* want, though is a "load sign-extended
immediate" for loading small values in one instruction.

> Using 16 bits regs, do I have to support 8 bits operations on them ?

At least load/save to memory.  And sign extension.

> > But it's OK to limit index registers to evenly numbered ones.
> 
> So If I use 16 bits registers, do I have to handle pairs of them to form
> 32 bits ?

Well, you don't *have* to if your word size is only 16 bits.  GCC will
still pair them, but you'll need to tell gcc how to split them back up
for the opcodes you have available.

Note that there are some operations that gcc assumes you have 32-bit
opcodes for, though.  Or at least insns that emulate it.

You really want to have native support for sizeof(int) values and
sizeof(void *) values, bigger things can be emulated or broken up.

> Yes, making the pipeline stall isn't easy in my case and detecting
> those case will costs me logic and decrease my timing margin. The
> hardware detects that the preceding instruction writes in a register
> read by the current instructions and forward the results. But for
> memory fetch (and io access), there is just no way, the results
> appears too late ...

Let gcc do it then.  You might end up adding a machine-specific
"reorg" pass that adds a few no-ops here and there to satisfy the
design rules.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-15  0:41     ` DJ Delorie
@ 2006-02-15 19:59       ` Sylvain Munaut
  2006-02-15 20:06         ` DJ Delorie
  0 siblings, 1 reply; 26+ messages in thread
From: Sylvain Munaut @ 2006-02-15 19:59 UTC (permalink / raw)
  To: DJ Delorie; +Cc: gcc

DJ Delorie wrote:
>>So If I use 16 bits registers, do I have to handle pairs of them to form
>>32 bits ?
> 
> 
> Well, you don't *have* to if your word size is only 16 bits.  GCC will
> still pair them, but you'll need to tell gcc how to split them back up
> for the opcodes you have available.
> 
> Note that there are some operations that gcc assumes you have 32-bit
> opcodes for, though.  Or at least insns that emulate it.

Like what ?


> You really want to have native support for sizeof(int) values and
> sizeof(void *) values, bigger things can be emulated or broken up.

Ok, then so it will be ;)


Any way, thanks for all the advices, I'll try to come up with a
good instruction set (both in regards to effecient implementation and
effecient for running code)



Sylvain

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-15 19:59       ` Sylvain Munaut
@ 2006-02-15 20:06         ` DJ Delorie
  2006-02-15 20:23           ` Paul Brook
  0 siblings, 1 reply; 26+ messages in thread
From: DJ Delorie @ 2006-02-15 20:06 UTC (permalink / raw)
  To: tnt; +Cc: gcc


> > Note that there are some operations that gcc assumes you have 32-bit
> > opcodes for, though.  Or at least insns that emulate it.
> 
> Like what ?

Well, cmpsi2 for example.  and divsi2.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-15 20:06         ` DJ Delorie
@ 2006-02-15 20:23           ` Paul Brook
  2006-02-15 21:21             ` DJ Delorie
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Brook @ 2006-02-15 20:23 UTC (permalink / raw)
  To: gcc; +Cc: DJ Delorie, tnt

On Wednesday 15 February 2006 20:06, DJ Delorie wrote:
> > > Note that there are some operations that gcc assumes you have 32-bit
> > > opcodes for, though.  Or at least insns that emulate it.
> >
> > Like what ?
>
> Well, cmpsi2 for example.  and divsi2.

You mean divsi3? Many targets don't have div at all.

Paul

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-15 20:23           ` Paul Brook
@ 2006-02-15 21:21             ` DJ Delorie
  0 siblings, 0 replies; 26+ messages in thread
From: DJ Delorie @ 2006-02-15 21:21 UTC (permalink / raw)
  To: paul; +Cc: gcc, tnt


> > Well, cmpsi2 for example.  and divsi2.
> 
> You mean divsi3? Many targets don't have div at all.

Er, right.  divsi3.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-14 23:22 Design a microcontroller for gcc Sylvain Munaut
  2006-02-14 23:40 ` DJ Delorie
@ 2006-02-15 22:10 ` Hans-Peter Nilsson
  2006-02-15 23:10   ` David Daney
  2006-02-16  0:26   ` Sylvain Munaut
  1 sibling, 2 replies; 26+ messages in thread
From: Hans-Peter Nilsson @ 2006-02-15 22:10 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: gcc

On Wed, 15 Feb 2006, Sylvain Munaut wrote:
> 	* 2 flags Carry & Zero for testing.

I think most of your questions have been answered, so let me
just add that if nothing else, the port will be much simplified
if you make sure that only specific compare instructions set
condition codes, i.e. not as a nice side-effect of move, add and
sub - or at least make such condition-code side-effects
optional.  It depends on too many undisclosed details like
pipeline restrictions to say whether performance is generally
better or worse, but I can tell for sure that the GCC port will
be simpler with a specific set of condition-code setting insns.

BTW, it depends on the compare (and branch) instructions whether
just two flags are sufficient.

brgds, H-P

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-15 22:10 ` Hans-Peter Nilsson
@ 2006-02-15 23:10   ` David Daney
  2006-02-16  0:26   ` Sylvain Munaut
  1 sibling, 0 replies; 26+ messages in thread
From: David Daney @ 2006-02-15 23:10 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: Sylvain Munaut, gcc

Hans-Peter Nilsson wrote:
> On Wed, 15 Feb 2006, Sylvain Munaut wrote:
> 
>>	* 2 flags Carry & Zero for testing.

> BTW, it depends on the compare (and branch) instructions whether
> just two flags are sufficient.
> 

That's true.  MIPS for example seems to get by with no flags, although 
it makes multi-word addition/subtraction sequences a bit large.

David Daney

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-15 22:10 ` Hans-Peter Nilsson
  2006-02-15 23:10   ` David Daney
@ 2006-02-16  0:26   ` Sylvain Munaut
  2006-02-16  0:46     ` DJ Delorie
  2006-02-16  2:49     ` Hans-Peter Nilsson
  1 sibling, 2 replies; 26+ messages in thread
From: Sylvain Munaut @ 2006-02-16  0:26 UTC (permalink / raw)
  To: Hans-Peter Nilsson, gcc

Hans-Peter Nilsson wrote:
> On Wed, 15 Feb 2006, Sylvain Munaut wrote:
> 
>>	* 2 flags Carry & Zero for testing.
> 
> 
> I think most of your questions have been answered, so let me
> just add that if nothing else, the port will be much simplified
> if you make sure that only specific compare instructions set
> condition codes, i.e. not as a nice side-effect of move, add and
> sub - or at least make such condition-code side-effects
> optional.  It depends on too many undisclosed details like
> pipeline restrictions to say whether performance is generally
> better or worse, but I can tell for sure that the GCC port will
> be simpler with a specific set of condition-code setting insns.

Making it optionnal is not hard nor expensive in hardware, the problem
is that my opcodes need to be 18 bits and I won't have space to stuff
another option bit ...

What I was thinking for the moment was to have :
 - sign is always the msb of the last ALU output
 - add/sub to modify all flags
 - move/xor/and/not/or only affect zero (and sign)
 - shift operations always affect carry and zero
 - Have some specific instructions like compare and test, but theses
   would only operate on registers (and not on immediate)

What's so bad about have the flag as side-effects ?

Here it's a simple MCU, it doesn't have a very long pipeline and that
pipeline is 'almost' invisible to the end-user exception for memory
fetch and io/access ...


> BTW, it depends on the compare (and branch) instructions whether
> just two flags are sufficient.
g
Adding Sign and overflow is pretty easy. And the compare
instruction/logic path shouldn't be a problem either.

MIPS has no flag ??? how does branching work ?



Finally, about immediates, I'm thinking of having instruction like add
could have 4 different forms :
add rD, rA, rB
add rA, rA, imm
add rA, rA, imm<<8
add rA, rA, signextend(imm)

Is that kind of manipulation on the immediate well understood by gcc
internals ?

Or maybe just allow immediates in the mov but that seems like a big
penalty ...


Sylvain

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  0:26   ` Sylvain Munaut
@ 2006-02-16  0:46     ` DJ Delorie
  2006-02-16  2:49     ` Hans-Peter Nilsson
  1 sibling, 0 replies; 26+ messages in thread
From: DJ Delorie @ 2006-02-16  0:46 UTC (permalink / raw)
  To: tnt; +Cc: gcc


> What's so bad about have the flag as side-effects ?

You can't put any other insn between the compare and the jump.  Like,
if you wanted to move an address into a register to do the jump, you'd
lose the condition bits.

The advantage of having most insns set flags, is you can sometimes
avoid the compare completely.

> MIPS has no flag ??? how does branching work ?

Some chips combine the compare and jump into one insn,
like "jeq $r0,4,label".

> Or maybe just allow immediates in the mov but that seems like a big
> penalty ...

Most risc chips have more move insns than other opcodes.  So, you'd
have two adds (register and sign- or zero-extended immediate), and a
variety of moves (lower, upper, extended, etc).

You have to think about what kind of constants are going to be common
in your software, and plan accordingly.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  0:26   ` Sylvain Munaut
  2006-02-16  0:46     ` DJ Delorie
@ 2006-02-16  2:49     ` Hans-Peter Nilsson
  2006-02-16  3:21       ` DJ Delorie
  1 sibling, 1 reply; 26+ messages in thread
From: Hans-Peter Nilsson @ 2006-02-16  2:49 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: gcc

On Thu, 16 Feb 2006, Sylvain Munaut wrote:
> What I was thinking for the moment was to have :
>  - sign is always the msb of the last ALU output
>  - add/sub to modify all flags
>  - move/xor/and/not/or only affect zero (and sign)
>  - shift operations always affect carry and zero
>  - Have some specific instructions like compare and test, but theses
>    would only operate on registers (and not on immediate)

No, really.  Just use compare insns.  (And perhaps some way for
carry propagation for multi-word add/sub, if that mechanism
interferes.  BTW, carry-out from shifts is very rarely used in
compiled code.)

> What's so bad about have the flag as side-effects ?

Besides what DJ said about performance (both pros and cons
there), the problem is as I said with port complexity, because
of the way you have to handle condition codes in gcc.
(_Should_ now, _have_to_ in the future -- or actually now, as
you say you need scheduling.)  Flag setting really should be
explicit, so with your way, you have to show that add and sub
etc. also set condition codes.  And that's where you notice the
complexity in the port, because (partly because of gcc
pecularities) unless you want to lose performancewise, you need
to show that most of the time, the flag register result is just
clobbered by those operations and not used.  Anyway, at least
keep a way to add reg+reg and reg+integer, load and store of
memory and load of integer and address without condition code
effects and your port has a chance to avoid the related bloat.

Sorry, I won't spend the time to spell out the details.
Whatever: if you're determined on your way to do it and won't
take advice you asked for, by all means feel free.  You _have_
been warned, though.

brgds, H-P

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  2:49     ` Hans-Peter Nilsson
@ 2006-02-16  3:21       ` DJ Delorie
  2006-02-16  3:34         ` Hans-Peter Nilsson
  2006-02-16 11:42         ` Hans-Peter Nilsson
  0 siblings, 2 replies; 26+ messages in thread
From: DJ Delorie @ 2006-02-16  3:21 UTC (permalink / raw)
  To: hp; +Cc: tnt, gcc


> BTW, carry-out from shifts is very rarely used in compiled code.)

Unless you've expanded SI shifts into a pair of HI shifts.

> Besides what DJ said about performance (both pros and cons
> there), the problem is as I said with port complexity, because
> of the way you have to handle condition codes in gcc.

Unless you tell gcc that the condition codes are hard register?
That's what m32c does; it has separate cmp/jmp and most opcodes set
flags, so I just set an attribute that says which flags are set by
each insn.  Then, I can add a reorg pass to delete the cmps if the
previous insn that set the flags happened to set them right.

> Anyway, at least keep a way to add reg+reg and reg+integer, load and
> store of memory and load of integer and address without condition
> code effects and your port has a chance to avoid the related bloat.

At least, move/load/store shouldn't touch flags.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  3:21       ` DJ Delorie
@ 2006-02-16  3:34         ` Hans-Peter Nilsson
  2006-02-16  3:37           ` Hans-Peter Nilsson
  2006-02-16  3:40           ` DJ Delorie
  2006-02-16 11:42         ` Hans-Peter Nilsson
  1 sibling, 2 replies; 26+ messages in thread
From: Hans-Peter Nilsson @ 2006-02-16  3:34 UTC (permalink / raw)
  To: DJ Delorie; +Cc: tnt, gcc

On Wed, 15 Feb 2006, DJ Delorie wrote:
> > BTW, carry-out from shifts is very rarely used in compiled code.)
> Unless you've expanded SI shifts into a pair of HI shifts.
>
> > Besides what DJ said about performance (both pros and cons
> > there), the problem is as I said with port complexity, because
> > of the way you have to handle condition codes in gcc.
>
> Unless you tell gcc that the condition codes are hard register?

No "unless" here.  You either have a clobber or a set in a
parallel with the main feature, and you lose out on all the
single_set-directed optimizations if you put in a "set" early.

> That's what m32c does; it has separate cmp/jmp and most opcodes set
> flags, so I just set an attribute that says which flags are set by
> each insn.  Then, I can add a reorg pass to delete the cmps if the
> previous insn that set the flags happened to set them right.

A machine dependent reorg pass isn't something I'd recommend
given that there are other possibilities.  FWIW, I use
peephole2s and condition code modes in CRIS w.i.p.  Works ok,
except for all the things that doesn't like insns with parallels
that I have to weed out to get performance on par with the cc0
representation.

brgds, H-P

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  3:34         ` Hans-Peter Nilsson
@ 2006-02-16  3:37           ` Hans-Peter Nilsson
  2006-02-16  3:44             ` DJ Delorie
  2006-02-16  3:40           ` DJ Delorie
  1 sibling, 1 reply; 26+ messages in thread
From: Hans-Peter Nilsson @ 2006-02-16  3:37 UTC (permalink / raw)
  To: DJ Delorie; +Cc: tnt, gcc

On Wed, 15 Feb 2006, Hans-Peter Nilsson wrote:

> FWIW, I use
> peephole2s and condition code modes in CRIS w.i.p.

...and cbranch (cc setter + user in one combined insn)  which
are split after reload.

brgds, H-P

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  3:34         ` Hans-Peter Nilsson
  2006-02-16  3:37           ` Hans-Peter Nilsson
@ 2006-02-16  3:40           ` DJ Delorie
  2006-02-16  3:56             ` Hans-Peter Nilsson
  1 sibling, 1 reply; 26+ messages in thread
From: DJ Delorie @ 2006-02-16  3:40 UTC (permalink / raw)
  To: hp; +Cc: tnt, gcc


> No "unless" here.  You either have a clobber or a set in a parallel
> with the main feature, and you lose out on all the
> single_set-directed optimizations if you put in a "set" early.

"Oh, crap"

I hope I can stick with my cmp/jmp model and manage them myself still,
though, because there's a LOT of patterns in m32c where the set of
flags affected depends on which alternative you select, and most
patterns affect the flags in some (usually nonorthagonal) way.

Or is gcc going to start putting things between the cmp and jmp?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  3:37           ` Hans-Peter Nilsson
@ 2006-02-16  3:44             ` DJ Delorie
  0 siblings, 0 replies; 26+ messages in thread
From: DJ Delorie @ 2006-02-16  3:44 UTC (permalink / raw)
  To: hp; +Cc: tnt, gcc


> ...and cbranch (cc setter + user in one combined insn) which are
> split after reload.

I have the cbranch and split, but allow it before reload.  So far that
hasn't been a problem, although I split it only to delete the cmp if I
can.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  3:40           ` DJ Delorie
@ 2006-02-16  3:56             ` Hans-Peter Nilsson
  2006-02-16  4:10               ` Hans-Peter Nilsson
  2006-02-16  4:13               ` DJ Delorie
  0 siblings, 2 replies; 26+ messages in thread
From: Hans-Peter Nilsson @ 2006-02-16  3:56 UTC (permalink / raw)
  To: DJ Delorie; +Cc: tnt, gcc

On Wed, 15 Feb 2006, DJ Delorie wrote:
> I hope I can stick with my cmp/jmp model and manage them myself still,
> though, because there's a LOT of patterns in m32c where the set of
> flags affected depends on which alternative you select, and most
> patterns affect the flags in some (usually nonorthagonal) way.

Unless I'm delirious (it's way past bedtime) I see a m32c port
and it's cc0-free.  Is there a problem?

> Or is gcc going to start putting things between the cmp and jmp?

Yes.  At least reload wants to do that.  The choice a port has
is to either have cc-free reload insns (like i386) or keep the
cc setter and user combined at least until after reload
(cbranch, but you don't have to use the cbranchM4 name; you can
do the combination to a cbranch-type insn in the CC user).
Not my idea, so it's probably sane. :-)

brgds, H-P
PS. There may be other choices, but none that caught my attention.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  3:56             ` Hans-Peter Nilsson
@ 2006-02-16  4:10               ` Hans-Peter Nilsson
  2006-02-16  4:17                 ` DJ Delorie
  2006-02-16  4:13               ` DJ Delorie
  1 sibling, 1 reply; 26+ messages in thread
From: Hans-Peter Nilsson @ 2006-02-16  4:10 UTC (permalink / raw)
  To: DJ Delorie; +Cc: tnt, gcc

On Wed, 15 Feb 2006, Hans-Peter Nilsson wrote:
> Unless I'm delirious (it's way past bedtime) I see a m32c port
> and it's cc0-free.  Is there a problem?

I see, in the code in svn trunk the compares aren't optimized
away yet.  You must be having a lot of fun right now. ;-)

brgds, H-P

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  3:56             ` Hans-Peter Nilsson
  2006-02-16  4:10               ` Hans-Peter Nilsson
@ 2006-02-16  4:13               ` DJ Delorie
  1 sibling, 0 replies; 26+ messages in thread
From: DJ Delorie @ 2006-02-16  4:13 UTC (permalink / raw)
  To: hp; +Cc: tnt, gcc


> Unless I'm delirious (it's way past bedtime) I see a m32c port
> and it's cc0-free.  Is there a problem?

m32c has a separate $flg register defined, not a cc0 port.
Hence, this pattern:

(define_insn_and_split "cbranch<mode>4"
  [(set (pc) (if_then_else
	      (match_operator 0 "m32c_cmp_operator"
			      [(match_operand:QHPSI 1 "mra_operand" "RraSd")
			       (match_operand:QHPSI 2 "mrai_operand" "iRraSd")])
              (label_ref (match_operand 3 "" ""))
	      (pc)))]
  ""
  "#"
  ""
  [(set (reg:CC FLG_REGNO)
	(compare (match_dup 1)
		 (match_dup 2)))
   (set (pc) (if_then_else (match_dup 4)
			   (label_ref (match_dup 3))
			   (pc)))]
  "operands[4] = m32c_cmp_flg_0 (operands[0]);"
  )

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  4:10               ` Hans-Peter Nilsson
@ 2006-02-16  4:17                 ` DJ Delorie
  0 siblings, 0 replies; 26+ messages in thread
From: DJ Delorie @ 2006-02-16  4:17 UTC (permalink / raw)
  To: hp; +Cc: tnt, gcc


> I see, in the code in svn trunk the compares aren't optimized away
> yet.  You must be having a lot of fun right now. ;-)

*That* is an understatement.  Unfortunately, reload hates me (see
archives for that thread) so I can't commit anything yet.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16  3:21       ` DJ Delorie
  2006-02-16  3:34         ` Hans-Peter Nilsson
@ 2006-02-16 11:42         ` Hans-Peter Nilsson
  2006-02-16 20:49           ` Sylvain Munaut
  1 sibling, 1 reply; 26+ messages in thread
From: Hans-Peter Nilsson @ 2006-02-16 11:42 UTC (permalink / raw)
  To: DJ Delorie; +Cc: tnt, gcc

On Wed, 15 Feb 2006, DJ Delorie wrote:
  I wrote:
> > Anyway, at least keep a way to add reg+reg and reg+integer, load and
> > store of memory and load of integer and address without condition
> > code effects and your port has a chance to avoid the related bloat.
>
> At least, move/load/store shouldn't touch flags.

I may have listed more operations than necessary, but the
important bit is that you need to form arbitrary addresses in
the stack frame without touching flags.  If for any const_int N,
(plus reg N) is a valid address for moves to and from memory
that doesn't touch flags, then I suppose you don't *need* an
"add" that doesn't touch flags.

brgds, H-P

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16 11:42         ` Hans-Peter Nilsson
@ 2006-02-16 20:49           ` Sylvain Munaut
  2006-02-17  0:28             ` Hans-Peter Nilsson
  0 siblings, 1 reply; 26+ messages in thread
From: Sylvain Munaut @ 2006-02-16 20:49 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: DJ Delorie, gcc

Hans-Peter Nilsson wrote:
> On Wed, 15 Feb 2006, DJ Delorie wrote:
>   I wrote:
> 
>>>Anyway, at least keep a way to add reg+reg and reg+integer, load and
>>>store of memory and load of integer and address without condition
>>>code effects and your port has a chance to avoid the related bloat.
>>
>>At least, move/load/store shouldn't touch flags.
> 
> 
> I may have listed more operations than necessary, but the
> important bit is that you need to form arbitrary addresses in
> the stack frame without touching flags.  If for any const_int N,
> (plus reg N) is a valid address for moves to and from memory
> that doesn't touch flags, then I suppose you don't *need* an
> "add" that doesn't touch flags.

Move/Load/Store without flag is no problem. But for add, to allow
multiword add, carry is needed and I can't make it optionnal.

But as you said, I could make the load/store take 3 args, either
load rD, rB(rA)
or
load rD, imm4(rA)

with imm4 being between -16 and 15.


Another thing for memory. I can't make 8 bits access, the memory is 16
bits wide and I can't change that, so 8 bits access will have to be done
in sw.
Also, I could make the address given a word address or a byte address
(but then I would just drop the LSB since i don't support unaligned
access ... and the immediate in load/store would be each even between
-32 and 30).



DJ Delorie wrote:
> You have to think about what kind of constants are going to be common
> in your software, and plan accordingly.

I can see several types of immediates:
1) Complete arbitrary constants like filter coefficients, stuff like that.
2) Small positive/negative integers: like to increment or walk in a array
3) Single bits or grouped bits anywhere in the word (to set/test bits)
4) Power of N - 1 : To do modulo / masking.

For the class 1, not much to do about it ... Those will have to be
loaded with several operations ...
To handle class 2/3/4 in the operation taking an immediate (and that are
not mov), I was thinking of allowing a 4 bits immediate, that could be
placed in any nibble, and the nibbles on the left could either be filled
with 1 or 0, and the nibbles on the right could also be filled with 1 or
0 (independently).

So for ex 0x003f would be possible (3 in second nibble, 0-filled on the
left and 1 filled on the right). 0xfff1 also, but not 0x0370 for example ...

Now the problem is to well describe to gcc what can be taken as an
immediate and what can't ...



Anyway, thanks for youe advices ! I may not be able to follow all of
them because I also have hw constraint but I do appreciate them. It may
get sometime before I actually come to start the work on gcc (first I
have to actually do the hw and port binutils ;)



	Sylvain

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-16 20:49           ` Sylvain Munaut
@ 2006-02-17  0:28             ` Hans-Peter Nilsson
  2006-02-20  8:54               ` Sylvain Munaut
  0 siblings, 1 reply; 26+ messages in thread
From: Hans-Peter Nilsson @ 2006-02-17  0:28 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: DJ Delorie, gcc

On Thu, 16 Feb 2006, Sylvain Munaut wrote:
> Move/Load/Store without flag is no problem. But for add, to allow
> multiword add, carry is needed and I can't make it optionnal.

As I hinted, perhaps you can have the multiword carry a separate
one from the flags carry, perhaps moved over with a separate
instruction?

Perhaps have a "load" variant that doesn't load; a "lea"?
Perhaps it only does that when run just after a prefix
instruction (that has another meaning before some other
instruction)?  (Look, there's your separate add and move
instruction in one! :-)

If it comes to that, I will go as far as suggesting that flags
handling is more important than multiword add support.  Really.
(The latter will happen less frequently and can be performed
with a few more instructions.)

> But as you said, I could make the load/store take 3 args, either
> load rD, rB(rA)
> or
> load rD, imm4(rA)
>
> with imm4 being between -16 and 15.

That's not enough to cover a full stack frame, unfortunately.
I suggest you find out a way to load an arbitrary integer into a
register without touching flags (there's no point in having that
touch flags) and then a way to add two registers without
touching flags.  Maybe it's sufficient with the first one, but
I'm not willing to bet on it.

> Another thing for memory. I can't make 8 bits access, the memory is 16
> bits wide and I can't change that, so 8 bits access will have to be done
> in sw.

That's ok.  8-bit accesses are desirable, but not a must and not
as important as anything else I can think of.

> Also, I could make the address given a word address or a byte address
> (but then I would just drop the LSB since i don't support unaligned
> access ... and the immediate in load/store would be each even between
> -32 and 30).

Stick with byte addresses.  Really, really really.  Word
addresses used to be somehow supported, but there are many bugs
and no other working port does it.  Having the imm4 be bits 5..1
and bit 0 constant 0 is certainly the right thing to do for
16-bit-wide accesses.

> Now the problem is to well describe to gcc what can be taken as an
> immediate and what can't ...

That's really not a problem, it's quite simple.

brgds, H-P

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Design a microcontroller for gcc
  2006-02-17  0:28             ` Hans-Peter Nilsson
@ 2006-02-20  8:54               ` Sylvain Munaut
  0 siblings, 0 replies; 26+ messages in thread
From: Sylvain Munaut @ 2006-02-20  8:54 UTC (permalink / raw)
  To: Hans-Peter Nilsson, dj, gcc

Hans-Peter Nilsson wrote:
> On Thu, 16 Feb 2006, Sylvain Munaut wrote:
> 
>>Move/Load/Store without flag is no problem. But for add, to allow
>>multiword add, carry is needed and I can't make it optionnal.
> 
> 
> As I hinted, perhaps you can have the multiword carry a separate
> one from the flags carry, perhaps moved over with a separate
> instruction?

Yes, two set of flags look good. One for arith operations (shift thru
carry, multiword add, ...) and the other only for compare operations.

With either a bit to select which set of flag to use during conditionnal
jumps or a way to copy one set into the other (I'll see what's the
easier to implement, but i'd like the first possibility better).


>>Also, I could make the address given a word address or a byte address
>>(but then I would just drop the LSB since i don't support unaligned
>>access ... and the immediate in load/store would be each even between
>>-32 and 30).
>
> Stick with byte addresses.  Really, really really.  Word
> addresses used to be somehow supported, but there are many bugs
> and no other working port does it.  Having the imm4 be bits 5..1
> and bit 0 constant 0 is certainly the right thing to do for
> 16-bit-wide accesses.

Ok, so be it then.


	Sylvain

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2006-02-20  8:54 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-14 23:22 Design a microcontroller for gcc Sylvain Munaut
2006-02-14 23:40 ` DJ Delorie
2006-02-15  0:28   ` Sylvain Munaut
2006-02-15  0:41     ` DJ Delorie
2006-02-15 19:59       ` Sylvain Munaut
2006-02-15 20:06         ` DJ Delorie
2006-02-15 20:23           ` Paul Brook
2006-02-15 21:21             ` DJ Delorie
2006-02-15 22:10 ` Hans-Peter Nilsson
2006-02-15 23:10   ` David Daney
2006-02-16  0:26   ` Sylvain Munaut
2006-02-16  0:46     ` DJ Delorie
2006-02-16  2:49     ` Hans-Peter Nilsson
2006-02-16  3:21       ` DJ Delorie
2006-02-16  3:34         ` Hans-Peter Nilsson
2006-02-16  3:37           ` Hans-Peter Nilsson
2006-02-16  3:44             ` DJ Delorie
2006-02-16  3:40           ` DJ Delorie
2006-02-16  3:56             ` Hans-Peter Nilsson
2006-02-16  4:10               ` Hans-Peter Nilsson
2006-02-16  4:17                 ` DJ Delorie
2006-02-16  4:13               ` DJ Delorie
2006-02-16 11:42         ` Hans-Peter Nilsson
2006-02-16 20:49           ` Sylvain Munaut
2006-02-17  0:28             ` Hans-Peter Nilsson
2006-02-20  8:54               ` Sylvain Munaut

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).