public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Steven Roos <S.Roos@its.tudelft.nl>
To: Marko Mlinar <Marko.Mlinar@campus.fri.uni-lj.si>
Cc: Damjan Lampret <lampret@opencores.org>, gcc@gcc.gnu.org
Subject: Re: Need advice on interface to non-gcc scheduler
Date: Mon, 03 Apr 2000 05:59:00 -0000	[thread overview]
Message-ID: <38E8957E.3CD72D4D@its.tudelft.nl> (raw)
In-Reply-To: <001001bf9bc3$13624ac0$c34902c1@Javor>

Marko Mlinar wrote:
> Steven,
> 
> I am Marko Mlinar, co-developer of OR2k, mentioned before.
> Let me say that I am surprised too, that our designs have so much
> similarities. Besides I don't think those three points
> you mentioned aren't really differences except second one.
> You even used 4 as a number of neigbours like we did.

Hi Marko,

The number of 4 (2 left, 2 right) was used as an example. Real
implementations may use any convenient number. We think that 4 may
be rather small, and that 6 or 8 may give a better balance between
hardware costs and communication latency (at least for a 1D
topology).

> And optimization algorithm is really similar to one we used.
> (even heuristics proposed in article; well - later we used more
> complex ones, that produced better results).
> Have you made any simulations? We have already made several.

We changed the scheduling algorithm recently. The algorithm described
in the paper has some serious problems when choosing a local register
file for long-living variables. It has to choose a file long before it 
has any idea of where the variable is likely going to be used.

Our new algorithm tries to postpone that decision until it has to
schedule a use,so it can better decide where to move the variable to.
It is also better at deciding when to split a variable, for example
when it has to be used in remote parts of the processor.

> I wonder - do you intend to use such arhitecture for general purpose
> computing or just for applications that needs high computation power?
> How do you intend to apply jumps/calls to your arhitecture?

We think that the architecture is mostly suited for embedded
applications with lots of parallelism and a high need of computation 
power. If we want to use it as a general purpose processor we have to
include support for floating point, exceptions, memory management,
etc. While it may be possible, these features would highly increase
the complexity of the architecture and would conflict with the idea of 
fast, simple and cheap hardware.

The implementation of jumps and calls is really nothing special. One
of the units gets some extra opcodes for (conditional) jumps and
calls. They will have some delay slots; two is probably enough.

> Major differences I see are:
> 1. your design uses (more) centralised register file, which may cause
>     problems, especially with jumps and calls; maybe localised register
>     file (as you mentioned it) and as we use it may put us to same ground

We left the idea of register file units shortly after we submitted the 
paper. The only two remaining differences in the register architecture 
are 
1: We allow only Ri as the destination.
2: Both our operands may come from a neighbor unit (the predicate
   operand shown in the paper has been removed).

> 2. your use more 'localised' connections this may cause problems with
>     larger number of units, but on the other hand dataflow graph may be
>     much more easily extracted.
> 
> I suggest that you use more specialised algorithm based on dataflow tree
> extraction (place small parts of df tree into 2D (unit,time) space), which
> could have much lower complexity and could yield better results.
> (I am just speculating based on my experiences - I have nothing concrete
> in mind yet).

I would really like to have an algorithm that could map a dataflow
graph into 2D as it solves a large part of the problem of deciding
on which unit an operation must be scheduled. When I have that info,
the rest of the schedule (transports, register allocation) is fairly
straightforward. The problem is: what is a good mapping, and how do I
find it.

> I belive (based on my simulations), that algorithms doesn't work so
> well for large number of units (like >10).
> 
> 
> best regards,
>     Marko

Greetings,
Steven

  reply	other threads:[~2000-04-03  5:59 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-03-31  5:59 Steven Roos
2000-03-31 13:50 ` Damjan Lampret
2000-04-01  2:17   ` Marko Mlinar
2000-04-03  5:59     ` Steven Roos [this message]
  -- strict thread matches above, loose matches on Subject: below --
2000-03-31  2:58 Steven Roos
2000-03-31  3:39 ` Damjan Lampret

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38E8957E.3CD72D4D@its.tudelft.nl \
    --to=s.roos@its.tudelft.nl \
    --cc=Marko.Mlinar@campus.fri.uni-lj.si \
    --cc=gcc@gcc.gnu.org \
    --cc=lampret@opencores.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).