public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
* GSoC(run-time argument checking project)
@ 2022-03-29 20:26 Γιωργος Μελλιος
  2022-03-29 21:47 ` Tobias Burnus
  0 siblings, 1 reply; 2+ messages in thread
From: Γιωργος Μελλιος @ 2022-03-29 20:26 UTC (permalink / raw)
  To: fortran

Greetings,

I am looking forward to applying for GCC so I was checking the project
ideas list. I got interested in the Fortran - run-time argument checking
project and I would like to learn some more information about it in order
to start doing some research on the specific field so that I will be more
productive if I get selected.

Thanks in advance.
Georgios Panagiotis Mellios

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: GSoC(run-time argument checking project)
  2022-03-29 20:26 GSoC(run-time argument checking project) Γιωργος Μελλιος
@ 2022-03-29 21:47 ` Tobias Burnus
  0 siblings, 0 replies; 2+ messages in thread
From: Tobias Burnus @ 2022-03-29 21:47 UTC (permalink / raw)
  To: Γιωργος
	Μελλιος,
	fortran

Hi Γιωργος,

On 29.03.22 22:26, Γιωργος Μελλιος via Fortran wrote:
> I am looking forward to applying for GCC so I was checking the project
> ideas list. I got interested in the Fortran - run-time argument checking
> project and I would like to learn some more information about it in order
> to start doing some research on the specific field so that I will be more
> productive if I get selected.

This feature relates more to older Fortran code - as modern Fortran code
tends to use modules. With modules, one writes procedures (subroutines or functions)
like:

! MODERN CODE - USING MODULES

module myMod
   implicit none
contains
   subroutine mySub(n,y,z)
     integer :: n
     real :: y(10)
     character(len=n) :: z(:,:)
   end subroutine
end module

And then when using it, just doing:

   use myMod
   ...
   call mySub(m, var, array)

By 'use'ing the module, the compiler knows the data type and can use
the proper ABI (here: all variables are passed by reference, 'y' is
a contiguous stream of the actual data whereas 'z' uses some wrapper
("array descriptor", "dope vector"), which contains additional data
(like array bounds).

  * * *

OLD WAY:

subroutine mySub2(n, x)
   integer :: n
   real :: y(n)
end

subroutine another_sub()
   real :: x(2)
   x = [1., 2.]
   call mySub2(size(x), x)
end

Even if you put this into the same file, in terms of the Fortran language,
the compiler does not know anything about 'mySub' inside 'another_sub'
except that it is a subroutine (because of the 'call mySub') - it does not
know the number of arguments or the data types or how to pass the data.
By usage, it can deduce 2 argument and it uses the standard argument passing
known from Fortran 66 (i.e. pass by reference, pass arrays as stream of
data).

If the two subroutines are in different files, the Fortran semantic and
what the compiler knows is the same. But of course, if both are in the
same file, the compiler _can_ see the other subroutine and do checks
between what is known locally – and how the subroutine looks in reality.
(GCC/gfortran does such checks if possible. There is room for improvement
but it already detects a lot.)

With -fcheck=interface or some option like that, the compiler should add
checks that there are indeed 2 arguments, the called procedure is indeed
a subroutine (and not a function), that the first argument it a scalar
and the second one an array.
Going beyond, it could also check whether the array size is >= the first
argument. (But the size might not always been known to the caller.)

  * * *

If certain features are used, the compiler must know the interface of
the procedure. One way is by 'use'ing a module as above, but, alternatively,
an INTERFACE block can be used.

The INTERFACE block is required if the arguments are passed in a non-standard
way, e.g. by VALUE instead of by reference or (as above) not as byte stream
but wrapped in an array descriptor ('var(:)' - assumed-shape array (w/ array descriptor),
by contrast, var(n) is an explicit-size array (passes pointer to first element such
there is just the stream of bytes with the values.)

'mySub2' above is an example where the inferface is not needed – and would be
only helpful to find argument mismatches.
In the example below, the assumed-shape arrays and the VALUE attribute mean
that an interface is required:

subroutine mySub3(n, x)
   integer, value :: n
   integer :: x(:)
end

subroutine a_third_sub()
   real :: r(2)
   interface
     subroutine mySub3(n, x)
       integer :: n
       integer :: x(:)
     end
   end interface
   call mySub3(123, x)
end

When writing an INTERFACE block, it can easily happen that one misses
some property – like above where VALUE is missing in the INTERFACE block.

Or one misses to write the INTERFACE block but it is required due to,
e.g., the VALUE attribute.

  * * *

Regarding the implementation: The idea is to have one/two global variable(s)
which is/are a pointers

When then doing
   call mySub3(123, x)

there is before done the following (pseudo code in Fortran syntax):

   var_callee => mySub3  ! called function

   data%version = 1
   data%filename = "...."
   data%line_num = ...
   data%num_args = 2  ! property from the interface block (if available), otherwise from usage.
   data%arg[1]%type = integer  ! likewise
   data%arg[1]%by_value = .false.
   data%arg[2]%type = real
   data%arg[2]%array_type = assumed_shape
   data%arg[2]%array_size = size(x)
   var_args => data

   call mySub3(123, x)


And inside mySub3:

subroutine mySub3 (...)
   if (var_callee == mySub3) then
     data2%version = 1
     data2%num_args = 2
     data2%arg[1]%type = integer
     data2%arg[1]%by_value = .true.
     data2%arg[2]%type = real
     data2%arg[2]%array_type = assumed_shape
     call gfortran_argcheck (data, var_args, "mySub3")
   endif
   ...
end


Thus: One stores a bunch of information about the actual arguments
in a variable + saves it. In the callee, there is a check that
the data is indeed for that procedure (to permit compiling only
a subset of the files with this instrumentation) – and if it is,
the arguments are compared.

My impression is that it then makes sense to outsource this checking
into a library function. In this example, that could be:

   if (caller.arg[i].by_value != callee.arg[i].by_value)
     error ("%s:%d: Mismatch in VALUE attribute for argument %d in call to %s",
            caller.filename, caller.linenum, i, proc_name);

I think you get the idea. Thus, the work is to generate the code for the
arguments before the call + at the beginning of a procedure + call a comparison
function. That is all work done in the compiler itself.

And then the diagnostic in the library, which does the actual checking and writes
some nice words about it.

In terms of the compiler, the data structure has to be created on the fly. You
have two choices in the Fortran AST (abstract syntax tree, gfc_expr, gfc_symbol)
or in the one used by C/C++ and the middle end ("tree").

Side remark: of course a thread-private variable is needed to support concurrency.

  * * *

I think the first step is to get some basic checking done (e.g function vs. subroutine
+ number of arguments) – and then to extend it to check for more complicated things.
(Hence also the version field - to permit adding more changes in later releases.)

Fortran standards: https://gcc.gnu.org/wiki/GFortranStandards
Something you surely need as reference when working on it, but if you do not know
much of Fortran, some Fortran tutorial will help more.

  * * *

I hope it helps to give you a rough idea – if you need more, ask.
(In particular, without knowing your background, it is difficult to
link to the best-suited references.)

Cheers,

Tobias

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-03-29 21:47 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-29 20:26 GSoC(run-time argument checking project) Γιωργος Μελλιος
2022-03-29 21:47 ` Tobias Burnus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).