From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 62E2D3857820 for ; Tue, 29 Mar 2022 21:47:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 62E2D3857820 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.90,220,1643702400"; d="scan'208";a="73752186" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 29 Mar 2022 13:47:15 -0800 IronPort-SDR: 02Roo94k2QewQ4e2e+5IaSjzqG4DnfHJPrynghlhqBK0m9ArvuwncqHoNvP4EHfGKeN3KIESRv RS5dDahkuig4R6f5+u73C1Lt6fXXTLiJy3QuDAThmN4pdEqp5avJc+9FR9MrjMEF6yAF9IkHs3 DYVAEXnKW0dwr2CH3jMjYWc4ips8fAWHNr5rzDEipZ6rdYuhQ34m5DTYZwoqTb9W6OgIVr8vEa QJDf8NTUU0dQMN39/wTdmBWtAXRN0/pNDTfdjO5IMsOkIT7IA1IQMe4W0Kf7riUTghmLvnkycI xkY= Message-ID: Date: Tue, 29 Mar 2022 23:47:07 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: GSoC(run-time argument checking project) Content-Language: en-US To: =?UTF-8?B?zpPOuc+Jz4HOs86/z4IgzpzOtc67zrvOuc6/z4I=?= , fortran References: From: Tobias Burnus In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-05.mgc.mentorg.com (139.181.222.5) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00, BODY_8BITS, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, NICE_REPLY_A, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: fortran@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Fortran mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Mar 2022 21:47:19 -0000 Hi =CE=93=CE=B9=CF=89=CF=81=CE=B3=CE=BF=CF=82, On 29.03.22 22:26, =CE=93=CE=B9=CF=89=CF=81=CE=B3=CE=BF=CF=82 =CE=9C=CE=B5= =CE=BB=CE=BB=CE=B9=CE=BF=CF=82 via Fortran wrote: > I am looking forward to applying for GCC so I was checking the project > ideas list. I got interested in the Fortran - run-time argument checking > project and I would like to learn some more information about it in order > to start doing some research on the specific field so that I will be more > productive if I get selected. This feature relates more to older Fortran code - as modern Fortran code tends to use modules. With modules, one writes procedures (subroutines or f= unctions) like: ! MODERN CODE - USING MODULES module myMod implicit none contains subroutine mySub(n,y,z) integer :: n real :: y(10) character(len=3Dn) :: z(:,:) end subroutine end module And then when using it, just doing: use myMod ... call mySub(m, var, array) By 'use'ing the module, the compiler knows the data type and can use the proper ABI (here: all variables are passed by reference, 'y' is a contiguous stream of the actual data whereas 'z' uses some wrapper ("array descriptor", "dope vector"), which contains additional data (like array bounds). * * * OLD WAY: subroutine mySub2(n, x) integer :: n real :: y(n) end subroutine another_sub() real :: x(2) x =3D [1., 2.] call mySub2(size(x), x) end Even if you put this into the same file, in terms of the Fortran language, the compiler does not know anything about 'mySub' inside 'another_sub' except that it is a subroutine (because of the 'call mySub') - it does not know the number of arguments or the data types or how to pass the data. By usage, it can deduce 2 argument and it uses the standard argument passin= g known from Fortran 66 (i.e. pass by reference, pass arrays as stream of data). If the two subroutines are in different files, the Fortran semantic and what the compiler knows is the same. But of course, if both are in the same file, the compiler _can_ see the other subroutine and do checks between what is known locally =E2=80=93 and how the subroutine looks in rea= lity. (GCC/gfortran does such checks if possible. There is room for improvement but it already detects a lot.) With -fcheck=3Dinterface or some option like that, the compiler should add checks that there are indeed 2 arguments, the called procedure is indeed a subroutine (and not a function), that the first argument it a scalar and the second one an array. Going beyond, it could also check whether the array size is >=3D the first argument. (But the size might not always been known to the caller.) * * * If certain features are used, the compiler must know the interface of the procedure. One way is by 'use'ing a module as above, but, alternatively= , an INTERFACE block can be used. The INTERFACE block is required if the arguments are passed in a non-standa= rd way, e.g. by VALUE instead of by reference or (as above) not as byte stream but wrapped in an array descriptor ('var(:)' - assumed-shape array (w/ arra= y descriptor), by contrast, var(n) is an explicit-size array (passes pointer to first elem= ent such there is just the stream of bytes with the values.) 'mySub2' above is an example where the inferface is not needed =E2=80=93 an= d would be only helpful to find argument mismatches. In the example below, the assumed-shape arrays and the VALUE attribute mean that an interface is required: subroutine mySub3(n, x) integer, value :: n integer :: x(:) end subroutine a_third_sub() real :: r(2) interface subroutine mySub3(n, x) integer :: n integer :: x(:) end end interface call mySub3(123, x) end When writing an INTERFACE block, it can easily happen that one misses some property =E2=80=93 like above where VALUE is missing in the INTERFACE = block. Or one misses to write the INTERFACE block but it is required due to, e.g., the VALUE attribute. * * * Regarding the implementation: The idea is to have one/two global variable(s= ) which is/are a pointers When then doing call mySub3(123, x) there is before done the following (pseudo code in Fortran syntax): var_callee =3D> mySub3 ! called function data%version =3D 1 data%filename =3D "...." data%line_num =3D ... data%num_args =3D 2 ! property from the interface block (if available),= otherwise from usage. data%arg[1]%type =3D integer ! likewise data%arg[1]%by_value =3D .false. data%arg[2]%type =3D real data%arg[2]%array_type =3D assumed_shape data%arg[2]%array_size =3D size(x) var_args =3D> data call mySub3(123, x) And inside mySub3: subroutine mySub3 (...) if (var_callee =3D=3D mySub3) then data2%version =3D 1 data2%num_args =3D 2 data2%arg[1]%type =3D integer data2%arg[1]%by_value =3D .true. data2%arg[2]%type =3D real data2%arg[2]%array_type =3D assumed_shape call gfortran_argcheck (data, var_args, "mySub3") endif ... end Thus: One stores a bunch of information about the actual arguments in a variable + saves it. In the callee, there is a check that the data is indeed for that procedure (to permit compiling only a subset of the files with this instrumentation) =E2=80=93 and if it is, the arguments are compared. My impression is that it then makes sense to outsource this checking into a library function. In this example, that could be: if (caller.arg[i].by_value !=3D callee.arg[i].by_value) error ("%s:%d: Mismatch in VALUE attribute for argument %d in call to = %s", caller.filename, caller.linenum, i, proc_name); I think you get the idea. Thus, the work is to generate the code for the arguments before the call + at the beginning of a procedure + call a compar= ison function. That is all work done in the compiler itself. And then the diagnostic in the library, which does the actual checking and = writes some nice words about it. In terms of the compiler, the data structure has to be created on the fly. = You have two choices in the Fortran AST (abstract syntax tree, gfc_expr, gfc_sy= mbol) or in the one used by C/C++ and the middle end ("tree"). Side remark: of course a thread-private variable is needed to support concu= rrency. * * * I think the first step is to get some basic checking done (e.g function vs.= subroutine + number of arguments) =E2=80=93 and then to extend it to check for more co= mplicated things. (Hence also the version field - to permit adding more changes in later rele= ases.) Fortran standards: https://gcc.gnu.org/wiki/GFortranStandards Something you surely need as reference when working on it, but if you do no= t know much of Fortran, some Fortran tutorial will help more. * * * I hope it helps to give you a rough idea =E2=80=93 if you need more, ask. (In particular, without knowing your background, it is difficult to link to the best-suited references.) Cheers, Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955