Hi Jakub, hi all, new version attached. It now checks during lto1 whether the values are consistent – and fails with a hard error. The actually used value (by libgomp) is stored as a scalar weak symbol – while for checking, each translation unit stores the integer value for lto (alongside the offload table). This is both used for checking and in lto1 (device + host lto1), to restore the value of 'omp_requires_mask' for further use. – Currently, it is only used on the host to make the value available to libgomp. – However, a device lto1 could also use it. (Usage: cf. Andrew's USM gcn patch.) Unchanged from previous version, libgomp outputs an warning/note if a device could be found but the requires prevented libgomp from using it. This message is also shown with -foffload=disable but it is not shown for OMP_TARGET_OFFLOAD=disable. Other change is that API calls no longer count as relevant for 'omp requires' – such that compilation units which only contain those will not output anything (independent whether there is an 'omp requires' or not.) On 09.06.22 16:19, Jakub Jelinek wrote: > On Thu, Jun 09, 2022 at 02:46:34PM +0200, Tobias Burnus wrote: >> On 09.06.22 13:40, Jakub Jelinek via Gcc-patches wrote: > If it is from me, bet it was because of that (mis)understanding that > device routines are device related runtime API calls. > I'd suggest to only mark in the patch what is clear (which is device > constructs) and defer the rest until it is clarified. Done so. >>> Shouldn't the vars in that section be const, so that it is a read-only >>> section? >>> >>> Is unsigned_type_node what we want (say wouldn't be just unsigned_char_node >>> be enough, currently we just need 3 bits). >> Probably -that would be 8 bits, leaving 5 spare. I have not checked what >> Andrew et al. do with the pinned-memory support by -f, but >> that will likely use only 1 to 3 bits, if any. > If it is SHF_MERGE, even 16-bit or 32-bit wouldn't be the end of the world, > or if it is in LTO streamed out stuff, we can use a bitpack for it... As the final binary will only contain a single variable, the size should not matter much. I currently use 'unsigned' but it could surely be shorter. For the .o files, it also outputs the unsigned value for each TU, but that's also small. I was thinking about adding more data (like location data, be it location_t or __FILENAME__). However, it uses a stripped-down stream writer - and to do so, location/string writing requires a different object (and reading it, data_in). I did not regard this as worthwhile and, thus, I only output the used requires clauses and not where they were used. > I think best would be a fatal error if people try to configure > offloading targets for a compiler that doesn't support named sections, > or perhaps that and presence of anything that should be offloaded. I do not use any named section – but I could if it makes sense. In any case, the question is whether the current weak symbol makes sense or not. And whether there are problems in using weak symbols (in libgomp's target.c + for non-ACCEL_COMPILER, but only when the symbol needs to be written). I am also not sure about the best naming. – Thoughts? Otherwise, tested with no offloading configured + with offloading to nvptx (fully tested) and gcn (partially) [all x86_64-gnu-linux) Tobias PS: At some point, we need to think about handling calling from a program's target region a declare-target device function which is inside a shared library. I am sure, we currently do not handle it. – For that, we then also have to think about how to handle the requires clauses. ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955