From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailer01.zib.de (mailer01.zib.de [130.73.108.150]) by sourceware.org (Postfix) with ESMTPS id 9E0613851C04 for ; Wed, 15 Jul 2020 07:09:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9E0613851C04 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=zib.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gholami@zib.de Received: from mailer02.zib.de ([130.73.108.151]:47538) by mailer01.zib.de with esmtps (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.82_1-5b7a7c0-XX) (envelope-from ) id 1jvbXY-0007w3-2m for gcc-help@gcc.gnu.org; Wed, 15 Jul 2020 09:09:17 +0200 Received: from ip5f5aebaa.dynamic.kabel-deutschland.de ([95.90.235.170]:5666 helo=[192.168.0.2]) by mailer02.zib.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.82_1-5b7a7c0-XX) (envelope-from ) id 1jvbXV-0004Td-2T for gcc-help@gcc.gnu.org; Wed, 15 Jul 2020 09:09:14 +0200 X-CTCH-RefID: str=0001.0A0C0206.5F0EAB9D.0004, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 From: Masoud Gholami Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.40.2.2.4\)) Subject: GCC Plugin to insert new expressions/statements in the code Message-Id: Date: Wed, 15 Jul 2020 09:09:11 +0200 To: gcc-help@gcc.gnu.org X-Mailer: Apple Mail (2.3608.40.2.2.4) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HTML_MESSAGE, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-help@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-help mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Jul 2020 07:09:23 -0000 Hi, I am using GCC 9.3 and writing a plugin that uses the PLUGIN_PRAGMAS = event to register a custom pragma that is expected to be before a = function call as follows: int main() { char *filename =3D =E2=80=9Cpath/to/file=E2=80=9D; #pragma inject_before_call File *f =3D fopen(filename, =E2=80=A6); // marked fopen = (by the pragma) =E2=80=A6 fclose(f); char *filename2 =3D =E2=80=9Cpath/to/file2=E2=80=9D; File *f2 =3D fopen(filename2, =E2=80=A6); // = non-marked fopen =E2=80=A6 fclose(f2); return 0; } In fact, I am using the inject_before_call pragma to mark some fopen = calls in the code (in this example, the first fopen call is marked). = Then, for each marked fopen call, some extra = expressions/statements/declarations are injected into the code before = calling the marked function. For example, the above main function would = be transformed as follows: int main() { char *filename =3D =E2=80=9C/path/to/file=E2=80=9D; File *tmp_f =3D fopen(=E2=80=9C/path/to/another/file=E2=80=9D, = =E2=80=9Cw+"); fclose(tmp_f); File *f =3D fopen(filename, =E2=80=A6); =E2=80=A6 fclose(f); char *filename2 =3D =E2=80=9Cpath/to/file2=E2=80=9D; // codes = not injected for the non-marked fopen File *f2 =3D fopen(filename2, =E2=80=A6); =E2=80=A6 fclose(f2); return 0; } Here, because of the inject_before_call pragma, the grey code is = injected into the main function before calling the marked fopen. It = simply opens a new file (=E2=80=9C/path/to/another/file=E2=80=9D) and = closes it.=20 The thing about the injected code is that it should be inserted only if = a fopen call is marked by a inject_before_call pragma. And if after the = inject_before_call pragma no fopen calls are made, the user gets an = error (the pragma should be only inserted before a fopen call). I implemented this in 3 steps as follows: 1. detection of the marked fopen calls: I created a pragma_handler which = remembers the location_t of all inject_before_call pragmas. Then using a = pass (before ssa), I look for the statements/expressions that are in the = next line of each remembered location. If it=E2=80=99s a fopen call, it = is considered as a marked call and the code should be inserted before = the fopen call. If it=E2=80=99s something other than a fopen call, an = error will be generated. However, I=E2=80=99m not aware if there are any = better ways to detect the marked calls. Here is the simplified pass to find the marked fopen calls (generating = errors not covered): unsigned int execute(function *func) { basic_block bb; =20 FOR_EACH_BB_FN (bb, func) { =20 for (gimple_stmt_iterator gsi =3D gsi_start_bb (bb); !gsi_end_p = (gsi); gsi_next (&gsi)) { gimple *stmt =3D gsi_stmt (gsi); = =20 if (gimple_is_fopen(stmt)) { =20 if (marked_fopen(stmt)) { = =20 handle_marked_fopen(stmt); =20 } = =20 } = =20 } =20 } }=20 2. create the GIMPLE representation of the code to be injected: after = finding the marked fopen calls, I construct some declaration and = expressions to be injected into the code as follows: // create the strings =E2=80=9C/path/to/another/file" and =E2=80=9Cw+" tree another_path =3D build_string (20, =E2=80=9C/path/to/another/file"); fix_string_type (another_path); =20 tree mode =3D build_string (3, =E2=80=9Cw+\0"); fix_string_type (mode); =20 // create a call to the fopen function with the created strings tree fopen_decl =3D lookup_qualified_name (global_namespace, = get_identifier("fopen"), 0, true, false); gimple *new_open_call =3D gimple_build_call(fopen_decl, 2, another_path, = mode); // create the tmp_f declaration f_decl =3D build_decl(UNKNOWN_LOCATION, VAR_DECL, = get_identifier(=E2=80=9Ctmp_f"), fileptr_type_node); pushdecl (f_decl); rest_of_decl_compilation (f_decl, 0, 0); =20 // set the lhs of the fopen call to be f_decl gimple_call_set_lhs(new_open_call, f_decl) // create a call to the fclose function with the tmp_f variable tree fclose_decl =3D lookup_qualified_name (global_namespace, = get_identifier("fclose"), 0, true, false); gimple *new_close_call =3D gimple_build_call(fclose_decl, 1, f_decl); 3. add the created GIMPLE trees to the code (basic-blocks): basic_block bb =3D gimple_bb(stmt); = =20 for (gimple_stmt_iterator gsi =3D gsi_start_bb (bb); !gsi_end_p (gsi); = gsi_next (&gsi)) { gimple *st =3D gsi_stmt (gsi); = =20 if (st =3D=3D stmt) { // the marked fopen call gsi_insert_before(&gsi, new_open_call, GSI_NEW_STMT); gsi_insert_after(&gsi, new_close_call, GSI_NEW_STMT); gimple_set_bb(new_open_call, bb); gimple_set_bb(new_close_call, bb); break; } } This is how I implemented the plugin. However, after compiling a sample = code (like the main function above), I get segmentation fault. By = defining another pass to print the statements of the code and by = executing this pass after the previous pass (that injects the code), I = see correct results (i.e., the injected code is correctly generated and = inserted into the right location as I intended). But when I debug the = sample code, I see that only the last injected statement (fclose) is = executed with NULL in the f_decl variable which causes the segmentation = fault.=20 I also tried to insert the pass after the =E2=80=9Clower=E2=80=9D pass = which is executed much sooner. Then I used gimple_seq body =3D = gimple_body (current_function_decl)to get the gimple sequence of the = current function and injected the new statements into the gimple = sequence same as above. Bit it didn=E2=80=99t work out as well. I searched everywhere, read all the documentations I could find, and = digged into the gcc code for other pragmas (i.e. omp parallel, etc.). = But still I have no success in doing this correctly. Could you please = point me where the problem is?=20 Thanks, M. Gholami=