public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fortran: Avoid accessing gfc_charlen when not looking at BT_CHARACTER (PR 110677)
@ 2023-08-14 17:39 Martin Jambor
  2023-08-14 19:28 ` Harald Anlauf
  0 siblings, 1 reply; 3+ messages in thread
From: Martin Jambor @ 2023-08-14 17:39 UTC (permalink / raw)
  To: GCC Patches, fortran

Hello,

this patch addresses an issue uncovered by the undefined behavior
sanitizer.  In function resolve_structure_cons in resolve.cc there is
a test starting with:

      if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl
	  && comp->ts.u.cl->length
	  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT

and UBSAN complained of loads from comp->ts.u.cl->length->expr_type of
integer value 1818451807 which is outside of the value range expr_t
enum.  If I understand the code correctly it the entire load was
unwanted because comp->ts.type in those cases is BT_CLASS and not
BT_CHARACTER.  This patch simply adds a check to make sure it is only
accessed in those cases.

I have verified that the UPBSAN failure goes away with this patch, it
also passes bootstrap and testing on x86_64-linux.  OK for master?

Thanks,

Martin



gcc/fortran/ChangeLog:

2023-08-14  Martin Jambor  <mjambor@suse.cz>

	PR fortran/110677
	* resolve.cc (resolve_structure_cons): Check comp->ts is character
	type before accessing stuff through comp->ts.u.cl.
---
 gcc/fortran/resolve.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index e7c8d919bef..5b4dfc5fcd2 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -1396,8 +1396,9 @@ resolve_structure_cons (gfc_expr *expr, int init)
 	 the one of the structure, ensure this if the lengths are known at
  	 compile time and when we are dealing with PARAMETER or structure
 	 constructors.  */
-      if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl
-	  && comp->ts.u.cl->length
+      if (cons->expr->ts.type == BT_CHARACTER
+	  && comp->ts.type == BT_CHARACTER
+	  && comp->ts.u.cl && comp->ts.u.cl->length
 	  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT
 	  && cons->expr->ts.u.cl && cons->expr->ts.u.cl->length
 	  && cons->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT
-- 
2.41.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Fortran: Avoid accessing gfc_charlen when not looking at BT_CHARACTER (PR 110677)
  2023-08-14 17:39 [PATCH] Fortran: Avoid accessing gfc_charlen when not looking at BT_CHARACTER (PR 110677) Martin Jambor
@ 2023-08-14 19:28 ` Harald Anlauf
  2023-08-15 15:18   ` Martin Jambor
  0 siblings, 1 reply; 3+ messages in thread
From: Harald Anlauf @ 2023-08-14 19:28 UTC (permalink / raw)
  To: Martin Jambor, GCC Patches, fortran

Hi Martin,

Am 14.08.23 um 19:39 schrieb Martin Jambor:
> Hello,
>
> this patch addresses an issue uncovered by the undefined behavior
> sanitizer.  In function resolve_structure_cons in resolve.cc there is
> a test starting with:
>
>        if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl
> 	  && comp->ts.u.cl->length
> 	  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT
>
> and UBSAN complained of loads from comp->ts.u.cl->length->expr_type of
> integer value 1818451807 which is outside of the value range expr_t
> enum.  If I understand the code correctly it the entire load was
> unwanted because comp->ts.type in those cases is BT_CLASS and not
> BT_CHARACTER.  This patch simply adds a check to make sure it is only
> accessed in those cases.
>
> I have verified that the UPBSAN failure goes away with this patch, it
> also passes bootstrap and testing on x86_64-linux.  OK for master?

this looks good to me.

Looking at that code block, there is a potential other UB a few lines
below, where (hopefully integer) string lengths are to be passed to
mpz_cmp.

If the string length is ill-defined (e.g. non-integer), value.integer
is undefined.  We've seen this elsewhere, where on BE platforms that
undefined value was interpreted as some large integer and giving
failures on those platforms.  One could similarly add the following
checks here (on top of your patch):

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index e7c8d919bef..43095406c16 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -1401,6 +1401,8 @@ resolve_structure_cons (gfc_expr *expr, int init)
  	  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT
  	  && cons->expr->ts.u.cl && cons->expr->ts.u.cl->length
  	  && cons->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT
+	  && cons->expr->ts.u.cl->length->ts.type == BT_INTEGER
+	  && comp->ts.u.cl->length->ts.type == BT_INTEGER
  	  && mpz_cmp (cons->expr->ts.u.cl->length->value.integer,
  		      comp->ts.u.cl->length->value.integer) != 0)
  	{

It is up to you whether you want to add this.

Thanks for the patch!

Harald


> Thanks,
>
> Martin
>
>
>
> gcc/fortran/ChangeLog:
>
> 2023-08-14  Martin Jambor  <mjambor@suse.cz>
>
> 	PR fortran/110677
> 	* resolve.cc (resolve_structure_cons): Check comp->ts is character
> 	type before accessing stuff through comp->ts.u.cl.
> ---
>   gcc/fortran/resolve.cc | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
> index e7c8d919bef..5b4dfc5fcd2 100644
> --- a/gcc/fortran/resolve.cc
> +++ b/gcc/fortran/resolve.cc
> @@ -1396,8 +1396,9 @@ resolve_structure_cons (gfc_expr *expr, int init)
>   	 the one of the structure, ensure this if the lengths are known at
>    	 compile time and when we are dealing with PARAMETER or structure
>   	 constructors.  */
> -      if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl
> -	  && comp->ts.u.cl->length
> +      if (cons->expr->ts.type == BT_CHARACTER
> +	  && comp->ts.type == BT_CHARACTER
> +	  && comp->ts.u.cl && comp->ts.u.cl->length
>   	  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT
>   	  && cons->expr->ts.u.cl && cons->expr->ts.u.cl->length
>   	  && cons->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Fortran: Avoid accessing gfc_charlen when not looking at BT_CHARACTER (PR 110677)
  2023-08-14 19:28 ` Harald Anlauf
@ 2023-08-15 15:18   ` Martin Jambor
  0 siblings, 0 replies; 3+ messages in thread
From: Martin Jambor @ 2023-08-15 15:18 UTC (permalink / raw)
  To: Harald Anlauf; +Cc: fortran, gcc-patches

Hello,

On Mon, Aug 14 2023, Harald Anlauf via Gcc-patches wrote:
> Hi Martin,
>
> Am 14.08.23 um 19:39 schrieb Martin Jambor:
>> Hello,
>> 
>> this patch addresses an issue uncovered by the undefined behavior
>> sanitizer.  In function resolve_structure_cons in resolve.cc there is
>> a test starting with:
>> 
>>        if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl
>> 	  && comp->ts.u.cl->length
>> 	  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT
>> 
>> and UBSAN complained of loads from comp->ts.u.cl->length->expr_type of
>> integer value 1818451807 which is outside of the value range expr_t
>> enum.  If I understand the code correctly it the entire load was
>> unwanted because comp->ts.type in those cases is BT_CLASS and not
>> BT_CHARACTER.  This patch simply adds a check to make sure it is only
>> accessed in those cases.
>> 
>> I have verified that the UPBSAN failure goes away with this patch, it
>> also passes bootstrap and testing on x86_64-linux.  OK for master?
>
> this looks good to me.
>
> Looking at that code block, there is a potential other UB a few lines
> below, where (hopefully integer) string lengths are to be passed to
> mpz_cmp.
>
> If the string length is ill-defined (e.g. non-integer), value.integer
> is undefined.  We've seen this elsewhere, where on BE platforms that
> undefined value was interpreted as some large integer and giving
> failures on those platforms.  One could similarly add the following
> checks here (on top of your patch):

Thank you very much for the approval and the improvement.  I have
committed the following (after another round of testing).

Martin



Fortran: Avoid accessing gfc_charlen when not looking at BT_CHARACTER (PR 110677)

This patch addresses an issue uncovered by the undefined behavior
sanitizer.  In function resolve_structure_cons in resolve.cc there is
a test starting with:

      if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl
	  && comp->ts.u.cl->length
	  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT

and UBSAN complained of loads from comp->ts.u.cl->length->expr_type of
integer value 1818451807 which is outside of the value range expr_t
enum.  If I understand the code correctly it the entire load was
unwanted because comp->ts.type in those cases is BT_CLASS and not
BT_CHARACTER.  This patch simply adds a check to make sure it is only
accessed in those cases.

During review, Harald Anlauf noticed that length types also need to be
checked and so I added also checks that he suggested to the condition.

Co-authored-by: Harald Anlauf <anlauf@gmx.de>

gcc/fortran/ChangeLog:

2023-08-14  Martin Jambor  <mjambor@suse.cz>

	PR fortran/110677
	* resolve.cc (resolve_structure_cons): Check comp->ts is character
	type before accessing stuff through comp->ts.u.cl.
---
 gcc/fortran/resolve.cc | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index e7c8d919bef..f51674f7faa 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -1396,11 +1396,14 @@ resolve_structure_cons (gfc_expr *expr, int init)
 	 the one of the structure, ensure this if the lengths are known at
  	 compile time and when we are dealing with PARAMETER or structure
 	 constructors.  */
-      if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl
-	  && comp->ts.u.cl->length
+      if (cons->expr->ts.type == BT_CHARACTER
+	  && comp->ts.type == BT_CHARACTER
+	  && comp->ts.u.cl && comp->ts.u.cl->length
 	  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT
 	  && cons->expr->ts.u.cl && cons->expr->ts.u.cl->length
 	  && cons->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT
+	  && cons->expr->ts.u.cl->length->ts.type == BT_INTEGER
+	  && comp->ts.u.cl->length->ts.type == BT_INTEGER
 	  && mpz_cmp (cons->expr->ts.u.cl->length->value.integer,
 		      comp->ts.u.cl->length->value.integer) != 0)
 	{
-- 
2.41.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-08-15 15:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-14 17:39 [PATCH] Fortran: Avoid accessing gfc_charlen when not looking at BT_CHARACTER (PR 110677) Martin Jambor
2023-08-14 19:28 ` Harald Anlauf
2023-08-15 15:18   ` Martin Jambor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).