From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jason@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTP id BD1AE395C401
 for <gcc-patches@gcc.gnu.org>; Wed, 28 Jul 2021 20:42:47 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BD1AE395C401
Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com
 [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-48-odvyJc_uP4G8xL_A3RiJpQ-1; Wed, 28 Jul 2021 16:42:46 -0400
X-MC-Unique: odvyJc_uP4G8xL_A3RiJpQ-1
Received: by mail-qk1-f200.google.com with SMTP id
 w26-20020a05620a129ab02903b9eeb8b45dso2388714qki.8
 for <gcc-patches@gcc.gnu.org>; Wed, 28 Jul 2021 13:42:46 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=XhjHFJ8aF/2A04OZ5YbC3H6TUCIEv+oyL2c8+plbf3g=;
 b=PXMHql4ygW9hTJFfy/qljsyWDfmKgLtLo1mAu5mj0Ir8pX+SxyWgmeO6hxsa4Oqs6M
 h841R4MrTkiRVNnp8YDIA1QkQZ4nstI40OmL23LC47TmMNVFKaUhWTuWD9CXboj9+7bM
 HcWrBW9l1H/Fu+6/2w0Xo+T5KpfnZPaPYitmfPElohR/iP8kVPlnE0Saqxd+k36yQpPX
 oOr+aBek5Kpw4eRjssIXbsFqIrBXYyDcmkAvyoNSGUsH/YNNc5x60vFUaCleWlR8Z0nK
 Bev8GFtq4SzVt54s6x48EAYtGrmoqudsHE8Knm0McvSV8/W+eG+gpD9VqyTH0PsAlYe9
 KOrA==
X-Gm-Message-State: AOAM5323jdhy0tWBKPcGY4Bx15TvgPGLB7mx0CImvaVXeY2T8Vrfnnnq
 tPvWLnnxR6O0dLXvlhxwkCflclaNdqzX+7B9EpqH2FtvvvyMm7UqA3E+Yo/KR92aF6koY3Nld38
 4ATKWnZ9RjPUcOT9qZfEyzb+KYGU/trWZ8mUI740oZsp9GSB5lNgggqpUbAsg4IIrOQ==
X-Received: by 2002:a05:622a:283:: with SMTP id
 z3mr1344136qtw.312.1627504965477; 
 Wed, 28 Jul 2021 13:42:45 -0700 (PDT)
X-Google-Smtp-Source: ABdhPJyTe0x7ytBdovMoK+1EX13yeUOA74FV9b2QGR1WrYZm/xzcsYKosmXhZbL8MC13dJHWK+4VDg==
X-Received: by 2002:a05:622a:283:: with SMTP id
 z3mr1344116qtw.312.1627504965165; 
 Wed, 28 Jul 2021 13:42:45 -0700 (PDT)
Received: from [192.168.1.148]
 (130-44-159-43.s11817.c3-0.arl-cbr1.sbo-arl.ma.cable.rcncustomer.com.
 [130.44.159.43])
 by smtp.gmail.com with ESMTPSA id t64sm574061qkd.71.2021.07.28.13.42.43
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Wed, 28 Jul 2021 13:42:44 -0700 (PDT)
Subject: Re: [PATCH] c++: Improve memory usage of subsumption [PR100828]
To: Patrick Palka <ppalka@redhat.com>, gcc-patches@gcc.gnu.org
References: <20210719220529.2446563-1-ppalka@redhat.com>
From: Jason Merrill <jason@redhat.com>
Message-ID: <81efcd70-e1ab-9c14-be96-3b1c70a2c64b@redhat.com>
Date: Wed, 28 Jul 2021 16:42:43 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.11.0
MIME-Version: 1.0
In-Reply-To: <20210719220529.2446563-1-ppalka@redhat.com>
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-13.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL,
 SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jul 2021 20:42:49 -0000

On 7/19/21 6:05 PM, Patrick Palka wrote:
> Constraint subsumption is implemented in two steps.  The first step
> computes the disjunctive (or conjunctive) normal form of one of the
> constraints, and the second step verifies that each clause in the
> decomposed form implies the other constraint.   Performing these two
> steps separately is problematic because in the first step the
> disjunctive normal form can be exponentially larger than the original
> constraint, and by computing it ahead of time we'd have to keep all of
> it in memory.
> 
> This patch fixes this exponential blowup in memory usage by interleaving
> these two steps, so that as soon as we decompose one clause we check
> implication for it.  In turn, memory usage during subsumption is now
> worst case linear in the size of the constraints rather than
> exponential, and so we can safely remove the hard limit of 16 clauses
> without introducing runaway memory usage on some inputs.  (Note the
> _time_ complexity of subsumption is still exponential in the worst case.)
> 
> In order for this to work we need formula::branch to prepend the copy
> of the current clause directly after the current clause rather than
> at the end of the list, so that we fully decompose a clause shortly
> after creating it.  Otherwise we'd end up accumulating exponentially
> many (partially decomposed) clauses in memory anyway.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, and also tested on
> range-v3 and cmcstl2.  Does this look OK for trunk and perhaps 11?

OK for trunk.

> 	PR c++/100828
> 
> gcc/cp/ChangeLog:
> 
> 	* logic.cc (formula::formula): Use emplace_back.
> 	(formula::branch): Insert a copy of m_current in front of
> 	m_current instead of at the end of the list.
> 	(formula::erase): Define.
> 	(decompose_formula): Remove.
> 	(decompose_antecedents): Remove.
> 	(decompose_consequents): Remove.
> 	(derive_proofs): Remove.
> 	(max_problem_size): Remove.
> 	(diagnose_constraint_size): Remove.
> 	(subsumes_constraints_nonnull): Rewrite directly in terms of
> 	decompose_clause and derive_proof, interleaving decomposition
> 	with implication checking.  Use formula::erase to free the
> 	current clause before moving on to the next one.
> ---
>   gcc/cp/logic.cc | 118 ++++++++++++++----------------------------------
>   1 file changed, 35 insertions(+), 83 deletions(-)
> 
> diff --git a/gcc/cp/logic.cc b/gcc/cp/logic.cc
> index 142457e408a..3f872c11fe2 100644
> --- a/gcc/cp/logic.cc
> +++ b/gcc/cp/logic.cc
> @@ -223,9 +223,7 @@ struct formula
>   
>     formula (tree t)
>     {
> -    /* This should call emplace_back(). There's an extra copy being
> -       invoked by using push_back().  */
> -    m_clauses.push_back (t);
> +    m_clauses.emplace_back (t);
>       m_current = m_clauses.begin ();
>     }
>   
> @@ -248,8 +246,7 @@ struct formula
>     clause& branch ()
>     {
>       gcc_assert (!done ());
> -    m_clauses.push_back (*m_current);
> -    return m_clauses.back ();
> +    return *m_clauses.insert (std::next (m_current), *m_current);
>     }
>   
>     /* Returns the position of the current clause.  */
> @@ -287,6 +284,14 @@ struct formula
>       return m_clauses.end ();
>     }
>   
> +  /* Remove the specified clause.  */
> +
> +  void erase (iterator i)
> +  {
> +    gcc_assert (i != m_current);
> +    m_clauses.erase (i);
> +  }
> +
>     std::list<clause> m_clauses; /* The list of clauses.  */
>     iterator m_current; /* The current clause.  */
>   };
> @@ -659,39 +664,6 @@ decompose_clause (formula& f, clause& c, rules r)
>     f.advance ();
>   }
>   
> -/* Decompose the logical formula F according to the logical
> -   rules determined by R.  The result is a formula containing
> -   clauses that contain only atomic terms.  */
> -
> -void
> -decompose_formula (formula& f, rules r)
> -{
> -  while (!f.done ())
> -    decompose_clause (f, *f.current (), r);
> -}
> -
> -/* Fully decomposing T into a list of sequents, each comprised of
> -   a list of atomic constraints, as if T were an antecedent.  */
> -
> -static formula
> -decompose_antecedents (tree t)
> -{
> -  formula f (t);
> -  decompose_formula (f, left);
> -  return f;
> -}
> -
> -/* Fully decomposing T into a list of sequents, each comprised of
> -   a list of atomic constraints, as if T were a consequent.  */
> -
> -static formula
> -decompose_consequents (tree t)
> -{
> -  formula f (t);
> -  decompose_formula (f, right);
> -  return f;
> -}
> -
>   static bool derive_proof (clause&, tree, rules);
>   
>   /* Derive a proof of both operands of T.  */
> @@ -744,28 +716,6 @@ derive_proof (clause& c, tree t, rules r)
>     }
>   }
>   
> -/* Derive a proof of T from disjunctive clauses in F.  */
> -
> -static bool
> -derive_proofs (formula& f, tree t, rules r)
> -{
> -  for (formula::iterator i = f.begin(); i != f.end(); ++i)
> -    if (!derive_proof (*i, t, r))
> -      return false;
> -  return true;
> -}
> -
> -/* The largest number of clauses in CNF or DNF we accept as input
> -   for subsumption. This an upper bound of 2^16 expressions.  */
> -static int max_problem_size = 16;
> -
> -static inline bool
> -diagnose_constraint_size (tree t)
> -{
> -  error_at (input_location, "%qE exceeds the maximum constraint complexity", t);
> -  return false;
> -}
> -
>   /* Key/value pair for caching subsumption results. This associates a pair of
>      constraints with a boolean value indicating the result.  */
>   
> @@ -845,31 +795,33 @@ subsumes_constraints_nonnull (tree lhs, tree rhs)
>     if (bool *b = lookup_subsumption(lhs, rhs))
>       return *b;
>   
> -  int n1 = dnf_size (lhs);
> -  int n2 = cnf_size (rhs);
> -
> -  /* Make sure we haven't exceeded the largest acceptable problem.  */
> -  if (std::min (n1, n2) >= max_problem_size)
> -    {
> -      if (n1 < n2)
> -        diagnose_constraint_size (lhs);
> -      else
> -	diagnose_constraint_size (rhs);
> -      return false;
> -    }
> -
> -  /* Decompose the smaller of the two formulas, and recursively
> -     check for implication of the larger.  */
> -  bool result;
> -  if (n1 <= n2)
> -    {
> -      formula dnf = decompose_antecedents (lhs);
> -      result = derive_proofs (dnf, rhs, left);
> -    }
> +  tree x, y;
> +  rules r;
> +  if (dnf_size (lhs) <= cnf_size (rhs))
> +    /* When LHS looks simpler than RHS, we'll determine subsumption by
> +       decomposing LHS into its disjunctive normal form and checking that
> +       each (conjunctive) clause implies RHS.  */
> +    x = lhs, y = rhs, r = left;
>     else
> +    /* Otherwise, we'll determine subsumption by decomposing RHS into its
> +       conjunctive normal form and checking that each (disjunctive) clause
> +       implies LHS.  */
> +    x = rhs, y = lhs, r = right;
> +
> +  /* Decompose X into a list of sequents according to R, and recursively
> +     check for implication of Y.  */
> +  bool result = true;
> +  formula f (x);
> +  while (!f.done ())
>       {
> -      formula cnf = decompose_consequents (rhs);
> -      result = derive_proofs (cnf, lhs, right);
> +      auto i = f.current ();
> +      decompose_clause (f, *i, r);
> +      if (!derive_proof (*i, y, r))
> +	{
> +	  result = false;
> +	  break;
> +	}
> +      f.erase (i);
>       }
>   
>     return save_subsumption (lhs, rhs, result);
>