From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ppalka@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTP id 5375F3857C59
 for <gcc-patches@gcc.gnu.org>; Mon,  9 Aug 2021 21:07:33 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5375F3857C59
Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com
 [209.85.208.70]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-466-Csc5csRSNt6wn6NC9a-PIg-1; Mon, 09 Aug 2021 17:07:31 -0400
X-MC-Unique: Csc5csRSNt6wn6NC9a-PIg-1
Received: by mail-ed1-f70.google.com with SMTP id
 dh21-20020a0564021d35b02903be0aa37025so7314337edb.7
 for <gcc-patches@gcc.gnu.org>; Mon, 09 Aug 2021 14:07:31 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=/QfGx8v1Yqnp43NhxucA99gRS+ujfOGWFVQu59XOjn8=;
 b=Bg9M3XkoPsIq8lkAT8X2MiHzevUXhLu5GhzBsdNgRRolKXecp6tNu1XYjrd5HlJsNg
 2KF2FkcQHMhuSKW3y2Bbc87K1oCEuxc4KZgVGH1scqC4NZbQVCZySKcrdfovaMkg5j94
 PEMG0Dtpdvl/gTRDYw3vVMf6kZCKiV+Y4J33W/nQUzpCT8MU3jHMAceOt4yerDlwASKI
 Ra7Z7TDXNVa5FoV/wnoqMY70/CG4+bQEHxQrz7VUmiocF1jGV6PnrYftlcINXbBNwRc8
 f/0xAJdi8PW3dfi/67PmjvC9McCoO69S5jFmhMLZc0ExUXoEn1ASFSFA/gZvCwHIWq00
 13vQ==
X-Gm-Message-State: AOAM533a9dgig6Wd4Scnm3O0rQkJq6VDg0DdsbdpXW4BPK2TPyO1/CfC
 JQYi/dURaQzf3yES1o9TRCwR+PyFh7SWQx5NZHcpDbECcaV1Gfeew3P+63GHpWyUtuZWH/zRK+i
 Qj5iI1s3BcgdwZgR6FYlhJXotUv/1XwNchA==
X-Received: by 2002:a50:da0e:: with SMTP id z14mr313266edj.73.1628543250127;
 Mon, 09 Aug 2021 14:07:30 -0700 (PDT)
X-Google-Smtp-Source: ABdhPJxtd+OPL5OEXQQ/EhVRLOyPH2IMrJjTccIiKx5P8UnwlCqoO0EnRyWDs1kxBDvxVHae0Bno0ZCGDHSp/0BV2yw=
X-Received: by 2002:a50:da0e:: with SMTP id z14mr313242edj.73.1628543249879;
 Mon, 09 Aug 2021 14:07:29 -0700 (PDT)
MIME-Version: 1.0
References: <20210719220529.2446563-1-ppalka@redhat.com>
 <81efcd70-e1ab-9c14-be96-3b1c70a2c64b@redhat.com>
In-Reply-To: <81efcd70-e1ab-9c14-be96-3b1c70a2c64b@redhat.com>
From: Patrick Palka <ppalka@redhat.com>
Date: Mon, 9 Aug 2021 17:07:19 -0400
Message-ID: <CAMOnLZZMQGjfF3ynHzCX98=N2kS_Ui1nD8rnRegRGiu_C2dEVg@mail.gmail.com>
Subject: Re: [PATCH] c++: Improve memory usage of subsumption [PR100828]
To: Jason Merrill <jason@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Aug 2021 21:07:43 -0000

On Wed, Jul 28, 2021 at 4:42 PM Jason Merrill <jason@redhat.com> wrote:
>
> On 7/19/21 6:05 PM, Patrick Palka wrote:
> > Constraint subsumption is implemented in two steps.  The first step
> > computes the disjunctive (or conjunctive) normal form of one of the
> > constraints, and the second step verifies that each clause in the
> > decomposed form implies the other constraint.   Performing these two
> > steps separately is problematic because in the first step the
> > disjunctive normal form can be exponentially larger than the original
> > constraint, and by computing it ahead of time we'd have to keep all of
> > it in memory.
> >
> > This patch fixes this exponential blowup in memory usage by interleaving
> > these two steps, so that as soon as we decompose one clause we check
> > implication for it.  In turn, memory usage during subsumption is now
> > worst case linear in the size of the constraints rather than
> > exponential, and so we can safely remove the hard limit of 16 clauses
> > without introducing runaway memory usage on some inputs.  (Note the
> > _time_ complexity of subsumption is still exponential in the worst case.)
> >
> > In order for this to work we need formula::branch to prepend the copy
> > of the current clause directly after the current clause rather than
> > at the end of the list, so that we fully decompose a clause shortly
> > after creating it.  Otherwise we'd end up accumulating exponentially
> > many (partially decomposed) clauses in memory anyway.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, and also tested on
> > range-v3 and cmcstl2.  Does this look OK for trunk and perhaps 11?
>
> OK for trunk.

Thanks a lot, patch committed to trunk as r12-2658.  Since this low
complexity limit was introduced in GCC 10, what do you think about
increasing the limit from 16 to say 128 in the 10/11 release branches
as a relatively safe stopgap?

>
> >       PR c++/100828
> >
> > gcc/cp/ChangeLog:
> >
> >       * logic.cc (formula::formula): Use emplace_back.
> >       (formula::branch): Insert a copy of m_current in front of
> >       m_current instead of at the end of the list.
> >       (formula::erase): Define.
> >       (decompose_formula): Remove.
> >       (decompose_antecedents): Remove.
> >       (decompose_consequents): Remove.
> >       (derive_proofs): Remove.
> >       (max_problem_size): Remove.
> >       (diagnose_constraint_size): Remove.
> >       (subsumes_constraints_nonnull): Rewrite directly in terms of
> >       decompose_clause and derive_proof, interleaving decomposition
> >       with implication checking.  Use formula::erase to free the
> >       current clause before moving on to the next one.
> > ---
> >   gcc/cp/logic.cc | 118 ++++++++++++++----------------------------------
> >   1 file changed, 35 insertions(+), 83 deletions(-)
> >
> > diff --git a/gcc/cp/logic.cc b/gcc/cp/logic.cc
> > index 142457e408a..3f872c11fe2 100644
> > --- a/gcc/cp/logic.cc
> > +++ b/gcc/cp/logic.cc
> > @@ -223,9 +223,7 @@ struct formula
> >
> >     formula (tree t)
> >     {
> > -    /* This should call emplace_back(). There's an extra copy being
> > -       invoked by using push_back().  */
> > -    m_clauses.push_back (t);
> > +    m_clauses.emplace_back (t);
> >       m_current = m_clauses.begin ();
> >     }
> >
> > @@ -248,8 +246,7 @@ struct formula
> >     clause& branch ()
> >     {
> >       gcc_assert (!done ());
> > -    m_clauses.push_back (*m_current);
> > -    return m_clauses.back ();
> > +    return *m_clauses.insert (std::next (m_current), *m_current);
> >     }
> >
> >     /* Returns the position of the current clause.  */
> > @@ -287,6 +284,14 @@ struct formula
> >       return m_clauses.end ();
> >     }
> >
> > +  /* Remove the specified clause.  */
> > +
> > +  void erase (iterator i)
> > +  {
> > +    gcc_assert (i != m_current);
> > +    m_clauses.erase (i);
> > +  }
> > +
> >     std::list<clause> m_clauses; /* The list of clauses.  */
> >     iterator m_current; /* The current clause.  */
> >   };
> > @@ -659,39 +664,6 @@ decompose_clause (formula& f, clause& c, rules r)
> >     f.advance ();
> >   }
> >
> > -/* Decompose the logical formula F according to the logical
> > -   rules determined by R.  The result is a formula containing
> > -   clauses that contain only atomic terms.  */
> > -
> > -void
> > -decompose_formula (formula& f, rules r)
> > -{
> > -  while (!f.done ())
> > -    decompose_clause (f, *f.current (), r);
> > -}
> > -
> > -/* Fully decomposing T into a list of sequents, each comprised of
> > -   a list of atomic constraints, as if T were an antecedent.  */
> > -
> > -static formula
> > -decompose_antecedents (tree t)
> > -{
> > -  formula f (t);
> > -  decompose_formula (f, left);
> > -  return f;
> > -}
> > -
> > -/* Fully decomposing T into a list of sequents, each comprised of
> > -   a list of atomic constraints, as if T were a consequent.  */
> > -
> > -static formula
> > -decompose_consequents (tree t)
> > -{
> > -  formula f (t);
> > -  decompose_formula (f, right);
> > -  return f;
> > -}
> > -
> >   static bool derive_proof (clause&, tree, rules);
> >
> >   /* Derive a proof of both operands of T.  */
> > @@ -744,28 +716,6 @@ derive_proof (clause& c, tree t, rules r)
> >     }
> >   }
> >
> > -/* Derive a proof of T from disjunctive clauses in F.  */
> > -
> > -static bool
> > -derive_proofs (formula& f, tree t, rules r)
> > -{
> > -  for (formula::iterator i = f.begin(); i != f.end(); ++i)
> > -    if (!derive_proof (*i, t, r))
> > -      return false;
> > -  return true;
> > -}
> > -
> > -/* The largest number of clauses in CNF or DNF we accept as input
> > -   for subsumption. This an upper bound of 2^16 expressions.  */
> > -static int max_problem_size = 16;
> > -
> > -static inline bool
> > -diagnose_constraint_size (tree t)
> > -{
> > -  error_at (input_location, "%qE exceeds the maximum constraint complexity", t);
> > -  return false;
> > -}
> > -
> >   /* Key/value pair for caching subsumption results. This associates a pair of
> >      constraints with a boolean value indicating the result.  */
> >
> > @@ -845,31 +795,33 @@ subsumes_constraints_nonnull (tree lhs, tree rhs)
> >     if (bool *b = lookup_subsumption(lhs, rhs))
> >       return *b;
> >
> > -  int n1 = dnf_size (lhs);
> > -  int n2 = cnf_size (rhs);
> > -
> > -  /* Make sure we haven't exceeded the largest acceptable problem.  */
> > -  if (std::min (n1, n2) >= max_problem_size)
> > -    {
> > -      if (n1 < n2)
> > -        diagnose_constraint_size (lhs);
> > -      else
> > -     diagnose_constraint_size (rhs);
> > -      return false;
> > -    }
> > -
> > -  /* Decompose the smaller of the two formulas, and recursively
> > -     check for implication of the larger.  */
> > -  bool result;
> > -  if (n1 <= n2)
> > -    {
> > -      formula dnf = decompose_antecedents (lhs);
> > -      result = derive_proofs (dnf, rhs, left);
> > -    }
> > +  tree x, y;
> > +  rules r;
> > +  if (dnf_size (lhs) <= cnf_size (rhs))
> > +    /* When LHS looks simpler than RHS, we'll determine subsumption by
> > +       decomposing LHS into its disjunctive normal form and checking that
> > +       each (conjunctive) clause implies RHS.  */
> > +    x = lhs, y = rhs, r = left;
> >     else
> > +    /* Otherwise, we'll determine subsumption by decomposing RHS into its
> > +       conjunctive normal form and checking that each (disjunctive) clause
> > +       implies LHS.  */
> > +    x = rhs, y = lhs, r = right;
> > +
> > +  /* Decompose X into a list of sequents according to R, and recursively
> > +     check for implication of Y.  */
> > +  bool result = true;
> > +  formula f (x);
> > +  while (!f.done ())
> >       {
> > -      formula cnf = decompose_consequents (rhs);
> > -      result = derive_proofs (cnf, lhs, right);
> > +      auto i = f.current ();
> > +      decompose_clause (f, *i, r);
> > +      if (!derive_proof (*i, y, r))
> > +     {
> > +       result = false;
> > +       break;
> > +     }
> > +      f.erase (i);
> >       }
> >
> >     return save_subsumption (lhs, rhs, result);
> >
>