From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by sourceware.org (Postfix) with ESMTPS id 97C873857C4D for ; Mon, 21 Sep 2020 20:08:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 97C873857C4D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=serhei.io Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=me@serhei.io Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id 031AD5C024D; Mon, 21 Sep 2020 16:08:38 -0400 (EDT) Received: from imap21 ([10.202.2.71]) by compute7.internal (MEProxy); Mon, 21 Sep 2020 16:08:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=serhei.io; h= mime-version:message-id:in-reply-to:references:date:from:to:cc :subject:content-type; s=fm2; bh=3bSwDAK38TqoZzRmK69MwyOQg/Pejnl E7384cjaZDmA=; b=zZ1GcKfFbbH2d7pQ5e7ePhb7L2dH/E3wqNhUaGJHI4/HVWF A4gfxcuSUbGNLdyRAd6/CZxp/em2bcvem9pbIGsXwCl+rjcA7P7WdE9V1wwKR8jN ySaERMEoXJNCDq01oBnRo7KvYJmSUYaopvqixaZY+nh0o9jUw0lXZuTEvgWKc2Kj 17jfTMC07H+vv7+zdtCMdO64Xfe4xI4da3vbJ54Nwh3zRKXT+24MEaXtAVBmNq/M HzJGcqaV89/K6rl95S3PtBzhKpx7q7sCF+F4CJ3ujT/MItbcECothk3/scIMsTlX 9rUAzwiV/FhDeGXtIFEWd+gYPhMqmxc6LoeOisg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=3bSwDA K38TqoZzRmK69MwyOQg/PejnlE7384cjaZDmA=; b=pSeV/uR7BAO2LunhHBqRsn cbDs2v2HsiAOldf7TbDo2GhAIv6awvXTdiYJkX7iahAiZKFHBUv4XaaeUD3IlfnD i0X9m21SVVvBk36yf4MHcp6uSVWZd2fE1sWbKDjYi1UIdCQkK0kKhLX2zlO1kYQc lZApZf1MWeWxLKmjgEwGulelvklQftctJvw3GcHmHwB8nPgafI7CN+qWoTkjClye MJehqehO5D/58J7lDVX69pp686NOK6LNeDWfdRUKx01UOuo0jIf2Avna6TWvG+yy KdkudRRYedFjPuhBXdwtQVIQnGdKQ4dKvY52uKd8ZJNM62dCzcC3WIK5NjJeaAGA == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedruddvgddugeekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvffutgesthdtredtreertdenucfhrhhomhepfdfuvghr hhgvihcuofgrkhgrrhhovhdfuceomhgvsehsvghrhhgvihdrihhoqeenucggtffrrghtth gvrhhnpeetudehffejheeufeeiueehgeektdelueejuedvhfejgfdtfefggeeftdegtdek heenucffohhmrghinhepshhouhhrtggvfigrrhgvrdhorhhgnecuvehluhhsthgvrhfuih iivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhgvsehsvghrhhgvihdrihho X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id B6A5F660069; Mon, 21 Sep 2020 16:08:29 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.3.0-325-g8593b62-fm-20200916.004-g0f995879-bis Mime-Version: 1.0 Message-Id: <8e645045-04e6-412d-97d2-46a173d2fefa@www.fastmail.com> In-Reply-To: <9709c97e-bb12-48dd-2c5c-a9efb35e55d1@redhat.com> References: <30950cc2-5d7f-eb93-42b1-d1c7a9138e81@redhat.com> <9709c97e-bb12-48dd-2c5c-a9efb35e55d1@redhat.com> Date: Mon, 21 Sep 2020 16:08:16 -0400 From: "Serhei Makarov" To: "Keith Seitz" Cc: Bunsen Subject: Re: Initial findings of bunsen performance and questions Content-Type: text/plain X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: bunsen@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Bunsen mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Sep 2020 20:08:41 -0000 On Fri, Sep 18, 2020, at 12:16 PM, Keith Seitz wrote: > Yes, I will send you (privately) what I've been benchmarking. This is > nearly unaltered > bunsen repo, but I've added a few simple patches to fix some existing > problems. The repo looked ok, so I compared the time to load 10,000 testcases while skipping different parts of the testrun deserialization process. I found that the main performance culprit is the deserialization code in Cursor. That does a regexp match operation for every one of 60,000 testcase results. Which neatly explains the slowdown. For the record: as I understood, you wanted to store individual testcases (consolidate_pass=False) since you wanted to have a Cursor giving the line range for every passing testcase in the .log. So fixing the Cursor deserialization performance is really necessary. Thankfully, it's possible to make the regexp parsing into an on-demand operation that occurs when a particular Cursor's line_start/line_end/testlog fields are accessed. commit 2ed3d13f76c3 should solve the issue, reducing the observed 58sec parsing time to 2s. https://sourceware.org/git/?p=bunsen.git;a=commitdiff;h=2ed3d13f76c37418cfd4180b149bca0e533ec92f