From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9921 invoked by alias); 14 Oct 2013 10:46:00 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 9882 invoked by uid 48); 14 Oct 2013 10:45:57 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/57742] memset(malloc(n),0,n) -> calloc(n,1) Date: Mon, 14 Oct 2013 10:46:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 4.9.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2013-10/txt/msg00790.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #5 from Richard Biener --- (In reply to Marc Glisse from comment #4) > (In reply to Richard Biener from comment #2) > > (In reply to Marc Glisse from comment #1) > > > This is a very limited version of this optimization. It is in > > > simplify_builtin_call, so only triggers if malloc/calloc is > > > SSA_NAME_DEF_STMT(gimple_vuse(memset_stmt)). However, generalizing it means > > > we would need plenty of tests protecting against cases where the > > > transformation would be wrong. Note that this transforms: > > > p=malloc(n); > > > if(cond)memset(p,0,n); > > > into: > > > p=calloc(n,1); > > > cond; > > > which is good if cond is p!=0 but may not always be so great otherwise. > > > > ;) post-dominator tests (or simply tests whether both calls are in the > > same basic-block ...). > > Same basic block is quite limited, and for the condition below we don't > directly have post-domination, we would need post-domination between the bbs > with gimple_cond and malloc, and the bb of memset with the landing block of > the gimple_cond. But even finding the gimple_cond in: malloc; loop; cond; > loop; memset; can be hard. I guess I'll have to limit my expectations a > bit... > > > Also you can transform > > > > p = malloc (n); > > if (p) > > memset (p, 0, n); > > > > which might be a common-enough case to optimize for. > > Yes, that's the goal. > > > dereferencing a double wouldn't have a VDEF (unless you store a double). > > I do want to be able to store in between, so I think I have to walk the vdef > chain. But as soon as I do that, I need to make sure that the writes are to > places that can't alias, which complicates things a lot (and it can get a > bit expensive in a function with many memset). Consider this program: > > #include > void f(void*p,int n){ new(p)std::vector(n,0); } > > With -O3, we end up with: > > _27 = operator new (_26); > MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_start = _27; > MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_finish = _27; > _16 = _27 + _26; > MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_end_of_storage = _16; > __builtin_memset (_27, 0, _26); > > which has memory stores between the allocation and memset. That's exactly > the type of code where I'd want the optimization to apply. Joost's example > has the same pattern: malloc, test for 0, several unrelated memory stores, > memset. We have walk_aliased_vdefs for this. Basically the first callback you receive has to be the malloc, otherwise there is an aliasing stmt inbetween. Initialize the ao_ref with ao_ref_init_from_ptr_and_size. > (how to handle the fact that we have operator new and not malloc is a > different issue, I am thinking of having a mode/flag where we promise not to > replace operator new so it can be inlined, which will include an if(p!=0) > test) > > It would be great (in particular for application-specific plugins) to have > an easy way to say things like: this is the next read/write use of this > memory region (but other memory regions may be used in between), and it > isn't post-dominated only because of this gimple_cond, etc. It's almost > noon, too late to be dreaming ;-) See above ;)