From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7638 invoked by alias); 29 Apr 2014 08:40:10 -0000 Mailing-List: contact gsl-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gsl-discuss-owner@sourceware.org Received: (qmail 7628 invoked by uid 89); 29 Apr 2014 08:40:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pb0-f51.google.com Received: from mail-pb0-f51.google.com (HELO mail-pb0-f51.google.com) (209.85.160.51) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 29 Apr 2014 08:40:00 +0000 Received: by mail-pb0-f51.google.com with SMTP id rq2so1936085pbb.10 for ; Tue, 29 Apr 2014 01:39:59 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.66.162.74 with SMTP id xy10mr31303692pab.4.1398760799198; Tue, 29 Apr 2014 01:39:59 -0700 (PDT) Received: by 10.70.37.42 with HTTP; Tue, 29 Apr 2014 01:39:59 -0700 (PDT) In-Reply-To: <535EFCA5.8050807@colorado.edu> References: <535DAFC0.8090402@colorado.edu> <535ED7A5.7020309@lanl.gov> <535EFCA5.8050807@colorado.edu> Date: Tue, 29 Apr 2014 08:40:00 -0000 Message-ID: Subject: Re: switched sparse format to binary trees From: Frank Reininghaus To: gsl-discuss@sourceware.org Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2014-q2/txt/msg00016.txt.bz2 Hi, 2014-04-29 3:13 GMT+02:00 Patrick Alken: > On 04/28/2014 04:35 PM, Gerard Jungman wrote: >> >> On 04/27/2014 07:32 PM, Patrick Alken wrote: >> >>> This actually makes a huge speed improvement in duplicate detection and >>> element retrieval for large sparse matrices. There may however be a >>> performance hit since a new tree node must be allocated each time an >>> element is added to the matrix. But I believe the tradeoff is worth it. >> >> >> For what it's worth, the right thing to do is probably >> to expose the avl allocation strategy to the GSL client >> in some way. It could be as simple as an extra argument >> to gsl_spmatrix_alloc(), which could be a pointer to a >> generic allocator strategy struct (similar or identical >> to 'struct libavl_allocator'). This is the most general >> solution, which requires that clients who care be able >> to write their own allocator. > > Interesting idea...I'll think a bit more on it. Unfortunately I don't > think there is any easy way to preallocate a binary tree. Even if the > user knows ahead of time how many elements they want to add to the > sparse matrix, its not so easy to preallocate the whole tree, since > after each insertion, the tree rebalances itself to minimize its height. you can create an array of many tree nodes in a single allocation though. The allocator struct could then store a pointer to this array, the total size of the array, and the index of the next node that is not in use yet. When a new node is inserted into the tree, one would just take a pointer to the next unused node instead of calling malloc, increase the index in the allocator struct, and perform a new allocation only if all nodes are in use. The advantage is that the amortized cost of this approach is much lower than doing a malloc for each node, and it will also save memory because malloc adds a certain amount of overhead to each individual allocation for its internal bookkeeping. And saving memory for each individual node also means that the entire tree will become more cache-friendly and thus possibly more performant. If individual nodes could be deleted at some point, then the allocator needed some bookkeeping of reusable nodes of its own, but the only node deletion that can take place is that all nodes are deleted together, then this would probably be quite straightforward to implement. Best regards, Frank