From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24268 invoked by alias); 12 Apr 2012 10:30:29 -0000 Received: (qmail 24258 invoked by uid 22791); 12 Apr 2012 10:30:28 -0000 X-SWARE-Spam-Status: No, hits=-4.8 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE X-Spam-Check-By: sourceware.org Received: from mail-iy0-f175.google.com (HELO mail-iy0-f175.google.com) (209.85.210.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 12 Apr 2012 10:30:06 +0000 Received: by iaag37 with SMTP id g37so2761260iaa.20 for ; Thu, 12 Apr 2012 03:30:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.183.193 with SMTP id eo1mr1831720igc.20.1334226605677; Thu, 12 Apr 2012 03:30:05 -0700 (PDT) Received: by 10.42.228.200 with HTTP; Thu, 12 Apr 2012 03:30:05 -0700 (PDT) In-Reply-To: References: <4F7B356E.9080003@google.com> <4F7C35A3.3080207@codesourcery.com> <20120410084614.GJ6148@sunsite.ms.mff.cuni.cz> <20120410163905.GK6148@sunsite.ms.mff.cuni.cz> Date: Thu, 12 Apr 2012 10:30:00 -0000 Message-ID: Subject: Re: Switching to C++ by default in 4.8 From: Richard Guenther To: Chiheng Xu Cc: Lawrence Crowl , Jakub Jelinek , Xinliang David Li , Bernd Schmidt , Gabriel Dos Reis , David Edelsohn , Diego Novillo , gcc Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2012-04/txt/msg00482.txt.bz2 On Thu, Apr 12, 2012 at 11:28 AM, Chiheng Xu wrote: > > The reason why GCC's code is very hard to hack is not simple. In part, > this is because GCC use a very old, extremely hard to understand build > system. In part, this is because GCC developer are more focused on > fixing bugs or adding new features, rather than re-factoring GCC's > code itself. =A0For example, for a .c file that have 15 years old, > people tend to fix its bugs to make it more and more ugly, rather to > rewrite it. > > But I think the big reason is that, GCC tend to have extremely large > .c files, which is typical > 6000 LOC. If you look at LLVM, there are > rarely source code files that is > 2000 LOC. =A0Typical LLVM source code > files have 1000~2000 LOC. =A0Just separating =A0a source code file of 6000 > LOC to several small files or file sections of 1000 LOC can improve > the code significantly. =A0Why has this not been done before ? =A0GCC > developers are reluctant to re-factoring their code may be the reason. > And, as the .c file grows, it become even harder to re-factor. > Thinking in C++ can help you write smaller, easier to understand, > easier to maintain code(C or C++), which have high cohesion and low > coupling. > > And I think the file names of GCC's source can also be changed more > friendly to newbies, using some notion of FQN(fully qualified name) > may be good. I think one of the reasons is a tools deficiency - at least subversion (whi= ch we use) is not able to track code motion, so if you dig in the revision his= tory you will need more intermediate steps, but more important, rely on 2nd level information (like the ChangeLog entry) to tell where a function was moved f= rom. Still some refactoring happens (I think mostly trying to remove APIs is important). But yes, I think we never renamed files ... I suppose when we start moving things into sub-directories that would be a good time to re-think names. At least subversion can handle file-renames just fine ;) Yes, files are too big - but splitting them is not easy unless you can figure out a hierarchy that you can expose. The largest file is dwarf2out.c with 22825 lines, but the average is more like 2000 (just looking at gcc/*.c files). There are only 23 files bigger than 6000 lines (out of 356), so the situation is not as ba= d as you paint it. But yes, looking at filenames hardly tells you about its con= tents anymore. Richard.