From: Xinliang David Li <davidxl@google.com>
To: Rong Xu <xur@google.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info
Date: Tue, 06 Oct 2015 17:21:00 -0000 [thread overview]
Message-ID: <CAAkRFZ+m3YGOX=q7AZNtntN1NEX4NpnnC3er=Ds5myqSV1EYpA@mail.gmail.com> (raw)
In-Reply-To: <CAF1bQ=S4D8WGv-6VcfoZr7OPfVKy9Kz+hjsfHbsnxR-0V_X+eg@mail.gmail.com>
On Tue, Oct 6, 2015 at 9:26 AM, Rong Xu <xur@google.com> wrote:
> On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li <davidxl@google.com> wrote:
>> unsigned ggc_memory = gcov_read_unsigned ();
>> + unsigned marker = 0, len = 0, k;
>> + char **string_array, *saved_cc1_strings;
>> +
>> for (unsigned j = 0; j < 7; j++)
>>
>>
>> Do not use hard coded number. Use the enum defined in coverage.c.
>
> OK.
>
>>
>>
>> + string_array[j] = xstrdup (gcov_read_string ());
>> +
>> + k = 0;
>> + for (unsigned j = 1; j < 7; j++)
>>
>> Do not use hard coded number.
>
> OK.
>
>>
>>
>> + {
>> + if (num_array[j] == 0)
>> + continue;
>> + marker += num_array[j];
>>
>> It is better to read if the name of variable 'marker' is changed to
>> 'j_end' or something similar
>>
>> For all the substrings of 'j' kind, there should be just one marker,
>> right? It looks like here you introduce one marker per string, not one
>> marker per string kind.
>
> I don't understand what you meant here. "marker" is fixed for each j
> substring (one option kind) -- it the end index of the sub-string
> array. k-loop is for each string.
>
That was a wrong comment from me. Discard it.
>>
>> + len += 3; /* [[<FLAG> */
>>
>> Same here for hard coded value.
>>
>> + for (; k < marker; k++)
>> + len += strlen (string_array[k]) + 1; /* 1 for delimter of ']' */
>>
>> Why do we need one ']' per string?
>
> This is because the options strings can contain space ' '. I cannot
> use space as the delimiter, neither is \0 as it is the end of the
> string of the encoded string.
Ok -- this allows you to avoid string copy during parsing.
>
>>
>>
>> + }
>> + saved_cc1_strings = (char *) xmalloc (len + 1);
>> + saved_cc1_strings[0] = 0;
>> +
>> + marker = 0;
>> + k = 0;
>> + for (unsigned j = 1; j < 7; j++)
>>
>> Same here for 7.
>
> will fix in the new patch.
>
>>
>> + {
>> + static const char lipo_string_flags[6] = {'Q', 'B', 'S',
>> 'D','I', 'C'};
>> + if (num_array[j] == 0)
>> + continue;
>> + marker += num_array[j];
>>
>> Suggest changing marker to j_end
> OK.
>
>>
>> + sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings,
>> + lipo_string_flags[j - 1]);
>> + for (; k < marker; k++)
>> + {
>> + sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings,
>> + string_array[k]);
>>
>> +#define DELIMTER "[["
>>
>> Why double '[' ?
> I will change to single '['.
>
>>
>> +#define DELIMTER2 "]"
>> +#define QUOTE_PATH_FLAG 'Q'
>> +#define BRACKET_PATH_FLAG 'B'
>> +#define SYSTEM_PATH_FLAG 'S'
>> +#define D_U_OPTION_FLAG 'D'
>> +#define INCLUDE_OPTION_FLAG 'I'
>> +#define COMMAND_ARG_FLAG 'C'
>> +
>> +enum lipo_cc1_string_kind {
>> + k_quote_paths = 0,
>> + k_bracket_paths,
>> + k_system_paths,
>> + k_cpp_defines,
>> + k_cpp_includes,
>> + k_lipo_cl_args,
>> + num_lipo_cc1_string_kind
>> +};
>> +
>> +struct lipo_parsed_cc1_string {
>> + const char* source_filename;
>> + unsigned num[num_lipo_cc1_string_kind];
>> + char **strings[num_lipo_cc1_string_kind];
>> +};
>> +
>> +struct lipo_parsed_cc1_string *
>> +lipo_parse_saved_cc1_string (const char *src, char *str,
>> + bool parse_cl_args_only);
>> +void free_parsed_string (struct lipo_parsed_cc1_string *string);
>> +
>>
>> Declare above in a header file.
>
> OK.
>
>>
>>
>> /* Returns true if the command-line arguments stored in the given module-infos
>> are incompatible. */
>> bool
>> -incompatible_cl_args (struct gcov_module_info* mod_info1,
>> - struct gcov_module_info* mod_info2)
>> +incompatible_cl_args (struct lipo_parsed_cc1_string* mod_info1,
>> + struct lipo_parsed_cc1_string* mod_info2)
>>
>> Fix formating.
> OK.
>>
>> {
>> {
>> @@ -1647,7 +1679,7 @@ build_var (tree fn_decl, tree type, int counter)
>> /* Creates the gcov_fn_info RECORD_TYPE. */
>>
>> NULL_TREE, get_gcov_unsigned_t ());
>> DECL_CHAIN (field) = fields;
>> fields = field;
>>
>> - /* Num bracket paths */
>> + /* cc1_uncompressed_strlen field */
>> field = build_decl (BUILTINS_LOCATION, FIELD_DECL,
>> NULL_TREE, get_gcov_unsigned_t ());
>> DECL_CHAIN (field) = fields;
>> fields = field;
>>
>>
>> Why do we need to store uncompressed string length? If there is need
>> to do that, I suggest combine uncompressed length and compressed
>> length into one 32bit integer. (16/16, or 17/15 split)
>
> In theory, I don't need the uncompressed length, But I would need to
> guess the uncompressed length to allocate the buffer. If the
> decompressing fails with insufficient space, I need to have another
> guess and do it again. I think it's easier just save the uncompressed
> size to the struct. I think combine the two sizes into one
> gcov_unsigned_t is a good idea. We don't expect the string size are
> very big.
What is the usual compression ratio? If you guess it right most of the
time, just get rid of the uncompressed length.
David
>>
>>
>> David
>>
>> On Mon, Oct 5, 2015 at 3:51 PM, Rong Xu <xur@google.com> wrote:
>>> Hi,
>>>
>>> This patch is for google branch only.
>>>
>>> It encodes and compresses various cc1 option strings in
>>> gcov_module_info to reduce the lipo instrumented object size. The
>>> savings are from string compression and the reduced number of
>>> relocations.
>>>
>>> More specifically, we replace the following fields in gcov_module_info
>>> gcov_unsigned_t num_quote_paths;
>>> gcov_unsigned_t num_bracket_paths;
>>> gcov_unsigned_t num_system_paths;
>>> gcov_unsigned_t num_cpp_defines;
>>> gcov_unsigned_t num_cpp_includes;
>>> gcov_unsigned_t num_cl_args;
>>> char *string_array[1];
>>> with
>>> gcov_unsigned_t cc1_strlen;
>>> gcov_unsigned_t cc1_uncompressed_strlen;
>>> char *saved_cc1_strings;
>>>
>>> The new saved_cc1_strings are zlib compressed string.
>>>
>>> Tested with google internal benchmarks.
>>>
>>> Thanks,
>>>
>>> -Rong
next prev parent reply other threads:[~2015-10-06 17:21 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-05 22:51 Rong Xu
2015-10-06 0:33 ` Xinliang David Li
2015-10-06 16:26 ` Rong Xu
2015-10-06 17:21 ` Xinliang David Li [this message]
2015-10-06 18:30 ` Rong Xu
2015-10-06 18:40 ` Rong Xu
2015-10-06 23:37 ` Xinliang David Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAAkRFZ+m3YGOX=q7AZNtntN1NEX4NpnnC3er=Ds5myqSV1EYpA@mail.gmail.com' \
--to=davidxl@google.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=xur@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).