* Linemap and pph @ 2011-07-22 20:40 Gabriel Charette 2011-07-22 21:15 ` Tom Tromey 0 siblings, 1 reply; 15+ messages in thread From: Gabriel Charette @ 2011-07-22 20:40 UTC (permalink / raw) To: Diego Novillo, tromey; +Cc: Lawrence Crowl, gcc, Collin Winter Hi, @tromey: We have a question for you: the problem is detailed here and our question to you is at the bottom. Thanks! So the line problem we are having in pph is much bigger then we originally thought. The problem is: As we've had issues with before: the chain of name bindings in any cpp_bindings is inserted at the head, such that when we stream it back in, the last one put on the chain by the parser is read first. This is a problem in the linemap case because when we read the last name, we also read it's input_location (line and column) and try to replay that location by calling linemap_line_start (so far so good....). This issue is that linemap_line_start EXPECTS to be called with lines in increasing orders (but since the bindings are backwards in the chain we actually call it with lines in a decreasing order), the logic in linemap_line_start is such that if the line# is less then the previous line# it was called with, it assumes this must be a new file and creates a new file entry in the line_table by calling linemap_add (which in our case is WRONG!!). From a comment found in libcpp/line-map.c, linemaps clearly assume this fact: "Since the set (line_table) is built chronologically, the logical lines are monotonic increasing, and so the list is sorted and we can use a binary search." @crowl: Maybe this is the issue with resume_scope?? I'm just thinking that since it seems lookups are fairly tied to the line_table, and the fact that it should be built in a monotonically increasing order. In the past we have solved similar issues (caused by the backwards ordering), by replaying whatever it was in the correct order (before doing anything else), and then simply loading references using the pph_cache when the real ones were needed. BUT we can't do this with source_locations as they are a simple typedef unsigned int, not actual structs which are passed by pointer value... Potential solution? : I thought of a few ways to solve this: 1) Catch the calls to the linemap functions when generating the pph, store them with the pph and replay them first before anything else when reading the pph file. The problem with that is that to when trying to obtain the source location for a given binding streamed in: after calling linemap_line_start when needed, we call linemap_position_for_column which ASSUMES that we are currently on the last line which linemap_line_start was called with (and thus doing the full replay first would not really work here, is there another function we can call that can take a line as an argument as well as a column??) 2) We could potentially output a reference to the source_location and use that to do what we always do using the cache in this sort of situation, however, we would need to figure out exactly which integer we need to output a reference of (since they are passed around by value...). This would be kinda hacky, and would also imply building some hackery to go around the fact that the lto version of the input location doesn't need to use references (i.e. we would need a streamer hook, but two different functions signatures...). I don't think this is a nice way of going about it, but could be an option... @tromey: I hear you are the person in the know when it comes down to linemaps, do you have any hint on how we could rebuild a valid line_table given we are obtaining the bindings from the last one in the file to the first one (that is, using pph (pre-parsed headers) where we are trying to restore a cached version of the parser state for a header)? Thanks, Gabriel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-22 20:40 Linemap and pph Gabriel Charette @ 2011-07-22 21:15 ` Tom Tromey 2011-07-22 22:12 ` Gabriel Charette 0 siblings, 1 reply; 15+ messages in thread From: Tom Tromey @ 2011-07-22 21:15 UTC (permalink / raw) To: Gabriel Charette; +Cc: Diego Novillo, Lawrence Crowl, gcc, Collin Winter >>>>> "Gabriel" == Gabriel Charette <gchare@google.com> writes: Gabriel> @tromey: We have a question for you: the problem is detailed Gabriel> here and our question to you is at the bottom. Thanks! Sure. I have not followed PPH progress very closely and after reading your message I am not sure I have much to offer. I'll give it a stab anyhow. Gabriel> This is a problem in the linemap case because when we read the last Gabriel> name, we also read it's input_location (line and column) and try to Gabriel> replay that location by calling linemap_line_start (so far so Gabriel> good....). This issue is that linemap_line_start EXPECTS to be called Gabriel> with lines in increasing orders (but since the bindings are backwards Gabriel> in the chain we actually call it with lines in a decreasing order), Gabriel> the logic in linemap_line_start is such that if the line# is less then Gabriel> the previous line# it was called with, it assumes this must be a new Gabriel> file and creates a new file entry in the line_table by calling Gabriel> linemap_add (which in our case is WRONG!!). My belief is that linemap was designed to support the typical case for a C and C++ compiler. However, this is not sacrosanct; you could perhaps change the implementation to work better for your case. I suspect, though, that this won't be easy -- and also be aware of Dodji's patch series, which complicates linemap. Gabriel> In the past we have solved similar issues (caused by the backwards Gabriel> ordering), by replaying whatever it was in the correct order (before Gabriel> doing anything else), and then simply loading references using the Gabriel> pph_cache when the real ones were needed. BUT we can't do this with Gabriel> source_locations as they are a simple typedef unsigned int, not actual Gabriel> structs which are passed by pointer value... A source_location is a reference to a particular line_map. It is sort of like a compressed pointer. Gabriel> @tromey: I hear you are the person in the know when it comes down to Gabriel> linemaps, do you have any hint on how we could rebuild a valid Gabriel> line_table given we are obtaining the bindings from the last one in Gabriel> the file to the first one (that is, using pph (pre-parsed headers) Gabriel> where we are trying to restore a cached version of the parser state Gabriel> for a header)? Can you not just serialize the line map when creating the PPH? Then, when using the PPH, read the serialized line map, in order, into the current global line map. This will preserve the include chains and all the rest. Then rewrite source locations coming from the PPH to new locations from the new line map. This rewriting could perhaps even be done efficiently, say just by adding an offset to all locations -- though I am not sure, this would require some research. (It seems doable to me, though; perhaps, but not definitely, requiring some additional linemap API.) I have no idea if this is practical for you. I guess it depends on how a PPH is read in. Tom ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-22 21:15 ` Tom Tromey @ 2011-07-22 22:12 ` Gabriel Charette 2011-07-23 5:46 ` Tom Tromey 2011-07-23 17:54 ` Dodji Seketeli 0 siblings, 2 replies; 15+ messages in thread From: Gabriel Charette @ 2011-07-22 22:12 UTC (permalink / raw) To: Tom Tromey; +Cc: Diego Novillo, Lawrence Crowl, gcc, Collin Winter Thanks for the quick reply Tom, On Fri, Jul 22, 2011 at 1:39 PM, Tom Tromey <tromey@redhat.com> wrote: >>>>>> "Gabriel" == Gabriel Charette <gchare@google.com> writes: > > Gabriel> @tromey: We have a question for you: the problem is detailed > Gabriel> here and our question to you is at the bottom. Thanks! > > Sure. I have not followed PPH progress very closely and after reading > your message I am not sure I have much to offer. I'll give it a stab > anyhow. > > Gabriel> This is a problem in the linemap case because when we read the last > Gabriel> name, we also read it's input_location (line and column) and try to > Gabriel> replay that location by calling linemap_line_start (so far so > Gabriel> good....). This issue is that linemap_line_start EXPECTS to be called > Gabriel> with lines in increasing orders (but since the bindings are backwards > Gabriel> in the chain we actually call it with lines in a decreasing order), > Gabriel> the logic in linemap_line_start is such that if the line# is less then > Gabriel> the previous line# it was called with, it assumes this must be a new > Gabriel> file and creates a new file entry in the line_table by calling > Gabriel> linemap_add (which in our case is WRONG!!). > > My belief is that linemap was designed to support the typical case for a > C and C++ compiler. However, this is not sacrosanct; you could perhaps > change the implementation to work better for your case. I suspect, > though, that this won't be easy -- and also be aware of Dodji's patch > series, which complicates linemap. > We are tying to keep pph as "pluginable" as possible (Diego correct me if I'm wrong), so changing the actual implementation of the linemap would be our very last resort I think. > Gabriel> In the past we have solved similar issues (caused by the backwards > Gabriel> ordering), by replaying whatever it was in the correct order (before > Gabriel> doing anything else), and then simply loading references using the > Gabriel> pph_cache when the real ones were needed. BUT we can't do this with > Gabriel> source_locations as they are a simple typedef unsigned int, not actual > Gabriel> structs which are passed by pointer value... > > A source_location is a reference to a particular line_map. It is sort > of like a compressed pointer. > Right, but our cache works in a way that, say you output a refrenece to a struct, you first remember the pointer value, then output the struct, if you try to ouput the same reference (same pointer value) later we simply mark that as a shared reference in the output (and have logic to link the two on the way in). However since source_location aren't pointers per se, this wouldn't work (at least with our current cache implementation, and changing that is also last resort in my opinion) > Gabriel> @tromey: I hear you are the person in the know when it comes down to > Gabriel> linemaps, do you have any hint on how we could rebuild a valid > Gabriel> line_table given we are obtaining the bindings from the last one in > Gabriel> the file to the first one (that is, using pph (pre-parsed headers) > Gabriel> where we are trying to restore a cached version of the parser state > Gabriel> for a header)? > > Can you not just serialize the line map when creating the PPH? > We were wondering about that, the problem we thought is that a pph can be loaded from anywhere in many different C files (which would generate a different linemap entry in each case I think?). If there was a way to serialize the linemap entries from the LC_ENTER entry for the header file to the LC_LEAVE (i.e. ignoring builtins, command line, etc.), and then directly insert those entries in a given C file compilation's linemaps entries, that would be great! > Then, when using the PPH, read the serialized line map, in order, into > the current global line map. This will preserve the include chains and > all the rest. Then rewrite source locations coming from the PPH to new > locations from the new line map. This rewriting could perhaps even be > done efficiently, say just by adding an offset to all locations -- > though I am not sure, this would require some research. (It seems > doable to me, though; perhaps, but not definitely, requiring some > additional linemap API.) > I will investigate in that direction. > I have no idea if this is practical for you. I guess it depends on how > a PPH is read in. > A pph is read in by: reading the preprocessor tokens and replaying them (i.e. redefining them as if they had been pasted in the C file at the very location of the include triggering the pph). We then load the bindings attached to the global_namespace defined in the header and all bindings are recursively loaded in that fashion, the only special fact to be aware of regarding the linemap issue is that those bindings (by the very nature of the parser logic happening before we cache the parser's state into the pph), get to us in the reverse order they were originally seen in the header file (thus creating this linemap problem). Thank you very much, Gabriel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-22 22:12 ` Gabriel Charette @ 2011-07-23 5:46 ` Tom Tromey 2011-07-23 17:54 ` Dodji Seketeli 1 sibling, 0 replies; 15+ messages in thread From: Tom Tromey @ 2011-07-23 5:46 UTC (permalink / raw) To: Gabriel Charette; +Cc: Diego Novillo, Lawrence Crowl, gcc, Collin Winter Gabriel> We are tying to keep pph as "pluginable" as possible (Diego correct me Gabriel> if I'm wrong), so changing the actual implementation of the linemap Gabriel> would be our very last resort I think. Gabriel> However since source_location aren't pointers per se, this wouldn't Gabriel> work (at least with our current cache implementation, and changing Gabriel> that is also last resort in my opinion) I think something's got to give :) Tom> Can you not just serialize the line map when creating the PPH? Gabriel> We were wondering about that, the problem we thought is that a pph can Gabriel> be loaded from anywhere in many different C files (which would Gabriel> generate a different linemap entry in each case I think?). If there Gabriel> was a way to serialize the linemap entries from the LC_ENTER entry for Gabriel> the header file to the LC_LEAVE (i.e. ignoring builtins, command line, Gabriel> etc.), and then directly insert those entries in a given C file Gabriel> compilation's linemaps entries, that would be great! Sure, I think you could probably serialize just some subset of the linemap. It is hard to be positive, as nobody has ever tried it, and I don't know enough about your needs to suggest what subsetting might make sense. The question then is whether you are certain that the other objects you write out are guaranteed not to reference locations outside this set. Tom ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-22 22:12 ` Gabriel Charette 2011-07-23 5:46 ` Tom Tromey @ 2011-07-23 17:54 ` Dodji Seketeli 2011-07-25 19:34 ` Gabriel Charette 1 sibling, 1 reply; 15+ messages in thread From: Dodji Seketeli @ 2011-07-23 17:54 UTC (permalink / raw) To: Gabriel Charette Cc: Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter Gabriel Charette <gchare@google.com> a écrit: >> Gabriel> @tromey: I hear you are the person in the know when it comes down to >> Gabriel> linemaps, do you have any hint on how we could rebuild a valid >> Gabriel> line_table given we are obtaining the bindings from the last one in >> Gabriel> the file to the first one (that is, using pph (pre-parsed headers) >> Gabriel> where we are trying to restore a cached version of the parser state >> Gabriel> for a header)? >> >> Can you not just serialize the line map when creating the PPH? >> > > We were wondering about that, the problem we thought is that a pph can > be loaded from anywhere in many different C files (which would > generate a different linemap entry in each case I think?). Just to make sure I understand, a given header included from two different main C files with two different sets of macro context would yield two macro maps with different contents anyway. So I would believe that these would yield two different pph, one for each C file. Is that correct? For the cases where the same pph could be loaded from two different main CUs, I'd say that in the context of a given main CU being parsed, loading the PPH and its associated serialized line map would yield a new line map to be inserted into the struct linemaps of the current CU. Then, not only should the source_locations encoded by that new de-serialized line map be rewritten (probably by modifying things like struct linemap::to_line and struct linemap::included_from, if need be) to fit into the source_location space of the line map set carried by the struct linemaps of the current main CU being parsed, but the source_locations carried by the tokens present inside the pph must also be changed to look as if they were yielded by the newly rewritten line map. This is what could look expensive at first sight. Wouldn't that address the issue of a given pph loaded by multiple main C files? > If there was a way to serialize the linemap entries from the LC_ENTER > entry for the header file to the LC_LEAVE (i.e. ignoring builtins, > command line, etc.) I believe all the builtins have the same source_location, which is the BUILTIN_LOCATION constant. As for command line options, I believe they get the location UNKNOWN_LOCATION. So I would think these should not be a problem when de-serializing the line map. -- Dodji ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-23 17:54 ` Dodji Seketeli @ 2011-07-25 19:34 ` Gabriel Charette 2011-07-25 22:55 ` Gabriel Charette 0 siblings, 1 reply; 15+ messages in thread From: Gabriel Charette @ 2011-07-25 19:34 UTC (permalink / raw) To: Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Gabriel Charette, Dodji Seketeli On Sat, Jul 23, 2011 at 8:51 AM, Dodji Seketeli <dodji@seketeli.org> wrote: > Gabriel Charette <gchare@google.com> a écrit: > >>> Gabriel> @tromey: I hear you are the person in the know when it comes down to >>> Gabriel> linemaps, do you have any hint on how we could rebuild a valid >>> Gabriel> line_table given we are obtaining the bindings from the last one in >>> Gabriel> the file to the first one (that is, using pph (pre-parsed headers) >>> Gabriel> where we are trying to restore a cached version of the parser state >>> Gabriel> for a header)? >>> >>> Can you not just serialize the line map when creating the PPH? >>> >> >> We were wondering about that, the problem we thought is that a pph can >> be loaded from anywhere in many different C files (which would >> generate a different linemap entry in each case I think?). > > Just to make sure I understand, a given header included from two > different main C files with two different sets of macro context would > yield two macro maps with different contents anyway. So I would believe > that these would yield two different pph, one for each C file. Is that > correct? > Before using a pph, we make sure the current macro context is the same as the one in which it was originally compiled (as far as what that header is using from that context). So yes, sometimes we have different pph for different C files, but it's very possible to use the same pph for different includes in independent C files. > For the cases where the same pph could be loaded from two different main > CUs, I'd say that in the context of a given main CU being parsed, loading > the PPH and its associated serialized line map would yield a new line > map to be inserted into the struct linemaps of the current CU. > > Then, not only should the source_locations encoded by that new > de-serialized line map be rewritten (probably by modifying things like > struct linemap::to_line and struct linemap::included_from, if need be) > to fit into the source_location space of the line map set carried by the > struct linemaps of the current main CU being parsed, but the > source_locations carried by the tokens present inside the pph must also > be changed to look as if they were yielded by the newly rewritten line > map. This is what could look expensive at first sight. > Right this is one of the option, but as you say, potentially an expensive change. > Wouldn't that address the issue of a given pph loaded by multiple main C > files? > If it works, yes, I think. >> If there was a way to serialize the linemap entries from the LC_ENTER >> entry for the header file to the LC_LEAVE (i.e. ignoring builtins, >> command line, etc.) > > I believe all the builtins have the same source_location, which is the > BUILTIN_LOCATION constant. As for command line options, I believe they > get the location UNKNOWN_LOCATION. So I would think these should not be > a problem when de-serializing the line map. > Right, but no point serializing them only to filter them on input. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-25 19:34 ` Gabriel Charette @ 2011-07-25 22:55 ` Gabriel Charette 2011-07-26 14:46 ` Dodji Seketeli 0 siblings, 1 reply; 15+ messages in thread From: Gabriel Charette @ 2011-07-25 22:55 UTC (permalink / raw) To: Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Gabriel Charette, Dodji Seketeli Cc: Cary Coutant Alright, so after looking even more at the linemap code, here is what I'm thinking now: So the main problem is: with the linemap logic is that linemap_line_start adds a new table entry if line_delta < 0 (I don't understand why this is needed, my assumption is that it is because the rest of the logic depends on set->highest_location, thus we want to make sure we are higher than the highest so far? can someone clarify why we need this?) My solution: I will add a boolean flag to linemap_line_start's parameters, allowEarlierStartLineStart, which when true, will not create a new entry in the line_table even if the line_delta < 0. Instead of relying on highest_location to find the current line's source location, it will use the start_location and add to it the delta between to_line and map->to_line). I will also add a new linemap function, linemap_position_for_line_and_column, which will be just like linemap_position_for_column, except that it won't assume the highest_line in the set is the one we are currently on (this function will thus allow us to get a source_location without depending on the current assumptions on the order these functions are called in). This solution would not modify the actual final result of the entries, at the end of parsing, as far as I understand (I would also of course only use this new flag in the call from lto_input_location (which we use in pph to get our source_location from the streamed expanded_location)) Let me know what you think, Gabriel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-25 22:55 ` Gabriel Charette @ 2011-07-26 14:46 ` Dodji Seketeli 2011-07-26 18:49 ` Gabriel Charette 0 siblings, 1 reply; 15+ messages in thread From: Dodji Seketeli @ 2011-07-26 14:46 UTC (permalink / raw) To: Gabriel Charette Cc: Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Cary Coutant Gabriel Charette <gchare@google.com> a écrit: > Alright, so after looking even more at the linemap code, here is what > I'm thinking now: > > So the main problem is: > with the linemap logic is that linemap_line_start adds a new table > entry if line_delta < 0 (I don't understand why this is needed, my > assumption is that it is because the rest of the logic depends on > set->highest_location, thus we want to make sure we are higher than > the highest so far? can someone clarify why we need this?) Here is how I understand this. Sorry if it's too obvious. The linemap_line_start model follows that of a parser that parses tokens from left to right, and from top to bottom. Let's imagine a main CU called M which has this layout: <---- start, locus #0 [... tokens ...] #include "A" <--- locus #1 [... tokens ...] There are going to be at least three line maps (instances of struct line_map) created and added to the line map set (struct line_maps) of M: one for loci from #0 to right before #1, at least one for loci coming from the included header A (there can be several line maps here, if A itself includes other headers) and another one from loci coming right after #1. A hard invariant here is that source_locations yielded by a given line map must be less than the struct line_map::start_location of the next line map. This is (partly) so that line map lookup (when we want to get the line map that corresponds to a given source_location) can be done using a binary search, which is faster than just doing a linear search. A side effect of this is that source_locations handed out by linemap_line_start increase monotonically. With these assumptions in mind, when the parser code calls linemap_line_start to get the source_location corresponding to a new line at column 0, the only case where the new line is less than the preceding line (so that line_delta < 0) is if the parser is entering a new file. E.g, it is entering the header A, coming from M. Or, it is getting back to M, leaving A. In that case, it seems required for linemap_line_start to create the new line map for A or for M. > My solution: > I will add a boolean flag to linemap_line_start's parameters, > allowEarlierStartLineStart, which when true, will not create a new > entry in the line_table even if the line_delta < 0. If I understand correctly, you are acting like if you were parsing from right to left and from bottom to top, for the content of A. In that context, you need to keep the hard invariant I talked about above, so that line map lookup keeps working, at least. In other words, when you generate a source_location for a token T0 that comes topologically /before/ the token T1 you generated source_location for at the previous step, you need to make sure that source_location of T0 is less than source_location of T1, and that the struct line_map::start_location of the current map is less than T0. If T0 comes from a different header than T1 (suppose both T1 and T0 comes from headers that are included by A) then a new line map (different from the line map for T1) needs to be created for T0. Now I am not sure if in your design, there is going to be only one pph for A, or if there is going to be one pph for each header included by A as well. I would have imagined the later case to be more appropriate. It seems to me that for you approach to work, you should at least make sure to add a new line map when T0 comes a different header than T1, while processing the pph of A. > Instead of relying on highest_location to find the current line's > source location, it will use the start_location and add to it the > delta between to_line and map->to_line). If the map->start_location was allocated correctly, I guess this could work assuming the constraints I raised above are respected. Note however that there are corner cases like having so long lines in the source code being parsed that we fill the column-space of the line map so quickly that we need to allocate more line maps for a given file. These are corner cases taken care of by linemap_line_start today that you could miss if you are trying to walk in reverse order like this. Those would have been addressed by serializing the original line map set of A that we knew was correct, barring the value of their map->start_location and map->to_line. > I will also add a new linemap function, > linemap_position_for_line_and_column, which will be just like > linemap_position_for_column, except that it won't assume the > highest_line in the set is the one we are currently on (this function > will thus allow us to get a source_location without depending on the > current assumptions on the order these functions are called in). Heh. I have added a function with that exact name in my patch set to track locations across macro expansions. Look at the source code at http://tinyurl.com/3egxp47, search for linemap_position_for_line_and_column (obviously). The implicit assumption made by that function is that the {line,column} pair which you are requesting a source_location for belongs to the current line map, though. HTH. -- Dodji ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-26 14:46 ` Dodji Seketeli @ 2011-07-26 18:49 ` Gabriel Charette 2011-07-26 22:54 ` Gabriel Charette 0 siblings, 1 reply; 15+ messages in thread From: Gabriel Charette @ 2011-07-26 18:49 UTC (permalink / raw) To: Gabriel Charette, Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Cary Coutant On Tue, Jul 26, 2011 at 4:25 AM, Dodji Seketeli <dodji@seketeli.org> wrote: > Gabriel Charette <gchare@google.com> a écrit: > >> Alright, so after looking even more at the linemap code, here is what >> I'm thinking now: >> >> So the main problem is: >> with the linemap logic is that linemap_line_start adds a new table >> entry if line_delta < 0 (I don't understand why this is needed, my >> assumption is that it is because the rest of the logic depends on >> set->highest_location, thus we want to make sure we are higher than >> the highest so far? can someone clarify why we need this?) > > Here is how I understand this. Sorry if it's too obvious. > > The linemap_line_start model follows that of a parser that parses tokens > from left to right, and from top to bottom. Let's imagine a main CU > called M which has this layout: > > <---- start, locus #0 > > [... tokens ...] > > #include "A" <--- locus #1 > > [... tokens ...] > > There are going to be at least three line maps (instances of struct > line_map) created and added to the line map set (struct line_maps) of M: > one for loci from #0 to right before #1, at least one for loci coming > from the included header A (there can be several line maps here, if A > itself includes other headers) and another one from loci coming right > after #1. > > A hard invariant here is that source_locations yielded by a given line > map must be less than the struct line_map::start_location of the next > line map. This is (partly) so that line map lookup (when we want to get > the line map that corresponds to a given source_location) can be done > using a binary search, which is faster than just doing a linear search. > A side effect of this is that source_locations handed out by > linemap_line_start increase monotonically. > > With these assumptions in mind, when the parser code calls > linemap_line_start to get the source_location corresponding to a new > line at column 0, the only case where the new line is less than the > preceding line (so that line_delta < 0) is if the parser is entering a > new file. E.g, it is entering the header A, coming from M. Or, it is > getting back to M, leaving A. In that case, it seems required for > linemap_line_start to create the new line map for A or for M. > Thanks for this explanation, that's pretty much how I had understood the linemap code, but you clarified some of the things I wasn't clear about. >> My solution: >> I will add a boolean flag to linemap_line_start's parameters, >> allowEarlierStartLineStart, which when true, will not create a new >> entry in the line_table even if the line_delta < 0. > > If I understand correctly, you are acting like if you were parsing from > right to left and from bottom to top, for the content of A. > Right. > In that context, you need to keep the hard invariant I talked about > above, so that line map lookup keeps working, at least. > > In other words, when you generate a source_location for a token T0 that > comes topologically /before/ the token T1 you generated source_location > for at the previous step, you need to make sure that source_location of > T0 is less than source_location of T1, and that the struct > line_map::start_location of the current map is less than T0. If T0 > comes from a different header than T1 (suppose both T1 and T0 comes from > headers that are included by A) then a new line map (different from the > line map for T1) needs to be created for T0. Now I am not sure if in > your design, there is going to be only one pph for A, or if there is > going to be one pph for each header included by A as well. I would have > imagined the later case to be more appropriate. > Correct we are using the multiple pph approach (one for each included header in the header). This actually makes the problem harder than I had originally pictured it yesterday. I'm now thinking using serialization is going to be a better solution because of that. >> Instead of relying on highest_location to find the current line's >> source location, it will use the start_location and add to it the >> delta between to_line and map->to_line). > > If the map->start_location was allocated correctly, I guess this could > work assuming the constraints I raised above are respected. > > Note however that there are corner cases like having so long lines in > the source code being parsed that we fill the column-space of the line > map so quickly that we need to allocate more line maps for a given file. > These are corner cases taken care of by linemap_line_start today that > you could miss if you are trying to walk in reverse order like this. > Those would have been addressed by serializing the original line map set > of A that we knew was correct, barring the value of their > map->start_location and map->to_line. > One more reason for serialization! >> I will also add a new linemap function, >> linemap_position_for_line_and_column, which will be just like >> linemap_position_for_column, except that it won't assume the >> highest_line in the set is the one we are currently on (this function >> will thus allow us to get a source_location without depending on the >> current assumptions on the order these functions are called in). > > Heh. I have added a function with that exact name in my patch set to > track locations across macro expansions. Look at the source code at > http://tinyurl.com/3egxp47, search for > linemap_position_for_line_and_column (obviously). The implicit > assumption made by that function is that the {line,column} pair which > you are requesting a source_location for belongs to the current line > map, though. > Good to know, I had made the same assumption in mine that we were looking for {line, column} in the current line map. ---- So, I think serializing the linemap and tweaking it on the way in is going to be easier, in particular because of nested pph (header including headers). I'll put down my strategy for serializing the linemap and send a follow up email soon. Thanks, Gabriel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-26 18:49 ` Gabriel Charette @ 2011-07-26 22:54 ` Gabriel Charette 2011-07-27 10:11 ` Dodji Seketeli 0 siblings, 1 reply; 15+ messages in thread From: Gabriel Charette @ 2011-07-26 22:54 UTC (permalink / raw) To: Gabriel Charette, Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Cary Coutant As mentionned in my previous email, I'm now thinking of serializing the linemap from the header by pph'ed, here is how I think this could work: From what I understand, the source_locations allocated for everything in a given set of headers (from the LC_ENTER for the header in the line_table up to, and including everything in between, the corresponding LC_LEAVE) is dependent on only one thing; the value of line_table->highest_location when the header was inserted (i.e. if in two different contexts we happen to have the same line_table->highest_location when inserting header A.h, all of the tokens for A.h in each context will have the same source_location). If the previous statement is true, then we calculate an offset between the line_table->highest_location as it was in the serialized version of the header and as it is in the C file in which we are about to de-serialize the line table. We then need to update some things based on that offset: for each line_map entry in the line_table: start_location for each source_location field on tokens coming in from the header Some other things need to be updated with simple logic: for each line_map entry in the line_table: included_from for the line_table itself: highest_location, highest_line, max_column_hint Some things just stay as is when de-serialized: for each line_map entry in the line_table: reason, sysp, to_file, to_line, column_bits After doing this we should have the same line_map we would have had when doing a regular compile (non-pph), and the tokens coming from the header file should have the correct source_location, simply by adding the calculated offset to the one they had previously (this also allows us to only serialize the source_location for each token as opposed to serializing it's expanded location, saving space in the pph). Let me know what you think, Gabriel On Tue, Jul 26, 2011 at 11:10 AM, Gabriel Charette <gchare@google.com> wrote: > On Tue, Jul 26, 2011 at 4:25 AM, Dodji Seketeli <dodji@seketeli.org> wrote: >> Gabriel Charette <gchare@google.com> a écrit: >> >>> Alright, so after looking even more at the linemap code, here is what >>> I'm thinking now: >>> >>> So the main problem is: >>> with the linemap logic is that linemap_line_start adds a new table >>> entry if line_delta < 0 (I don't understand why this is needed, my >>> assumption is that it is because the rest of the logic depends on >>> set->highest_location, thus we want to make sure we are higher than >>> the highest so far? can someone clarify why we need this?) >> >> Here is how I understand this. Sorry if it's too obvious. >> >> The linemap_line_start model follows that of a parser that parses tokens >> from left to right, and from top to bottom. Let's imagine a main CU >> called M which has this layout: >> >> <---- start, locus #0 >> >> [... tokens ...] >> >> #include "A" <--- locus #1 >> >> [... tokens ...] >> >> There are going to be at least three line maps (instances of struct >> line_map) created and added to the line map set (struct line_maps) of M: >> one for loci from #0 to right before #1, at least one for loci coming >> from the included header A (there can be several line maps here, if A >> itself includes other headers) and another one from loci coming right >> after #1. >> >> A hard invariant here is that source_locations yielded by a given line >> map must be less than the struct line_map::start_location of the next >> line map. This is (partly) so that line map lookup (when we want to get >> the line map that corresponds to a given source_location) can be done >> using a binary search, which is faster than just doing a linear search. >> A side effect of this is that source_locations handed out by >> linemap_line_start increase monotonically. >> >> With these assumptions in mind, when the parser code calls >> linemap_line_start to get the source_location corresponding to a new >> line at column 0, the only case where the new line is less than the >> preceding line (so that line_delta < 0) is if the parser is entering a >> new file. E.g, it is entering the header A, coming from M. Or, it is >> getting back to M, leaving A. In that case, it seems required for >> linemap_line_start to create the new line map for A or for M. >> > > Thanks for this explanation, that's pretty much how I had understood > the linemap code, but you clarified some of the things I wasn't clear > about. > >>> My solution: >>> I will add a boolean flag to linemap_line_start's parameters, >>> allowEarlierStartLineStart, which when true, will not create a new >>> entry in the line_table even if the line_delta < 0. >> >> If I understand correctly, you are acting like if you were parsing from >> right to left and from bottom to top, for the content of A. >> > > Right. > >> In that context, you need to keep the hard invariant I talked about >> above, so that line map lookup keeps working, at least. >> >> In other words, when you generate a source_location for a token T0 that >> comes topologically /before/ the token T1 you generated source_location >> for at the previous step, you need to make sure that source_location of >> T0 is less than source_location of T1, and that the struct >> line_map::start_location of the current map is less than T0. If T0 >> comes from a different header than T1 (suppose both T1 and T0 comes from >> headers that are included by A) then a new line map (different from the >> line map for T1) needs to be created for T0. Now I am not sure if in >> your design, there is going to be only one pph for A, or if there is >> going to be one pph for each header included by A as well. I would have >> imagined the later case to be more appropriate. >> > > Correct we are using the multiple pph approach (one for each included > header in the header). > > This actually makes the problem harder than I had originally pictured > it yesterday. I'm now thinking using serialization is going to be a > better solution because of that. > >>> Instead of relying on highest_location to find the current line's >>> source location, it will use the start_location and add to it the >>> delta between to_line and map->to_line). >> >> If the map->start_location was allocated correctly, I guess this could >> work assuming the constraints I raised above are respected. >> >> Note however that there are corner cases like having so long lines in >> the source code being parsed that we fill the column-space of the line >> map so quickly that we need to allocate more line maps for a given file. >> These are corner cases taken care of by linemap_line_start today that >> you could miss if you are trying to walk in reverse order like this. >> Those would have been addressed by serializing the original line map set >> of A that we knew was correct, barring the value of their >> map->start_location and map->to_line. >> > > One more reason for serialization! > >>> I will also add a new linemap function, >>> linemap_position_for_line_and_column, which will be just like >>> linemap_position_for_column, except that it won't assume the >>> highest_line in the set is the one we are currently on (this function >>> will thus allow us to get a source_location without depending on the >>> current assumptions on the order these functions are called in). >> >> Heh. I have added a function with that exact name in my patch set to >> track locations across macro expansions. Look at the source code at >> http://tinyurl.com/3egxp47, search for >> linemap_position_for_line_and_column (obviously). The implicit >> assumption made by that function is that the {line,column} pair which >> you are requesting a source_location for belongs to the current line >> map, though. >> > > Good to know, I had made the same assumption in mine that we were > looking for {line, column} in the current line map. > ---- > So, I think serializing the linemap and tweaking it on the way in is > going to be easier, in particular because of nested pph (header > including headers). > I'll put down my strategy for serializing the linemap and send a > follow up email soon. > > Thanks, > Gabriel > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-26 22:54 ` Gabriel Charette @ 2011-07-27 10:11 ` Dodji Seketeli 2011-07-27 14:17 ` Gabriel Charette 0 siblings, 1 reply; 15+ messages in thread From: Dodji Seketeli @ 2011-07-27 10:11 UTC (permalink / raw) To: Gabriel Charette Cc: Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Cary Coutant Gabriel Charette <gchare@google.com> a écrit: > From what I understand, the source_locations allocated for > everything in a given set of headers (from the LC_ENTER for the > header in the line_table up to, and including everything in between, > the corresponding LC_LEAVE) is dependent on only one thing; the > value of line_table->highest_location when the header was inserted The source_location handed out by a given line map is dependant on three things: 1/ The line_map::start_location. For a given map, this one equals line_table->highest_location + 1, because at the time the line map is created, its line_map::start_location must be greater than the highest source_location handed out by any line map previously created. At any point in time, line_table->highest_location equals the line_map::start_location of the lastly created line_map. This is part of the monotonically increasing property of source_location we were talking about earlier. 2/ The line number of the location 3/ The column number of the location > (i.e. if in two different contexts we happen to have the same > line_table->highest_location when inserting header A.h, all of the > tokens for A.h in each context will have the same source_location). Each token coming from a given A.h will have a different source_location, as they will presumably have a different {line,column} pair. If in two different contexts we happen to have the same line_table->highest_location, the line_map::start_location of the two different line maps of the two different A.hs will be equal, though. > If the previous statement is true, then we calculate an offset > between the line_table->highest_location as it was in the serialized > version of the header and as it is in the C file in which we are > about to de-serialize the line table. > > We then need to update some things based on that offset: [...] It seems to me that you could just set the line_map::start_location of the de-serialized map for the current portion of A.h to the current line_table->highest_location of the main CU your are currently parsing (i.e, just forget about the serialized line_table->highest_location into account. Actually I would think that it's unnecessary to serialize the line_table->highest_location) + 1. Then you should also set the de-serialized line_map::to_line to the current line in your context. Then you should add that line map to the current line_table and set line_table->highest_location to the line_map::start_location you have just computed. Then, assign the source_location of each token that belongs to that de-serialized map to a source_location that is handed out by your new linemap_position_for_line_and_column, with de-serialized map passed in argument. This is where you would progress "backward" in the token stream, for that given map. Then somehow, when you need to suck in a new pph (so this time you are going downward in the stream again), just start this dance of de-serializing the map for that new pph, updating its properties, adding it to line_table, and setting the source_location of the tokens coming for that pph again. -- Dodji ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-27 10:11 ` Dodji Seketeli @ 2011-07-27 14:17 ` Gabriel Charette 2011-07-27 14:26 ` Dodji Seketeli 0 siblings, 1 reply; 15+ messages in thread From: Gabriel Charette @ 2011-07-27 14:17 UTC (permalink / raw) To: Dodji Seketeli Cc: Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Cary Coutant I think I wasn't clear in the way I expressed my assumptions in my last email: On Wed, Jul 27, 2011 at 1:11 AM, Dodji Seketeli <dodji@seketeli.org> wrote: > > Gabriel Charette <gchare@google.com> a écrit: > > > From what I understand, the source_locations allocated for > > everything in a given set of headers (from the LC_ENTER for the > > header in the line_table up to, and including everything in between, > > the corresponding LC_LEAVE) is dependent on only one thing; the > > value of line_table->highest_location when the header was inserted > > The source_location handed out by a given line map is dependant on > three things: > > 1/ The line_map::start_location. For a given map, this one equals > line_table->highest_location + 1, because at the time the line map is > created, its line_map::start_location must be greater than the highest > source_location handed out by any line map previously created. At any > point in time, line_table->highest_location equals the > line_map::start_location of the lastly created line_map. This is part > of the monotonically increasing property of source_location we were > talking about earlier. > Actually from my understanding, highest_location is not equal to the last start_location, but to the last source_location returned by either linemap_line_start or linemap_positition_for_column (which is >= to the start_location of the current line_map). > 2/ The line number of the location > > 3/ The column number of the location > Right, that's what I mean, I would not actually stream highest_location. What I meant by they "all depend on only highest_location" is that IF highest_location is the same in file B.c and in a different compiled file C.c when they happen to include A.h, then all of the source_locations for the tokens in A.h (in both compilation) would be identical (i.e. token1's loc in B == token1's loc in C, etc.) (as 2/3 always is the same since we're talking about the same file A.h in both compilation, hence if 1 also holds, we get the same result). > > (i.e. if in two different contexts we happen to have the same > > line_table->highest_location when inserting header A.h, all of the > > tokens for A.h in each context will have the same source_location). > > Each token coming from a given A.h will have a different > source_location, as they will presumably have a different {line,column} > pair. > What I meant is that all of the source locations handed out in the first compilation will be the same as all of the source locations handed out in the second compilation, pairwise (not that ALL token's source locations themselves will be the same within a single compilation of course!). > If in two different contexts we happen to have the same > line_table->highest_location, the line_map::start_location of the two > different line maps of the two different A.hs will be equal, though. > > > If the previous statement is true, then we calculate an offset > > between the line_table->highest_location as it was in the serialized > > version of the header and as it is in the C file in which we are > > about to de-serialize the line table. > > > > We then need to update some things based on that offset: > > [...] > Hence, given that they only depend on start_location, I just have to calculate an offset between the serialized start_location and the start_location as it would be (highest_location + 1) in the C file including the header, and offset all of the source_locations on each token coming from the pph (without even needing to recalculate them!). > It seems to me that you could just set the line_map::start_location of > the de-serialized map for the current portion of A.h to the current > line_table->highest_location of the main CU your are currently parsing > (i.e, just forget about the serialized line_table->highest_location > into account. Actually I would think that it's unnecessary to > serialize the line_table->highest_location) + 1. > > Then you should also set the de-serialized line_map::to_line to the > current line in your context. Then you should add that line map to > the current line_table and set line_table->highest_location to the > line_map::start_location you have just computed. > > Then, assign the source_location of each token that belongs to that > de-serialized map to a source_location that is handed out by your new > linemap_position_for_line_and_column, with de-serialized map passed in > argument. This is where you would progress "backward" in the token > stream, for that given map. > > Then somehow, when you need to suck in a new pph (so this time you are > going downward in the stream again), just start this dance of > de-serializing the map for that new pph, updating its properties, > adding it to line_table, and setting the source_location of the tokens > coming for that pph again. > Doing it this way (with the offset) I would read in all the tokens and linemap entries inherited from that header and it's underlying include tree, thus no need to be tricky about inserting line tables for the header's included file, as they are part of the header's serialized line_table by recursion (a pph'ed header can include other pph'ed header), nor to recalculate source_locations (it would be hard anyway to replay the line_table getters to recalculate source_locations as in the serialized parsed state we don't have the notion of which line the #includes were on, nor which tokens came before or after... its potentially possible to know that from the serialized line_table, but it would be tricky to replay the tokens before, in, and after the included headers in the correct way) Thanks, Gabriel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-27 14:17 ` Gabriel Charette @ 2011-07-27 14:26 ` Dodji Seketeli 2011-07-27 14:51 ` Gabriel Charette 0 siblings, 1 reply; 15+ messages in thread From: Dodji Seketeli @ 2011-07-27 14:26 UTC (permalink / raw) To: Gabriel Charette Cc: Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Cary Coutant Gabriel Charette <gchare@google.com> a écrit: > Actually from my understanding, highest_location is not equal to the > last start_location, but to the last source_location returned by > either linemap_line_start or linemap_positition_for_column (which is > >>= to the start_location of the current line_map). Right, it equals the highest location yielded by any map in the system. Sorry. > >> 2/ The line number of the location >> >> 3/ The column number of the location >> > > Right, that's what I mean, I would not actually stream > highest_location. What I meant by they "all depend on only > highest_location" is that IF highest_location is the same in file B.c > and in a different compiled file C.c when they happen to include A.h, > then all of the source_locations for the tokens in A.h (in both > compilation) would be identical (i.e. token1's loc in B == token1's > loc in C, etc.) (as 2/3 always is the same since we're talking about > the same file A.h in both compilation, hence if 1 also holds, we get > the same result). > Oh, OK. >> > (i.e. if in two different contexts we happen to have the same >> > line_table->highest_location when inserting header A.h, all of the >> > tokens for A.h in each context will have the same source_location). >> >> Each token coming from a given A.h will have a different >> source_location, as they will presumably have a different {line,column} >> pair. >> > > What I meant is that all of the source locations handed out in the > first compilation will be the same as all of the source locations > handed out in the second compilation, pairwise (not that ALL token's > source locations themselves will be the same within a single > compilation of course!). > OK. >> If in two different contexts we happen to have the same >> line_table->highest_location, the line_map::start_location of the two >> different line maps of the two different A.hs will be equal, though. >> >> > If the previous statement is true, then we calculate an offset >> > between the line_table->highest_location as it was in the serialized >> > version of the header and as it is in the C file in which we are >> > about to de-serialize the line table. >> > >> > We then need to update some things based on that offset: >> >> [...] >> > > Hence, given that they only depend on start_location, I just have to > calculate an offset between the serialized start_location and the > start_location as it would be (highest_location + 1) in the C file > including the header, and offset all of the source_locations on each > token coming from the pph (without even needing to recalculate them!). > That could work. But then you'd need to do something for a map encoding the locations of tokens coming from the pph to appear in line_table, right? Otherwise, at lookup time, (when you want to find the map that matches the source_location of a token coming for that pph), you'll be in trouble. I am saying this b/c you are not calling linemap_line_start anymore. And that function was the one that was including the said map to line_table. And the map still must be inserted into line_table somehow. >> It seems to me that you could just set the line_map::start_location of >> the de-serialized map for the current portion of A.h to the current >> line_table->highest_location of the main CU your are currently parsing >> (i.e, just forget about the serialized line_table->highest_location >> into account. Actually I would think that it's unnecessary to >> serialize the line_table->highest_location) + 1. >> >> Then you should also set the de-serialized line_map::to_line to the >> current line in your context. Then you should add that line map to >> the current line_table and set line_table->highest_location to the >> line_map::start_location you have just computed. >> >> Then, assign the source_location of each token that belongs to that >> de-serialized map to a source_location that is handed out by your new >> linemap_position_for_line_and_column, with de-serialized map passed in >> argument. This is where you would progress "backward" in the token >> stream, for that given map. >> >> Then somehow, when you need to suck in a new pph (so this time you are >> going downward in the stream again), just start this dance of >> de-serializing the map for that new pph, updating its properties, >> adding it to line_table, and setting the source_location of the tokens >> coming for that pph again. >> > > Doing it this way (with the offset) I would read in all the tokens and > linemap entries inherited from that header and it's underlying include > tree, thus no need to be tricky about inserting line tables for the > header's included file, as they are part of the header's serialized > line_table by recursion (a pph'ed header can include other pph'ed > header), This is what I am not sure to understand. There is only one line table per CU. The headers included by the CU generate instances of struct line map that are inserted into the line table of the CU. So I don't understand what you mean by "header's serialized line_table", as I don't think there is such a thing as a header's line_table. -- Dodji ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-27 14:26 ` Dodji Seketeli @ 2011-07-27 14:51 ` Gabriel Charette 2011-07-27 15:45 ` Dodji Seketeli 0 siblings, 1 reply; 15+ messages in thread From: Gabriel Charette @ 2011-07-27 14:51 UTC (permalink / raw) To: Dodji Seketeli Cc: Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Cary Coutant >> >> Hence, given that they only depend on start_location, I just have to >> calculate an offset between the serialized start_location and the >> start_location as it would be (highest_location + 1) in the C file >> including the header, and offset all of the source_locations on each >> token coming from the pph (without even needing to recalculate them!). >> > > That could work. But then you'd need to do something for a map encoding > the locations of tokens coming from the pph to appear in line_table, > right? Otherwise, at lookup time, (when you want to find the map that > matches the source_location of a token coming for that pph), you'll be > in trouble. I am saying this b/c you are not calling linemap_line_start > anymore. And that function was the one that was including the said map > to line_table. And the map still must be inserted into line_table > somehow. > The lookup, from what I understand, only depends on the line_map entries for the header to be present, and the same as they would be *had* the functions been called (not that each token actually called the linemap getters to get its location), and also of course depends on that the tokens have the correct source_locations *as if* obtained from the linemap getters. >> Doing it this way (with the offset) I would read in all the tokens and >> linemap entries inherited from that header and it's underlying include >> tree, thus no need to be tricky about inserting line tables for the >> header's included file, as they are part of the header's serialized >> line_table by recursion (a pph'ed header can include other pph'ed >> header), > > This is what I am not sure to understand. There is only one line table > per CU. The headers included by the CU generate instances of struct > line map that are inserted into the line table of the CU. So I don't > understand what you mean by "header's serialized line_table", as I don't > think there is such a thing as a header's line_table. What I mean by "serialized header line_table" is the serialized version of the line_table as it was when were done parsing the header being pph'ed. I would then de-serialize that and insert it's line_map entries in the C file's line_table, doing the necessary offset adjustements in the process (and updating all other line_table variables like highest_location that would have changed if we had actually called the linemap functions) Best, Gabriel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linemap and pph 2011-07-27 14:51 ` Gabriel Charette @ 2011-07-27 15:45 ` Dodji Seketeli 0 siblings, 0 replies; 15+ messages in thread From: Dodji Seketeli @ 2011-07-27 15:45 UTC (permalink / raw) To: Gabriel Charette Cc: Tom Tromey, Diego Novillo, Lawrence Crowl, gcc, Collin Winter, Cary Coutant Gabriel Charette <gchare@google.com> a écrit: >>> >>> Hence, given that they only depend on start_location, I just have to >>> calculate an offset between the serialized start_location and the >>> start_location as it would be (highest_location + 1) in the C file >>> including the header, and offset all of the source_locations on each >>> token coming from the pph (without even needing to recalculate them!). >>> >> >> That could work. But then you'd need to do something for a map encoding >> the locations of tokens coming from the pph to appear in line_table, >> right? Otherwise, at lookup time, (when you want to find the map that >> matches the source_location of a token coming for that pph), you'll be >> in trouble. I am saying this b/c you are not calling linemap_line_start >> anymore. And that function was the one that was including the said map >> to line_table. And the map still must be inserted into line_table >> somehow. >> > > The lookup, from what I understand, only depends on the line_map > entries for the header to be present, Exactly. The line_map need to be present inside line_table->maps, so you need to insert it in there somehow. It wasn't clear to me from my reading of your previous messages. [...] >>> Doing it this way (with the offset) I would read in all the tokens and >>> linemap entries inherited from that header and it's underlying include >>> tree, thus no need to be tricky about inserting line tables for the >>> header's included file, as they are part of the header's serialized >>> line_table by recursion (a pph'ed header can include other pph'ed >>> header), >> >> This is what I am not sure to understand. There is only one line table >> per CU. The headers included by the CU generate instances of struct >> line map that are inserted into the line table of the CU. So I don't >> understand what you mean by "header's serialized line_table", as I don't >> think there is such a thing as a header's line_table. > > What I mean by "serialized header line_table" is the serialized > version of the line_table as it was when were done parsing the header > being pph'ed. OK. Please note that you don't necessarily need to serialize the entire line_map at the parsing point of the pph. I believe that just serializing the line maps of the pph'ed header should suffice. How to determine them is another question. :-) > I would then de-serialize that and insert it's line_map entries in the > C file's line_table, doing the necessary offset adjustements in the > process (and updating all other line_table variables like > highest_location that would have changed if we had actually called the > linemap functions) OK. We are on the same page now. Thanks. -- Dodji ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2011-07-27 14:51 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-07-22 20:40 Linemap and pph Gabriel Charette 2011-07-22 21:15 ` Tom Tromey 2011-07-22 22:12 ` Gabriel Charette 2011-07-23 5:46 ` Tom Tromey 2011-07-23 17:54 ` Dodji Seketeli 2011-07-25 19:34 ` Gabriel Charette 2011-07-25 22:55 ` Gabriel Charette 2011-07-26 14:46 ` Dodji Seketeli 2011-07-26 18:49 ` Gabriel Charette 2011-07-26 22:54 ` Gabriel Charette 2011-07-27 10:11 ` Dodji Seketeli 2011-07-27 14:17 ` Gabriel Charette 2011-07-27 14:26 ` Dodji Seketeli 2011-07-27 14:51 ` Gabriel Charette 2011-07-27 15:45 ` Dodji Seketeli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).