public inbox for gnats-devel@sourceware.org
 help / color / mirror / Atom feed
* 4.0 beta - question on parsing of subject line in PR header
@ 2002-05-10 16:21 Mel Hatzis
  2002-05-11  4:21 ` Lars Henriksen
  0 siblings, 1 reply; 13+ messages in thread
From: Mel Hatzis @ 2002-05-10 16:21 UTC (permalink / raw)
  To: help-gnats

Can anyone tell me why the regular expression match for
the subject header was changed so that it no longer
supports subjects such as 'Re: category/num'?

Here's the relevant cvs log entry...

revision 1.44
date: 2001/12/23 20:22:20;  author: pdm;  state: Exp;  lines: +22 -41
(checkIfReply): Matching changed.

Previously, the regular expression was:

  "(.*re[ \t]*(\\[[0-9]+\\])?:)?[ \t]*([-a-z0-9_+.]*[:/][ \t]*([0-9]+))"

and it's been changed to:

  "\\<((PR[ \t/])|([-a-z0-9_+.]+)/)([0-9]+)"

This breaks the ability to respond to an existing PR and have
stuff added to it's Audit-Trail.

--
Mel Hatzis
Juniper Networks, Inc.


_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-10 16:21 4.0 beta - question on parsing of subject line in PR header Mel Hatzis
@ 2002-05-11  4:21 ` Lars Henriksen
  2002-05-11 10:46   ` Mel Hatzis
  0 siblings, 1 reply; 13+ messages in thread
From: Lars Henriksen @ 2002-05-11  4:21 UTC (permalink / raw)
  To: Mel Hatzis; +Cc: help-gnats

On Fri, May 10, 2002 at 04:20:31PM -0700, Mel Hatzis wrote:
> Can anyone tell me why the regular expression match for
> the subject header was changed so that it no longer
> supports subjects such as 'Re: category/num'?

Take a look at the list archives. There was a thorough discussion
of this in December, Subject: Subject line processing in Gnats 4.0.
A couple of your colleagues participated.

Lars Henriksen

_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-11  4:21 ` Lars Henriksen
@ 2002-05-11 10:46   ` Mel Hatzis
  2002-05-15 10:00     ` Milan Zamazal
  0 siblings, 1 reply; 13+ messages in thread
From: Mel Hatzis @ 2002-05-11 10:46 UTC (permalink / raw)
  To: Lars Henriksen; +Cc: help-gnats

[-- Attachment #1: Type: text/plain, Size: 1858 bytes --]


Lars Henriksen wrote:

>On Fri, May 10, 2002 at 04:20:31PM -0700, Mel Hatzis wrote:
>
>>Can anyone tell me why the regular expression match for
>>the subject header was changed so that it no longer
>>supports subjects such as 'Re: category/num'?
>>
>
>Take a look at the list archives. There was a thorough discussion
>of this in December, Subject: Subject line processing in Gnats 4.0.
>A couple of your colleagues participated.
>
Yes...I consulted with a couple of my colleagues after sending out the
help request. Thank you.

After understanding this a little more, we determined that there
was definitely a bug here. The regular expression used is incorrect
...for one, it requires a '\<' as the start of the subject line in order to
match. It is also missing an escape character before the '|' and is
incorrectly anchored to the beginning of the subject line.

I have attached a patch. For reference, here's the regular expression
from the patch:

(.*[ \t:])?((PR[ \t/])\\|([-a-z0-9_+.]+/))([0-9]+)

The patch allows for the following subject
lines:

  Fwd: Re: category/50
  Re:<tab>category/50
  Re: PR 50
  Re PR/50
  Re:PR 50 (note that there's no space after the colon)
  category/50
  PR<tab>50
  PR/50
  PR 50

with as much preceding or trailing gunk as desirable...note that trailing
gunk need not be separated by white space.

I added the colon to the preceding text separator as an afterthought...
thinking it might be useful. Since the existing regex didn't allow for
categories with colon's in them, this seemed like a safe addition.

There are a few corner cases where this may not result in the most
desirable behaviour, such as:

   Re: PR 50 (was: Re: PR/75)

which matches PR/75.

However, for each such case, there's generally a counter argument...

   Re: closed PR 50 (fix documented in PR/75)

--
Mel Hatzis
Juniper Networks, Inc.

[-- Attachment #2: diffs --]
[-- Type: text/plain, Size: 2003 bytes --]

Index: file-pr.c
===================================================================
RCS file: /cvsroot/gnats/gnats/gnats/file-pr.c,v
retrieving revision 1.45
diff -b -u -p -r1.45 file-pr.c
--- file-pr.c	10 Feb 2002 18:23:42 -0000	1.45
+++ file-pr.c	11 May 2002 08:33:40 -0000
@@ -572,7 +572,7 @@ checkIfReply (PR *pr, ErrorDesc *err)
   const char *headerValue;
   struct re_pattern_buffer regex;
   struct re_registers regs;
-  int i, start, end, idstart;
+  int i, start, end, idstart, idend;
   char case_fold[256];
   char *possiblePrNum;
   reg_syntax_t old_syntax;
@@ -594,7 +594,7 @@ checkIfReply (PR *pr, ErrorDesc *err)
   regex.translate = case_fold;
   
   {
-    const char *const PAT = "\\<((PR[ \t/])|([-a-z0-9_+.]+)/)([0-9]+)";
+    const char *const PAT = "(.*[ \t:])?((PR[ \t/])\\|([-a-z0-9_+.]+/))([0-9]+)";
     re_compile_pattern (PAT, strlen (PAT), &regex);
   }
   i = re_match (&regex, headerValue, strlen (headerValue), 0, &regs);
@@ -607,9 +607,10 @@ checkIfReply (PR *pr, ErrorDesc *err)
       return NULL;
     }
 
-  start = regs.start[0];
-  end = regs.end[0];
-  idstart = regs.start[4] - start;
+  start = regs.start[2];
+  end = regs.end[2];
+  idstart = regs.start[5];
+  idend = regs.end[5];
 
   free (regs.start);
   free (regs.end);
@@ -618,7 +619,7 @@ checkIfReply (PR *pr, ErrorDesc *err)
   memcpy (possiblePrNum, headerValue + start, end - start);
   possiblePrNum[end - start] = '\0';
 
-  *(possiblePrNum + idstart - 1) = '\0';
+  *(possiblePrNum + end -start - 1) = '\0';
 
   /* See if the category exists: */
   cat = get_adm_record (CATEGORY (pr->database), possiblePrNum);
@@ -632,7 +633,9 @@ checkIfReply (PR *pr, ErrorDesc *err)
     {
       /* We only needed res, never cat, so free cat. */
       free_adm_entry (cat);
-      prID = xstrdup (possiblePrNum + idstart);
+      prID = xmalloc(idend - idstart + 1);
+      memcpy(prID, headerValue + idstart, idend - idstart);
+      *(prID + idend - idstart) = '\0';
     }
 
   free (possiblePrNum);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-11 10:46   ` Mel Hatzis
@ 2002-05-15 10:00     ` Milan Zamazal
  2002-05-15 13:25       ` Mel Hatzis
  0 siblings, 1 reply; 13+ messages in thread
From: Milan Zamazal @ 2002-05-15 10:00 UTC (permalink / raw)
  To: hatzis; +Cc: Lars Henriksen, help-gnats

>>>>> "MH" == Mel Hatzis <hatzis@juniper.net> writes:

    MH> After understanding this a little more, we determined that there
    MH> was definitely a bug here. The regular expression used is
    MH> incorrect ...for one, it requires a '\<' as the start of the
    MH> subject line in order to match. It is also missing an escape
    MH> character before the '|' and is incorrectly anchored to the
    MH> beginning of the subject line.

    MH> I have attached a patch. 

Thank you, Mel, for the patch.  Before I apply it, I'd like to clarify
one little thing: Do you know why `\\<' didn't work?  It should mean
"beginning of a word", should it be notated in a different way (other
than the one you've used)?

Regards,

Milan Zamazal

-- 
The world is not something you can wrap your head around without needing years
of experience.                              -- Kent M. Pitman in comp.lang.lisp

_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-15 10:00     ` Milan Zamazal
@ 2002-05-15 13:25       ` Mel Hatzis
  2002-05-15 16:21         ` Mel Hatzis
  2002-05-16 16:27         ` Milan Zamazal
  0 siblings, 2 replies; 13+ messages in thread
From: Mel Hatzis @ 2002-05-15 13:25 UTC (permalink / raw)
  To: Milan Zamazal; +Cc: Lars Henriksen, help-gnats

Milan Zamazal wrote:
>>>>>>"MH" == Mel Hatzis <hatzis@juniper.net> writes:
>>>>>>
> 
>     MH> After understanding this a little more, we determined that there
>     MH> was definitely a bug here. The regular expression used is
>     MH> incorrect ...for one, it requires a '\<' as the start of the
>     MH> subject line in order to match. It is also missing an escape
>     MH> character before the '|' and is incorrectly anchored to the
>     MH> beginning of the subject line.
> 
>     MH> I have attached a patch. 
> 
> Thank you, Mel, for the patch.  Before I apply it, I'd like to clarify
> one little thing: Do you know why `\\<' didn't work?  It should mean
> "beginning of a word", should it be notated in a different way (other
> than the one you've used)?
> 

Milan, I was incorrect regarding the use of '\\<'...it does work.
You can change the pattern match to the following:

   "(.*[^\\<])?\\<((PR[ \t/])\\|([-a-z0-9_+.]+/))([0-9]+)"

I tested this and it works. I guess this is what was originally intended.

--
Mel Hatzis
Juniper Networks, Inc.


_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-15 13:25       ` Mel Hatzis
@ 2002-05-15 16:21         ` Mel Hatzis
  2002-05-16  1:00           ` Yngve Svendsen
  2002-05-16 16:27         ` Milan Zamazal
  1 sibling, 1 reply; 13+ messages in thread
From: Mel Hatzis @ 2002-05-15 16:21 UTC (permalink / raw)
  To: Milan Zamazal; +Cc: Lars Henriksen, help-gnats

Mel Hatzis wrote:
> Milan Zamazal wrote:
> 
>>>>>>> "MH" == Mel Hatzis <hatzis@juniper.net> writes:
>>>>>>>
>>
>>     MH> After understanding this a little more, we determined that there
>>     MH> was definitely a bug here. The regular expression used is
>>     MH> incorrect ...for one, it requires a '\<' as the start of the
>>     MH> subject line in order to match. It is also missing an escape
>>     MH> character before the '|' and is incorrectly anchored to the
>>     MH> beginning of the subject line.
>>
>>     MH> I have attached a patch.
>> Thank you, Mel, for the patch.  Before I apply it, I'd like to clarify
>> one little thing: Do you know why `\\<' didn't work?  It should mean
>> "beginning of a word", should it be notated in a different way (other
>> than the one you've used)?
>>
> 
> Milan, I was incorrect regarding the use of '\\<'...it does work.
> You can change the pattern match to the following:
> 
>   "(.*[^\\<])?\\<((PR[ \t/])\\|([-a-z0-9_+.]+/))([0-9]+)"
> 
> I tested this and it works. I guess this is what was originally intended.
> 

Thinking about this some more, it would be *really* useful to allow this
regular expression to be overriden in the dbconfig file. Allowing user's
to define their own would provide them the flexibility to get as fancy
as they want with some of the corner cases where the default regex behaviour
is not desirable...such as "Re: PR 50 (was PR 33)" which matches PR 33.

Thoughts?

--
Mel Hatzis
Juniper Networks, Inc.


_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-15 16:21         ` Mel Hatzis
@ 2002-05-16  1:00           ` Yngve Svendsen
  2002-05-16 16:27             ` Milan Zamazal
  0 siblings, 1 reply; 13+ messages in thread
From: Yngve Svendsen @ 2002-05-16  1:00 UTC (permalink / raw)
  To: Mel Hatzis, Milan Zamazal; +Cc: Lars Henriksen, help-gnats

At 16:19 15.05.2002 -0700, Mel Hatzis wrote:
>Thinking about this some more, it would be *really* useful to allow this
>regular expression to be overriden in the dbconfig file. Allowing user's
>to define their own would provide them the flexibility to get as fancy
>as they want with some of the corner cases where the default regex behaviour
>is not desirable...such as "Re: PR 50 (was PR 33)" which matches PR 33.
>
>Thoughts?

I agree strongly with this.

Yngve Svendsen
Senior Engineer, HA Data Management
Sun Microsystems
yngve.svendsen@sun.com


_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-15 13:25       ` Mel Hatzis
  2002-05-15 16:21         ` Mel Hatzis
@ 2002-05-16 16:27         ` Milan Zamazal
  2002-05-17  7:24           ` Chad Walstrom
  1 sibling, 1 reply; 13+ messages in thread
From: Milan Zamazal @ 2002-05-16 16:27 UTC (permalink / raw)
  To: Mel Hatzis; +Cc: help-gnats

>>>>> "MH" == Mel Hatzis <hatzis@juniper.net> writes:

    MH> Milan, I was incorrect regarding the use of '\\<'...it does
    MH> work.  You can change the pattern match to the following:

    MH>    "(.*[^\\<])?\\<((PR[ \t/])\\|([-a-z0-9_+.]+/))([0-9]+)"

    MH> I tested this and it works. I guess this is what was originally
    MH> intended.

Yes, thanks, Mel.  BTW, I've found that Andrew Gray has already
suggested a similar thing in March.  I'm sorry for the duplicate work, I
was busy with some RL events since February :-(.  Anyway, subject
parsing should be fixed now.

Regards,

Milan Zamazal

-- 
Wasting somebody else's time strikes me as the height of rudeness.
						      Bill Gates

_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-16  1:00           ` Yngve Svendsen
@ 2002-05-16 16:27             ` Milan Zamazal
  2002-05-23 11:42               ` Mel Hatzis
  0 siblings, 1 reply; 13+ messages in thread
From: Milan Zamazal @ 2002-05-16 16:27 UTC (permalink / raw)
  To: Yngve Svendsen; +Cc: Mel Hatzis, help-gnats

>>>>> "YS" == Yngve Svendsen <yngve.svendsen@sun.com> writes:

    YS> At 16:19 15.05.2002 -0700, Mel Hatzis wrote:
    >> Thinking about this some more, it would be *really* useful to
    >> allow this regular expression to be overriden in the dbconfig
    >> file. Allowing user's to define their own would provide them the
    >> flexibility to get as fancy as they want with some of the corner
    >> cases where the default regex behaviour is not desirable...such
    >> as "Re: PR 50 (was PR 33)" which matches PR 33.
    >> 
    >> Thoughts?

    YS> I agree strongly with this.

Me too.  If someone sends appropriate patches, I'll be happy to
incorporate them.  (If anyone wants to do it before the 4.0 release,
please test it carefully and include documentation patches as well.)

Regards,

Milan Zamazal

-- 
It's amazing how much better you feel once you've given up hope.
                                                (unknown source)

_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-16 16:27         ` Milan Zamazal
@ 2002-05-17  7:24           ` Chad Walstrom
  2002-05-19 13:40             ` Milan Zamazal
  0 siblings, 1 reply; 13+ messages in thread
From: Chad Walstrom @ 2002-05-17  7:24 UTC (permalink / raw)
  To: help-gnats

[-- Attachment #1: Type: text/plain, Size: 744 bytes --]

On Fri, May 17, 2002 at 12:12:45AM +0200, Milan Zamazal wrote:
> Yes, thanks, Mel.  BTW, I've found that Andrew Gray has already
> suggested a similar thing in March.  I'm sorry for the duplicate work,
> I was busy with some RL events since February :-(.  Anyway, subject
> parsing should be fixed now.

THANK YOU!  I've been waiting for Gray's patch to make it in.  No
biggie.  I had applied the patch locally and created some local *.deb's
in the mean time.  In hindsight, I suppose I could have made these
available on my Debian public_html directory or done an NMU.

Oh, well. ;-)

-- 
Chad Walstrom <chewie@wookimus.net>                 | a.k.a. ^chewie
http://www.wookimus.net/                            | s.k.a. gunnarr

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-17  7:24           ` Chad Walstrom
@ 2002-05-19 13:40             ` Milan Zamazal
  0 siblings, 0 replies; 13+ messages in thread
From: Milan Zamazal @ 2002-05-19 13:40 UTC (permalink / raw)
  To: Chad Walstrom; +Cc: help-gnats

>>>>> "CW" == Chad Walstrom <chewie@wookimus.net> writes:

    CW> I had applied the patch locally and created some local *.deb's
    CW> in the mean time.  In hindsight, I suppose I could have made
    CW> these available on my Debian public_html directory or done an
    CW> NMU.

Thank you, but it's not necessary -- I'm going to upload a new version
as soon as I'll write a few words about Bug#132961 in README.Debian.

Regards,

Milan Zamazal

-- 
Omigod, it's a flame war about a flame war.  You know, a meta-flame war!
                                                 Kenny Tilton in comp.lang.lisp

_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-16 16:27             ` Milan Zamazal
@ 2002-05-23 11:42               ` Mel Hatzis
  2002-05-26  8:37                 ` Milan Zamazal
  0 siblings, 1 reply; 13+ messages in thread
From: Mel Hatzis @ 2002-05-23 11:42 UTC (permalink / raw)
  To: Milan Zamazal; +Cc: Yngve Svendsen, help-gnats

[-- Attachment #1: Type: text/plain, Size: 2159 bytes --]

Milan Zamazal wrote:
>>>>>>"YS" == Yngve Svendsen <yngve.svendsen@sun.com> writes:
>>>>>>
> 
>     YS> At 16:19 15.05.2002 -0700, Mel Hatzis wrote:
>     >> Thinking about this some more, it would be *really* useful to
>     >> allow this regular expression to be overriden in the dbconfig
>     >> file. Allowing user's to define their own would provide them the
>     >> flexibility to get as fancy as they want with some of the corner
>     >> cases where the default regex behaviour is not desirable...such
>     >> as "Re: PR 50 (was PR 33)" which matches PR 33.
>     >> 
>     >> Thoughts?
> 
>     YS> I agree strongly with this.
> 
> Me too.  If someone sends appropriate patches, I'll be happy to
> incorporate them.  (If anyone wants to do it before the 4.0 release,
> please test it carefully and include documentation patches as well.)
> 

OK, I've attached a patchfile for this.

You can now (optionally) include the following in the database-info
section of the dbconfig file:

     # The regular expression used to determine whether a PR is referenced
     # on an email subject line
     subject-matching {
         "\\<((PR[ \t/])\\|([-a-z0-9_+.]+/))([0-9]+)"
         capture-group "4"
     }

(The above example is exactly analogous to the built-in default)

Regarding documentation, I've updated the man page for dbconfig. One thing
I didn't mention is that if you include a regex which doesn't compile, GNATS
will revert to using the default - and will also email the gnats-admin via punt.

Invalid values for the capture-group similarly generate email to gnats-admin,
though in this case, GNATS does not revert to the default - it simply assumes
no PR's are referenced on the subject line. I'm not clear on whether this
too should revert to using the default, but decided against it since the
associated regex would be valid in this case.

A possible enhancement would be to allow multiple regular expressions to
be specified, each with their own capture group. Alternatively, it might
be nice to allow multiple capture groups, making it possible to update
multiple PR's using a single submission.

--
Mel Hatzis
Juniper Networks, Inc.

[-- Attachment #2: patch_subj_regex --]
[-- Type: text/plain, Size: 8489 bytes --]

Index: database.c
===================================================================
RCS file: /cvsroot/gnats/gnats/gnats/database.c,v
retrieving revision 1.23
diff -b -u -p -r1.23 database.c
--- database.c	29 Oct 2001 22:40:26 -0000	1.23
+++ database.c	23 May 2002 17:17:10 -0000
@@ -29,6 +29,7 @@ struct databaseInfo
   int keepReceivedHeadersFlag;
   int notifyExpireFlag;
   char *binDirValue;
+  SubjectMatchInfo subjectMatch;
   int submitterAckFlag;
   unsigned int businessDay[2];
   unsigned int businessWeek[2];
@@ -69,6 +70,7 @@ newDatabaseInfo (void)
   res->keepReceivedHeadersFlag = 0;
   res->notifyExpireFlag = 0;
   res->binDirValue = NULL;
+  res->subjectMatch = NULL;
   res->submitterAckFlag = 0;
   res->businessDay[0] = 0; res->businessDay[1] = 0;
   res->businessWeek[0] = 0; res->businessWeek[1] = 0;
@@ -248,6 +250,23 @@ setCategoryDirPerms (DatabaseInfo databa
 }
 
 void
+setSubjectMatch (DatabaseInfo database, const char *regex, const char *prIdx)
+{
+  if (databaseValid (database))
+    {
+      if (database->subjectMatch != NULL)
+	{
+	  free (database->subjectMatch->regex);
+	  free (database->subjectMatch);
+	}
+      database->subjectMatch =
+        (SubjectMatchInfo) xmalloc (sizeof (struct subjectMatchInfo));
+      database->subjectMatch->regex = xstrdup (regex);
+      database->subjectMatch->prIdx = atoi (prIdx);
+    }
+}
+
+void
 addGlobalChangeActions (DatabaseInfo database, ChangeActions actions)
 {
   if (databaseValid (database))
@@ -531,6 +550,19 @@ categoryDirPerms (const DatabaseInfo dat
     }
 }
 
+SubjectMatchInfo
+subjectMatch (const DatabaseInfo database)
+{
+  if (databaseValid (database))
+    {
+      return database->subjectMatch;
+    }
+  else
+    {
+      return NULL;
+    }
+}
+
 char *
 gnats_adm_dir (const DatabaseInfo database, const char *filename)
 {
@@ -964,6 +996,11 @@ freeDatabaseInfo (DatabaseInfo database)
       if (database->binDirValue != NULL)
 	{
 	  free (database->binDirValue);
+	}
+      if (database->subjectMatch != NULL)
+	{
+	  free (database->subjectMatch->regex);
+	  free (database->subjectMatch);
 	}
       if (database->next != NULL)
 	{
Index: database.h
===================================================================
RCS file: /cvsroot/gnats/gnats/gnats/database.h,v
retrieving revision 1.12
diff -b -u -p -r1.12 database.h
--- database.h	4 Jul 2001 18:26:28 -0000	1.12
+++ database.h	23 May 2002 17:17:10 -0000
@@ -24,6 +24,13 @@ Software Foundation, 59 Temple Place - S
 struct databaseInfo;
 typedef struct databaseInfo * DatabaseInfo;
 
+struct subjectMatchInfo
+{
+    char *regex;
+    unsigned int prIdx;
+};
+typedef struct subjectMatchInfo *SubjectMatchInfo;
+
 #include "adm.h"
 #include "mail.h"
 
@@ -93,6 +100,9 @@ extern void addGlobalChangeActions (Data
 				    ChangeActions actions);
 extern void setCategoryDirPerms (DatabaseInfo database, 
 				 const char *value);
+extern void setSubjectMatch(DatabaseInfo database,
+                            const char *regex,
+                            const char *prIdx);
 extern void setInputTemplate (DatabaseInfo database,
 			      InputTemplate *template);
 extern void setQueryFormatList (DatabaseInfo database,
@@ -122,6 +132,7 @@ extern ChangeActions globalChangeActions
 extern int createCategoryDirs (const DatabaseInfo database);
 QueryFormat *getAuditTrailFormat (const DatabaseInfo database);
 extern int categoryDirPerms (const DatabaseInfo database);
+extern SubjectMatchInfo subjectMatch (const DatabaseInfo database);
 extern IndexDesc getIndexDesc (const DatabaseInfo database);
 extern InputTemplate *getInputTemplate (const DatabaseInfo database);
 extern QueryFormat *getQueryFormatList (const DatabaseInfo database);
Index: fconfig.y
===================================================================
RCS file: /cvsroot/gnats/gnats/gnats/fconfig.y,v
retrieving revision 1.35
diff -b -u -p -r1.35 fconfig.y
--- fconfig.y	8 Dec 2001 20:21:20 -0000	1.35
+++ fconfig.y	23 May 2002 17:17:10 -0000
@@ -43,8 +43,8 @@
 %token BODYTOK HEADERTOK AUDITTRAILFMTTOK ADDAUDITTRAILTOK 
 %token REQUIRECHANGEREASONTOK READONLYTOK BINARYINDEXTOK RAWTOK
 %token BADTOK AUXFLAGSTOK PRLISTTOK MAXPRSTOK EDITONLYTOK VIRTUALFORMATTOK
-%token CATPERMSTOK
-%type <sval> optChangeExpr
+%token CATPERMSTOK SUBJECTMATCHINGTOK CAPTUREGROUPTOK
+%type <sval> optChangeExpr captureGroup
 %type <qstr> QSTRING
 %type <intval> INTVAL
 %type <adm_field_des> enumFieldList enumFieldMember
@@ -104,6 +104,12 @@ databaseInfoEnt	: DEBUGMODETOK booleanVa
 		| CATPERMSTOK QSTRING {
 		    setCategoryDirPerms (databaseBeingDefined, qStrVal ($2));
 		}
+		| SUBJECTMATCHINGTOK '{' QSTRING captureGroup '}' {
+		    setSubjectMatch(databaseBeingDefined, qStrVal ($3), $4);
+		}
+		;
+
+captureGroup	: CAPTUREGROUPTOK QSTRING { $$ = takeQString ($2); }
 		;
 
 booleanVal	: FALSETOK { $$ = 0; }
Index: fconfigl.l
===================================================================
RCS file: /cvsroot/gnats/gnats/gnats/fconfigl.l,v
retrieving revision 1.23
diff -b -u -p -r1.23 fconfigl.l
--- fconfigl.l	29 Oct 2001 22:40:38 -0000	1.23
+++ fconfigl.l	23 May 2002 17:17:10 -0000
@@ -305,6 +305,14 @@ create-category-dirs {
     return CREATECATEGORYDIRSTOK;
 }
 
+subject-matching {
+    return SUBJECTMATCHINGTOK;
+}
+
+capture-group {
+    return CAPTUREGROUPTOK;
+}
+
 false {
     return FALSETOK;
 }
Index: file-pr.c
===================================================================
RCS file: /cvsroot/gnats/gnats/gnats/file-pr.c,v
retrieving revision 1.46
diff -b -u -p -r1.46 file-pr.c
--- file-pr.c	16 May 2002 23:24:57 -0000	1.46
+++ file-pr.c	23 May 2002 17:17:11 -0000
@@ -576,6 +576,9 @@ checkIfReply (PR *pr, ErrorDesc *err)
   char case_fold[256];
   char *possiblePrNum;
   reg_syntax_t old_syntax;
+  SubjectMatchInfo subject_match;
+  unsigned prIdx = 4;
+  const char *retval = NULL;
 
   headerValue = header_value (pr, SUBJECT);
 
@@ -593,16 +596,37 @@ checkIfReply (PR *pr, ErrorDesc *err)
     }
   regex.translate = case_fold;
   
+  subject_match = subjectMatch (pr->database);
+  if (subject_match != NULL)
   {
-    const char *const PAT = "\\<((PR[ \t/])\\|([-a-z0-9_+.]+)/)([0-9]+)";
+      retval = re_compile_pattern (subject_match->regex,
+                                   strlen (subject_match->regex), &regex);
+      if (retval == NULL)
+        {
+          prIdx = subject_match->prIdx; /* regex compiled successfully */
+        }
+      else
+        {
+          const char *PAT = "\\<((PR[ \t/])\\|([-a-z0-9_+.]+)/)([0-9]+)";
+          punt (pr->database, 0,
+              "Invalid regular expression defined for 'subject-matching' in dbconfig: %s\n\nReverting to default regular expression.\n\n", retval);
     re_compile_pattern (PAT, strlen (PAT), &regex);
   }
+    }
+
   i = re_search (&regex, headerValue, strlen (headerValue), 0,
 		 strlen (headerValue), &regs);
   regex.translate = NULL;
   regfree (&regex);
   re_set_syntax (old_syntax);
 
+  if (prIdx >= regs.num_regs || regs.start[prIdx] == -1)
+    {
+      punt (pr->database, 0,
+            "illegal capture-group defined for subject-matching in dbconfig\n");
+      return NULL;
+    }
+
   if (i < 0)
     {
       return NULL;
@@ -610,7 +634,7 @@ checkIfReply (PR *pr, ErrorDesc *err)
 
   start = regs.start[0];
   end = regs.end[0];
-  idstart = regs.start[4] - start;
+  idstart = regs.start[prIdx] - start;
 
   free (regs.start);
   free (regs.end);
Index: man/dbconfig.man
===================================================================
RCS file: /cvsroot/gnats/gnats/gnats/man/dbconfig.man,v
retrieving revision 1.1
diff -b -u -p -r1.1 dbconfig.man
--- man/dbconfig.man	10 Mar 2000 04:49:29 -0000	1.1
+++ man/dbconfig.man	23 May 2002 17:17:11 -0000
@@ -131,6 +131,18 @@ to \fItrue\fR.
 .P
 The default value is \fItrue\fR.
 .RE
+.TP
+\fBsubject-matching\fR { "\fIregexp\fR" \fBcapture-group\fR "\fIinteger\fR" }
+Specifies the regular expression used by GNATS to determine whether
+the subject line in a PR header references a PR. The regular expression
+must capture the PR number.  The integer specified by capture-group identifies
+which pattern match group contains the PR number.
+.RS 0.5i
+.P
+If subject-matching is not specified, GNATS uses
+"\\\\<((PR[ \\t/])\\\\|([-a-z0-9_+.]+/))([0-9]+)"
+as the default for \fIregexp\fR with 4 as the associated capture-group integer.
+.RE
 
 .SH "Individual field configuration"
 Each field in a PR is described with a field entry. It has the general 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 4.0 beta - question on parsing of subject line in PR header
  2002-05-23 11:42               ` Mel Hatzis
@ 2002-05-26  8:37                 ` Milan Zamazal
  0 siblings, 0 replies; 13+ messages in thread
From: Milan Zamazal @ 2002-05-26  8:37 UTC (permalink / raw)
  To: Mel Hatzis; +Cc: Yngve Svendsen, help-gnats

>>>>> "MH" == Mel Hatzis <hatzis@juniper.net> writes:

    MH> OK, I've attached a patchfile for this.

Thank you, Mel.  Could you please send me the ChangeLog entries for your
changes?

When you're in it, would you like to unify the regexp syntax with the
rest of GNATS, to avoid mess (I know you're not responsible for it)?
`init_gnats' sets

  re_set_syntax ((RE_SYNTAX_POSIX_EXTENDED | RE_BK_PLUS_QM) & ~RE_DOT_NEWLINE);

while `checkIfReply' seems to set a different syntax.

    MH> Regarding documentation, I've updated the man page for
    MH> dbconfig. 

Please update the Texinfo documentation as well.  This one is more
important, changes to man pages can be omitted to avoid duplicate work.
Also, could you please check the section "Querying using regular
expressions" of the manual and to add missing information (if any)?

Thank you.

Regards,

Milan Zamazal

-- 
http://www.zamazal.org

_______________________________________________
Help-gnats mailing list
Help-gnats@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnats

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2002-05-26 15:37 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-05-10 16:21 4.0 beta - question on parsing of subject line in PR header Mel Hatzis
2002-05-11  4:21 ` Lars Henriksen
2002-05-11 10:46   ` Mel Hatzis
2002-05-15 10:00     ` Milan Zamazal
2002-05-15 13:25       ` Mel Hatzis
2002-05-15 16:21         ` Mel Hatzis
2002-05-16  1:00           ` Yngve Svendsen
2002-05-16 16:27             ` Milan Zamazal
2002-05-23 11:42               ` Mel Hatzis
2002-05-26  8:37                 ` Milan Zamazal
2002-05-16 16:27         ` Milan Zamazal
2002-05-17  7:24           ` Chad Walstrom
2002-05-19 13:40             ` Milan Zamazal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).