* [RFC] Updating patchwork patches on commit @ 2020-12-07 5:48 Siddhesh Poyarekar 2020-12-07 8:45 ` Florian Weimer 2020-12-07 16:15 ` DJ Delorie 0 siblings, 2 replies; 21+ messages in thread From: Siddhesh Poyarekar @ 2020-12-07 5:48 UTC (permalink / raw) To: libc-alpha [Re-sending because I don't know how to type email addresses.] Hi, I have been running some hacked up scripts to update patch state on patchwork for every commit that goes into the glibc repository. The script simply walks through commits in a date range, hashes the diffs from each ref (using patchwork/hasher.py) and compares it with hashes on patchwork. If the patch as been committed with the diff unchanged, the hashes match. This is very similar to the git hook that patchwork ships[1], so I hope to eventually add this into the glibc git hook. In the last run (2020-12-07), of the 33 commits went in since 2020-12-01, 19 were found in patchwork and 14 were missing. The week before (2020-11-23 - 2020-12-01) it was 19 found and 9 missing. This means that diffs of 14 patches were modified before committing. Our commit policy explicitly allows this and trusts committers to limit these changes to trivial fixes. However for patchwork usage to be valuable (and in the process, improve transparency), a 1:1 correspondence between git commits and patchwork would be ideal. That is, every commit on git should have at least one[2] patchwork entry. This also solves the question "What finally went in?" I've had to ask myself repeatedly when cleaning up patchwork state. We could achieve this without additional busy work by having the git hook send out [pushed] emails to the list in addition to glibc-cvs (libc-alpha should be spared the private branch traffic of course) whenever it sees a commit that it can't find on patchwork. A nightly script can then trivially mark all [pushed] patches as committed. Thoughts? Siddhesh [1] https://github.com/getpatchwork/patchwork/blob/master/tools/post-receive.hook [2] People have been known to send out identical patches repeatedly as part of a series. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-07 5:48 [RFC] Updating patchwork patches on commit Siddhesh Poyarekar @ 2020-12-07 8:45 ` Florian Weimer 2020-12-07 9:30 ` Siddhesh Poyarekar 2020-12-07 16:15 ` DJ Delorie 1 sibling, 1 reply; 21+ messages in thread From: Florian Weimer @ 2020-12-07 8:45 UTC (permalink / raw) To: Siddhesh Poyarekar; +Cc: libc-alpha * Siddhesh Poyarekar: > We could achieve this without additional busy work by having the git > hook send out [pushed] emails to the list in addition to glibc-cvs > (libc-alpha should be spared the private branch traffic of course) > whenever it sees a commit that it can't find on patchwork. A nightly > script can then trivially mark all [pushed] patches as committed. I'm not sure if this useful if we can't find the thread to which the updated commit belongs. If we can find the thread, it would be more useful to patch the diff between what was committed and the latest posted patch. Thanks, Florian -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-07 8:45 ` Florian Weimer @ 2020-12-07 9:30 ` Siddhesh Poyarekar 0 siblings, 0 replies; 21+ messages in thread From: Siddhesh Poyarekar @ 2020-12-07 9:30 UTC (permalink / raw) To: Florian Weimer; +Cc: libc-alpha On 12/7/20 2:15 PM, Florian Weimer wrote: > * Siddhesh Poyarekar: > >> We could achieve this without additional busy work by having the git >> hook send out [pushed] emails to the list in addition to glibc-cvs >> (libc-alpha should be spared the private branch traffic of course) >> whenever it sees a commit that it can't find on patchwork. A nightly >> script can then trivially mark all [pushed] patches as committed. > > I'm not sure if this useful if we can't find the thread to which the > updated commit belongs. That's a broader problem not limited to these [pushed] patches; we currently don't have a way to associate different versions of the same patch. My thinking was that adding these commits won't make things worse and could at least give us confidence that the patches that remain definitely did not make it into the repo and take stroner actions. It could let us to do things like walking backwards in time from committed patches to find patches with identical subject lines and close them off as superseded. It won't catch all superseded patches, but at least we'll get a majority of them. Once the process is bootstrapped, the likelihood of false positives (i.e. marking unrelated patches with the same subject lines) ought to be negligible. Siddhesh ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-07 5:48 [RFC] Updating patchwork patches on commit Siddhesh Poyarekar 2020-12-07 8:45 ` Florian Weimer @ 2020-12-07 16:15 ` DJ Delorie 2020-12-07 16:39 ` Siddhesh Poyarekar 1 sibling, 1 reply; 21+ messages in thread From: DJ Delorie @ 2020-12-07 16:15 UTC (permalink / raw) To: Siddhesh Poyarekar; +Cc: libc-alpha Siddhesh Poyarekar <siddhesh@gotplt.org> writes: > This means that diffs of 14 patches were modified before committing. Do you try removing the Reviewed-by tags and re-hashing? My last step before committing is usually to add those according to the reviews, so my patches might never match patchwork. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-07 16:15 ` DJ Delorie @ 2020-12-07 16:39 ` Siddhesh Poyarekar 2020-12-07 17:02 ` DJ Delorie 0 siblings, 1 reply; 21+ messages in thread From: Siddhesh Poyarekar @ 2020-12-07 16:39 UTC (permalink / raw) To: DJ Delorie; +Cc: libc-alpha On 12/7/20 9:45 PM, DJ Delorie wrote: > Siddhesh Poyarekar <siddhesh@gotplt.org> writes: >> This means that diffs of 14 patches were modified before committing. > > Do you try removing the Reviewed-by tags and re-hashing? My last step > before committing is usually to add those according to the reviews, so > my patches might never match patchwork. > Well your NSS patches did match and auto-close; in fact the v4 of 3/6 in that patchset also got closed as Committed because diff-wise it was identical to v5 3/6 :) Patchwork stores the diff separately and the hash is generated only on the diff using the patchwork hasher in patchwork/hasher.py. So changing the commit message in any way does not change the hash. Siddhesh ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-07 16:39 ` Siddhesh Poyarekar @ 2020-12-07 17:02 ` DJ Delorie 2020-12-07 18:11 ` Joseph Myers 0 siblings, 1 reply; 21+ messages in thread From: DJ Delorie @ 2020-12-07 17:02 UTC (permalink / raw) To: Siddhesh Poyarekar; +Cc: libc-alpha Siddhesh Poyarekar <siddhesh@gotplt.org> writes: > Well your NSS patches did match and auto-close; in fact the v4 of 3/6 in > that patchset also got closed as Committed because diff-wise it was > identical to v5 3/6 :) Ah, patchwork hash != git hash. Nevermind ;-) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-07 17:02 ` DJ Delorie @ 2020-12-07 18:11 ` Joseph Myers 2020-12-08 2:57 ` Siddhesh Poyarekar 0 siblings, 1 reply; 21+ messages in thread From: Joseph Myers @ 2020-12-07 18:11 UTC (permalink / raw) To: DJ Delorie; +Cc: Siddhesh Poyarekar, libc-alpha On Mon, 7 Dec 2020, DJ Delorie via Libc-alpha wrote: > Siddhesh Poyarekar <siddhesh@gotplt.org> writes: > > Well your NSS patches did match and auto-close; in fact the v4 of 3/6 in > > that patchset also got closed as Committed because diff-wise it was > > identical to v5 3/6 :) > > Ah, patchwork hash != git hash. Nevermind ;-) A previous discussion suggested "git patch-id" was appropriate to use for this purpose, but I don't know if it's what patchwork actually uses. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-07 18:11 ` Joseph Myers @ 2020-12-08 2:57 ` Siddhesh Poyarekar 2020-12-08 9:08 ` Andreas Schwab 0 siblings, 1 reply; 21+ messages in thread From: Siddhesh Poyarekar @ 2020-12-08 2:57 UTC (permalink / raw) To: Joseph Myers, DJ Delorie; +Cc: libc-alpha On 12/7/20 11:41 PM, Joseph Myers wrote: > On Mon, 7 Dec 2020, DJ Delorie via Libc-alpha wrote: > >> Siddhesh Poyarekar <siddhesh@gotplt.org> writes: >>> Well your NSS patches did match and auto-close; in fact the v4 of 3/6 in >>> that patchset also got closed as Committed because diff-wise it was >>> identical to v5 3/6 :) >> >> Ah, patchwork hash != git hash. Nevermind ;-) > > A previous discussion suggested "git patch-id" was appropriate to use for > this purpose, but I don't know if it's what patchwork actually uses. > It doesn't; it has it's own hashing function where it normalizes spaces and newline chars to avoid false negatives. It could however do with some rudimentary sorting of diff lines to ensure that it generates the same hash for reordered diffs. I'll play with that a bit later in the week and see if it improves matching. Siddhesh ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-08 2:57 ` Siddhesh Poyarekar @ 2020-12-08 9:08 ` Andreas Schwab 2020-12-08 10:10 ` Siddhesh Poyarekar 0 siblings, 1 reply; 21+ messages in thread From: Andreas Schwab @ 2020-12-08 9:08 UTC (permalink / raw) To: Siddhesh Poyarekar; +Cc: Joseph Myers, DJ Delorie, libc-alpha On Dez 08 2020, Siddhesh Poyarekar wrote: > On 12/7/20 11:41 PM, Joseph Myers wrote: >> On Mon, 7 Dec 2020, DJ Delorie via Libc-alpha wrote: >> >>> Siddhesh Poyarekar <siddhesh@gotplt.org> writes: >>>> Well your NSS patches did match and auto-close; in fact the v4 of 3/6 in >>>> that patchset also got closed as Committed because diff-wise it was >>>> identical to v5 3/6 :) >>> >>> Ah, patchwork hash != git hash. Nevermind ;-) >> A previous discussion suggested "git patch-id" was appropriate to use >> for >> this purpose, but I don't know if it's what patchwork actually uses. >> > > It doesn't; it has it's own hashing function where it normalizes spaces > and newline chars to avoid false negatives. Like git patch-id? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-08 9:08 ` Andreas Schwab @ 2020-12-08 10:10 ` Siddhesh Poyarekar 2020-12-16 18:35 ` Girish Joshi 0 siblings, 1 reply; 21+ messages in thread From: Siddhesh Poyarekar @ 2020-12-08 10:10 UTC (permalink / raw) To: Andreas Schwab; +Cc: Joseph Myers, DJ Delorie, libc-alpha On 12/8/20 2:38 PM, Andreas Schwab wrote: >> It doesn't; it has it's own hashing function where it normalizes spaces >> and newline chars to avoid false negatives. > > Like git patch-id? > Yeah, except that it (AFAICT) doesn't order the diff input like git patch-id does :) I suppose I could check if they're willing to add a dependency on git for this and drop their custom hasher or at least provide a supported way to add a different hashing function or program. Siddhesh ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-08 10:10 ` Siddhesh Poyarekar @ 2020-12-16 18:35 ` Girish Joshi 2020-12-16 18:49 ` Siddhesh Poyarekar 0 siblings, 1 reply; 21+ messages in thread From: Girish Joshi @ 2020-12-16 18:35 UTC (permalink / raw) To: Siddhesh Poyarekar Cc: Andreas Schwab, Girish Joshi via Libc-alpha, Joseph Myers [-- Attachment #1: Type: text/plain, Size: 1934 bytes --] Hello all, I tried a couple of very basic scripts for this. (I know that there are a lot of improvements needed there.) I was able to merge 336 series out of 1114. As "git-pw patch apply <id>" gives "Resource not found" for the older patches. So right now only series are applied to a branch. Here is how the scripts work. We have two scripts, "get-patches.py" and "apply-patches.py" (we can change the names of course). "get-patches.py" reads the patches/series starting from page1 to page 100 (currently) in csv format and dumps it to stdout. This output is piped to the second script "apply-patches.py" which tries to apply each series/patch to the branch. In the end we get two files as an output "merged.txt" and "unmerged.txt" containing the IDs for merged and unmerged series respectively. Currently these files are placed in the current directory, I'll change it to /tmp or something else in the next patch. Just to have it here, to apply patches using these two scripts $ python scripts/get-patches.py series | python scripts/apply-patches.py series apply I'm still not sure about what happens to the older patches, do they get applied from "git-pw series apply" or not (I'm looking into it) because the newer ones do get applied. Is it going in the right direction? Please share your thoughts. Thanks. Girish Joshi girishjoshi.io On Tue, Dec 8, 2020 at 3:40 PM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > > On 12/8/20 2:38 PM, Andreas Schwab wrote: > >> It doesn't; it has it's own hashing function where it normalizes spaces > >> and newline chars to avoid false negatives. > > > > Like git patch-id? > > > > Yeah, except that it (AFAICT) doesn't order the diff input like git > patch-id does :) I suppose I could check if they're willing to add a > dependency on git for this and drop their custom hasher or at least > provide a supported way to add a different hashing function or program. > > Siddhesh [-- Attachment #2: get-patches.py --] [-- Type: text/x-python, Size: 286 bytes --] #!python3 import os import sys type_ = sys.argv[1] command = "git-pw {0} list --page {1} -f csv" if type_ == 'patch': command+= " --state 'new'" for i in range(1, 100): # print(command.format(type_, i)) ret = os.system(command.format(type_, i)) if ret: break [-- Attachment #3: apply-patches.py --] [-- Type: text/x-python, Size: 3382 bytes --] #!python3 import re import csv import sys import shlex import subprocess as sp # import time prune_warining = "warning: There are too many unreachable loose objects; run 'git prune' to remove them." # List for series entries series = [] # These lists will contain merged and unmnerged series data. merged = [] unmerged = [] # option that we will be operating upon, series or the patch # this is the command line argument to git-pw # for example "git-pw patch apply 12345" or "git-pw series apply "12356" type_ = "series" # Get the csv data from stdin csv_data = [] for line in sys.stdin: if not '"ID","Date","Name","Version","Submitter"' in line: print(line) csv_data.append(line.strip()) # parse the csv entries def read_rows(csvfile): spamreader = csv.reader(csvfile, delimiter=",", quotechar='"') for row in spamreader: # print(row) if not row: return if row and row[1] != "ID": series.append(row) def get_output(cmnd): """ Execute command and check the output, if git throws a warning saying "warning: There are too many unreachable loose objects; run 'git prune' to remove them." `git prune` will be executed. otherwise output will be printed and exit code will be returned. """ try: output = sp.check_output( cmnd, stderr=sp.STDOUT, shell=True, universal_newlines=True ) except sp.CalledProcessError as exc: print("Status : FAIL", exc.returncode, exc.output) return exc.returncode else: print("Output: \n{}\n".format(output)) if prune_warining in output: print("running git prune") get_output("git prune") return 0 def write_file(filename, list_): """ This function is used to write the IDs for patches/series that are merged/unmerged after we have processed everything. """ with open(filename, "w") as f: for i in list_: f.write(i[0] + "\n") if __name__ == "__main__": read_rows(csv_data) # this is crappy, it will be replaced by arg parser. if len(sys.argv) >= 3: if sys.argv[1] == "series" or sys.argv[1] == "patch": type_ = sys.argv[1] if sys.argv[2] == "apply": print("applying ", type_) if series: for i in series: try: print("trying to apply:", i[0], i[1], i[2]) if i[0] == "ID": pass ret = get_output(f"git-pw {type_} apply {i[0]}") print("git exit code: ", ret) # time.sleep(0.5) if ret: # if `git-pw patch/series apply <id>` fails # resetting to HEAD print("resetting...") get_output("git reset --hard HEAD") get_output("git am --abort") unmerged.append(i) else: merged.append(i) except KeyboardInterrupt as ke: break print("total merged: {0}, total unmerged {1}".format(len(merged), len(unmerged))) write_file("merged.txt", merged) write_file("unmerged.txt", unmerged) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-16 18:35 ` Girish Joshi @ 2020-12-16 18:49 ` Siddhesh Poyarekar 2020-12-17 17:49 ` Girish Joshi 0 siblings, 1 reply; 21+ messages in thread From: Siddhesh Poyarekar @ 2020-12-16 18:49 UTC (permalink / raw) To: Girish Joshi; +Cc: Andreas Schwab, Girish Joshi via Libc-alpha, Joseph Myers On 12/17/20 12:05 AM, Girish Joshi wrote: > Hello all, > I tried a couple of very basic scripts for this. (I know that there > are a lot of improvements needed there.) > I was able to merge 336 series out of 1114. I'm surprised there are 1114 series that need action; maybe it's including series that have already been committed and you need to filter those out? > As "git-pw patch apply <id>" gives "Resource not found" for the older > patches. So right now only series are applied to a branch. > Here is how the scripts work. > We have two scripts, "get-patches.py" and "apply-patches.py" (we can > change the names of course). > "get-patches.py" reads the patches/series starting from page1 to page > 100 (currently) in csv format and dumps it to stdout. This output is > piped to the second script "apply-patches.py" which tries to apply > each series/patch to the branch. It should become one script. > In the end we get two files as an output "merged.txt" and > "unmerged.txt" containing the IDs for merged and unmerged series > respectively. > Currently these files are placed in the current directory, I'll change > it to /tmp or something else in the next patch. > > Just to have it here, to apply patches using these two scripts > > $ python scripts/get-patches.py series | python > scripts/apply-patches.py series apply > > I'm still not sure about what happens to the older patches, do they > get applied from "git-pw series apply" or not (I'm looking into it) > because the newer ones do get applied. The older ones do not have a series ID because they were ported over from an ancient patchwork instance, so they won't work with `git-pw series`. They'll need some trickery to figure out series. There ought to be some relationship beyond the name, say, in the mbox of the patch that could be exploited to make that connection. Siddhesh ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-16 18:49 ` Siddhesh Poyarekar @ 2020-12-17 17:49 ` Girish Joshi 2020-12-18 4:04 ` Siddhesh Poyarekar 0 siblings, 1 reply; 21+ messages in thread From: Girish Joshi @ 2020-12-17 17:49 UTC (permalink / raw) To: Siddhesh Poyarekar Cc: Andreas Schwab, Girish Joshi via Libc-alpha, Joseph Myers [-- Attachment #1: Type: text/plain, Size: 1889 bytes --] Hi Siddhesh, On Thu, Dec 17, 2020 at 12:19 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > I'm surprised there are 1114 series that need action; maybe it's > including series that have already been committed and you need to filter > those out? Yeah, in the git output we can see that a lot of those are already applied. > > As "git-pw patch apply <id>" gives "Resource not found" for the older > > patches. So right now only series are applied to a branch. > > Here is how the scripts work. > > We have two scripts, "get-patches.py" and "apply-patches.py" (we can > > change the names of course). > > "get-patches.py" reads the patches/series starting from page1 to page > > 100 (currently) in csv format and dumps it to stdout. This output is > > piped to the second script "apply-patches.py" which tries to apply > > each series/patch to the branch. > > It should become one script. Will do that in a couple of iterations, right now I've modified it so that it can take input from stdin as well as from a csv file. > > In the end we get two files as an output "merged.txt" and > > "unmerged.txt" containing the IDs for merged and unmerged series > > respectively. > > Currently these files are placed in the current directory, I'll change > > it to /tmp or something else in the next patch. I've added one more file to it for unavailable patches. So the the author can be notified and asked to repost those patches (if needed). Also added an argument for changing the output location for these files. > The older ones do not have a series ID because they were ported over > from an ancient patchwork instance, so they won't work with `git-pw > series`. They'll need some trickery to figure out series. There ought > to be some relationship beyond the name, say, in the mbox of the patch > that could be exploited to make that connection. looking into it. Thanks. Girish Joshi [-- Attachment #2: apply-patches.py --] [-- Type: text/x-python, Size: 6346 bytes --] #!python3 import subprocess as sp import argparse import csv import os import sys # if these strings are found in output of git/git-pw, # we need to take some actions. prune_warining = "warning: There are too many unreachable loose objects; run 'git prune' to remove them." resource_not_found_warning = "Resource not found" already_applied_warning = "No changes -- Patch already applied." # These lists will contain merged and unmnerged series data. merged = [] unmerged = [] unavailable = [] # parse the csv entries def read_rows(csvfile): # List for series entries series_data = [] csvreader = csv.reader(csvfile, delimiter=",", quotechar='"') for row in csvreader: # print(row) if not row: return if row and row[1] != "ID": series_data.append(row) return series_data def run_cmd(cmd, debug=False): """ Execute command and check the output, if git throws a warning saying "warning: There are too many unreachable loose objects; run 'git prune' to remove them." `git prune` will be executed. otherwise output will be printed and exit code will be returned. """ exit_code = 0 output = "" try: output = sp.check_output( cmd, stderr=sp.STDOUT, shell=True, universal_newlines=True ) except sp.CalledProcessError as exc: if debug: print("Status : FAIL", exc.returncode, exc.output) exit_code, output = exc.returncode, exc.output else: if debug: print("{}\n".format(output)) return exit_code, output def write_file(filename, list_): """ This function is used to write the IDs for patches/series that are merged/unmerged/unavailable after we have processed everything. """ with open(filename, "w") as f: for i in list_: f.write(i[0] + "\n") def apply_(series): for i in series: try: print( f"{bcolors.OKGREEN}trying to apply:{type_}{bcolors.OKCYAN} {i[0]} {bcolors.ENDC}" ) print(f"{bcolors.OKBLUE} {i[1]}, {bcolors.UNDERLINE}{i[2]}{bcolors.ENDC}") if i[0] == "ID": pass exit_code, output = run_cmd(f"git-pw {type_} apply {i[0]}") if prune_warining in output: print("running: git prune") get_output("git prune") if exit_code == 1 and resource_not_found_warning in output: print(f"{bcolors.WARNING}patch unavailable{bcolors.ENDC}") unavailable.append(i) if exit_code: # if `git-pw patch/series apply <id>` fails # resetting to HEAD print( f"{bcolors.OKCYAN}git exit code: {bcolors.FAIL}{exit_code}{bcolors.ENDC}" ) print(f"{bcolors.FAIL}resetting to HEAD: {bcolors.ENDC}") if os.path.exists(".git/rebase-apply"): run_cmd("git am --abort") run_cmd("git reset --hard HEAD", debug=True) unmerged.append(i) else: if output.strip().endswith(already_applied_warning): print( f"{bcolors.WARNING}No changes -- already applied {bcolors.ENDC}\n" ) else: print(f"{bcolors.OKCYAN}{type_} applied{bcolors.ENDC}\n") merged.append(i) except KeyboardInterrupt as ke: break if __name__ == "__main__": parser = argparse.ArgumentParser(description="Initial Ci script for patchwork") parser.add_argument( "-c", "--colors", default=False, action="store_true", help="Enable colors" ) parser.add_argument( "-t", "--type", type=str, default="series", choices=["patch", "series"], help="type: patch/series", ) parser.add_argument( "-a", "--action", type=str, default="apply", help="action: list/apply" ) parser.add_argument( "-o", "--output-location", type=str, default="/tmp/pw-results", help="location for the output files containing merged, umerged and unavailable patches/series.", ) parser.add_argument( "-i", "--input-file", type=str, default="-", help="input file: csv file or '-' for the standard input", ) args = parser.parse_args() print(args) csv_data = [] if args.input_file == "-": # Get the csv data from stdin for line in sys.stdin: if not '"ID"' in line: print(line) csv_data.append(line.strip()) elif os.path.exists(args.input_file): data = open(args.input_file).read().strip().split("\n") for line in data: if not '"ID"' in line: print(line) csv_data.append(line.strip()) # option that we will be operating upon, series or the patch # this is the command line argument to git-pw # for example "git-pw patch apply 12345" or "git-pw series apply "12356" type_ = args.type series_data = read_rows(csv_data) output_files_loc = args.output_location if not os.path.exists(output_files_loc): os.mkdir(output_files_loc) colors = args.colors class bcolors: if colors: HEADER = "\033[95m" OKBLUE = "\033[94m" OKCYAN = "\033[96m" OKGREEN = "\033[92m" WARNING = "\033[93m" FAIL = "\033[91m" ENDC = "\033[0m" BOLD = "\033[1m" UNDERLINE = "\033[4m" else: HEADER = "" OKBLUE = "" OKCYAN = "" OKGREEN = "" WARNING = "" FAIL = "" ENDC = "" BOLD = "" UNDERLINE = "" if args.action == "apply": apply_(series_data) print( "total merged: {0}, total unmerged {1}, total unavailable{2}".format( len(merged), len(unmerged), len(unavailable) ) ) write_file(f"{output_files_loc}/merged.txt", merged) write_file(f"{output_files_loc}/unmerged.txt", unmerged) write_file(f"{output_files_loc}/unavailable.txt", unavailable) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-17 17:49 ` Girish Joshi @ 2020-12-18 4:04 ` Siddhesh Poyarekar 2020-12-19 13:25 ` Girish Joshi 0 siblings, 1 reply; 21+ messages in thread From: Siddhesh Poyarekar @ 2020-12-18 4:04 UTC (permalink / raw) To: Girish Joshi; +Cc: Andreas Schwab, Girish Joshi via Libc-alpha, Joseph Myers On 12/17/20 11:19 PM, Girish Joshi wrote: > Hi Siddhesh, > On Thu, Dec 17, 2020 at 12:19 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: >> I'm surprised there are 1114 series that need action; maybe it's >> including series that have already been committed and you need to filter >> those out? > Yeah, in the git output we can see that a lot of those are already applied. Are they marked as committed though? If not then they should be. Siddhesh ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-18 4:04 ` Siddhesh Poyarekar @ 2020-12-19 13:25 ` Girish Joshi 2020-12-22 15:13 ` Girish Joshi 0 siblings, 1 reply; 21+ messages in thread From: Girish Joshi @ 2020-12-19 13:25 UTC (permalink / raw) To: Siddhesh Poyarekar Cc: Andreas Schwab, Girish Joshi via Libc-alpha, Joseph Myers On Fri, Dec 18, 2020 at 9:34 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > > On 12/17/20 11:19 PM, Girish Joshi wrote: > > Hi Siddhesh, > > On Thu, Dec 17, 2020 at 12:19 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > >> I'm surprised there are 1114 series that need action; maybe it's > >> including series that have already been committed and you need to filter > >> those out? > > Yeah, in the git output we can see that a lot of those are already applied. > > Are they marked as committed though? If not then they should be. Yes, the status for (almost all of) those patches is "committed" on the patchwork instance. To verify it I'm writing down the IDs for such series in a separate file now. Although I did not find an option from the git-pw cli for checking if a series is already committed. The work around for that could be to go through all of the patches in that series and check if all of them are committed. Girish Joshi ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-19 13:25 ` Girish Joshi @ 2020-12-22 15:13 ` Girish Joshi 2021-01-06 20:26 ` Girish Joshi 0 siblings, 1 reply; 21+ messages in thread From: Girish Joshi @ 2020-12-22 15:13 UTC (permalink / raw) To: Siddhesh Poyarekar Cc: Andreas Schwab, Girish Joshi via Libc-alpha, Joseph Myers [-- Attachment #1: Type: text/plain, Size: 1480 bytes --] I've created this[1] script to go through all available series and get the patches that do not belong to any one of them. It dumps a json containing individual patch ids in /tmp directory. This script can be merged with the previous one "apply-patches.py". I'll do that soon. Currently we have around 106 individual patches with the state "new" that do not belong to any of the series. Girish Joshi girishjoshi.io On Sat, Dec 19, 2020 at 6:55 PM Girish Joshi <girish946@gmail.com> wrote: > > On Fri, Dec 18, 2020 at 9:34 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > > > > On 12/17/20 11:19 PM, Girish Joshi wrote: > > > Hi Siddhesh, > > > On Thu, Dec 17, 2020 at 12:19 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > > >> I'm surprised there are 1114 series that need action; maybe it's > > >> including series that have already been committed and you need to filter > > >> those out? > > > Yeah, in the git output we can see that a lot of those are already applied. > > > > Are they marked as committed though? If not then they should be. > Yes, the status for (almost all of) those patches is "committed" on > the patchwork instance. > To verify it I'm writing down the IDs for such series in a separate file now. > Although I did not find an option from the git-pw cli for checking if > a series is already committed. > The work around for that could be to go through all of the patches in > that series and check if all of them are committed. > > Girish Joshi [-- Attachment #2: check-series.py --] [-- Type: text/x-python, Size: 2980 bytes --] #!/usr/bin/env python import csv import sys import os import subprocess as sp import _thread as thread def read_file(file_name): file_data = [] if os.path.exists(file_name): data = open(file_name).read().strip().split("\n") for line in data: if not '"ID"' in line: # print(line) file_data.append(line.strip()) return file_data # parse the csv entries def read_rows(csvfile): # List for series entries series_data = [] csvreader = csv.reader(csvfile, delimiter=",", quotechar='"') for row in csvreader: # print(row) if not row: return if row and row[1] != "ID": series_data.append(row) return series_data def run_cmd(cmd, debug=False): """ Execute command and return the exit code and output. """ exit_code = 0 output = "" try: output = sp.check_output( cmd, stderr=sp.STDOUT, shell=True, universal_newlines=True ) except sp.CalledProcessError as exc: if debug: print("Status : FAIL", exc.returncode, exc.output) exit_code, output = exc.returncode, exc.output else: if debug: print("{}\n".format(output)) return exit_code, output def write_json(file_name, data): import json with open(file_name, "w") as f: f.write(json.dumps(data)) def check_data(list_, index): for i in list_: ret, op = run_cmd(f"git-pw series show {i} -f csv") print("****", index, "****") series_data = read_rows(op.strip().split("\n")) print(series_data[1]) for j in series_data[11:]: patch_data = j[1].split() print(patch_data[0], patch_data[1]) series_dict[i].append(patch_data[0]) if patch_data[0] in patch_ids: patch_ids.remove(patch_data[0]) done_lists[index] = True if __name__ == "__main__": file_loc = "/tmp/pw-analysis" if not os.path.exists(file_loc): os.mkdir(file_loc) series_file = sys.argv[1] patches_file = sys.argv[2] series = [i for i in read_rows(read_file(series_file))] series_dict = {i[0]: [] for i in series} patches = read_rows(read_file(patches_file)) patch_ids = [i[0] for i in patches] series_ids = [i for i in series_dict.keys()] chunk_size = 100 all_lists = [ series_ids[i : i + chunk_size] for i in range(0, len(series_ids), chunk_size) ] done_lists = [False for i in range(len(all_lists))] print(len(all_lists)) for index, i in enumerate(all_lists): print(index) thread.start_new_thread(check_data, (i, index)) while False in done_lists: pass print("Total individual patches: ", len(patch_ids)) print("writing patch ids to", file_loc + "/patch_ids") write_json(file_loc + "/dict", series_dict) write_json(file_loc + "/patch_ids", {"patches": patch_ids}) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2020-12-22 15:13 ` Girish Joshi @ 2021-01-06 20:26 ` Girish Joshi 2021-02-04 15:47 ` Girish Joshi 0 siblings, 1 reply; 21+ messages in thread From: Girish Joshi @ 2021-01-06 20:26 UTC (permalink / raw) To: Siddhesh Poyarekar Cc: Andreas Schwab, Girish Joshi via Libc-alpha, Joseph Myers [-- Attachment #1: Type: text/plain, Size: 2320 bytes --] I've combined the two scripts to pull data from the patchwork instance instead of stdin or a csv file. Also to get the patch ids for the old patches that do not belong to any series. To get these individual patches python scripts/apply-patches.py -u I'll try to fix a few small things like taking page numbers for pulling the data from the command line itself by this weekend. Once this is done, this script can be invoked after a regular interval of time to check if the new patches can be applied. Also I will try to set up a local patchwork instance this weekend (I was supposed to do this a couple of weeks back but got too busy and could not do that). Thanks. Girish Joshi girishjoshi.io On Tue, Dec 22, 2020 at 8:43 PM Girish Joshi <girish946@gmail.com> wrote: > > I've created this[1] script to go through all available series and get > the patches that do not belong to any one of them. > It dumps a json containing individual patch ids in /tmp directory. > This script can be merged with the previous one "apply-patches.py". > I'll do that soon. > Currently we have around 106 individual patches with the state "new" > that do not belong to any of the series. > > Girish Joshi > girishjoshi.io > > On Sat, Dec 19, 2020 at 6:55 PM Girish Joshi <girish946@gmail.com> wrote: > > > > On Fri, Dec 18, 2020 at 9:34 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > > > > > > On 12/17/20 11:19 PM, Girish Joshi wrote: > > > > Hi Siddhesh, > > > > On Thu, Dec 17, 2020 at 12:19 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > > > >> I'm surprised there are 1114 series that need action; maybe it's > > > >> including series that have already been committed and you need to filter > > > >> those out? > > > > Yeah, in the git output we can see that a lot of those are already applied. > > > > > > Are they marked as committed though? If not then they should be. > > Yes, the status for (almost all of) those patches is "committed" on > > the patchwork instance. > > To verify it I'm writing down the IDs for such series in a separate file now. > > Although I did not find an option from the git-pw cli for checking if > > a series is already committed. > > The work around for that could be to go through all of the patches in > > that series and check if all of them are committed. > > > > Girish Joshi [-- Attachment #2: apply-patches.py --] [-- Type: text/x-python, Size: 9613 bytes --] #!python3 import subprocess as sp import _thread as thread import argparse import csv import os import sys import time # if these strings are found in output of git/git-pw, # we need to take some actions. prune_warining = "warning: There are too many unreachable loose objects; run 'git prune' to remove them." resource_not_found_warning = "Resource not found" already_applied_warning = "No changes -- Patch already applied." # These lists will contain merged and unmnerged series data. merged = [] unmerged = [] unavailable = [] already_applied = [] # parse the csv entries def read_rows(csvfile): # List for series entries series_data = [] csvreader = csv.reader(csvfile, delimiter=",", quotechar='"') for row in csvreader: print(row) if not row: return if row and row[0] != "ID": series_data.append(row) return series_data def run_cmd(cmd, debug=False): """ Execute command and return the exit code and output. """ exit_code = 0 output = "" try: output = sp.check_output( cmd, stderr=sp.STDOUT, shell=True, universal_newlines=True ) except sp.CalledProcessError as exc: if debug: print("Status : FAIL", exc.returncode, exc.output) exit_code, output = exc.returncode, exc.output else: if debug: print("{}\n".format(output)) return exit_code, output def write_file(filename, list_): """ This function is used to write the IDs for patches/series that are merged/unmerged/unavailable after we have processed everything. """ with open(filename, "w") as f: for i in list_: f.write(i[0] + "\n") def write_json(file_name, data): import json with open(file_name, "w") as f: f.write(json.dumps(data)) def apply_(series): """if git throws a warning saying "warning: There are too many unreachable loose objects; run 'git prune' to remove them." `git prune` will be executed. otherwise output will be printed and exit code will be returned. """ for i in series: try: print( f"{bcolors.OKGREEN}trying to apply:{type_}{bcolors.OKCYAN} {i[0]} {bcolors.ENDC}" ) print(f"{bcolors.OKBLUE} {i[1]}, {bcolors.UNDERLINE}{i[2]}{bcolors.ENDC}") if i[0] == "ID": pass exit_code, output = run_cmd(f"git-pw {type_} apply {i[0]}") if prune_warining in output: print("running: git prune") run_cmd("git prune") if exit_code == 1 and resource_not_found_warning in output: print(f"{bcolors.WARNING}patch unavailable{bcolors.ENDC}") unavailable.append(i) if exit_code: # if `git-pw patch/series apply <id>` fails # resetting to HEAD print( f"{bcolors.OKCYAN}git exit code: {bcolors.FAIL}{exit_code}{bcolors.ENDC}" ) unmerged.append(i) else: if output.strip().endswith(already_applied_warning): print( f"{bcolors.WARNING}No changes -- already applied {bcolors.ENDC}\n" ) already_applied.append(i) else: print(f"{bcolors.OKCYAN}{type_} applied{bcolors.ENDC}\n") merged.append(i) print(f"{bcolors.FAIL}resetting to HEAD: {bcolors.ENDC}") if os.path.exists(".git/rebase-apply"): run_cmd("git am --abort") run_cmd("git reset --hard master", debug=True) except KeyboardInterrupt as ke: break except Exception as e: print(e) break def get_patches(from_page=1, to_page=100): cmd = "git-pw patch list --page {0} -f csv --state 'new'" patches = [] for i in range(from_page, to_page): exit_code, output = run_cmd(cmd.format(i), debug=True) if exit_code: print(f"git-pw exited with exit code {exit_code}") # patches.extend(output.strip().split("\n")) break patches.extend(output.strip().split("\n")) # print(patches) return patches def get_series(from_page=1, to_page=100): cmd = "git-pw series list --page {0} -f csv" series = [] for i in range(from_page, to_page): exit_code, output = run_cmd(cmd.format(i), debug=True) if exit_code: print(f"git-pw exited with exit code {exit_code}") # series.extend(output.strip().split("\n")) break series.extend(output.strip().split("\n")) print(series) return series def get_patches_for_series(list_, index, series_dict, patch_ids): for i in list_: print(f"running: git-pw series show {i} -f csv") ret, op = run_cmd(f"git-pw series show {i} -f csv") print("****", index, "****") if ret: print(f"exitted with {ret}: {op}") series_data = read_rows(op.strip().split("\n")) print(series_data[1]) for j in series_data[11:]: patch_data = j[1].split() print(patch_data[0], patch_data[1]) series_dict[i].append(patch_data[0]) if patch_data[0] in patch_ids: print("poping ") patch_ids.remove(patch_data[0]) def get_individual_patches(): file_loc = "/tmp/pwanalysis" series = [i for i in read_rows(get_series())] # print(series) series_dict = {i[0]: [] for i in series} patches = read_rows(get_patches()) patch_ids = [i[0] for i in patches] # print(patch_ids) series_ids = [i for i in series_dict.keys()] get_patches_for_series(series_ids, 0, series_dict, patch_ids) print("Individual patches", len(patch_ids)) print("series_dict", series_dict) write_json(file_loc + "/dict", series_dict) write_json(file_loc + "/patch_ids", {"patches": patch_ids}) if __name__ == "__main__": parser = argparse.ArgumentParser(description="Initial Ci script for patchwork") parser.add_argument( "-c", "--colors", default=False, action="store_true", help="Enable colors" ) parser.add_argument( "-t", "--type", type=str, default="series", choices=["patch", "series"], help="type: patch/series", ) parser.add_argument( "-a", "--action", type=str, default="apply", help="action: list/apply" ) parser.add_argument( "-o", "--output-location", type=str, default="/tmp/pw-results", help="location for the output files containing merged, umerged and unavailable patches/series.", ) parser.add_argument( "-i", "--input-file", type=str, default="", help="input file: csv file or '-' for the standard input. If no file is specified\ this data will be pulled from patchwork instance.", ) parser.add_argument( "-u", "--get-individual-patches", default=False, action="store_true", help="""Get individual patches. In this case the series data and the patches data is pulled and compared to find out the individual patches that do not belong to any series.""", ) args = parser.parse_args() print(args) if args.get_individual_patches: print("getting individual patches") get_individual_patches() sys.exit(0) csv_data = [] if args.input_file == "-": # Get the csv data from stdin for line in sys.stdin: if not '"ID"' in line: print(line) csv_data.append(line.strip()) elif os.path.exists(args.input_file): data = open(args.input_file).read().strip().split("\n") for line in data: if not '"ID"' in line: # print(line) csv_data.append(line.strip()) # option that we will be operating upon, series or the patch # this is the command line argument to git-pw # for example "git-pw patch apply 12345" or "git-pw series apply "12356" type_ = args.type series_data = read_rows(csv_data) output_files_loc = args.output_location if not os.path.exists(output_files_loc): os.mkdir(output_files_loc) colors = args.colors class bcolors: if colors: HEADER = "\033[95m" OKBLUE = "\033[94m" OKCYAN = "\033[96m" OKGREEN = "\033[92m" WARNING = "\033[93m" FAIL = "\033[91m" ENDC = "\033[0m" BOLD = "\033[1m" UNDERLINE = "\033[4m" else: HEADER = "" OKBLUE = "" OKCYAN = "" OKGREEN = "" WARNING = "" FAIL = "" ENDC = "" BOLD = "" UNDERLINE = "" print(len(series_data)) if args.action == "apply": if args.input_file == "": series_data = [i for i in read_rows(get_series())] apply_(series_data) print( "total merged: {0}, total unmerged {1}, total unavailable{2}".format( len(merged), len(unmerged), len(unavailable) ) ) write_file(f"{output_files_loc}/merged.txt", merged) write_file(f"{output_files_loc}/unmerged.txt", unmerged) write_file(f"{output_files_loc}/unavailable.txt", unavailable) write_file(f"{output_files_loc}/already_applied.txt", already_applied) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2021-01-06 20:26 ` Girish Joshi @ 2021-02-04 15:47 ` Girish Joshi 2021-02-12 5:25 ` Siddhesh Poyarekar 2021-02-12 9:02 ` Siddhesh Poyarekar 0 siblings, 2 replies; 21+ messages in thread From: Girish Joshi @ 2021-02-04 15:47 UTC (permalink / raw) To: Siddhesh Poyarekar, Girish Joshi via Libc-alpha Cc: Andreas Schwab, Joseph Myers [-- Attachment #1: Type: text/plain, Size: 811 bytes --] Hello Siddhesh, On Thu, Jan 7, 2021 at 1:56 AM Girish Joshi <girish946@gmail.com> wrote: > > I've combined the two scripts to pull data from the patchwork instance > instead of stdin or a csv file. > Also to get the patch ids for the old patches that do not belong to any series. > To get these individual patches > python scripts/apply-patches.py -u > > I'll try to fix a few small things like taking page numbers for > pulling the data from the command line itself by this weekend. I've done this change in the attached script. > Once this is done, this script can be invoked after a regular interval > of time to check if the new patches can be applied. We can do this now. There are a couple of functions that need refactoring. But for now it does the job. Could you please review it? Girish Joshi [-- Attachment #2: apply-patches.py --] [-- Type: text/x-python, Size: 10264 bytes --] #!python3 import subprocess as sp import _thread as thread import argparse import csv import os import sys import time # if these strings are found in output of git/git-pw, # we need to take some actions. prune_warining = "warning: There are too many unreachable loose objects; run 'git prune' to remove them." resource_not_found_warning = "Resource not found" already_applied_warning = "No changes -- Patch already applied." # These lists will contain merged and unmnerged series data. merged = [] unmerged = [] unavailable = [] already_applied = [] # parse the csv entries def read_rows(csvfile): # List for series entries series_data = [] csvreader = csv.reader(csvfile, delimiter=",", quotechar='"') for row in csvreader: print(row) if not row: return if row and row[0] != "ID": series_data.append(row) return series_data def run_cmd(cmd, debug=False): """ Execute command and return the exit code and output. """ exit_code = 0 output = "" try: output = sp.check_output( cmd, stderr=sp.STDOUT, shell=True, universal_newlines=True ) except sp.CalledProcessError as exc: if debug: print("Status : FAIL", exc.returncode, exc.output) exit_code, output = exc.returncode, exc.output else: if debug: print("{}\n".format(output)) return exit_code, output def write_file(filename, list_): """ This function is used to write the IDs for patches/series that are merged/unmerged/unavailable after we have processed everything. """ with open(filename, "w") as f: for i in list_: f.write(i[0] + "\n") def write_json(file_name, data): import json with open(file_name, "w") as f: f.write(json.dumps(data)) def apply_(series): """if git throws a warning saying "warning: There are too many unreachable loose objects; run 'git prune' to remove them." `git prune` will be executed. otherwise output will be printed and exit code will be returned. """ for i in series: try: print( f"{bcolors.OKGREEN}trying to apply:{type_}{bcolors.OKCYAN} {i[0]} {bcolors.ENDC}" ) print(f"{bcolors.OKBLUE} {i[1]}, {bcolors.UNDERLINE}{i[2]}{bcolors.ENDC}") if i[0] == "ID": pass exit_code, output = run_cmd(f"git-pw {type_} apply {i[0]}") if prune_warining in output: print("running: git prune") run_cmd("git prune") if exit_code == 1 and resource_not_found_warning in output: print(f"{bcolors.WARNING}patch unavailable{bcolors.ENDC}") unavailable.append(i) if exit_code: # if `git-pw patch/series apply <id>` fails # resetting to HEAD print( f"{bcolors.OKCYAN}git exit code: {bcolors.FAIL}{exit_code}{bcolors.ENDC}" ) unmerged.append(i) else: if output.strip().endswith(already_applied_warning): print( f"{bcolors.WARNING}No changes -- already applied {bcolors.ENDC}\n" ) already_applied.append(i) else: print(f"{bcolors.OKCYAN}{type_} applied{bcolors.ENDC}\n") merged.append(i) print(f"{bcolors.FAIL}resetting to HEAD: {bcolors.ENDC}") if os.path.exists(".git/rebase-apply"): run_cmd("git am --abort") run_cmd("git reset --hard master", debug=True) except KeyboardInterrupt as ke: break except Exception as e: print(e) break def get_patches(from_page=1, to_page=100): cmd = "git-pw patch list --page {0} -f csv --state 'new'" patches = [] for i in range(from_page, to_page): exit_code, output = run_cmd(cmd.format(i), debug=True) if exit_code: print(f"git-pw exited with exit code {exit_code}") # patches.extend(output.strip().split("\n")) break patches.extend(output.strip().split("\n")) # print(patches) return patches def get_series(from_page=1, to_page=100): cmd = "git-pw series list --page {0} -f csv" series = [] for i in range(from_page, to_page): exit_code, output = run_cmd(cmd.format(i), debug=True) if exit_code: print(f"git-pw exited with exit code {exit_code}") # series.extend(output.strip().split("\n")) break series.extend(output.strip().split("\n")) return series def get_patches_for_series(list_, index, series_dict, patch_ids): for i in list_: print(f"running: git-pw series show {i} -f csv") ret, op = run_cmd(f"git-pw series show {i} -f csv") print("****", index, "****") if ret: print(f"exitted with {ret}: {op}") series_data = read_rows(op.strip().split("\n")) print(series_data[1]) for j in series_data[11:]: patch_data = j[1].split() print(patch_data[0], patch_data[1]) series_dict[i].append(patch_data[0]) if patch_data[0] in patch_ids: patch_ids.remove(patch_data[0]) def get_individual_patches(): """This function iterates over all of the series and all of the patches. All of the patches that do not belong to any series are dumped into a file. """ # TODO: This function needs a refactor. file_loc = "/tmp/pwanalysis" if not os.path.exists(file_loc): os.mkdir(file_loc) series = [i for i in read_rows(get_series())] # print(series) series_dict = {i[0]: [] for i in series} patches = read_rows(get_patches()) patch_ids = [i[0] for i in patches] # print(patch_ids) series_ids = [i for i in series_dict.keys()] get_patches_for_series(series_ids, 0, series_dict, patch_ids) print("Individual patches", len(patch_ids)) print("series_dict", series_dict) write_json(file_loc + "/dict", series_dict) write_json(file_loc + "/patch_ids", {"patches": patch_ids}) if __name__ == "__main__": parser = argparse.ArgumentParser(description="Initial Ci script for patchwork") parser.add_argument( "-c", "--colors", default=False, action="store_true", help="Enable colors" ) parser.add_argument( "-t", "--type", type=str, default="series", choices=["patch", "series"], help="type: patch/series", ) parser.add_argument( "-a", "--action", type=str, default="apply", help="action: list/apply" ) parser.add_argument( "-o", "--output-location", type=str, default="/tmp/pw-results", help="location for the output files containing merged, umerged and unavailable patches/series.", ) parser.add_argument( "-i", "--input-file", type=str, default="", help="input file: csv file or '-' for the standard input. If no file is specified\ this data will be pulled from patchwork instance.", ) parser.add_argument( "-p", "--page-range", default="1-100", help="page range for patchwork in the format 'from_pageNo'-'to_pageNo' for example '1-100'", ) parser.add_argument( "-u", "--get-individual-patches", default=False, action="store_true", help="""Get individual patches. In this case the series data and the patches data is pulled and compared to find out the individual patches that do not belong to any series.""", ) args = parser.parse_args() print(args) csv_data = [] if args.input_file == "-": # Get the csv data from stdin for line in sys.stdin: if not '"ID"' in line: print(line) csv_data.append(line.strip()) elif os.path.exists(args.input_file): data = open(args.input_file).read().strip().split("\n") for line in data: if not '"ID"' in line: # print(line) csv_data.append(line.strip()) # option that we will be operating upon, series or the patch # this is the command line argument to git-pw # for example "git-pw patch apply 12345" or "git-pw series apply "12356" type_ = args.type series_data = read_rows(csv_data) output_files_loc = args.output_location if not os.path.exists(output_files_loc): os.mkdir(output_files_loc) if args.get_individual_patches: print("getting individual patches") get_individual_patches() sys.exit(0) colors = args.colors class bcolors: if colors: HEADER = "\033[95m" OKBLUE = "\033[94m" OKCYAN = "\033[96m" OKGREEN = "\033[92m" WARNING = "\033[93m" FAIL = "\033[91m" ENDC = "\033[0m" BOLD = "\033[1m" UNDERLINE = "\033[4m" else: HEADER = "" OKBLUE = "" OKCYAN = "" OKGREEN = "" WARNING = "" FAIL = "" ENDC = "" BOLD = "" UNDERLINE = "" print(len(series_data)) if args.action == "apply": if args.input_file == "": try: from_range, to_range = [int(i) for i in args.page_range.split("-")] except ValueError as ve: print("invalid page range") sys.exit(1) series_data = [i for i in read_rows(get_series(from_range, to_range + 1))] apply_(series_data) print( "total merged: {0}, total unmerged {1}, total unavailable{2}".format( len(merged), len(unmerged), len(unavailable) ) ) write_file(f"{output_files_loc}/merged.txt", merged) write_file(f"{output_files_loc}/unmerged.txt", unmerged) write_file(f"{output_files_loc}/unavailable.txt", unavailable) write_file(f"{output_files_loc}/already_applied.txt", already_applied) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2021-02-04 15:47 ` Girish Joshi @ 2021-02-12 5:25 ` Siddhesh Poyarekar 2021-02-12 9:02 ` Siddhesh Poyarekar 1 sibling, 0 replies; 21+ messages in thread From: Siddhesh Poyarekar @ 2021-02-12 5:25 UTC (permalink / raw) To: Girish Joshi, Girish Joshi via Libc-alpha; +Cc: Andreas Schwab, Joseph Myers On 2/4/21 9:17 PM, Girish Joshi wrote: > There are a couple of functions that need refactoring. > But for now it does the job. > Could you please review it? Thanks for doing this, I've got the script running now to see what it gives. While it was running, I noticed that it lists (and processes) patches that have been committed and also marked as committed on patchwork. Perhaps the git-pw invocation needs tweaking? Siddhesh ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2021-02-04 15:47 ` Girish Joshi 2021-02-12 5:25 ` Siddhesh Poyarekar @ 2021-02-12 9:02 ` Siddhesh Poyarekar 2021-02-12 13:04 ` Carlos O'Donell 1 sibling, 1 reply; 21+ messages in thread From: Siddhesh Poyarekar @ 2021-02-12 9:02 UTC (permalink / raw) To: Girish Joshi, Girish Joshi via Libc-alpha; +Cc: Andreas Schwab, Joseph Myers On 2/4/21 9:17 PM, Girish Joshi wrote: >> Once this is done, this script can be invoked after a regular interval >> of time to check if the new patches can be applied. > We can do this now. > > There are a couple of functions that need refactoring. > But for now it does the job. > Could you please review it? The script is now done and I have gone through the outputs. Some notes: 1. git-pw runs are leaving /tmp/git-pw* directories, you need to clean them up 2. The script must ignore patches that are not in the New state. Currently it seems to be going through everything. 3. The output in pw-results seemed to mostly be old patches from 2014 or so by default. Perhaps it's hitting the limit for the server in 2014 because it's not filtering correctly on patch state? If the output does not correspond with what you're seeing, then please send the commandline you'd like me to run to get the output you're seeing. The primary goal with this script set is to identify patches in 2019/2020 that are out of date and no longer apply so that we can mark them accordingly. Thanks, Siddhesh ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC] Updating patchwork patches on commit 2021-02-12 9:02 ` Siddhesh Poyarekar @ 2021-02-12 13:04 ` Carlos O'Donell 0 siblings, 0 replies; 21+ messages in thread From: Carlos O'Donell @ 2021-02-12 13:04 UTC (permalink / raw) To: Siddhesh Poyarekar, Girish Joshi, Girish Joshi via Libc-alpha Cc: Andreas Schwab, Joseph Myers On 2/12/21 4:02 AM, Siddhesh Poyarekar wrote: > On 2/4/21 9:17 PM, Girish Joshi wrote: >>> Once this is done, this script can be invoked after a regular interval >>> of time to check if the new patches can be applied. >> We can do this now. >> >> There are a couple of functions that need refactoring. >> But for now it does the job. >> Could you please review it? > > The script is now done and I have gone through the outputs. Some notes: > > 1. git-pw runs are leaving /tmp/git-pw* directories, you need to clean them up > > 2. The script must ignore patches that are not in the New state. Currently it seems to be going through everything. > > 3. The output in pw-results seemed to mostly be old patches from 2014 or so by default. Perhaps it's hitting the limit for the server in 2014 because it's not filtering correctly on patch state? > > If the output does not correspond with what you're seeing, then please send the commandline you'd like me to run to get the output you're seeing. The primary goal with this script set is to identify patches in 2019/2020 that are out of date and no longer apply so that we can mark them accordingly. FYI. The kernel has a patchwork-bot here: https://git.kernel.org/pub/scm/linux/kernel/git/mricon/korg-helpers.git/tree/git-patchwork-bot.py In case they were doing something interesting that we're not. Their bot knows how to mark superseded based on vN markup. It uses sqlite3 local db to track processing state. -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2021-02-12 13:04 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-07 5:48 [RFC] Updating patchwork patches on commit Siddhesh Poyarekar 2020-12-07 8:45 ` Florian Weimer 2020-12-07 9:30 ` Siddhesh Poyarekar 2020-12-07 16:15 ` DJ Delorie 2020-12-07 16:39 ` Siddhesh Poyarekar 2020-12-07 17:02 ` DJ Delorie 2020-12-07 18:11 ` Joseph Myers 2020-12-08 2:57 ` Siddhesh Poyarekar 2020-12-08 9:08 ` Andreas Schwab 2020-12-08 10:10 ` Siddhesh Poyarekar 2020-12-16 18:35 ` Girish Joshi 2020-12-16 18:49 ` Siddhesh Poyarekar 2020-12-17 17:49 ` Girish Joshi 2020-12-18 4:04 ` Siddhesh Poyarekar 2020-12-19 13:25 ` Girish Joshi 2020-12-22 15:13 ` Girish Joshi 2021-01-06 20:26 ` Girish Joshi 2021-02-04 15:47 ` Girish Joshi 2021-02-12 5:25 ` Siddhesh Poyarekar 2021-02-12 9:02 ` Siddhesh Poyarekar 2021-02-12 13:04 ` Carlos O'Donell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).