* [Bug Infrastructure/31551] New: Better fail2ban scripts for search/ai spider fighting
@ 2024-03-25 0:22 mark at klomp dot org
2024-03-25 0:24 ` [Bug Infrastructure/31551] " mark at klomp dot org
0 siblings, 1 reply; 2+ messages in thread
From: mark at klomp dot org @ 2024-03-25 0:22 UTC (permalink / raw)
To: overseers
https://sourceware.org/bugzilla/show_bug.cgi?id=31551
Bug ID: 31551
Summary: Better fail2ban scripts for search/ai spider fighting
Product: sourceware
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: Infrastructure
Assignee: overseers at sourceware dot org
Reporter: mark at klomp dot org
Target Milestone: ---
Search and AI spiders are difficult things. Since everything we do is
open and public we actually like people to easily find anything our
projects publish. But often these spiders (especially the new AI ones)
are very aggressive and ignore our robots.txt causing service
overload.
We have some fail2ban scripts that help and worst case we include
agressive spider ip addresses in the httpd block.include list
(by hand). But this doesn't really scale. One solution is smarter
fail2ban scripts. Another is providing sitemaps https://www.sitemaps.org/
so spiders have a known list of resources to index and we can more
easily block any that go outside those.
We should have some kind of automation of fail2ban and robots.txt.
Anything that aggressively hits urls that are in robots.txt should
get banned.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug Infrastructure/31551] Better fail2ban scripts for search/ai spider fighting
2024-03-25 0:22 [Bug Infrastructure/31551] New: Better fail2ban scripts for search/ai spider fighting mark at klomp dot org
@ 2024-03-25 0:24 ` mark at klomp dot org
0 siblings, 0 replies; 2+ messages in thread
From: mark at klomp dot org @ 2024-03-25 0:24 UTC (permalink / raw)
To: overseers
https://sourceware.org/bugzilla/show_bug.cgi?id=31551
Mark Wielaard <mark at klomp dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://sourceware.org/bugz
| |illa/show_bug.cgi?id=31549
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-03-25 0:24 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-25 0:22 [Bug Infrastructure/31551] New: Better fail2ban scripts for search/ai spider fighting mark at klomp dot org
2024-03-25 0:24 ` [Bug Infrastructure/31551] " mark at klomp dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).