= phenomenon = nb reported that running report_mirror would report hundreds of directories deleted on each run.
= reason = crawler is creating HostCategoryDir entries for not up-to-date directories (in fact, dirs that nb has excluded on his mirror). report_mirror then deletes these entries.
= recommendation = commit 20986e503db481d4760e6e3ea74b07863a8e1cf9 Author: Matt Domsch matt@domsch.com Date: Sun Oct 4 22:25:15 2009 -0500
crawler: don't create HCDs for directories which aren't up2date
diff --git a/server/crawler_perhost b/server/crawler_perhost index 9553fb8..8964c7e 100755 --- a/server/crawler_perhost +++ b/server/crawler_perhost @@ -303,6 +303,9 @@ def sync_hcds(host, host_category_dirs): if hcd.count() > 0: hcd = hcd[0] else: + # don't create HCDs for directories which aren't up2date on the mirror + # chances are the mirror is excluding that directory + if not up2date: continue hcd = HostCategoryDir(host_category=hc, path=path, directory=d)
if hcd.directory is None:
live now with released MM 1.3.3.
Log in to comment on this ticket.