[Dshield] Help Me Sort this Out (Apache Logs)

David Cary Hart DShield at TQMcube.com
Thu Apr 6 15:09:00 GMT 2006


On Thu, 6 Apr 2006 11:04:20 +1000
Cefiar <cef at optus.net> opined:
> > "Java/1.4.1_04" /var/log/httpd/access_log:88.208.194.64 - -
> > [05/Apr/2006:12:57:08 -0400]
> > "GET /https://www.oag.state.tx.us/forms/cpd/images/zombies.php
> > HTTP/1.1" 302 
> 288related:
> > "-" "Java/1.4.1_04"
> 
> Well, I found this page on your site/domain (tqmcube.com) that
> contains the url (minus the /) -
> http://tqmcube.com/fss_fight_back.php - via Google.
> 
> Could be a badly written robot/spider that is getting confused and
> assuming that because the link doesn't start with http: it's
> actually local, and requesting it as though it's a local page? You
> might try looking thru the access logs and seeing if the addresses
> that hit these url's fetch robots.txt at some point prior to this.
> 
> BTW: If you're redirecting the pages yourself (as I see they're
> returning 302's) then they should really return 301's. If these are
> robots, they may pick up on the fact that it's actually moved if
> it's a 301, but not if it's a 302. This of course assumes that the
> robot author actually coded them well. Of course, if if isn't coded
> well, it mightn't make a difference, or they may even crash (their
> loss). For more info on why I suggest 301 instead of 302, see
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for general
> info on http error codes.
> 
Yes, we do have a link to that domain which I think is coincidental.
Our link: https://www.oag.state.tx.us/forms/cpd/cpd_getcounty.php

Robot?:https://www.oag.state.tx.us/forms/cpd/images/zombies.php
(sample) where zombies.php is in the virtual server root.

I cannot figure out why we are returning a 302 in contrast to a 404.
All of the redirects in our table (in httpd.conf) are associated
with .htm -> .php conversion. In other words, something like:
"Redirect permanent /index.html http://www.tqmcube.com/index.php"

Finally, most of these clients are in dynamic space and not expected
to make a recursive content fetch. I considered the possibility that
this might be from ODP/DMOZ volunteers but they would first get
robots.txt.

Yes, I always assume something is sinister until proven benign.
-- 
Our DNSRBL - Eliminate Spam: http://www.TQMcube.com
Multi-RBL Check: http://www.TQMcube.com/rblcheck.php
The Dirty Dozen Spammiest Ranges: http://tqmcube.com/dirty12.php



More information about the list mailing list