[Dshield] Trend analysis at DShield

Johannes B. Ullrich jullrich at sans.org
Tue Jun 15 14:43:00 GMT 2004


On Tue, 2004-06-15 at 07:06, Pete Cap wrote:
> List,
>  
> Does anyone remember when the Trends page at DShield used to include
> an explanation of how the trends are calculated?
>  
> It still has a quick blurb in there about what the values mean and it
> still says "For more details, see the end of this page."  But since
> the re-org the explanation of the maths involved seems to have vanished.
>  
> Hoping it will be possible to get that back--I want to implement
> something similar here but I don't want to reinvent the wheel...
>  

Well, the reason I removed the blurb was that I am reinventing the
wheel ;-). Trying out different ways to get a better trend number.

The basic idea was (and will likely still be):

(1) calculate the average number of reports per day for
    a particular port.
(2) divide this number by the total number of reports
    received. This is important, as we try to compare this
    number to recent reports, and we may not have all submissions
    in yet for the current day.
(3) divide the number of reports received for this port today
    by the total number of reports received for today.

So "in math" you got:

C = total number of reports
Cp = total number of reports for a particular port
Cpd = number of reports for a particular port during the
      last day
Cd = number of all reports for a particular day

Dp = Average reports per day for a port.
Rp = Recent (last day) reports per day for port

Dp=Cp/C  Rp=Cpd/Cd

ok. to get a trend, we could just divide these two 
numbers:

T=Rp/Dp... if its >1, we got an increase, if its
           <1 we got a decrease. 

However, if we take the log of this, we got it <0
for a decrease and >0 for an increase

T=ln(Rp/Dp) is what I used to calculate the trend.

Next, you would like to know if the trend is
significant. So you need to figure out the error.
I used standard counting statistic (Poison) for
that. The std error on our count is the square
root of the count. Then you plug it into standard
error propagation,

Terr= sqrt( 1 / (C*Cp*Cpd*Cd) ) 

(or something close to this... proof left as an exercise.. but
essentially, the more reports, the smaller the error ;-) )






          


-- 
CTO SANS Internet Storm Center               http://isc.sans.org
phone: (617) 837 2807                          jullrich at sans.org 

contact details: http://johannes.homepc.org/contact.htm




More information about the list mailing list