.Open Mathematics in Action: What jobs have been running too long . Note This a real problem. . Background As part of my task of keeping an FTP server running I needed to be able to terminate FTP sessions that have been running for more than a certain number of hours - which I call the 'grace period'. I can get a list of these sessions with either the date or the time when they started in particular position. If there is a date then the session is more than 24 hours long and needs killing. When there is a start time then this is a number between 0 and 23 inclusive and indicates the time on a twenty-four hour clock. I can also get hold of the time now on the same twenty-four hour clock. All these times are whole numbers: 8 means 8 in the morning and 20 means 8 in the evening. 23 means 11 at night and 0 stands for midnight. My problem is to get a function that tells me whether or not a the difference between two 24-hour times is greater than a given grace period. . Examples When the grace period is 4 hours and a job started at 4 (24 hour clock time) then it needs to be killed when the 24 hour time is 8. If it starts at 11pm or 23 (on the 24 hour clock) then it needs to be stopped at 3am. . Quality Requirement We do not want to be sued for stopping jobs before the announced time. But we can not afford to let jobs accumulate.... the system crashes. . Formal Analysis We will use the C/C++ notation for remainder or modulus: Use C.expressions. ()|-(mod): For int i,j>0, i%j::=i - j*(i/j). (The |- symbol is used to indicate that a formaula has been asserted to be true. It may be preceeded by a list of reasons.) We can show that (mod)|- (r0): For i,j>0, i=i%j+j*(i/j). (mod)|- (r1): For i,j>0, for some k(i=k*j+i%j)). .Net s::int=`the start time`, S::0..23=`the start time on the 24 hour clock`. t::int=`the time now`, T::0..23=`the time, now, on the 24 hour clock`. g::0..23=`the grace period in hours`. D::int=T-S, the difference between to 24 hour clock times. d::int=t-s, the real time difference in hours. Given $D and $g we want to determine if d > g or if d<=g. We have been told: (given)|-(1): T = t % 24 (given)|-(2): S = s % 24 (given)|-(3): 0 <= t-s and t-s < 24. So we deduce: (1,2)|-(4): (T-S)%24 = (t-s)%24. (4,r1)|-(5): for some k, d = D + 24*k (1,2)|-(6): 0 <= T and S < 24, (6,3)|-(7): d in { D, D+24, D+48, ...} and 0..23 = { D, D+24 }. Now (above)|-(D0): D<0 or D=0 or D>0. (7,5)|-(11): if D=0 then d0 then (dk1,ei,k=>k2)|-(5.2): D+k1*24=d+24*k2, (5.2)|-(5.3): D= -k1*24+d+24*k2, (5.3)|-(5.4): D= d+24*(k2-k1), (5.4,eg,k=>k2-k1)|-(5.5): D= d+24*k, .Close.Let . Proof of 11 .Let |-(8): D=0. (8,7,3)|-(9): d=0, (9)|-(10): d -24, (13,7)|- (14): d = D+24, .Close.Let . Proof of 18 .Let |-(16): D>0. (16,7)|- (17):d=D. .Close.Let .Close.Net . C and C++ Code We would therefore use the following function .As_is int hangable(int T, int S, int g) .As_is { .As_is if (T-S < 0) .As_is return T-S+24 < g; .As_is else .As_is return T-S < g; .As_is } . Actual Implementation The "ftp.hangman" program is a shell script: .As_is : Hang up day old FTP sessions. Uses BSD -u option, RTFM .As_is tmp=/tmp/ftp.hang.$$ .As_is ps -axu >$tmp .As_is grep "^ftp" <$tmp | .As_is awk '$9!~/:/{print $2}' | .As_is while read pid; do kill -9 $pid; done .As_is rm $tmp This works because the set of running jobs is listed in $tmp by the `ps -axu` command. The first "word" on a line is the owner of the job, and so `grep "^ftp"` extracts only the FTP processes. The `ps -axu` format puts the time or month as the 9th word on the line and the process Id as the second word. Hence the `awk` command selects the 24 or more hour old processes and outputs their process Identifiers. This set of Ids (the old ftp processes) are read in, one at a time by the `while - do - done` loop and `kill`ed. Clearly the only change is check for the presence of a ":" in the 9th word, extract the hour (when it started) and compare: .As_is awk '$9!~/:/{print $2} .As_is $9~/:/{S=substr($9,0,2); .As_is if(S-T>=g || S-T<0 && (S-T+24)>g)print $2; .As_is }' g=$grace T=`date +%H` . Testing There is a hopeful theory that a program that is properly developed does not need testing. Given the unprovable state of most software tools this is a dangerous theory. In this case testing showed up a bug caused by the weak typing in 'awk'. The correct statement to select outstanding jobs is: .As_is if(S-T>=0+g || S-T<0 && (S-T+24)>0+g)print $2; since `g` is interpreted as a string and so lexographic rather than numeric ordering is used in the comparisons. . Conclusion Notice that the condition was not wrong. It is the coding that fell into a well known and avoidable trap. .Close What jobs have been running too long . Notes (ei): The "existential instanciation" rule of inference .See http://www/dick/maths/logic_2_Proofs.html#ei (eg): The "existential generalization" rule of inference .See http://www/dick/maths/logic_2_Proofs.html#eg given::=`a reason for assuming something`. above::=`short hand for using all previous assertions as the reason for the next one`.