Tuesday, August 01, 2006

Crawling out of the woodwork

Now this is weird. To say the least.

Over the weekend I tickled the sourcecode very slightly - I added a single new function to do fuzzy matching inside the code rather than externally (as mentioned in my last post). It's not working yet, this is just the hook that lets me develop it. Four lines changed in the code, none of them difficult, or with anything unusual about them.

When I released last night I started getting bug reports from the placefinder. It was blowing up in a line that multiplied a score by 100. When I adjusted that to stop it happening, the same happened in another line that multipled a different value by 100.

I got up early this morning as this was driving me bonkers and got down to work on it. The errors were coming out of the parsing part of the code - which hasn't been touched since September 2004. The adjustments I'd made to the code weren't the sort to have introduced any memory corruption bugs (plus the fact it was doing exactly the same on the development and server machine, and was utterly consistent), so what the heck was happening?

A bit of experiment showed that:
print 2 * 95
worked just as it should, as did
print 2 * 101
but try to do anything between 95 and 100 and it died. It wasn't the multiplication that did it, just the same happened with addition etc.

I know that it won't help, but I fruitlessly search for 94s and 95s, 100s and 99s in the code.

When I was first developing the CanalplanAC interpreter, which is far and away the most complicated thing I've ever written, I took the trouble to build a dedicated debugging and tracing feature into it, and left all the calls in the code - but they are excluded normally by some macro hackery.

So I changed a constant, rebuilt from clean, and this time, before typing
x = 2 * 95
I typed
trace "EF"
'E' means trace expressions, and 'F' means trace syntactical flow.

In another console I did the same with x = 2 * 10 and I compared the outputs. Each generated a page that looked like this (the full trace for a successful evaluation and assignment of 20 to x)

syntax.c (147) Read_Everything Read: x = 2 * 10 as 1
syntax.c (162) Read_Everything Identifier token is: x
syntax.c (187) Read_Everything Parsing [SET/GOSUB]
syntax.c (188) Read_Everything Syntax table reads: %IK#
syntax.c (190) Read_Everything Using syntax table entry %
syntax.c (190) Read_Everything Using syntax table entry I
syntax.c (853) Match_Simple Read identifier - Identifier token is: x
syntax.c (190) Read_Everything Using syntax table entry K
syntax.c (301) Read_Everything Next token read: Token is operator: =
syntax.c (530) Read_Everything Keyword changed to [IMPLICIT SET]
syntax.c (190) Read_Everything Using syntax table entry #
syntax.c (285) Read_Everything Spliced to syntax table: #%A~X
syntax.c (190) Read_Everything Using syntax table entry %
syntax.c (190) Read_Everything Using syntax table entry A
token.c (389) Syntactic_Unit First token: Identifier token is: x
token.c (421) Pull_Expression Expression token: Token is operator: =
syntax.c (747) Match_Simple Item is an expression: x
syntax.c (190) Read_Everything Using syntax table entry ~
syntax.c (190) Read_Everything Using syntax table entry X
NUMBER: 2
token.c (389) Syntactic_Unit First token: Integer constant is: 2
token.c (421) Pull_Expression Expression token: Token is operator: *
NUMBER: 10
token.c (421) Pull_Expression Expression token: Integer constant is: 10
token.c (421) Pull_Expression Expression token: empty token
syntax.c (731) Match_Simple Item is an expression: 2 * 10
syntax.c (595) Read_Everything ?Separator empty token
syntax.c (606) Read_Everything End of line: x = 2 * 10
syntax.c (607) Read_Everything Lastpushed: 0, depth: 0
syntax.c (634) Read_Everything About to resolve procedure calls
expression.c (313) Real_Evaluate Caching this expression
expression.c (336) Real_Evaluate Getting token from expression into new storage
expression.c (354) Real_Evaluate Evaluating - Identifier token is: x
variables.c (148) Create_Global_Variable Creating global variable x
variables.c (97) hashdup Making item of size 10 to hold string 'x' - allocated at 0x86bc5b8
variables.c (103) hashdup The name item points at 0x86bc5c0
variables.c (105) hashdup Copied in
variables.c (152) Create_Global_Variable Done!
expression.c (958) Push_Variable_Onto_Valstack Pushing variable called x onto the stack
expression.c (959) Push_Variable_Onto_Valstack It holds [NULL]
expression.c (336) Real_Evaluate Getting token from expression into new storage
expression.c (354) Real_Evaluate Evaluating - empty token
expression.c (560) Real_Evaluate Returning from Evaluate
expression.c (1069) Drop_Valstk Dropping value stack
expression.c (188) Evaluate_String - EXPRESSION TO EVALUATE IS 2 * 10
expression.c (313) Real_Evaluate Caching this expression
expression.c (336) Real_Evaluate Getting token from expression into new storage
NUMBER: 2
expression.c (354) Real_Evaluate Evaluating - Integer constant is: 2
expression.c (921) Push_Value_Onto_Valstack Pushing value of 2 onto the stack
expression.c (336) Real_Evaluate Getting token from expression into new storage
expression.c (354) Real_Evaluate Evaluating - Token is operator: *
expression.c (336) Real_Evaluate Getting token from expression into new storage
NUMBER: 10
expression.c (354) Real_Evaluate Evaluating - Integer constant is: 10
expression.c (921) Push_Value_Onto_Valstack Pushing value of 10 onto the stack
expression.c (336) Real_Evaluate Getting token from expression into new storage
expression.c (354) Real_Evaluate Evaluating - empty token
expression.c (557) Real_Evaluate End-loop. Have popped *
expression.c (1018) Pull_Top_Value_Off_Stack Result was a new value 10
expression.c (1018) Pull_Top_Value_Off_Stack Result was a new value 2
expression.c (663) Operate_On_Stack Calculating 2 * 10
expression.c (665) Operate_On_Stack Evaluates to 20
expression.c (921) Push_Value_Onto_Valstack Pushing value of 20 onto the stack
expression.c (560) Real_Evaluate Returning from Evaluate
expression.c (1018) Pull_Top_Value_Off_Stack Result was a new value 20
expression.c (210) Evaluate_String -END OF EXPRESSION- returning 20
expression.c (1069) Drop_Valstk Dropping value stack

The faulty calculation stopped dead with an error after the line I've marked in red.

Armed with this it was easy to see what was happening. The code was wrong, it was as simple as that - a test that should only have been performed when checking an "operator" was being performed for numbers as well. CanalplanAC stores all symbols in what is called a "union" - where different sorts of things are stored in the same space.

Now the first lesson. The time and effort I put into writing the trace module and embedding trace statements in the code this has paid off just today. This part of the code runs like clockwork, and I've never looked at it for ages. It's like debugging someone else's code, not mine now. Without the debugging this problem would have taken days to find.

Lesson two - you are not clever enough to use "fall-through" in C, even if you think you are. Even if it's one of the accepted reasons for doing it, and you comment it. It will still bite you on the arse.

When I added the new function I increased the size of the table that holds function names, and this in turn changed the values of the operators by one. This meant that the largest operator now had a value of 100.

Would you believe it, but for the last two years Canalplan has been incapable of evaluating any expression that ended with a literal number between 94 and 99? Well it has. And at no time at all has anyone tried to do this. There aren't any of them, anywhere in the code. But there are several 100s.

Lesson three - nobody ever tests their code enough.

And on that cheery note, I'm off to work.

0 Comments:

Post a Comment

<< Home