One of the pain points I encountered was trying to determine which rules were causing performance issues. A typical scenario would be that the rules would pass validation testing by the business users, but then weeks or months later an atypically large data file would be processed uncovering a performance issue (requiring hours for the data to be processed rather than seconds). Isolating the problem was accomplished by strategically deleting rules until the performance issue went away. The offending rule(s) were then inspected and analyzed with respect to the data to determine the cause of the problem. This was hardly an ideal approach and only worked best if there were a few rules causing the issue.
From a user perspective, It would be nice if you could profile rule performance much as you can profile the amount of time spent in functions/methods in languages such as C and Java. In fact, CLIPS does have profiling functions which allow you to track the amount of time spent in the actions of a rule. Unfortunately, the performance issues in rules are usually caused by the way the conditions have been written. Because rule conditions can be shared, it's difficult to assign time spent while pattern matching to a specific rule.
When I was developing performance improvements for CLIPS 6.3, I added code to track the number of operations performed on partial matches in the join network to gauge the benefit of using hash tables rather than linked lists for the join memories. It occurred to me that this could be a useful metric for identifying performance issues, and from this idea came the join-activity command in CLIPS 6.3.
This new command easily identifies the offending rule in the manners benchmark program:
When used in terse mode, the join-activity command returns a multifield value containing the number of comparisons made between the left and right memories, the number of partial matches added to the join memories, and the number of partial matches removed from the join memories. For this program, the find_seating rule has much more join activity than all the other rules combined.CLIPS> (clear) CLIPS> (unwatch all) CLIPS> (load manners.clp) :%%%%%%%******** TRUE CLIPS> (reset) CLIPS> (load-facts manners128.fct) TRUE CLIPS> (run) CLIPS> (progn$ (?r (get-defrule-list)) (printout t ?r ": " (join-activity ?r terse) crlf)) assign_first_seat: (1314 1315 1315) find_seating: (5847660 7067211 7067211) make_path: (49549 16510 16510) path_done: (127 254 254) are_we_done: (128 255 255) continue: (0 127 127) print_results: (257 258 128) all_done: (0 1 0) CLIPS>
Using the verbose mode allows you to examine the information for each join in the rule.
In conjunction with the sort function, you can create a ranked listing of all the rules. Here's an example using the waltz benchmark.CLIPS> (join-activity find_seating verbose) Activity for CE 1 Compares: 0 Adds: 127 Deletes: 127 Activity for CE 2 Compares: 8128 Adds: 8128 Deletes: 8128 Activity for CE 3 Compares: 28255 Adds: 28255 Deletes: 28255 Activity for CE 4 Compares: 2487321 Adds: 1241614 Deletes: 1241614 Activity for CE 5 Compares: 2483228 Adds: 2483228 Deletes: 2483228 Activity for CE 6 Compares: 823428 Adds: 1660994 Deletes: 1660994 Activity for CE 7 Compares: 17300 Adds: 1644865 Deletes: 1644865 (5847660 7067211 7067211) CLIPS>
In this case I've added the compares, adds, and deletes together to create a single join activity value for each rule displayed in the output. Hopefully this new command will provide CLIPS developers with more options for analyzing the performance of their rules.CLIPS> (clear) CLIPS> (unwatch all) CLIPS> (load waltz.clp) %%%%:!!!!!!******************************** TRUE CLIPS> (reset) CLIPS> (load-facts waltz50.fct) TRUE CLIPS> (run) CLIPS> (deffunction ja-lt (?r1 ?r2) (< (+ (expand$ (join-activity ?r1 terse))) (+ (expand$ (join-activity ?r2 terse))))) CLIPS> (progn$ (?r (sort ja-lt (get-defrule-list))) (format t "%-35s %10d%n" (str-cat ?r ":") (+ (expand$ (join-activity ?r terse))))) initial_boundary_junction_L: 1165683 initial_boundary_junction_arrow: 511101 make_L: 163364 make-3_junction: 128960 label_arrow-1B: 59799 label_tee_B: 55991 label_arrow-2B: 39455 label_arrow-1A: 35692 match_edge: 32072 second_boundary_junction_L: 26984 label_tee_A: 17095 label_arrow-2A: 16217 label_L: 14926 label_arrow-4A: 13811 label_arrow-4B: 13811 label_fork-1: 12727 reverse_edges: 11618 plot_boundaries: 9830 second_boundary_junction_arrow: 9641 done_reversing: 3878 label_fork-2: 2213 label_fork-3: 2213 label_arrow-3A: 1838 label_arrow-3B: 1838 label_arrow-5A: 1838 label_arrow-5B: 1838 plot_remaining: 1808 label_fork-4: 614 done_detecting: 7 done_plotting: 5 done_labeling: 2 begin: 0 "" CLIPS>
No comments:
Post a Comment