Selasa, 23 Juli 2013

Loop Detection - Chapter 3 Algorithms Used to Detect Loops


Loop Detection
Peter Silberman



Chapter 3
Algorithms Used to Detect Loops
 
A lot of research has been done on the subject of loop detection. The research, however, was not done for the purpose of finding and exploiting vulnerabilities that exist inside of loops. Most research has been done with an interest in recognizing and optimizing loops (A good article about loop optimization and compiler optimization is http://www.cs.princeton.edu/courses/archive/spring03/cs320/notes/loops.pdf.

Research on the optimization of loops has led scientists to classify various types of loops. There are two distinct categories to which any loop will belong. Either the loop will be an irreducible loop (Irreducible loops are defined as ”loops with multiple entry [points]” (http://portal.acm.org/citation.cfm?id=236114.236115) or a reducible loop (3Reducible loops are defined as ”loops with one entry [point]” (http://portal.acm.org/citation.cfm?id=236114.236115). Given that there are two different distinct categories, it stands to reason that the two types of loops are detected in different fashions. Two popular papers on loop detection are Interval Finding Algorithm[1] and Identifying Loops Using DJ Graphs[2]. This document will cover the most widely accepted theory on loop detection.

3.1 Natural Loop Detection
One of the most well known algorithms for loop detection is demonstrated in the book Compilers Principles, Techniques, and Tools by Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman. In this algorithm, the authors use a technique that consists of two components to find natural loops (A natural loop ”Has a single entry point. The header dominates all nodes in the loop.” (http://www-2.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15745-s03/public/lectures/L7 handouts.pdf all loops are not natural loops).

The first component of natural loop detection is to build a dominator tree out of the control flow graph (CFG). A dominator can be found when all paths to a given node have to go through another node. A control flow graph is essentially a map of code execution with directional information. The algorithm in the
book calls for the finding of all the dominators in a CFG. Let’s look at the actual algorithm.

Starting from the entry node, the algorithm needs to check if there is a path to the slave from the entry node. This path has to avoid the master node.

If it is possible to get to the slave node without touching the master node, it can be determined that the master node does not dominate the slave node.

If it is not possible to get to the slave node, it is determined that the master node does dominate the slave. To implement this routine the user would call the is path to(ea t from, ea t to, ea t avoid) function included in loop detection.cpp. This function will essentially check to see if there is a path from the parameter from that can get to the parameter to, and will avoid the node specified in avoid.

As the reader can see from Figure 1, there is a loop in this CFG. Let B to C to D be the path of nodes that create a loop, it will be represented as B->C->D. There is also another loop from nodes B->D. Using the algorithm described above it is possible to verify which of these nodes is involved in the natural loop. The first question to ask is if the flow of the program can get from A to D while avoidingB. As the reader can see, it is impossible in this case to get to D avoiding B.

As such, a call to the is path to function will tell the user that B Dominates D. This can be represented as B Dom D, and B Dom C. This is due to the fact that there is no way to reach C or D without going through B. One question that might be asked is how exactly does this demonstrate a loop? The answer is that, in fact, it doesn’t. The second component of the natural loop detection checks to see if there is a link, or backedge, from D to B that would allow the flow of the program to return to node B to complete the loop. In the case of B->D there exists a backedge that does complete the loop.

3.2 Problems with Natural Loop Detection
There is a very big problem with natural loops. The problem is with the natural loop definition which is “a single entry point whose header dominates all the nodes in the loop”. Natural loop detection does not deal with irreducible loops, as defined previously.

As the reader can see both B and D are entry points into C. Also neither D nor B dominates C. This throws a huge wrench into the algorithm and makes it only able to pick up loops that fall under the specification of a natural loop or reducible loop (5It is important to note that it is next that it is next to impossible to reproduce ??F:fig2) without using nested for loops and goto’s statements. For this reason it is rare that the reader will see an example of this in a binary. However, it is possible therefore the author thought it important to mention).

Tidak ada komentar:

Posting Komentar