CSCI 104 - Fall 2019 Data Structures and Design

Midterm Review

The focus of this lab is to best prepare you for the midterm, so if there's anything you want to cover, feel free to stop us and ask!

This review briefly goes over all the main topics, with a more in-depth look at runtime, since that's what people have been struggling with lately, but again, we can expand on any of these topics (or anything not covered in this lab, too)!

1 - Runtime Analysis

What's the difference between O, Theta, and Omega?

O - Upper bound

Theta - Tight bound

Omega - Lower bound

A common misconception that students have is that O is the same as worst-case, Theta is the same as average-case, and Omega is the same as best-case. This is NOT accurate! "Cases" are based on what branch the code takes (if the code branches).

Look at this for loop as a basic example.

for(int i = 0; i < nums.size(); i++) {
    cout << nums[i] << endl;
}

It would be accurate (though not particularly useful) to say this code is O(n^2), but it would NOT be accurate to say that its worst-case runtime is n^2, because worst case and O are not the same thing. This code has NO branches, so there's no difference in runtime between worst-case, average-case, and best-case.

What is the runtime?

for(int i = 0; i < nums.size(); i++) {
    if(nums[i] % 3 == 0)
    {
        for(int j = 0; j < nums.size(); j++) {
            cout << nums[j] << endl;
        }
    }
}

Now, the code branches. The WORST case runtime would be if the if statement is true every time. What's the worst-case runtime (theta)? What's the best-case runtime (theta)?

Some things to think about:

Sums you should definitely put on your notes sheet:

Runtime analysis could appear in two separate forms, iterative or recursive. The strategies used to solve these types are different and will be discussed below.

Iterative

You all should have a good understanding of runtime analysis from homework 3. The summations are very handy when you encounter a pattern that you are familiar with.

Lets take a simple example for now:

for (int i = 1; i < n; i++)
{
    for (int j = 1; j < n; j *= 2)
    { 
        // Do something
    }
}

We can see that the inner loop executes over values of j and doubles j at every iteration: 1, 2, 4, 8, 16..., n. The loop stops when n > 2^k, where k is the number of times the loop executes. So what is k? Note that k = log base 2 of n, making the runtime O(logn).

What about the outer loop?

Now, let's put them together. During every iteration of the outer loop we do an execution of the inner loop. This means that we can multiple the two runtimes together, making the final runtime O(nlogn).

Recursive

In order to solve recursive runtimes, you will need to identify two things: the recurrence relation, and the base case. The recurrence relation is how the input size changes at every recursive call. Take the following example:

int recursion(int num) {
    int val;
    if (num > 0) {
        val = recursion(num/2) + recursion(num/2);
        return val;
    }
    else {
        return 1;
    }
} 

The recurrence relation is: T(n) = 2T(n/2) + C for n > 0. Where do the *2 and the /2 come from?

The 'C' is the constant amount of computation needed at each recursive step.

The base case is: T(n) = 1 for n <= 0. This is when we no longer recurse and just return.

Lets solve this using a recursion tree. In order to do this we need to determine 2 things: the depth of the tree, and the work done at each level.

Depth of Tree: Because we are halving the size of n at every level, there are logn levels.

Work at Each Level: Let's assign i to be the level we're on. At each level, there are 2^i recursive calls, each doing 'C' amount of work.

Let's write out the summation.

Which series can we use to solve this? What's the final answer?

2 - Sorting

There's code for this in the lecture notes, so we're not going to repeat it, but you should understand MergeSort and QuickSort, and know how to do them both by hand.

What is the worst choice of pivot for QuickSort? What's the best?

3 - Template

What's the benefit of using templates? Why do we put implementations in .hpp instead of .cpp?

4 - Graphs

What are the different ways of representing graphs? When might you use one over another?

BFS and DFS can be written with very similar code — what's the main difference? Which would you use if you wanted to find the shortest path from a start node to an end node?

What is Dijkstra's algorithm? What is A*? What's an example of when you would use A*? What would the heuristic be?

5 - Trees

What makes a graph a tree? A full tree? A complete tree? A BST?

What is the difference between pre-order, in-order, and post-order traversal? What's an example of when you would need to use post-order traversal?

6 - More practice

If you haven't done the practice midterm yet, DO IT!

Also go over your lecture notes and the handout questions, the web exercises, the assignments, and the course notes. You could also look online for coding problems — some areas that are missing from the web exercises that you should practice are graph problems and recursion.

7 - Q&A