Unit 05: Algorithm efficiency

  1. Introduction 1
  2. Introduction 2
  3. Example: Fibonacci 1
  4. Example: Fibonacci 2
  5. Example: Fibonacci 3
  6. Comparing iterative and recursive 1
  7. Comparing iterative and recursive 2
  8. Big-O Notation and Growth Rates
  9. Constant time, O(1)
  10. Logarithmic time, O(log(n))
  11. Linear time, O(n)
  12. Quadratic time, O(n^2)
  13. Cubic time, O(n^3)
  14. Exponential time, O(2^n)
  15. Growth rate comparisons

This week's stuff:

Accessibility

You can read through / play the audio for each slide at your own pace, or

Introduction: Algorithm Efficiency (and why we care)

Processing power continues increasing as time moves forward. Many things process so quickly that we barely notice it, if at all. That doesn't mean we can just write code as inefficiently as possible and just let it run because "computers are fast enough, right?" - as technology evolves, we crunch more and more data, and as "big data" becomes a bigger field, we need to work with this data efficiently - because it doesn't matter how powerful a machine is, processing large quantities of data with an inefficient algorithm can still take a long time.

And some computers today are slow. Not usually the computers that the average person is using (Even though it might feel that way if they're running Windows). Some computers that handle systems are pretty basic, or they're small, and don't have the luxury of having great specs. Fitness trackers, dedicated GPS devices, a thermostat, etc. For systems like these, processing efficiency is important.

Introduction: Algorithm Efficiency (and why we care)

Let's say our algorithm's time to execute increases quadratically. That means, as we increase the amount of items to process (n), the time to process that data goes up quadratically.

For processing 1 piece of data, it takes 1 unit of time (we aren't focused so much on the actual unit of time, since each machine runs at different speeds) - that's fine. Processing 2 pieces of data? 4 units of time. 8 pieces of data? 64 units of time. We don't just double the amount of time it takes when we double the data - we square it.

How much data do you think is generated every day on a social media website that has millions of users? Now imagine the processing time for an algorithm with a quadratic increase…

From a design standpoint we also need to know what the efficiency of different algorithms are (such as the algorithm to find data in a Linked List or the algorithm to resize a dynamic array) in order to make design decisions on what is the best option for our software.

Data (n) Time units
1 1
2 4
3 9
4 16
100 10,000
1,000 1,000,000
10,000 100,000,000

Example: Fibonacci sequence, iterative and recursive

As an example, you shouldn't use a recursive algorithm to generate a Fibonacci sequence (Each number F[n] in the sequence is equal to the sum of the two previous items: F[n] = F[n-1] + F[n-2]). It's just not as efficient! Let's look at generating the string of numbers:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55

The most efficient way to do this would be with an iterative approach using a loop. However, we could also figure it out with a recursive approach, though the recursive approach is much less efficient.

Fibonacci sequence, iterative:

Fibonacci sequence, iterative:

int GetFib_Iterative(int n)
{
  if (n == 0 || n == 1) { return 1; }

  int n_prev = 1;
  int n_prevprev = 1;
  int result = 0;

  for (int i = 2; i <= n; ++i)
    {
      result = n_prev + n_prevprev;
      n_prevprev = n_prev;
      n_prev = result;
    }

  return result;
}

This algorithm has a simple loop that iterates n times to find a given number of the Fibonacci sequence. The amount of time the loop iterates based on n is:

n 3 4 5 6 7 8 9 10 20
iterations 2 3 4 5 6 7 8 9 19

Fibonacci sequence, recursive:

Fibonacci sequence, recursive:

int GetFib_Recursive(int n)
{
  if (n == 0 || n == 1) { return 1; }

  return GetFib_Recursive(n - 1) + GetFib_Recursive(n - 2);
}

The way this version is written, any time we call this function with any value n, it has to compute the Fibonacci number for Fib(n-1), Fib(n-2), Fib(n-3)Fib(1), Fib(0) … twice.

This produces duplicate work because it effectively doesn't "remember" what Fib(3) was after it computes it, so for Fib(n) it has to recompute Fib(n-1) and Fib(n-2) each time…

n 3 4 5 6 7 8 9 10 20
iterations 2 4 7 12 20 33 54 88 10,945

Comparing iterative and recursive

Red solid line = exponential growth;
Blue dotted line = linear growth

int GetFib_Iterative(int n)
{
  if (n == 0 || n == 1) { return 1; }

  int n_prev = 1;
  int n_prevprev = 1;
  int result = 0;

  for (int i = 2; i <= n; ++i)
    {
      result = n_prev + n_prevprev;
      n_prevprev = n_prev;
      n_prev = result;
    }

  return result;
}

The growth rate of the iterative version ends up being linear: As n rises linearly, the amount of iterations also goes up linearly.

int GetFib_Recursive(int n)
{
  if (n == 0 || n == 1) { return 1; }

  return GetFib_Recursive(n - 1) + GetFib_Recursive(n - 2);
}

The growth rate of the recursive function is exponential: As n rises linearly, the amount of iterations goes up exponentially.

Comparing iterative and recursive

Y scale from 0 to 25 (the 0th, to 25th Fibonacci number to generate).

Y scale from 0 to 11,000 (the 11,000th Fibonacci number to generate).

For small amounts this might not be too bad. However, if we were generating a Fibonacci number further in the list, it would continue getting even slower…

If we are trying to generate the 11,000th item in the sequence, the iterative approach requires 11,000 iterations in a loop. That linear increase is still much faster than each Fib(n) calling Fib(n-1) and Fib(n-2) recursively.

Big-O Notation and Growth Rates

For the most part, we don't care much about the exact amount of times a loop runs. After all, while the program is running, n could be changing depending on user input, or how much data is stored, or other scenarios. Instead, we are more interested in looking at the big picture: the generalized growth rate of the algorithm.

We use something called "Big-O notation" to indicate the growth rate of an algorithm. Some of the rates we care about are:

O(1) O(log(n)) O(n) O(n^2) O(n^3) O(2^n)
Constant Logarithmic Linear Quadratic Cubic Exponential

Constant time, O(1)

int F(int x)
{
  return 3 * x + 2;
}

We think of any single command as being constant time. The operation a = b + c will take the same amount of computing time no matter what the values of a, b, and c are.

Logarithmic time, O(log(n))

int Search(int l, int r,
           int search, int arr[])
{
  while ( l <= r )
    {
      int m = l + (r-l) / 2;
      if ( arr[m] == search )
        { return m; }
      else if ( arr[m] < search )
        { l = m+1; }
      else if ( arr[m] > search )
        { r = m-1; }
    }
  return -1;
}

Having an algorithm that halves its range each iteration of the loop will result in a logarithmic growth rate.

Linear time, O(n)

int Sum( int n )
{
  int sum = 0;
  for (int i=1; i<=n; i++)
    {
      sum += n;
    }
  return sum;
}

Having a single loop that iterates over a range will be a linear time increase.

Quadratic time, O(n^2)

void TimesTables(int n)
{
  for (int y=1; y<=n; y++)
    {
      for (int x=1; x<=n; x++)
        {
          cout << x << "*"
               << y << "="
               << x*y << "\t";
        }
      cout << endl;
    }
}

Quadratic time comes into play when you have one loop iterating n times nested within an other loop that also iterates n times. For example, if we were writing out times tables from 1 to 10 (n = 10), then we would need 10 rows and 10 columns, giving us 10^2 = 100 cells.

Cubic time, O(n^3)

void CubicThing(int n)
{
  for (int z=1; z<=n; z++)
    {
      for (int y=1; y<=n; y++)
        {
          for (int x=1; x<=n; x++)
            {
              cout << x << " "
                   << y << " "
                   << z << endl;
            }
        }
    }
}

Just like nesting two loops iterating n times each gives us a O(n^2) result, having three nested loops iterating n times each gives us a O(n^3) growth rate.

Exponential time, O(2^n)

int Fib(int n)
{
  if (n == 0 || n == 1)
    return 1;

  return
    Exponential_Fib(n-2)
    + Exponential_Fib(n-1);
}

With an exponential function, each step increases the complexity of the operation. The Fibonacci example is a good illustration: Figuring out Fib(0) and Fib(1) are constant, but Fib(2) requires calling Fib(0) and Fib(1), and Fib(3) calls Fib(1) and Fib(2), with Fib(2) calling Fib(0) and Fib(1), and each iteration adds that many more operations.

Growth rate comparisons