Welcome to my introduction to evaluating asymptotic complexity of alogrithms. Now that I've lost half of you, I'm talking about making algorithms run faster. There is almost always more than one way to solve a problem, but some of them require a lot longer than others. I'll start with an example.

Actually before I start, I want to say one thing about logarithms:
We know that (4*2) / 2 = 4
If we take log(2^4) = 4 (assuming log is base2)
Now for the example:

Say we had 100 boxes and the first n boxes had gifts in them. (i.e. Boxes 1 - 24 will have gifts in them but boxes 25-100 are empty) How would you tell how many gifts we have without making more than one wrong guess?
The easiest way to do this is to scan from box 1 to 100 looking for an empty box, and the number of the first empty box minus one tells us how many gifts there are. The worst case scenario is that you have to scan 100 boxes so one would call this a big-Oh n algorithm (denoted O(n)).
There is a more efficient, though less obvious way to do this though. We seperate the series of boxes into groups of size ceiling[logBase2(n)] (for 100 boxes this is 7). We can then check box 7. If it has a gift, we scan box 14, if it has a gift we scan box 21 and so on until we reach 100. If, say box 14 is empty, we scan all boxes between 7 and 14 for gifts (since 7 had a gift). The worst-case running time for this algorithm is to scan up to box 91, and then have to scan between 91 and 98 for gifts. This takes 13 + 7 operations. One could approximate this (since big-Oh notation is an approximation) to be 2 * ceiling[logBase2(n)] + 7. For now I will just say this is O(log(n)).

Big-Oh notation is concerned about running times of alogrithms on large sets of data. 100 is not considered large. By large we are considering data sets of 10000 or 100000 or even 10000000 elements. For these reasons there are steps to follow and assumptions that can be made when evaluating algorithms.
Step one is to count how many operations are being made.
Step two is sum the operations to produce an equation from the number of operations.
Step three is to realize that for large data sets, only the most signficant value matters. That is additive constants, coefficients and lower order terms can be dropped. Hopefully this will be clearer with an example.
Lets look at an algorithm that finds the maximum value in an unsorted array. I will include the number of operations in brackets beside the instruction.

Alogrithm arrayMax(A, n)
Input: An array A storing n integers
Output: The maximum value of A

currentMax = A[0] (2 operations)

for i = 1 to n - 1 do (1 operation + n-1 tests)
if currentMax < A[i] then (2 operations n-1 times)
currentMax = A[i] (2 operations up to n-1 times)
loop (2 operations n-1 times)
return currentMax (1 operation)

One important note is that big-Oh evaluations are based on Worst-Case running times of algorithms. For this reason, we assume
currentMax = A[i] (2 operations up to n-1 times)
runs n-1 times.
To find the worst-case number of operations we add up the total operations.

2 + ( 1 + n-1) + 2*(n-1) + 2*(n-1) + 2*(n-1) + 1
= 2 + n + 2*n - 2 + 2*n - 2 + 2*n - 2 + 1
= 7*n - 3

If we follow step 3 from above 7*n - 3 becomes O(n).

Another example is:
Take the equation 8*n^2*log(n) + 5*n^2 + n
From this equation, we can drop the lower order terms (by lower order I mean those that for large values of n are less than the others. This is termed Eventually Less and denoted <<)
We are left with 8*n^2*log(n)
and from this we can drop the coefficient.
This algorithm would be O(n^2*log(n))

All these assumptions I have made can be proved mathematically, but that is beyond the scope of this discussion.

A simple heiarchy of functions includes

1 << log(n) << log^2(n) << sqrt(n) << n << n*log(n) << n^2 << n^3 << 2^n << n!
note: 1 means constant time (that is any algorithm that does a fixed number of operations. An algorithm that adds two numbers, divides the sum by 2 and then multiplies it by 5 takes 3 operations, but runs in O(1) or constant time)

What does all this do for us though?

When we are trying to solve a problem we should look at as many solutions as we can and figure out the most efficient way to solve the problem. For real world data sets chosing a O(n*log(n)) algorithm over a O(n^2) alogrithm can save significant time on even mid-sized data sets.
For example look at the running times of heap sort (O(n*log(n))) vs. insertion sort (O(n^2)) running on a data set of size 16384.
Heap Sort takes 14.36 seconds
Insertion Sort takes 47.95 seconds

This is a very powerful techinque to use when writing code. Say for example one were sorting database records. If we were talking about a large enterprise database with 1000000000 records Heap Sort would take 1505755 seconds or 174 days
Insertion sort would approx. 2.0 * 10^11 seconds which is approx. 6000 years.
Similarly this has a significant application to cryptography when it comes to encrypting data. If one used inefficient alogrithms to encrypt and decrypt data, one would have very long wait times when large bit keys are used.

Hopefully this is helpful to some of you programmers out there, and it at least makes some sort of sense.

If there is anything anyone feels they need more detail on, or something that can be expanded further, please let me know, and I'll try to put together a part 2.