1. ## Algorithm Efficiencies

Welcome to my introduction to evaluating asymptotic complexity of alogrithms. Now that I've lost half of you, I'm talking about making algorithms run faster. There is almost always more than one way to solve a problem, but some of them require a lot longer than others. I'll start with an example.

Actually before I start, I want to say one thing about logarithms:
We know that (4*2) / 2 = 4
If we take log(2^4) = 4 (assuming log is base2)
Now for the example:

Say we had 100 boxes and the first n boxes had gifts in them. (i.e. Boxes 1 - 24 will have gifts in them but boxes 25-100 are empty) How would you tell how many gifts we have without making more than one wrong guess?
The easiest way to do this is to scan from box 1 to 100 looking for an empty box, and the number of the first empty box minus one tells us how many gifts there are. The worst case scenario is that you have to scan 100 boxes so one would call this a big-Oh n algorithm (denoted O(n)).
There is a more efficient, though less obvious way to do this though. We seperate the series of boxes into groups of size ceiling[logBase2(n)] (for 100 boxes this is 7). We can then check box 7. If it has a gift, we scan box 14, if it has a gift we scan box 21 and so on until we reach 100. If, say box 14 is empty, we scan all boxes between 7 and 14 for gifts (since 7 had a gift). The worst-case running time for this algorithm is to scan up to box 91, and then have to scan between 91 and 98 for gifts. This takes 13 + 7 operations. One could approximate this (since big-Oh notation is an approximation) to be 2 * ceiling[logBase2(n)] + 7. For now I will just say this is O(log(n)).

Big-Oh notation is concerned about running times of alogrithms on large sets of data. 100 is not considered large. By large we are considering data sets of 10000 or 100000 or even 10000000 elements. For these reasons there are steps to follow and assumptions that can be made when evaluating algorithms.
Step one is to count how many operations are being made.
Step two is sum the operations to produce an equation from the number of operations.
Step three is to realize that for large data sets, only the most signficant value matters. That is additive constants, coefficients and lower order terms can be dropped. Hopefully this will be clearer with an example.
Lets look at an algorithm that finds the maximum value in an unsorted array. I will include the number of operations in brackets beside the instruction.

Alogrithm arrayMax(A, n)
Input: An array A storing n integers
Output: The maximum value of A

currentMax = A[0] (2 operations)

for i = 1 to n - 1 do (1 operation + n-1 tests)
if currentMax < A[i] then (2 operations n-1 times)
currentMax = A[i] (2 operations up to n-1 times)
loop (2 operations n-1 times)
return currentMax (1 operation)

One important note is that big-Oh evaluations are based on Worst-Case running times of algorithms. For this reason, we assume
currentMax = A[i] (2 operations up to n-1 times)
runs n-1 times.
To find the worst-case number of operations we add up the total operations.

2 + ( 1 + n-1) + 2*(n-1) + 2*(n-1) + 2*(n-1) + 1
= 2 + n + 2*n - 2 + 2*n - 2 + 2*n - 2 + 1
= 7*n - 3

If we follow step 3 from above 7*n - 3 becomes O(n).

Another example is:
Take the equation 8*n^2*log(n) + 5*n^2 + n
From this equation, we can drop the lower order terms (by lower order I mean those that for large values of n are less than the others. This is termed Eventually Less and denoted <<)
We are left with 8*n^2*log(n)
and from this we can drop the coefficient.
This algorithm would be O(n^2*log(n))

All these assumptions I have made can be proved mathematically, but that is beyond the scope of this discussion.

A simple heiarchy of functions includes

1 << log(n) << log^2(n) << sqrt(n) << n << n*log(n) << n^2 << n^3 << 2^n << n!
note: 1 means constant time (that is any algorithm that does a fixed number of operations. An algorithm that adds two numbers, divides the sum by 2 and then multiplies it by 5 takes 3 operations, but runs in O(1) or constant time)

What does all this do for us though?

When we are trying to solve a problem we should look at as many solutions as we can and figure out the most efficient way to solve the problem. For real world data sets chosing a O(n*log(n)) algorithm over a O(n^2) alogrithm can save significant time on even mid-sized data sets.
For example look at the running times of heap sort (O(n*log(n))) vs. insertion sort (O(n^2)) running on a data set of size 16384.
Heap Sort takes 14.36 seconds
Insertion Sort takes 47.95 seconds

This is a very powerful techinque to use when writing code. Say for example one were sorting database records. If we were talking about a large enterprise database with 1000000000 records Heap Sort would take 1505755 seconds or 174 days
Insertion sort would approx. 2.0 * 10^11 seconds which is approx. 6000 years.
Similarly this has a significant application to cryptography when it comes to encrypting data. If one used inefficient alogrithms to encrypt and decrypt data, one would have very long wait times when large bit keys are used.

Hopefully this is helpful to some of you programmers out there, and it at least makes some sort of sense.

If there is anything anyone feels they need more detail on, or something that can be expanded further, please let me know, and I'll try to put together a part 2.

2. For some additional reading on algorithms and algorithmic complexity I recomend the following books:

The Design and Analysis of Computer Algorithms by A.V. Aho, J.E. Hopcroft, and J.D. Ullman(thorough explainations of important big-O, big-Omega, and big-Theta results

Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity by Jan Van Leeuwen(good chapters on abelian groups and vector analysis algorithms)

The Art of Computer Programming by Donald Knuth (this has 3 volumes and all are well worth the hefty price tag, great coverage of sorting algorithms)

Introduction to Algorithms by Thomas H. Cormen(despite the title this book is NOT written on an elementary level and is mathematically intensive, but if you are possess the necessary talent to excel in this field that won't be a problem anyway)

http://www.infosyssec.com/infosyssec/onlinebooks1.htm

4. Ok lets see if I got this.
Lets say I have 200 boxes.
3/4 of the boxes have something in them.
I want to find how many empty boxes there are
so i split the boxes into 2 groups

And I count untill I found an empty box lets say the empty box is the 10th. So then i go directly to 20 if that box is empty i go to box 40 and so on untill i hit 200 but if box 20 is empty i have to check every box from 10 to 20 to see if there are any other empty ones?

5. Nooooooooo....having hellish flashbacks to college...aaarrrrgghh. Man Big-oh and NP complete were always the things the CS department would try to hammer people with, good writeup.

/nebulus

EDIT: Was going to change Big-O to how it should be..but then I decided I liked it the way I typoed it

6. phazeout420:
If you had 200 boxes, and 3/4 of the boxes had something in them, then the first 150 boxes would have gifts. IMPORTANT: All the boxes with gifts are grouped together starting at positon 0. 1 to 150 have gifts and 151 to 200 are empty.
Using the more efficient alogrithm, you would take ceiling[log(200)] = 8
So you would scan box 8, if it were full, you would scan box 16 and so on until you scanned box 152 and it was empty. Then you would scan between boxes 144 and 152 looking for a present.
Hope this clears it up a bit.

7. Similarly this has a significant application to cryptography when it comes to encrypting data. If one used inefficient alogrithms to encrypt and decrypt data, one would have very long wait times when large bit keys are used.
i would have to say no

but thank you for the info

8. May be this will help you phazeout 420. Lets start from very basic intro to Algorithm.
As you know that computer is extremely powerful machine and it can calculates so many calculations in Pico second. Otherwise it would take lot of time if we do it manually.
We can tell computer to do what we want it to do in step by step manner and to tell computer we must develop an algorithm, which is a step - by- step procedure to solve a problem. get it.?

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•

×