.Open CSci202 Computer Science II, Session 15 Searching and Sorting (previous): Strings .See http://www/dick/cs202/14.html . Preparation -- 19 Searching and Sorting 922 Study this page and the special handout .See http://www/dick/cs202/alg.html and the .See http://www/dick/cs202/algorithms.txt.html example page. 19.1 Introduction 923 19.2 Searching Algorithms 923 19.2.1 Efficiency of Linear Search 923 19.2.2 Binary Search 925 19.3 Sorting Algorithms 931 19.3.1 Efficiency of Selection Sort 931 19.3.2 Efficiency of Insertion Sort 931 19.3.3 Merge Sort (A Recursive Implementation) 932 19.4 Wrap-Up 939 . Coding Horror on the BigO .See http://www.codinghorror.com/blog/archives/000957.html . Assigned work -- send a question and bring a deck of cards. . Chapter 19 pages 923-924 -- Searching and sorting Can we make algorithms do anything else other than search? Yes .... examples?? . Chapter 19 pages 923 -- Algorithm timing How big does a program have to be (or any other factor) to notice a difference between one algorithm and an other algorithm? Twelve (12) lines can do it. Indeed faster algorithms are often bigger than slower ones. The extra choices and loops can reduce the amount of work to be done. Exception -- a badly coded slow algorithm can be a lot longer than an elegant fast algorithm. . Chapter 19 pages 923 -- big O notation How would one find out what the big O notation would be for an algorithm? Take CSCI 431 -- Algorithm Analysis. To prepare for these -- CSCI330 and MATH 372. . Chapter 20 pages 976 -- Algorithms Is there an relatively easy way of determining an algorithm's worst case scenario short of trying every possible outcome? Hire a computer scientist to do it for you! Or look in one of the reference books -- Knuth -- He has analysed evrything -- and is in the library. In most cases you just look at the loops and think about the maximum times they can possibly repeat. In particular look for deeply nested loops. Example -- if you see .As_is for(int row=0; row functions: .See ./timeMergeSort.cpp .See ./mergeMainC.cpp .See ./mergeSortC.h .See ./mergeSortC.cpp . Chapter 19 pages 932 -- Merge Sort Can you give a clearer explanation of how Merge Sort works? I can try.... I'll do a demo if we have time. . Why are constants thrown out when trying to figure out the order of the big_O? Partially convenience. The constants tend to reflect the speed of the computer. You can always get a better CPU and solve the problem faster. Worse -- we can show that clever programming and using more storage can speed up an algorithm by a fixed constant multiplier. The nonconstant part tells you the important properties of the algorithm. For example calculate the table .Table n 100*n n^2 .Row 10 1000 10 .Row 20 2000 400 .Row ... .Close.Table until the last column is larger than the second column. Ultimately an `a*n` is overtaken by `b*n*n`, when `n` is biggest. The `a` and `b` just tell you when it happens. So most computer scientists don't care whether an algorithm takes 10*`n`^2 or 5*`n`^2. (Note I have colleagues who like to argue that we `should` worry about the constants... He always reminds me that shifting a C++ program from using vectors to using arrays always makes it run faster. I reply that it also adds bugs...). . What does it mean when the time to solve an algorithm tends towards the exponential, and can problems like these ever be solved? It means that the problems can be solved -- if we have time to solve them. Small amounts of data a done quickly. Large amounts of data take enormous amounts of time. The time increases drastically with the size of the problem. With an "exponential algorithm", each time you add one item to the data the time is multiplied by a number greater than one. Like this .Table Size of data Time doubles .Row 1 2 .Row 2 4 .Row 3 8 .Row 4 16 .Row ... .Row 10 1024 .Row ... .Row 20 ???? .Close.Table Exercise -- how long is a million seconds? a billion seconds? a trillion seconds? . What does it mean when a algorithm is tractable or intractable? A tractable problem has an algorithm that solves the problem in polynomial time. This means that for size `n` the time is O(`n^p`) for some `p`. The intractable problem haven't got a known polynomial algorithm to solve them. Worse they can all be used to solce each other -- they are equivalent. Think of it. several hundred different problems that we can't find a fast solution... AND if any one of them did have a polynomial solution .... then they ALL would. Worse we can not prove that there is NO polynomial solution. This gets rather technical. But if you can find a polynomial solution to one of the intractable problems: .List You will be famous. All passwords will become unsafe. You can win 1 million dollars. .Close.List Most computer scientists think that this won'e happen. Instead someone will become rich and famous by showing that it can not be done. . Chapter 19 pages 939 -- libraries Could you give a brief history of the library? A big debate on the internet/usenet/Google groups.... much experimental programming. The specs put into the standard. SGI releases the source code and every body (but microsoft) uses it. Has it been included in C? It does not fit -- uses templates -- C has no templates. Are there other algorithm libraries in C++? Probably. BOOST probably has some, and the next revision of the standard will add more. See .See ./19.html later. . Chapter 19 pages 938-939 -- searching and sorting Which type of search and sort seem to be the most popular The ones in your library! In practice you need to go beyond the theoretical Big-Os and think about the particular problem in hand. (1) What do you know about the data? If there is any thing special about the data you can probably come up with a variation of an algorithm that takes advantage of the quirk in the data. (2) How fast must it respond to keep the users happy? No point in wasting programmer time (expensive) to save CPU time(cheap). But if you have to respond quickly (Think games, heart pace makers, missiles, autos, proton therapy, ....) then it is worth tuning the algorithms. . Chapter 19 pages 923 -- algorithm Is a binary search quicker then a linear search an whats the difference See next. . Chapter 19 pages 932 Algorithms Merge sort and Binary search are the most efficient algorithms, are they also the ones that will get used the most? No! The sort you will find on most computer systems is "quicksort with randomization". For small amounts of data most programmers use "Insertion sort" which is faster on upto about 10 items. Mergesort is not used because it may have a good worst case but its average performance is worse than quicksort. We make a tricky change to quicksort so that the worst case never happens twice on the same data! This keeps the users happy. I expect the card searching demo to clarify the choice between linear and binary search. I will post my conclusion later. . Chapter 19 pages 925-930 -- Binary search Can you give another example of a binary search algorithm and a further explanation of why binary search is more efficient than linear search. Try looking up something in the index of the book. Demo. I just wrote .See ./binary.cpp which uses a binary search to find roots of certain equations. . Chapter 19 pages ppp-ppp -- Linear/Binary Can you give board examples of both please? See next. . Which search method is better, linear or binary search Good question -- let us find out. Time for a demo with cards.... . Special Preparation: Bring a deck of playing cards . Demonstrations of sorting and searching using playing cards. .Close CSci202 Computer Science II, Session 15 Algorithms . Laboratory 7 Comparing Algorithms for Sorting ?? .See http://www/dick/cs202/lab07.html . Next -- Linked Data Structures .See http://www/dick/cs202/16.html Project 4 is due in next class!