1
|
- Session 7
- LBSC 790 / INFM 718B
- Building the Human-Computer Interface
|
2
|
- Questions
- Some useful algorithms
- Project
- Some useful data structures
- Including Java implementations
|
3
|
- Some generic problems come up repeatedly
- Sorting
- Searching
- Graph traversal
- Need a way to compare alternative solutions
- Reusing algorithms is easy and productive
- Focusing on the algorithm reveals the key ideas
- Language and interface make reusing code hard
|
4
|
- Given an array, put the elements in order
- Numerical or lexicographic
- Desirable characteristics
- Fast
- In place (don’t need a second array)
- Able to handle any values for the elements
- Easy to understand
|
5
|
- Simple, able to handle any data
- Grow a sorted array from the beginning
- Create an empty array of the proper size
- Pick the elements one at a time in any order
- Put them in the new array in sorted order
- If the element is not last, make room for it
- Repeat until done
- Can be done in place if well designed
|
6
|
|
7
|
|
8
|
|
9
|
|
10
|
|
11
|
|
12
|
|
13
|
|
14
|
|
15
|
- Sorting can actually be done in place
- Never need the same element in both arrays
- Every insertion can cause lots of copying
- If there are N elements, need to do N insertions
- Worst case is about N/2 copys per insertion
- N elements can take nearly N
operations to sort
- But each operation is very fast
- So this is fine if N is small (20 or so)
|
16
|
- Fast, able to handle any data
- But can’t be done in place
- View the array as a set of small sorted arrays
- Initially only the 1-element “arrays” are sorted
- Merge pairs of sorted arrays
- Repeatedly choose the smallest element in each
- This produces sorted arrays that are twice as long
- Repeat until only one array remains
|
17
|
|
18
|
|
19
|
|
20
|
|
21
|
|
22
|
|
23
|
|
24
|
- Each array size requires N steps
- But 8 elements requires only 3 array sizes
- In general, 2 elements
requires k array sizes
- So the complexity is N*log(N)
- No faster sort (based on comparisons) exists
- Faster sorts require assumptions about the data
- There are other N*log(N) sorts, though
- Merge sort is most often used for large disk files
|
25
|
- Run time typically depends on:
- How long things take to set up
- How many operations there are in each step
- How many steps there are
- Insertion sort can be faster than merge sort
- One array, one operation per step
- But N*log(N) eventually beats N&nb=
sp;
for large N
- And once it does, the advantage increases rapidly
|
26
|
- Split a problem into simpler subproblems
- Keep doing that until trivial subproblems result
- Solve the trivial subproblems
- Combine the results to solve a larger problem
- Keep doing that until the full problem is solved
- Merge sort illustrates divide and conquer
- But it is a general strategy that is often helpful
|
27
|
- Divide and conquer problems are recursive
- Solve the same problem at increasing granularity
- Construct a Java method to solve the problem
- Divide the problem into subproblems
- Call the same method to solve each subproblem
- Unless the subproblems are trivial
- Use the parameters to control the granularity
- See this week’s notes page for merge sort example
|
28
|
- First, craft a divide and conquer strategy
- Create a non-recursive top-level method
- Calls recursive method with initial parameters
- In the recursive method:
- First solve the problem if it is trivial and return
- Be sure you eventually get here!
- Otherwise, split the problem and call itself
- Combine the results and return them
|
29
|
- Find something by following links
- Web pages
- Connections in the flight finder
- Winning moves in chess
- This may seem like an easy problem
- But computational complexity can get really bad
- Simple tricks can help in some cases
|
30
|
- Goal is to find everything on the web
- Build a balanced tree, sorted by search terms
- Start anywhere, follow every link
- If every page has 1 kB of text and 10 links
- Then 10 levels would find a terabyte of data!
- Avoid links that are likely to be uninteresting
- Detect duplicates quickly with hashing
|
31
|
- Explore multiple paths between two points
- Usually trying to find the best by some measure
- Flight finder searches like a web crawler
- Every possible continuation of every route
- Also search backward from the destination
- Assuming 10 departures per airfield:
- 3 connections takes Flight finder 10,000 steps
- 1 connection twice would take 200 steps
|
32
|
- The paths are the legal moves
- And the “places” are possible board positions
- You are seeking to make things better
- Your opponent seeks to make things worse
- Such “zero sum games” are common
- Although many lack chess’ shared information
- Any problem structure makes search easier
- The trick is to exploit constraints effectively
|
33
|
- Decide how many half-moves to look ahead
- Develop a scoring strategy for final positions
- Based on piece count and positional factors
- Follow a promising path
- Helpful to guess the best moves for each side
- With several moves available, pick the best
- But stop any search that can’t improve things
|
34
|
|
35
|
- Find the cheapest way of visiting 10 cities
- Given the airfare between every city pair
- Only visit each city once, finish where you start
- There are only 90 city pairs
- But there are a LOT of possible tours
- The best known algorithm is VERY slow
- Because the problem is “NP complete”
|
36
|
- No “polynomial time” algorithm is known
- Haven’t proved that none exists
- But if it does, many hard problems would be easy
- Approximate solutions with heuristic methods
- Greedy methods
- Genetic algorithms
- Simulated annealing
|
37
|
- Must specify maximum size when declared
- And the maximum possible size is always used
- Can only index with integers
- For efficiency they must be densely packed
- Adding new elements is costly
- If the elements are stored in order
- Every element must be the same type
|
38
|
- Can get any element quickly
- If you know what position it is in
- Natural data structure to use with a loop
- Do the same thing to different data
- Efficiently uses memory
- If the array is densely packed
- Naturally encodes an order among elements
|
39
|
- A way of making variable length arrays
- In which insertions and deletions are easy
- Very easy to do in Java
- But nothing comes for free
- Finding an element can be slow
- Extra space is needed for the links
- There is more complexity to keep track of
|
40
|
- In Java, all objects are accessed by reference
- Object variables store the location of the object
- New instances must be explicitly constructed
- Add reference to next element in each object
- Handy to also have a reference to the prior one
- Keep a reference to the first object
- And then walk down the list using a loop
|
41
|
|
42
|
- Add an element
- Easy to put it in sorted order
- Examine every element
- Just as fast as using an array
- Find just one element
- May be as slow as examining every element
- Delete an element after you find it
- Fast if you keep both next and prior links
|
43
|
|
44
|
- Linked list with multiple next elements
- Just as easy to create as linked lists
- Binary trees are useful for relationships like “<“=
li>
- Insertions and deletions are easy
- Useful for fast searching of large collections
- But only if the tree is balanced
- Efficiently balancing trees is complex, but possible
|
45
|
|
46
|
- Resizable array [O(n) insertion, O(1) access]:
- Linked list [O(1) insertion, O(n) access, sorted]:
- Hash table [object index, unsorted, O(1)]:
- HashSet (key only)
- HashMap (key+value)
- Balanced Tree [object index, sorted, O(log n)]:
- TreeSet (key only)
- TreeMap (key+value)
|
47
|
- Find an element nearly as fast as in an array
- With easy insertion and deletion
- But without the ability to keep things in order
- Fairly complex to implement
- But Java defines a class to make it simple
- Helpful to understand how it works
- “One size fits all” approaches are inefficient
|
48
|
- Create an array with enough room
- It helps a lot to guess the right size first
- Choose a variable in each object as the key
- But it doesn’t need to be an integer
- Choose a spot in the array for each key value
- Using a fast mathematical function
- Best if things are scattered around well
- Choose how to handle multiple keys at one spot
|
49
|
- Hashtables are objects like any other
- import java.util.*
- Must be declared and instantiated
- The constructor optionally takes a size parameter
- put(key, value) adds an element
- containsKey(key) checks for an element
- get(key) returns the “value” object for that key
|
50
|
- Maintain an implicit order
- Easy additions and deletions
- Maps naturally to certain problems
- Interrupt a task to compute an intermediate value
- Implemented as a Java class
|
51
|
- What operations do you need to perform?
- Reading every element is typically easy
- Other things depend on the representation
- Hashing finds single elements quickly
- But cannot preserve order
- Stacks and linked lists preserve order easily
- But they can only read one element at any time
- Balanced trees are best when you need both
- Which operations dominate the processing?
|