Contents¶
- Gale-Shapley
- Big-O Notation
- Data Structures
- Greedy Algorithms
- Solving Recurrence Relations
- Divide and Conquer Algorithms
- Dynamic Programming
Gale-Shapley (Stable Matching)
The Gale-Shapley algorithm produces a stable matching between $A=\{a_1,\cdots,a_n\}$ and $B=\{b_1,\cdots,b_n\}$, where each $a_i$ orders $B$ and each $b_j$ orders $A$ (preferences). The matching is represented by a 1-to-1 permutation function $\pi:\{1,\cdots,n\}\rightarrow\{1,\cdots,n\}$.
Pseudocode.
initialise $\pi(i)=0$ for all $i$
while $\exists i$ such that $\pi(i)=0$ do
choose $i$
$b_j\leftarrow$ most preferred pairing for $a_i$ that has not yet rejected $a_i$
if $b_j$ unmatched then
$\pi(i)=j$ match $a_i$ and $b_j$
else if $b_j$ matched with $a_{i'}$ but prefers $a_i$ to $a_{i'}$ then
unmatch $b_j$ and $a_{i'}$
$\pi(i)=j$ match $a_i$ and $b_j$
else keep current matching $\pi$, $b_j$ rejects $a_i$
Proofs.
- The algorithm terminates. On each iteration of the loop pairs an $a_i$ with a $b_j$ that has not yet rejected it. There are at most $n^2$ of these pairings because there are $n$ $a$'s and $n$ $b$'s. As such, the loop must terminate after all possible pairings are visted.
- The result is a matching. Suppose on the contrary that there is some $a_i'$ that is unmatched at the end. Since there are $n-1$ $a_i$'s, excluding $a_i'$, that msut be matched to $n$ $b_j$'s, there must also be a $b_j'$ that is unmatched. This means that $b_j'$ must have never been paired with, because once any $b_j$ is paired, it can only trade their existing match with one that is more-preferred; it cannot get unmatched. However, this is a contradiction because $a_i'$ must have paired with each $b_j$ as it ends up unmatched.
- The matching is stable. Assume the matching is not stable. Then there exists some $a_i,b_j$ that are ultimately not matched together such that $a_i$ prefer $b_j$ to $b_{\pi(i)}$ and $b_j$ prefers $a_i$ to $a_{\pi^{-1}(j)}$. However, this cannot be the case as $a_i$ must have been paired with $b_j$ at some point because it prefer $b_j$ to its current match. Either $b_j$ accepted and ultimately unmatched, or outright rejected. In the first case, this poses a contradiction because $b_j$ would not unmatch with $a_i$ to match with a less-preferred $a_{\pi^{-1}(j)}$. Similarly, $b_j$ would not have rejected $a_i$ outright because it prefers it to its current matching.
- Def. $b_j$ is available to $a_i$ if $b_j$ did not reject $a_i$ and if either $b_j$ is unmatched or $b_j$ prefers $a_i$ to its current match.
Other information.
- The order of execution of the algorithm (choices of $i$) does not affect the matches produced.
- Def. $S(i)=\{j\mid \exists\text{ stable matching } \pi':\pi'(i)=j\}$
- the set of possible partners for $a_i$
- G-S assigns each $a_i$ with the best possible partner in $S(i)$
- the solution favours the $a_i$'s but not the $b_j$'s
- $b_j$'s are only guaranteed to have increasing quality of pairings after it is already matched
- Lemma. In an exectuion of the algorithm, $\not\exists i,j$ such that $j\in S(i)$ and $b_j$ not available to $a_i$. That is, for all executions, either $b_j$ is available to $a_i$ or $j\notin S(i)$.
- Proof. Suppose not. Then there is some $b_j\in S_i$ that is not available to $a_i$. That is, there exists $a_{i'}$ such that $b_j$ accepted and $b_j$ ranks $a_{i'}$ above $a_i$. Since $b_j\in S_i$, there exists a stable matching that matches $a_{i}$ and $b_j$. If $a_{i'}$ prefers $b_j$ to its current match, then the current matching is not stable, which is a contradiction. On the other hand, if $a_{i'}$ prefers $b_{j'}$ to $b_j$ then before making an offer to $b_j$ , $a_{i'}$ must have made an offer to $b_{j'}$ and been turned down or turned away. Since $j'\in S_{i'}$ , this is an earlier case where $i',j'\in[n]$, $j'\in S_{i'}$, and $b_{j'}$ is unavailable to $a_{i'}$ , which contradicts the definition of $i$ and $j$.
- Def. $S(i)=\{j\mid \exists\text{ stable matching } \pi':\pi'(i)=j\}$
- At the end of the process, the $a$'s end up with the best possible $b$ that it could match with, and the $b$'s end up with the worst possible $a$.
def gale_shapley(
a_pref: dict[str, list[str]],
b_pref: dict[str, list[str]]
) -> dict[str, str]:
unmatcheda = list(a_pref.keys())
matches = {}
rejects = {a: [] for a in a_pref}
while unmatcheda:
ai = unmatcheda.pop() # choose i
b = a_pref[ai] # consider a_i's preferences (list of b's)
for bj in b:
if bj in rejects[ai]: # b_j has already rejected a_i
continue
elif bj not in matches: # b_j is unmatched
matches[bj] = ai # match a_i and b_j
break # move to the next unmatched a_i
elif b_pref[bj].index(ai) < b_pref[bj].index(matches[bj]): # b_j prefers a_i to its current match
unmatcheda.append(matches[bj]) # b_j unmatches with current match
matches[bj] = ai # match a_i and b_j
break # move to the next unmatched a_i
else:
rejects[ai].append(bj) # b_j rejects a_i
return {bj: matches[bj] for bj in sorted(matches)}
import random
def shuffle_prefs(prefs):
for x in prefs:
random.shuffle(prefs[x])
return prefs
a_pref = {
'A': ['X', 'Y', 'Z'],
'B': ['Y', 'X', 'Z'],
'C': ['Z', 'Y', 'X']
}
b_pref = {
'X': ['B', 'A', 'C'],
'Y': ['C', 'B', 'A'],
'Z': ['A', 'C', 'B']
}
print(gale_shapley(a_pref, b_pref))
print(gale_shapley(shuffle_prefs(a_pref), shuffle_prefs(b_pref)))
print(gale_shapley(shuffle_prefs(a_pref), shuffle_prefs(b_pref)))
print(gale_shapley(shuffle_prefs(a_pref), shuffle_prefs(b_pref)))
{'X': 'A', 'Y': 'B', 'Z': 'C'} {'X': 'B', 'Y': 'C', 'Z': 'A'} {'X': 'B', 'Y': 'A', 'Z': 'C'} {'X': 'C', 'Y': 'B', 'Z': 'A'}
Big-O Notation
Def. Given $f,g:\mathbb{N}\rightarrow\mathbb{R}^+$,
- $f(n)$ is $O(g(n))$ if there exists $n_0, C>0$ such that for all $n\geq n_0$, $f(n)\leq Cg(n)$ (upper bound for $f$).
- $f(n)$ is $\Omega(g(n))$ if there exists $n_0, C>0$ such that for all $n\geq n_0$, $f(n)\geq Cg(n)$ (lower bound for $f$, $g(n)$ is $O(f)$).
- $f(n)$ is $\Theta(g(n))$ if $f(n)$ is both $O(g(n))$ and $\Omega(g(n))$. Equivalently, we say that $f(n)$ is $\Theta(g(n))$ if there exists $n_0$, $c,C>0$ such that for all $n\geq n_0$, $cg(n)\leq f(n)\leq Cg(n)$ ($f$ and $g$ are approximately equivalent).
Other information.
- Transitivity: If $f(n)$ is $O(g(n))$ and $g(n)$ is $O(h(n))$, then $f(n)$ is $O(h(n))$.
- Sum: If $f_1(n)$ is $O(g_1(n))$ and $f_2(n)$ is $O(g_2(n))$, then $f_1(n)+f_2(n)$ is $O(\max\{g_1(n),g_2(n)\})$.
- Product: If $f_1(n)$ is $O(g_1(n))$ and $f_2(n)$ is $O(g_2(n))$, then $f_1(n)f_2(n)$ is $O(g_1(n)g_2(n))$.
Interval Scheduling
Maximising the number of jobs accepted¶
There are $n$ jobs. Each job $i$ has a set time interval $[a_i, b_i]$ when it must run its job if it is accepted. We want to find a schedule $S$ which accepts the maximum number of jobs without having two jobs which run simultaneously.
Def. A sequence $S=(i_1,\cdots,i_m)$ is a valid schedule if for all $j\in[1,m-1]$, $b_{i_j}\leq a_{i_{j+1}}$, that is, job $i_{j+1}$ does not start before $i_j$ finishes. Let $|S|$ be the number of jobs accepted.
- Goal: Find a valid schedule which maximises $|S|=m$.
- Possible solution (greedy algorithm): At each step, choose the job (out of the remaining available jobs) which finishes first.
- The optimal schedule from the greedy algorithm is not necessarily unique.
Psuedocode.
sort jobs in order of $b_j$ so that $b_1\leq\cdots\leq b_n$
initialise $S_k=\emptyset$
repeat
if $a_j\geq b_{i_k}$ then
$i_{k+1}=j$
add $i_{k+1}$ to $S_{k+1}$
until $j>n$
take schedule $S=S_k$
Proof. The greedy algorithm stays ahead. It finishes job $j$ at the same time or before any other schedule.
- The greedy algoirthm gives the longest valid schedule.
- Def. Given a valid schedule $S=(i_1,\cdots,i_m)$, we define $t_j(S)=b_{i_j}$ to be the time $S$ finishes job $i_j$ (its $j$-th job). If $j>m$ (job is not accepted), set $t_j(S)=\infty$.
- Lemma. Let $S$ be the schedule given by the greedy algorithm. For any other schedule $S'=(i_1',\cdots, i_{m'}')$, $t_j(S)\leq t_j(S)$ for all $j$.
- Proof. By induction on $j$.
Base case. $j=1$. Then $t_1(S)=b_{i_1}\leq b_{i'_1}=t_1(S')$.
Inductive case. Assume $t_j(S)\leq t_j(S')$. If $t_{j+1}(S')=\infty$, the proof is complete. Otherwise, $t_j(S)\leq t_j(S')\leq a_{i_{j+1}'}$ by the inductive hypothesis and the definition of a valid job. Since $t_j(S)\leq a_{i_{j+1}'}$, job $i_{j+1}'$ is also available to $S$. Since $S$ chooses the job that finishes first, we have that $t_{j+1}(S)=b_{i_{j+1}}\leq b_{i_{j+1}'}=t_{j+1}(S')$, which completes the proof.
- Proof. By induction on $j$.
- This algorithm can be implemented in $O(n\log n)$ by first sorting the jobs in increasing order of $b_i$ and then going through the jobs and accepting any job which does not start before the previously accepted job ends.
def interval_scheduling_max_jobs(
jobs: dict[int, tuple[int, int]]
) -> list[int]:
res = []
prev_end_time = -1
# sort jobs by end time
jobs_sorted = sorted(jobs.items(), key=lambda x: x[1][1])
for job, (start, end) in jobs_sorted:
if start >= prev_end_time:
res.append(job)
prev_end_time = end
return res
j1 = {1: (0, 6), 2: (1, 4), 3: (3, 5), 4: (5, 7), 5: (5, 9)}
sched1 = interval_scheduling_max_jobs(j1)
print(f"scheduled jobs: {sched1}")
j2 = {6: (2, 5), 7: (4, 8), 8: (6, 10), 9: (0, 3), 10: (7, 11), 11: (30, 40), 12: (12, 13), 13: (100, 200)}
sched2 = interval_scheduling_max_jobs(j2)
print(f"scheduled jobs: {sched2}")
j3 = {14: (5, 9), 15: (8, 12), 16: (1, 2), 17: (2, 4), 18: (3, 6)}
sched3 = interval_scheduling_max_jobs(j3)
print(f"scheduled jobs: {sched3}")
scheduled jobs: [2, 4] scheduled jobs: [9, 7, 12, 11, 13] scheduled jobs: [16, 17, 14]
Minimising maximum lateness¶
There are $n$ jobs, each job $i$ with length $l_i>0$ and deadline $d_i$. The jobs can be scheduled at any time $t\geq 0$ but we want to complete the jobs so that no two jobs run simultaneously and the maximum lateness of any single job is minimised.
Def. Given a schedule $S=(i_1,\cdots, i_n)$, which is a permutation of $[n]$, define $t_S(i_k)=\sum_{j=1}^k l_{i_j}$ to be the time at which job $i_k$ is completed.
- Goal: Find a schedule $S$ that minimises $\max_{j\in[n]}\{t_S(i_j)-d_{i_j}\}$.
- Possible solution (greedy algorithm): Perform the jobs in order of their deadlines.
Proof. We can transform any optimal schedule S' into S by exchanging pairs of jobs without increasing the maximum lateness (exchange argument).
- The schedule produced by the greedy algorithm is optimal.
- Lemma. If $S=(i_1,\cdots, i_n)$ is an optimal schedule, for any $j\in[n-1]$, if $d_{i_j}>d_{i_{j+1}}$ (swapped compared to the schedule generated by the greedy algorithm), then the schedule $S'$ obtained by swapping $i_j$ and $i_{j+1}$ is also optimal.
- Proof. Notice that swapping $i_j$ and $i_{j+1}$ does not affect the lateness of all other jobs, that is, $t_{S'}(i_{j'})=t_S(i_{j'})$ for all $j'\neq j, j+1$. Also, since $S'$ schedules $i_{j+1}$ directly before $i_j$, we have that $t_S(i_{j+1})=t_{S'}(i_j)=t_{S'}(i_{j+1})+l_{i_j}$. Therefore, we have $t_S(i_{j+1})>t_{S'}(i_{j+1})\implies t_S(i_{j+1})-d_{i_{j+1}}>$$t_{S'}(i_{j+1})-d_{i_{j+1}}>t_{S'}(i_{j+1})-d_{i_{j}}>$$t_{S'}(i_{j})-d_{i_{j}}$, because $d_{i_j}>d_{i_{j+1}}$ by assumption. Thus, we have that $t_S(i_{j+1})-d_{i_{j+1}}> t_{S'}(i_{j})-d_{i_{j}}$, that is, the lateness of job $i_{j+1}$ for schedule $S$ is greater than the lateness for both jobs $i_j$ and $i_{j+1}$ for greedy algorithm-generated schedule $S'$. As such, the maximum lateness for $S'$ is less than or equal to the maximum lateness for $S$.
- We can use the lemma to swap two adjacent jobs whenever they are out of order. This is repeated until everything is in order (which produces the schedule generated by the greedy algorithm).
def interval_scheduling_min_max_lateness(
jobs: dict[int, tuple[int, int]]
) -> (list[int], int):
res = []
current_time = 0
max_lateness = 0
# sort jobs by deadline
jobs_sorted = sorted(jobs.items(), key=lambda x: x[1][1])
for jobno, (duration, deadline) in jobs_sorted:
current_time += duration
max_lateness = max(max_lateness, max(0, current_time - deadline))
res.append(jobno)
return res, max_lateness
j1 = {1: (2, 4), 2: (1, 3), 3: (2, 5), 4: (3, 7)}
sched1, max_lateness1 = interval_scheduling_min_max_lateness(j1)
print(f'schedule: {sched1}, max lateness: {max_lateness1}')
j2 = {1: (3, 6), 2: (2, 4), 3: (1, 3)}
sched2, max_lateness2 = interval_scheduling_min_max_lateness(j2)
print(f'schedule: {sched2}, max lateness: {max_lateness2}')
j3 = {1: (4, 5), 2: (2, 2), 3: (1, 4), 4: (3, 7)}
sched3, max_lateness3 = interval_scheduling_min_max_lateness(j3)
print(f'schedule: {sched3}, max lateness: {max_lateness3}')
j4 = {1: (1, 2), 2: (2, 5), 3: (3, 8), 4: (1, 3)}
sched4, max_lateness4 = interval_scheduling_min_max_lateness(j4)
print(f'schedule: {sched4}, max lateness: {max_lateness4}')
j5 = {1: (2, 3), 2: (1, 4), 3: (3, 6)}
sched5, max_lateness5 = interval_scheduling_min_max_lateness(j5)
print(f'schedule: {sched5}, max lateness: {max_lateness5}')
schedule: [2, 1, 3, 4], max lateness: 1 schedule: [3, 2, 1], max lateness: 0 schedule: [2, 3, 1, 4], max lateness: 3 schedule: [1, 4, 2, 3], max lateness: 0 schedule: [1, 2, 3], max lateness: 0
Dijkstra's Algorithm (Shortest Path)
Directed graph $G$ with non-negative edge lengths $\{l_{uv}\mid (u,v)\in E\}$, with starting vertex $s\in V$ and a destination vertex $t\in V$.
- Goal: Find the shortest path from $s$ to $t$.
Review of DFS and BFS.
- Direct connectivity: Given a directed graph $G$ and vertices $s$ and $t$, is there a path from $s$ to $t$?
- Psuedocode
initialise $S_{reached}=\{s\}$
$E_{unexplored}\leftarrow\{(s,v)\mid (s,v)\in E\}$
while $E_{unexplored}\neq\emptyset$ do
remove an edge $(u,v)\in E_{unexplored}$
if $v\in S_{reached}$ then
continue
else if $v\neq t$ and $v\notin S_{reached}$ then
add $v$ to $S$
add all of the edges $\{(v,w)\mid (v,w)\in E\}$ to $E_{unexplored}$
else $v=t$
return True
return False - adding and removing edges from $E_{unexplored}$
- storing $E_{unexplored}$ using a stack $\implies$ DFS
- storing $E_{unexplored}$ using a queue $\implies$ BFS
- runtime is $O(|E|)$ because we explore all edges, with each edge being added and removed from $E_{unexplored}$ at most once
Pseudocode.
initialise $S=\{s\}$
initialise $d_s=0$
$E_{candidate}\leftarrow \{(s,v)\mid (s,v)\in E\}$
initialise $d_e=l_e$ for every edge $e=(s,v)\in E_{candidate}$
while $E_{candidate}\neq\emptyset$ do
remove $e=(v,w)$ from $E_{candidate}$, which has minimum $d_e$
if $w\in S$ then
continue, we have already found a path with length $\leq$ $d_e$
else if $w\neq t$ and $w\notin S$ then
add $w$ to $S$
$d_w=d_e=d_v+l_{vw}$
add all of the edges $\{(w,x)\mid (w,x)\in E\}$ to $E_{candidate}$
set $d_{e'}=d_w+l_{e'}$ for all edges in $E_{candidate}$
else $w=t$
add $t$ to $S$
$d_t=d_e=d_v+l_{vt}$
return $d_t$
return None
Proofs.
- Lemma. For all $v\in S$, $d_v$ is the minimum length of a path from $s$ to $v$.
- Proof. By induction on $|S|$.
Base case: $|S|=1$. Thus, $S=\{s\}$. The shortest path from $s$ to $s$ has a length of $0$, which is the shortest path. The base case holds.
Inductive case: Let $S=|k|$. Suppose that for all $v\in S$, $d_v$ is the minimum length of a path from $s$ to $v$. Consider what happens when we add the next vertex $w\to S$ via the edge $(v,w)$, where $v\in S$. We'll show that $d_w=d_v+l_{vw}$ is the minimum length of a path from $s$ to $w$. Suppose on the contrary that there is another path $P'$ from $s$ to $w$ which has smaller length. Let $w'$ be the first vertex on this path which is not in $S$ and let $(v',w')$ be the edge used to reach $w'$. By the inductive hypothesis, we know that the length between $s$ and $v'$ is at least $d_{v'}$. Also, $d_{v'}+l_{v'w'}\geq d_v+l_{vw}=d_w$ because otherwise Dijkstra's algorithm would have taken the edge $(v',w')$ and $w'$ instead of $(v,w)$ and $w$ when considering the next edge. This contradicts the assumption that the length of $P'$ is less than the length of the $d_w$. As such, we have that $d_w$ is the minimum length of a path from $s$ to $w$, where $v\in S$ and $(v,w)\in E$.
- Proof. By induction on $|S|$.
Other information.
- In order for Dijkstra's algorithm to work, the edge weights must be non-negative for the comparison to hold.
- Dijkstra's algorithm can be implemented in $O(|E|\log|E|)$ time.
- We can use a priority queue to store the candidate edges $(v,w)$ and distances $d_e$.
- Each edge is added at most once and removed at most once, so we perform at most $2|E|$ operations on this priority queue.
- The priority queue never has more than $|E|$ elements, so each operation takes $O(\log|E|)$ time.
import heapq
def dijkstra(graph: dict[str, tuple[str, int]], start: str) -> tuple[dict[str, float], dict[str, str|None]]:
distances = {node: float('inf') for node in graph}
distances[start] = 0
predecessors = {node: None for node in graph}
priority_queue = [(0, start)]
while priority_queue:
current_distance, current_node = heapq.heappop(priority_queue)
for neighbor, weight in graph[current_node]:
distance = current_distance + weight
if distance < distances[neighbor]:
distances[neighbor] = distance
predecessors[neighbor] = current_node
heapq.heappush(priority_queue, (distance, neighbor))
return distances, predecessors
def shortest_path(graph: dict[str, tuple[str, int]], start: str, end: str) -> tuple[str, float]:
distances, predecessors = dijkstra(graph, start)
path = []
current_node = end
while current_node is not None:
path.insert(0, current_node)
current_node = predecessors[current_node]
return (path, distances[end]) if path[0] == start else ("NO PATH", float('inf'))
graph = {
'A': [('B', 3), ('C', 1), ('E', 2)],
'B': [('D', 2), ('E', 4)],
'C': [('B', 2), ('F', 5)],
'D': [('E', 1), ('G', 8)],
'E': [('H', 5)],
'F': [('E', 1), ('H', 6)],
'G': [('H', 4)],
'H': []
}
p1, d1 = shortest_path(graph, 'A', 'D')
print(f"Shortest path from 'A' to 'D': {p1}, distance: {d1}")
p2, d2 = shortest_path(graph, 'A', 'E')
print(f"Shortest path from 'A' to 'E': {p2}, distance: {d2}")
p3, d3 = shortest_path(graph, 'A', 'G')
print(f"Shortest path from 'A' to 'G': {p3}, distance: {d3}")
Shortest path from 'A' to 'D': ['A', 'B', 'D'], distance: 5 Shortest path from 'A' to 'E': ['A', 'E'], distance: 2 Shortest path from 'A' to 'G': ['A', 'B', 'D', 'G'], distance: 13
Data Structures
Data structures are used for efficiently storing and accessing data needed for algorithms.
- arrays, linked lists, heaps, binary search trees, and hash tables specify how the data should be stored
- abstract data types such as stacks, queues, and priority queues, specify operations that can be preferred
Array: An array of length $n$ has $n$ elements, $A[0],A[1],\cdots,A[n-1]$.
- $A[i]$ gives the element of $A$ at index $i$
- allow us to quickly store and access data but we have to remember where we put everything
- need to sort an array to put data in order (take $O(n\log n)$ time using mergesort or advanced sorting)
- advantages: can use binary search to quickly find elements; can traverse the array in order
- disadvantages: sorted arrays are static, can't add or remove elements
Linked list: A linked list is a set of nodes where each node contains an element and points to the next node (null at the end). Starting from head, we can iterate through the rest of the linked list.
- may be useful to keep the tail pointer as well.
- linked lists can also be doubly-linked, each node contains pointer to previous node
- advantages: easy to insert elements; easy to remove elements for a doubly-linked list
- disadvantages: hard to access elements at a specific index
Stack: Has $O(1)$ operations, can be implemented using either arrays or linked lists.
push(x)
: adds $x$ to the top of the stackpop()
: removes and returns the top of the stackpeek()
: returns but does not remove the top of the stack
Queue: Has $O(1)$ operations, can be implemented using either arrays or linked lists that keep track of the tail.
enqueue(x)
: adds $x$ to the back of the queuedequeue()
: removes and returns the first element of the queuepeek()
: returns but does not remove the first element of the queue
Priority queue: Similar to queue, but each element has a priority value and instead of taking the element at the front of the queue, we take the element with the smallest priority value.
insert(x,v)
: adds $x$ to the priority queue with priority $v$delete min
: removes and returns element with the smallest priority valuepeek()
: returns but does not remove element with the smallest priority value
Binary min-heap: Structure where each node has 0, 1, or 2 children and each node has a value which is less than or equal to its children. Key idea: Smallest element of the heap must be at the top of the heap and may be useful to require heap to be an almost-complete binary tree.
insert(x)
: add $x$ to next spot and while it is smaller than its parent, swap with parent. $O(\log n)$delete()
: remove and return top of heap; take element in last-occupied position, put it at the top, and while it is larger than at least one child, swap it with its smaller child. $O(\log n)$- can be implemented with an array
- each $A[i]$ is a node, top is $A[1]$, and left and right children of $A[i]$ are $A[2i]$ and $A[2i+1]$
Minimum Spanning Trees
Def. Given a graph $G=(V,E)$ where each edge has an edge length of $\ell_e$, a minimum spanning tree $T$ is a set of edges $T\subseteq E$ such that $(V,T)$ is a tree and $\sum_{e\in T}\ell_E$ is minimised.
Kruskal's Algorithm¶
Algorithm.
- Set edges of $E$ in order of length.
- Go through edges one by one and take each edge whichd oes not form a cycle with existing edges.
Pseudocode.
sort $E$ by edge length
initialise $T=\emptyset$, where $T$ is the set of edges we've taken so far
repeat:
take next edge $e\in E$
if $T\cup\{e\}$ contains a cycle then
discard $e$
else
add $e$ to $T$
until $|T|=|V|-1$ or there are no edges left to consider
Proof. To see that Kruskal's algorithm gives the minimum spanning tree $T$, observe that:
- If Kruskal's algorithm adds an edge $e=\{u,v\}$, let $L$ be the set of vertices $u'$ such that there is a path from $u$ to $u'$ using the edges which we have added to $T$ so far. Let $R=V\setminus L$. Note that $v\notin L$; otherwise adding $e$ would create a cycle. By definition of $L$, no edges between $L$ and $R$ were added to $T$. Thus, $e$ must be the shortest edge between $L$ and $R$; if there were some $e'$ between $L$ and $R$ that were shorter, it would be added by the algorithm sooner. Therefore, $e\in T$.
- If there exists a cut $C=(L,R)$ such that $e$ is the shortest edge between $L$ and $R$, then $e$ is the first edge between $L$ and $R$ that Kruskal's algorithm will consider. At this point, no other edges between $L$ and $R$ have been added so adding $e$ will not create a cycle. Therefore $e\in T$.
Additional information.
- Checking if $T$ contains a cycle can be done in $O(\alpha(n))$ time.
- $\alpha(n)$ is the inverse Ackermann function.
- This uses the union-find data structure.
Prim's Algorithm¶
- Key idea: At each step, we take the shortest edge to a new destination.
- Similar to Dijkstra's algorithm and takes $O(|E|\log|E|)$ time.
- Instead of storing the total length of the path from $s$ to $v$ using an edge $e=(u,v)$, we store $\ell_e$ and put it in the priority queue.
- Prim's algorithm does not require the initial sorting that Kruskal's algorithm requires.
Proof. To see that Prim's algorithm gives the minimum spanning tree $T$, observe that:
- Whenever the algorithm adds an edge $e$, $e$ is the shortest edge between the set $S$ fo vertices which have been reached so far and $V\setminus S$, so $e\in T$.
- If $e$ is an edge of $T$, then when the algorithm considers $e$, it will add $e$. Since Prim's algorithm only adds edges in $T$, adding $e$ cannot create a cycle so $e$ must go to a new vertex.
Correctness of MST Algorithms¶
- Note that if $e$ is the shortest edge incident to a vertex $v$, then $e\in T$.
- Def. A cut is a partition $(L,R)$ of the vertices of $G$ where $L,R\neq\emptyset$. An edge $e=(u,v)$ crosses the cut $(L,R)$ if $u\in L$ and $v\in R$, or vice versa.
- Thm. For each edge $e\in E$, $e\in T$ if and only if there exists a cut $(L, R)$ such that $e$ is the shortest edge crossing $(L,R)$.
- Proof. ($\rightarrow$) Let $e=(u,v)\in T$. Take $L$ to be the set of vertices reachable from $u$ using the edges $T\setminus\{e\}$. If there was a shorter edge $e'$ crossing $(L,R)$, $T'=(T\cup\{e'\})\setminus\{e\}$ should be a spanning tree with shorter total length, which would contradict our assumption that $T$ is a MST.
($\leftarrow$) Let $e$ be the shortest edge crossing $(L,R)$. Suppose on the contrary that $e\notin T$. Then adding $e$ to $T$ creates a cycle. This cycle contains some other edge $e'$ crossing $(L,R)$. If we add $e$ to $T$ and remove $e'$ from $T$, $T$ will have shorter length and is still a spanning tree, which contradicts our assumption that $T$ is a MST. Therefore, $e\in T$.
- Proof. ($\rightarrow$) Let $e=(u,v)\in T$. Take $L$ to be the set of vertices reachable from $u$ using the edges $T\setminus\{e\}$. If there was a shorter edge $e'$ crossing $(L,R)$, $T'=(T\cup\{e'\})\setminus\{e\}$ should be a spanning tree with shorter total length, which would contradict our assumption that $T$ is a MST.
- Thm. For all edges $e\in E$, $e\notin T$ if and only if there exists a cycle $C$ such that $e$ is the longest edge of $C$.
Solving Recurrence Relations
- Expand out recurrence relation by iteratively substituting the recurrence relation into itself
- Make educated guesses for the solution to recurrence relations by solving the homogeneous and inhomogeneous parts of the relation
Master Theorem¶
The Master theorem allows us to obtain the runtime of a recursive algorithm from the recurrence relation describing its runtime. Let $T(n)$ be the worst-case runtime for an algorithm. If $T(n)=aT(\frac{n}{b})+O(n^c), let $c_{crit}=log_b a. Then:
- If $c<c_{crit}$, then $T(n)$ is $O(n^{c_{crit}})$.
- If $c>c_{crit}$, then $T(n)$ is $O(n^c)$.
- If $c=c_{crit}$, then $T(n)$ is $O(n^{c_{crit}}\log n)$.
Recursive Algorithms
Binary Search¶
The binary search algorithm checks if an element $x$ is in a sorted array $A$ as follows:
- Input: The algorithm requires indices $i$ and $j$ such that if $x\in A$, it must be located at some index $i'\in[i,j]$.
- Initialisation: We call binary search on $A$ with $i=0$ and $j=n-1$.
- Recursive step: If $j<i$, then $x\notin A$. Otherwise take $k=\lfloor\frac{i+j}{2}\rfloor$. There are three cases:
- $x=A[k]$. Then we have found $x$.
- $x>A[k]$. Then we use binary search with $i'=k+1$ and $j'=j$.
- $x<A[k]$. Then we use binary search with $i'=i$ and $j'=k-1$.
- Recurrence relation: $T(n)=T(\frac{n}{2})+1$, so $T(n)$ is $O(\log n)$.
Finding Local Maximum¶
This algorithm finds an index $i$ that is a local maximum in a sorted array $A$:
- Initialisation: We call the algorithm on $A$ with $l=0$ and $r=n-1$.
- Recursive step: If $l=r$, then $l$ is a local maximum. Otherwise take $m=\lfloor\frac{l+r}{2}\rfloor$. There are two cases:
- $A[m]\geq A[m+1]$. Then we call algorithm with $l'=l$ and $r'=m$.
- $A[m]<A[m+1]$. Then we call algorithm with $l'=m+1$ and $r'=r$.
- Key idea (for local minimum): We compare middle element with its neighbors. If middle element is not greater than any of its neighbours, then it is the local minimum. If the middle element is greater than its left neighbor, then there is always a local minima in left half. If the middle element is greater than its right neighbor, then there is always a local minima in right half.
Mergesort¶
This algorithm sorts an array $A$:
- Split the array into two subarrays by dividing $A$ in half.
- Call mergesort on elements $0$ through $\lfloor\frac{n}{2}\rfloor-1$ of the array.
- Call mergesort on elements $\lfloor\frac{n}{2}\rfloor$ through $n-1$ of the array.
- If $A$ and $B$ are two sorted arrays of lenghts $n_1$ and $n_2$, we can merge these arrays into a sorted array $C$ of length $n_1+n_2$:
- Initialisation: Start with $i=0$ and $j=0$.
- Iterative step: If $j\geq n_2$, or $i<n_1$ and $A[i]<B[j]$, then set $C[i+j]=A[i]$ and increment $i$. If $i\geq n_1$, or $j\neq n_2$ and $B[j]\leq A[i]$, then set $C[i+j]=B[j]$ and increment $j$.
- Recurrence relation: $T(n)=2T(\frac{n}{2})+n$, so $T(n)$ is $O(n\log n)$.
Counting Inversions
Problem: Given a sequence $a_1,\cdots, a_n$ of distinct numbers, how many pairs $i<j\in[n]$ are there such that $i<j$ but $a_i>a_j$?
Naive solution: Consider every pair $i,j$. This takes $O(n^2)$ time.
Algorithm: Given an array $A$ of $n$ distinct elements, we can simultaneously count the number of inversions in $A$ and sort $A$ as follows:
- Base case: If $n=1$, the array is already sorted and there are no inversions.
- Divide and conquer: Recursively apply this algorithm to elements $\lfloor\frac{n}{2}\rfloor-1$ of $A$, and let $count_L$ be the number of inversions that were in these elements of $A$. Similarly, let $count_R$ be the number of inversions that are in elements $\lfloor\frac{n}{2}\rfloor$ through $n-1$ of $A$.
- Merge: If $A$ and $B$ are two sorted arrays of lengths $n_1$ and $n_2$, then we can count the number of pairs $i,j$ such that $a_i>b_j$ and merge these sequences into a sorted array $C$ of length $n_1+n_2$:
- Initialisation: Start with $i=0$, $j=0$, and $count_{between}=0$.
- Iterative step: If $j\geq n_2$, or $i<n_1$ and $a_i\leq b_j$, then set $c_{i+j}=a_i$ and increase $i$ by $1$. If $i\geq n_1$, or $j<n_2$ and $b_j<a_i$, then set $c_{i+j}=b_j$, increase $count_{between}$ by $n_1-i$ and increase $j$ by $1$.
- Solution: After the merging step of the two sorted halves, the total number of inversions in $A$ is $count_L+count_R+count_{between}$.
Proof of correctness: At each step, we have the elements in $C$, followed by the remaining elements of $A$ and then the remaining elements of $B$. When an element of $A$ is added to $C$, nothing changes. When an element $b_j$ of $B$ is added to $C$, $b_j<a_{i'}$ for all $a_{i'}$ remaining in $A$, so each of these remaining $a_{i'}$ was previously inverted with $b_j$. Shifting $b_j$ before all of these $a_{i'}$ removes all of these inversions, reducing the number of inversions by $n_1-i$, the number of elements remaining in $A$. Thus, $count_{between}$ keeps track of the number of inversions between $A$ and $B$ which we have removed so far. When we are finished, there will be no inversions, so $count_{between}$ will be the number of inversions between $A$ and $B$ at the start.
Runtime: This algorithm can count many inversions at once, which allows the algorithm to run in $O(n\log n)$ time as opposed to $O(n^2)$ time.
Closest Pair of Points
Problem: Given a set $P=\{p_1,\cdots,p_n\}\in\mathbb{R^2}$ of $n$ points in the plane, which two points are closest to each other?
Naive solution: Consider every pair of points $p_i,p_j$. This takes $O(n^2)$ time.
Algorithm: Given arrays $X$ and $Y$ of the points sorted by their $x$-coordinate and $y$-coordinate respectively, the algorithm does the following:
- Base cases: If $n\leq 1$, then the algorithm returns infinity as there is no pair of points. If $n=2$, then the algorithm returns $p_1,p_2$ because it is the only pair of points.
- Divide: Let $m=\lfloor\frac{n}{2}\rfloor$. Then $X[m]$ is the middle entry of $X$. Take $P_L$ be the set of all points to the left of $X[m]$, including $X[m]$. Take $P_R$ to be the set of all points to the right of $X[m]$, excluding $X[m]$. Then let $X_L$ and $Y_L$ contain all the points in $P_L$ sorted by $x$ and $y$-coordinate, respectively. We can do the same for $P_R$. This construction takes $O(n)$ time by going through $X$ and $Y$ and filtering by $P_L$ and $P_R$.
- Conquer: We recursively find the closest pair of points $p_{i_L}, p_{j_L}$ in the left half and $p_{i_R}, p_{j_R}$ in the right half.
- Merge: To find the closest pair of points overall, we also need to check whether there are any other pairs of points across the two halves which are closer than the pairs of points that we have already found:
- Take $d=\min\{d(p_{i_L}, p_{j_L}), d(p_{i_R}, p_{j_R})\}$. Let $P_{mid}=\{p_i=(x_i,y_i)\mid |x_i-X[m]|<d\}$ be the subset of points which are within $d$ of the dividing line $x=X[m]$. Let $Y_{mid}$ be an array of hte points in $P_{mid}$ sorted by their y-coordinate.
- Consider all pairs $i,j$ such that one of the following is true:
- $i=i_L$ and $j=j_L$, or $i=i_R$ and $j=j_R$.
- $p_i,p_j\in P_{mid}$ and $p_j$ comes between $1$ and $11$ positions after $p_i$ in $Y_{mid}$ (visually, this involves splitting the middle strip into $\frac{d}{2}\times\frac{d}{2}$ cells; only cells within the next 3 cells on the same row and the next 8 cells over the next 2 rows can be within $d$ of the current cell).
- Take $i_{min},j_{min}$ to be the pair which minimises $d(p_i,p_j)$ among all such pairs $p_i,p_j$, and take $d_{min}=d(p_{i_{min}},p_{j_{min}})$.
Proof of correctness: With the two observations below, we can see that the algorithm checks all pairs of points which could possibly be within distance $d$ of each other:
- If $p_i\in P_L$ and $p_j\in P_R$, then $d(p_i,p_j)\geq d$ unless $p_i,p_j\in P_{mid}$. Note that $x_j-x_i=(x_j-x_{mid})+(x_{mid}-x_i)$. If $p_i\notin P_{mid}$, then $x_j-x_i\geq x_{mid}-x_i\geq d$. If $p_j\notin P_{mid}$, then $x_j-x_i\geq x_j-x_{mid}\geq d$. Therefore, the closest pair of points must be on either side of the dividing line or within $d$ of the dividing line on either side.
- For each point $p_i\in P_{mid}$, there are at most $11$ points $p_j\in P_{mid}$ such that $0\leq y_j-y_i\leq d$. We can see this by partitioning the region $M=\{(x,y)\mid |x-x_{mid}|<d\}$ into rows of $4$ boxes, where each box has side length $d/2$. Observe that each such box can have at most $1$ point in it because otherwise there would be a pair of points in either $P_L$ or $P_R$ which are less than distance $d$ apart. Then, for any $p_i\in P_{mid}$, we can only have that $p_j\in P_{mid}$ and $0\leq y_j-y_i\leq d$ if $p_j$ is in the same row of boxes as $p_i$ or is in one of the two rows of boxes above $p_i$. Thus, there are only $11$ boxes which $p_j$ could be in, so there can be at most $11$ such $p_j$.
Karatsuba Multiplication Algorithm
The grade school algorithm for multiplying two $n$-digit numbers takes $O(n^2)$ time. We can use the Karatsuba multiplication divide-and-conquer algorithm: Given two $n$-digit numbers $x$ and $y$, taking $k=\lceil\frac{n}{2}\rceil$, we can write $x=10^kx_1+x_2$ and $y=10^ky_1+y_2$, where $x_1,y_1$ are $(n-k)$-digit numbers and $x_2,y_2$ are $k$-digit numbers. We then have that: $$xy=10^{2k}x_1y_1+10^kx_1y_2+10^kx_2y_1+x_2y_2$$
This allows us to compute $xy$ by taking the sum of smaller products. But this still requires four products, which does not give us a faster algorithm. However, notice that $x_1y_2+x_2y_1=(x_1+x_2)-x_1y_1-x_2y_2$. Therefore, we have that: $$xy=10^{2k}x_1y_1+10^k((x_1+x_2)(y_1+y_2)-x_1y_1-x_2y_2)+x_2y_2$$
Recurrence relation: $T(n)=3T(\frac{n}{2})+O(n)$, so $T(n)=O(n^{log_2 3})$.
Dynamic Programming
In dynamic programming, we solve a problems by solving smaller subproblems. Using the solutions to these subproblems, we can obtain solutions for larger and larger subproblems until we solve the entire problem. To design a dynamic programming algorithm, we need to do the following:
- Determine the subproblems which we need to solve.
- Find a recurrence relation for how to obtain the answer to a subproblem from the answers to smaller subproblems.
To implement the dynamic programming algorithm:
- Store the answers to the subproblems which we’ve solved so far (memoisation).
- This allows us to avoid repeating our work if we need the answer to a given subproblem multiple times.
- Iteratively compute the answer to the next subproblem using the recurrence relation.
Weighted Interval Scheduling
Problem: Given $n$ intervals $[a_1,b_1],\dots,[a_n,b_n]$, find a valid schedule $S=(i_1,\dots,i_m)$ which maximises $\sum_{j=1}^{|S|} w_{i_j}$.
Preprocessing: Sort the times $T=\{a_i:i\in[n]\}\cup\{b_i:i\in[n]\}$ and let $(t_1,\dots,t_{2n})$ be the sorted times. Here, we assume that all the start and end times are distinct so there are no duplicates.
Algorithm: To solve this problem, we consider the following subproblems: For all $t\in T$, what is the maximum total weight which can be completed by time $t$? Let $S_t=\{S=(i_1,\dots,i_m): S\text{ is a valid schedule,} b_{i_m}\leq t\}$, the set of all valid schedules which are completed by time $t$. Also, let $w(t)=\max_{S\in S_t}(\sum_{j=1}^{|S|} w_{i_j})$ be the maximum total weight which can be completed by time $t$. Then we have the following recurrence relation for all $k\in[2n-1]$:
- If $t_{k+1}=a_i$ for some $i\in[n]$, then $w({t_{k+1}})=w(t_k)$.
- If $t_{k+1}=b_j$ for some $j\in[n]$, then $w(t_{k+1})=\max\{w(t_k), w(a_j)+w_j\}$.
Proof of correctness: If $t_{k+1}=a_i$ for some $i\in[n]$, then any schedule which finishes by time $t_{k+1}$ also finishes by time $t_k$ so $w(t_{k+1})=w(t_k)$. If $t_{k+1}=b_j$, then we can show that $w(t_{k+1})\geq \max\{w(t_k),w(a_j)+w_j\}$ and $w(t_{k+1})\leq\max\{w(t_k),w(a_j)+w_j\}$.
(LHS $\geq$ RHS) Observe that by definition of $w(t_k)$, there is a schedule that finishes by time $t_k$ with total weight $w(t_k)$. This schedule also finishes by $t_{k+1}$, so $w({t_{k+1}})\geq w(t_k)$. Similarly, by definition of $w(a_j)$, there is a schedule which finishes by time $a_j$ with total weight $w(a_j)$. Adding job $j$ to this schedule gives a schedule which finishes by time $t_{k+1}=b_j$ and has total weight $w(a_j)+w_j$, so $w(t_{k+1})\geq w(a_j)+w_j$.
(LHS $\leq$ RHS) Observe that by definition of $w(t_{k+1})$, there is a schedule with weight $w(t_{k+1})$ that finishes by time $t_{k+1}$. If this schedule does not contain job $j$, then this schedule finishes by $t_k$ and has weight $w(t_k)$, so $w(t_{k+1})\leq w(t_k)$. If this schedule contains job $j$, then the schedule consists of a subschedule $S$ with weight $w(S)$ which finishes by time $a_j$, plus job $j$. It must be the case that $w(S)\leq w(a_j)$, so we have that $w(t_{k+1})=w(S)+w_j\leq w(a_j)+w_j$.
Alternative proof of correctness: We can partition the schedules $S$ which finish by time $t=b_j$ based on whether they contain job $j$ or not. Observe that:
- The schedules that finish by time $t_{k+1}=b_j$ and do not contain job $j$ also finish by time $t_k$. The maximum weight completed by such a schedule is $w(t_k)$.
- The schedules $S$ which finish by time $t_{k+1}=b_j$ and contain job $j$ are the schedules of the form $S'\cup \{j\}$, where $S'$ is a schedule which finishes by time $a_j$. The maximum weight completed by such a schedule is $w(a_j)+w_j$.
Runtime: The preprocessing step takes $O(n\log n)$ time to sort the starting and ending times of the jobs. In the algorithm itself, there are $2n$ subproblems and each subproblem takes $O(1)$ time, so the algorithm runs in $O(n)$. Therefore, the entire algorithm takes $O(n\log n)$.
Bellman-Ford Algorithm for Shortest Path
Problem: Given a directed graph $G=(V,E)$, a starting vertex $s\in V$, a destination vertex $t\in V$, and edge lengths $l_e=l_{uv}$ for all $e=(u,v)\in E$, find a path $P$ from $s$ to $t$ such that $\sum_{e\in E(P)}l_e$ is minimised.
Algorithm: To solve this problem, consider the following subproblems: For all $k\in[|V(G)|$, what is the minimum length of a walk from $s$ to $v$ which uses at most $k$ edges? Let $d_k(v)$ be the minimum length of a walk from $s$ to $v$ which uses at most $k$ edges. We then have the following recurrence relation: $$d_{k+1}(v)=\min\{d_k(v),\min_{u\in V(G):(u,v)\in E(G)} (d_k(u)+l_{uv})\}$$
Proof of correctness: First we'll show that if there are no negative cycles in $G$, then for all $k\in\mathbb{N}$ and all $v\in V(G)$, $d_k(v)$ is also equal to the minimum length of a path from $s$ to $v$ which uses at most $k$ edges. Since all paths are walks, for any walk $W$ from $s$ to $v$ which uses at most $k$ edges, there is a path $P$ from $s$ to $v$ which uses at most $k$ edges with length less than or equal to length of $W$. We can turn $W$ into $P$ by removing all cycles from $W$. Since there are no negative length cycles in $G$, the length of $P$ is at most the length of $W$.
(LHS $\geq$ RHS) Let $W$ be a walk from $s$ to $v$ with minimum length which uses at most $k+1$ edges. If $W$ uses at most $k$ edges, then $W$ has a length of at least $d_k(v)$, so $d_{k+1}(v)\geq d_k(v)$. And if $W$ uses exactly $k+1$ edges with the final edge being $(u,v)$, then $W$ has length at least $d_k(u)+l_{uv}$, so $d_{k+1}(v)\geq\min_{u\in V(G):(u,v)\in E(G)}(d_k(u)+l_{uv})$.
(LHS $\leq$ RHS) By definition, there is a walk of length $d_k(v)$ from $s$ to $v$ which uses at most $k$ edges. As such, it uses at most $k+1$ edges, so $d_{k+1}(v)\leq d_k(v)$. Similarly, for all $u\in V(G)$, there is a walk $W$ of length $d_k(u)$ from $s$ to $u$ which uses at most $k$ edges. Adding the edge $(u,v)$ to this walk gives a walk from $s$ to $v$ of length $d_k(u)+l_{uv}$, which uses at most $k+1$ edges, so $d_{k+1}(v)\leq\min_{u\in V(G):(u,v)\in E(G)}(d_k(u)+l_{uv})$.
Alternative proof of correctness: We can partition the walks $W$ from $s$ to $v$ using at most $k+1$ edges based on the vertex immediately preceding $v$. Observe that any walk $W$ from $s$ to $v$ using at most $k+1$ edges must satisfy one of the following:
- $W$ uses at most $k$ edges. The shortest such walk has length $d_k(v)$.
- The vertex before $v$ in $W$ is $u$ for some $u\in V$ such that $(u,v)\in E(G)$. These walks consist of a walk $W'$ from $s$ to $u$ using at most $k$ edges with the edge $(u,v)$ added at the end. The shortest such walk has length $d_k(u)+l_{uv}$.
Runtime: Let $n=|V(G)|$ and let $m=|E(G)|$. Naively, there are $n^2$ subproblems $d_k(v)$ to compute and each $d_k(v)$ takes $O(n)$ to compute. We can improve this by observing that each $d_k(v)$ takes $O(indegree(v))$ time to compute. Thus, for each $k\in[n]$, computing all of the $d_k(v)$ takes $O(m)$ time , since $\sum_{v\in V} indegree(V)=|E|=m$. Thus, this algorithm takes $O(mn)$ time.
Note: If there is a negative cycle in $G$ which is reachable from $s$, then there are infinitely many negative walks to any vertex $v$ which is reachable from the negative cycle. Using the Bellman-Ford algorithm, we can check whether this happens by checking if there are any updates when we look at walks of length $n$. If there are no negative cycles reachable from $s$, then since all paths in $G$ have length at most $n − 1$, there will be no updates. Conversely, if there are no updates at a given $k$ then we have already found the shortest walks from $s$ to every vertex $v$ and we can terminate the algorithm. If there is a negative cycle reachable from $s$, there will be updates for all $k$.
Knapsack Problem
Problem: Given $n$ objects with weights $w_i$ and values $v_i$ and a knapsack with capacity $C$, what is $\max_{I\subseteq[n]:\sum_{i\in I}w_i\leq C}(\sum_{i\in I}v_i)$, the maximum total value of objects which can be carried in the knapsack?
Algorithm: To solve this problem with integer weights and capacity, we consider the following subproblems: What is the maximum total value which can be taken from the first $k$ objects without exceeding weight $w$? Let $v(k,w)$ represent this value, where $v(k,w)=\max_{I\subseteq[k]:\sum_{i\in I}w_i\leq w}(\sum_{i\in I}v_i)$. Then we have the following recurrence relation: $$v(k+1, w)=\max\{v(k,w), v(k, w-w_{k+1})+v_{k+1}\}$$
Proof of correctness: (LHS $\geq$ RHS) By definition, there is an $I\subseteq[k]$ such that $\sum_{i\in I}w_i\leq w-w_{k+1}$ and $\sum_{i\in I}v_i=v(k,w)$. Since $I\subseteq[k+1]$, we have that $v(k+1,w)\geq v(k,w)$. Also, there is an $I\subseteq[k]$ such that $\sum_{i\in I}w_i\leq w-w_{k+1}$ and $\sum_{i\in I}v_i=v(k, w-w_{k+1})$. Taking $I'=I\cup \{k+1\}$, we have that $\sum_{i\in I'}w_i\leq w$ and $\sum_{i\in I'}v_i=v(k,w-w_{k+1})+v_{k+1}$. Thus, $v(k+1,w)\geq v(k,w-w_{k+1})+v_{k+1}$.
(LHS $\leq$ RHS) By definition, there is an $I\subseteq[k+1]$ such that $\sum{i\in I}w_i\leq w$ and $\sum_{i\in I}v_i=v(k+1,w)$. If $k+1\notin I$, then $I\subseteq[k]$ so $v(k+1,w)\leq v(k,w)$ If $k+1\in I$, then taking $I'=I\setminus\{k+1\}$, $\sum_{i\in I'}w_i\leq w-w_{k+1}$ and $\sum_{i\in I'}v_i=v(k+1,w)-v_{k+1}\leq v(k,w-w_{k+1})$, so $v(k+1,w)\leq v(k,w-w_{k+1})+v_{k+1}$.
Runtime: Though the input size is $O(n\log C)$, this algorithm runs in $O(Cn)$ time, which is not polynomial in the input size if $C$ is very large. Therefore, we say this algorithm runs in pseudo-polynomial time (and the problem is NP-hard).
Longest Common Subsequence
Problem: Given two strings $A=a_1a_2\cdots a_n$ and $B=b_1b_2\cdots b_m$, find a sequence of indices $I=(i_1,i_2,\cdots, i_l)$ and $J=(j_1,j_2,\cdots, j_l)$ such that:
- for all $k\in [2,l]$, $1\leq i_{k-1}<i_k\leq n$ and $1\leq j_{k-1}<j_k\leq m$ (the sequences of indices $I$ and $J$ are in increasing order),
- for all $k\in [l]$, $a_{i_k}=b_{j_k}$ (the subsequeences of $A$ and $B$ with indices $I$ and $J$ are the same), and
- the length $l$ is maximised.
Algorithm: To solve this problem, consider the following subproblems: what is the longest common subsequence given the first $i$ letters of $A$ and the first $j$ letters of $B$? We define $S(i,j)$ to be the longest common subsequence in $A_i=a_1a_2\cdots a_i$ and $B_j=b_1b_2\cdots b_j$. Then we have the following recurrence relation: $$S(i,j)=\begin{cases} 0 &\text{if $i=0$ or $j=0$}\\ S(i-1,j-1)+1 &\text{if $a_i=b_j$}\\ \max\{S(i-1,j), S(i,j-1)\} &\text{if $a_i\neq b_j$} \end{cases}$$
Proof of correctness: If $a_i=b_j$, then it is always optimal to match up $a_i$ with $b_j$ and find the longest common subsequence in $A_{i-1}$ and $B_{j-1}$. On the other hand. if $a_i\neq b_j$, then we cannot use both $a_i$ and $b_j$. If we do not use $a_i$, then our subsequence will have length at most $S(i-1,j)$. If we do not use $b_j$, then our subsequence will have length at most $S(i,j-1)$. Thus, the longest common subsequence in $A_i$ and $B_j$ has length $\max\{S(i-1,j),S(i,j-1)\}$.
Runtime: There are $mn$, or $O(n^2)$ subproblems $S(i,j)$. Each subproblem takes $O(1)$ time, so this algorithm runs in $O(n^2)$ time.
Non-crossing matchings**
Problem: Given a graph $G$ with vertices $v_1,\cdots,v_n$, a non-crossing matching is a set of edges $M\subseteq E(G)$ such that if we arrange the vertices $v_1,\cdots,v_n$ clockwise on a circle and draw the edges as chords, no two edges cross or share an endpoint. Equivalently, for all pairs of edges $(v_i,v_j),(v_{i'},v_{j'})\in M$ where $i<j$ and $i'<j'$, neither $i<i'<j<j'$ nor $i'<i<j'<j$. Given this undirected graph, what is the largest non-crossing matching $M$?
Algorithm: To solve this problem, consider the following subproblems: For all $i<j\in[n]$, what is the size of the largest non-crossing matching $M$ containing only edges between the vertices $v_i,\cdots,v_j$. We define $M(i,j)$ to be the size of the largest non-crossing matching $M$ such that $M$ only contains edges between vertices $v_i,\cdots, v_j$. Then we have the following recurrence relation: $$M(i,j)=\max\{M(i,j-1), \max_{i':i\leq i'<j, (v_{i'},v_{j'}\in E(G)}(M(i,i'-1)+M(i'+1,j-1)+1)\}$$
Proof of correctness: Note that either $v_j$ is matched with a vertex $v_{i'}$, where $i'\in[i,j-1]$, or $v_j$ is not matched with any of these vertices. In the second case, the largest non-crossing matching we can obtain has $S(i,j-1)$ edges. In th efirst case, the largest non-crossing matching we can obtain within the vertices $v_i,\cdots, v_j$ has $M(i,i'-1)+M(i'+1,j-1)+1$ edges. Thus, the largest matching we can obtain if $v_j$ is matched up has $\max_{i':i\leq i'<j, (v_{i'},v_{j'}\in E(G)}(M(i,i'-1)+M(i'+1,j-1)+1)$ edges.
Runtime: There are $n^2$ subproblems $M(i,j)$, each of which takes $O(n)$ time to compute. As such, the total running time is $O(n^3)$.