This post summarizes several common concepts and algorithms that can be helpful for coding interview. The concept is taken from the Book: Cracking the Coding Interview.

Bit Manipulation

Bit operations: AND (&), OR (|), XOR (^), NOT (~), shift (<<,>>). Left shifting by 1 corresponds to a multiplication of 2.

  • AND: x & 0s = 0, x & 1s = x, x & x = x;
  • OR: x | 0s = x, x | 1s = 1s, x | x = x;
  • XOR: x ^ 0s = x, x ^ 1s = ~x, x ^ x = 0.

remove the least significant bit (LSB)
Given an integer num, consider its binary representation, how do we set its LBS from 1 to 0? The trick is num & (nums-1).

Representation of Negative Numbers

Suppose there is a N-bit integer, to represent a negative number, the sign bit is set to 1. We then have N-1 bits to represent values. The negative number is represented as the complement of its absolute number with respect to 2^{N-1}.

If N=4, then -3 is represented as 1101, while 3 represented as 0011. In other words, -K is represented as concat(1, 2^{N-1}-k).

From the bit operation, to represent -K, we invert the bits of K and then add 1, that is: ~K + 1.

The range of numbers represented by a N-bit number is [-2^{N-1}, 2^{N-1}-1]. It is interesting to know that 1s (all 1 in every bit) is -1, while 10s (1 followed by N-1 0s) is -2^{N-1}.

Arithmetic and Logical Right Shift

  • Logical Right Shift: shift the bits and put a 0 in the most significant bit. Note this is indicated with a >>> operator.
  • Arithmetic Right Shift: shift bits and fill in the most significant bit with the value of the sign bit. Indicated by a >> operator. This has the effect of dividing by 2 (taking the lower bound).

Prime Numbers

  • Every positive integer can be decomposed into a product of primes.
  • All non-prime numbers are divisible by a prime number other than itself.

Sorting

Bubble Sort | Time: O(N^2). Memory: O(1)

We aim to put the minimum number at the beginning of array by comparing the number at 0 index with the rest and swapping them if the number at 0 is larger. Then, repeat this process for the index 1, 2 until the end of array.

Selection Sort | Time: O(N^2). Memory: O(1)

We find the minimum number simply by scanning through the whole array, pick the min, and swap it with the number at the index of 0. Repeat this process for the second smallest number,....

Insertion Sort | Time: O(N^2). Memory: O(1)

The basic idea is that assuming we have a sorted array, and given a new number, we simply need to find the correct position in the sorted array for this new number and insert.

Start from the second element of a unsorted array. (Note that the left side array is trivially sorted.) Compare the current element to numbers on the left side (sorted array) from right to left, when we find the current element is larger than a number at index i, then insert the current element just after the index i. Repeat this process for the third element, ....

Merge Sort | Time: O(N logN). Memory: O(N)

This is a recursive algorithm. Given a unsorted array, divide it into two halves, sort these two halves, and then merge them together (O(N)). To sort those halves, the same merge sort algorithm is applied.

PriorityQueue Sort | Time: O(N logN). Memory: O(N)

Simply add all elements to a priority queue, and then extract elements.

Counting Sort | Time: O(N+k). Memory: O(k)

Suppose the range of numbers is known to be within [0,k], and k is not too large.

Initialize a k-length count array C. Scan through the unsorted array, for each number n, add by for C[n]. Then simply scan through the count array and print according to C[n].

Radix Sort | Time: O(cN). Memory: O(N)

Suppose k=N^c. Represent all numbers in the base of N. The number of bits to represent a number is: log_{N}k = c. To sort, we first sort numbers by the least significant digit by counting sort, and sort number by the next-least significant digit, and .... Therefore, we have the time complexity O(c(N+N)) = O(cN).

Graph Search

We skip the discussions of DFS and BFS, which are covered in this post.

Topological Sort

A topological sort of a directed graph is a way of ordering the list of nodes such that if (a,b) is an edge in the graph then a will appear before b in the list. The topological sort can be helpful, for example, in ordering the assembly line.

The key is to maintain the "ancestor/descendant" relation as indicated in the graph. Therefore, we can first find a node without any incoming edge, and start with this node to run a DFS while printing nodes we have visited in order.

Another way without the need of finding the node without incoming edges is to run a DFS for each node (only if we have not visited), while using a stack to collect nodes. And then print them by popping nodes from the stack at the end. The detailed implementation is below (generated by ChatGPT 4).

import java.util.*;

public class TopologicalSort {
    private int numVertices;
    private ArrayList<ArrayList<Integer>> adj;

    // Constructor
    public TopologicalSort(int numVertices) {
        this.numVertices = numVertices;
        adj = new ArrayList<ArrayList<Integer>>(numVertices);
        for (int i = 0; i < numVertices; i++)
            adj.add(new ArrayList<Integer>());
    }

    // Function to add an edge into the graph
    public void addEdge(int v, int w) {
        adj.get(v).add(w);
    }

    // A recursive function used by topologicalSort
    private void topologicalSortUtil(int v, boolean visited[], Stack<Integer> stack) {
        // Mark the current node as visited.
        visited[v] = true;
        Integer i;

        // Recur for all the vertices adjacent to this vertex
        for (Integer neighbor : adj.get(v)) {
            if (!visited[neighbor])
                topologicalSortUtil(neighbor, visited, stack);
        }

        // Push current vertex to stack which stores result
        stack.push(new Integer(v));
    }

    // The function to do Topological Sort. It uses recursive topologicalSortUtil()
    public void topologicalSort() {
        Stack<Integer> stack = new Stack<Integer>();

        // Mark all the vertices as not visited
        boolean visited[] = new boolean[numVertices];
        for (int i = 0; i < numVertices; i++)
            visited[i] = false;

        // Call the recursive helper function to store Topological Sort
        // starting from all vertices one by one
        for (int i = 0; i < numVertices; i++)
            if (!visited[i])
                topologicalSortUtil(i, visited, stack);

        // Print contents of stack
        while (!stack.isEmpty())
            System.out.print(stack.pop() + " ");
        System.out.println();
    }

    public static void main(String args[]) {
        // Create a graph given in the above diagram
        TopologicalSort g = new TopologicalSort(6);
        g.addEdge(5, 2);
        g.addEdge(5, 0);
        g.addEdge(4, 0);
        g.addEdge(4, 1);
        g.addEdge(2, 3);
        g.addEdge(3, 1);

        System.out.println("Following is a Topological sort of the given graph");
        // Function Call
        g.topologicalSort();
    }
}

Note that in the above code, we always push the descendants first while root last. It works in the similar way as the Post-Order Traversal of the tree.

Topological sort basically needs to realize the Pre-Order Traversal of tree on a graph.

Dijkstra's Algorithm

When a graph has (positive) weights on each edge, we want to find the shortest path between two points in a weighted directed graph.

Suppose we want to go from the node S to another node D. We need a priority queue to maintain the current total weight from the source S for each node. Initially, total weights are set to infinity for all node except for S, which is 0. As long as the queue is not empty, we take the node n with the minimum weight (this weight is the min weight from the source to this node), and loop over its adjacent nodes v, check whether the current weight in the queue d[v] is larger than d[n]+w(n,v), if so, update d[w].

In sum, we need |V| extract_min operations and |E| update operations. If the priority queue is implemented with a min_heap, then time complexity is O(log|V|(|V|+|E|)).

Bellman-Ford Algorithm

The algorithm can be applied to a directed graph with negative weights. Note that if it contains negative cycle, the algorithm is capable to detect.

Similarly, we need to maintain an array holding the total weight from the source S for each node, which are set to infinity for all node except for S. Then the algorithm works as below:

for i = 1 to |V|-1
    for each edge (u,v) in E
        relax(u,v,W)
    end
end
// now we should have the shortest path, if the graph contains a negative cycle, the code below could detect this
for each edge (u,v) in E
    if d[v] > d[u] + w(u,v)
        report that negative cycle exists!
    end
end

The time complexity is O(|V||E|).

Binary Search

Given a sorted array arr and a number num, there are a few different cases:

Find the index of num in arr, return -1 if not found

Code 1

int l = 0, r = arr.length-1;
while(l<=r) {
    int mid = l + (r-l)/2;
    if(arr[mid]==num) return mid;
    if(arr[mid]>num) r=mid-1;
    else l=mid+1;
}
// if reach here, means not found
return -1;

Note the range of l, r, termination condition of while loop and the update of l and r.

Find the insertion index of num in arr to keep arr sorted

Here comes two cases, consider an array 1 | 3 | 5, if num=4

  • right-most insertion will insert 4 at the index 2, in other words, we find the minimum of numbers that are larger or equal to num.
  • left-most insertion will insert 4 at the index 1, in other words, we find the maximum of numbers that are smaller or equal to num.

A slight change of the above code would achieve this goal.

Code 2

int l = 0, r = arr.length-1;
while(l<=r) {
    int mid = l + (r-l)/2;
    if(arr[mid]==num) return mid;
    if(arr[mid]>num) r=mid-1;
    else l=mid+1;
}
// if use right-most insertion
return l;
// if use left-most insertion
return r;

Note that in the above code, when the code reaches after while loop, we have r<l, or specifically, r=l-1. The range of l is [0, arr.length], while the range of r is [-1, arr.length-1]. It is worth mentioning that the l pointer is used to search the right half of the array when the middle element is less than the target. If the target is not found, l will eventually point to the smallest element greater than the target. On the other hand, the r pointer is used to search the left half of the array when the middle element is greater than the target. After the loop, if the target is not found, r will point to the largest element less than the target.

Duplicate numbers in arr

Note that if numbers in arr are not unique, the returned index is not unique if the target number occurs multiple times in arr.

Now suppose we have an array 1|3|3|5, if num=3, we want to return index of 2 (the right-most element of 3), the above code is not working anymore. Suppose we want to find the left-most insertion or the index of right-most element of target, we can change this problem to: find the minimum of all numbers that are strictly larger than num, and return the found index minus one. For example, the following code aims to find the minimum of all numbers that are strictly larger than num, as indicated by l pointer.

Code 3

int l=0, r=arr.length;
while(l<r) {
    int mid = l + (r-l)/2;
    if(arr[mid]>num) r = mid;
    else l = mid+1;
}
// return the index of the minimum number (left-most number when duplicate numbers) that is larger than num
return l;

Note that here the update condition for r is r = mid; without decrementing mid by one. Once we use such update conditions, the terminating condition of while loop can not be while(l<=r) since this would lead to infinite loop when mid==l==r and arr[mid]>num. Instead, we use the condition while(l<r). Another thing to note is that r is initialized to arr.length instead of arr.length-1. If for example, we set r=arr.length-1, which would lead to the issue: mid will never point to arr.length-1, since this requires l point to arr.length-1, which would exit from while loop. In other word, we never check the last element of arr. The range of l is [0, arr.length-1]. If num=6, then we would get the index of 3 instead of 4. In order to check all elements in arr, we initialize r to arr.length. In this case, the range of l is [0, arr.length]. When exiting from while loop, we have l==r, so either return l; or return r; works.

In fact, to find the index of minimum number that is larger than num, we can still use the original terminating condition of while loop: while(l<=r). We could modify the code as below:

Code 4

int l = 0, r = arr.length-1;
int result = arr.length;
while(l<=r) {
    int mid = l + (r-l)/2;
    if(arr[mid]>num) {
        result = mid;
        r = mid-1;
    }else l = mid+1;
}
return result;

Note that in the previous code 3, the update condition for r is r = mid; instead of r = mid-1; since mid itself could be the answer. In order to use the update condition r = mid-1; (that is, discarding mid), we use an additional variable result to log these possible candidates. And everything else is the same as Code 2.

Similarly, to find the index of largest number (right-most number when duplicate numbers) that is smaller or equal to num:

Code 5

int l = 0, r = arr.length-1;
int result = -1;
while(l<=r) {
    int mid = l + (r-l)/2;
    if(arr[mid]<=num) {
        result = mid;
        l=mid+1;
    }else r=mid-1;
}
return result;

Finally, note that the update condition for l can never be l = mid; whatever in case of while(l<r) or while(l<=r), as this will lead to infinite loop again.