Things I Learned

Computer System 3

Optimization Writing an efficient program requires several type of activities. First, we must select an appropriate set of algorithms and data structures. Second, we must write source code that the compiler can effectively optimize to turn into efficient executable code. A third technique for dealing with especially demanding computations is to divide a task into portions that can be computed in parallel, on some combination of multiple cores and multiple processors. ...

Computer System 2

Assembler Computer execute machine code, sequences of bytes encoding the low-level operations that manipulate data manage memory, read and write data on storage devices, and communicate over networks. We will focus on x86-64, the commonest machine language used in processor with laptop and PC, also it’s widely used in supercomputer and lager data center. the x86-64 machine code is much different with the corresponding C code, the processor states below are everywhere: ...

Computer System 1

Hardware Organization of A System A typical system has those hadrware below: Buses: Running throughout the system is a collection of electrical conduits called buses that carry bytes of ingormation back and forth betweent the components. Buses are typically designed to transfer fiexed sized chunks of bytes know as words. The number of bytes in a word is a fundamental system parameter that varies across systems. Most have word sizes of 8 bytes (64 bits)today ...

字符串处理算法

字符串处理算法具有很高的重要性以及应用领域的多样性. 以下讨论默认的是扩展的ASCII字符集(R=256). 字符串排序算法对于许多排序应用而言, 决定顺序的键都是字符串. 利用字符串的特殊性质来将其排序通常能够获得更高的效率. 键索引计数法我们首先介绍一种适用于小整数键的简单排序算法. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 public class KeyIndexedCount { public static void sort(char[] chr, int R) { int[] count = new int[R + 1]; char[] temp = new char[chr.length]; for (char aChr : chr) { count[aChr + 1]++; } for (int i = 0; i < R; i++) { count[i + 1] += count[i]; } for (char aChr : chr) { temp[count[aChr]] = aChr; count[aChr]++; } System.arraycopy(temp, 0, chr, 0, chr.length); } public static void sort(char chr[]) { sort(chr, 256); } } 进行频率统计, 将频率转换为索引, 任意一个键的起始位置总是在比它小的键的频率之和的后面, 用一个辅助数组进行排序, 将排序好的内容回写到原来的数组. 很容易发现, 算法仅仅使用了四个循环, 所以其时间复杂度是线性的(四次循环共访问数组10N+3R次). 上述算法的另一个显而易见的特性是排序的稳定性(因为该算法按顺序遍历数组来排序). ...

图算法

采用哪种数据结构来表示图, 主要考虑以下两个方面: 必须为可能在应用中碰到的各种类型的图预留出足够的空间, 以及 Graph的实例方法一定要高效. 常见的表示方法有以下几种: 邻接矩阵, 对于有N个顶点的图而言, 邻接矩阵需要$ N^2 $的空间. 边的数组, 通过定义边来定义图, 这种方法在寻找相邻的点时需要遍历整个数组. 邻接表数组, 使用一个顶点为索引的列表数组, 其中每个元素都是和该顶点相邻的顶点列表. 这种结构能够满足上述的两个条件. 常见实现的性能比较(V表示结点数, E表示边数): 数据结构所需空间添加边检查顶点是否相邻遍历所有相邻顶点边的列表 E 1 E E 邻接矩阵 $ V^2 $ 1 1 V 邻接表 E+V 1 degree(V) degree(V) 非稠密图的标准表示是邻接表, 这种表示具有以下特性: 使用的空间和V+E成正比, 添加一条边所需要的时间为常数, 以及遍历顶点v的所有相邻顶点所需要的时间和v的度数成正比. 无向图无向图只定义了顶点以及顶点之间的关系. 深度优先搜索 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 public class DepthFirstSearch { private boolean[] marked; private int count; public DepthFirstSearch(Graph G, int s) { marked = new boolean[G.V()]; dfs(G, s); } private void dfs(Graph G, int v) { marked[v] = true; count++; for (int w : G.adj(v)) { if (!marked[w]) { dfs(G, w); } } } public boolean marked(int w) { return marked[w]; } public int count() { return count; } } 深度优先搜索能够所有与起点相连的顶点, 且所需时间和所有连通顶点的度数之和成正比. ...

查找算法

符号表是一种存储键值对的数据结构, 支持插入和查找操作. 要确定一个给定的键是否存在于符号表中, 首先要建立对象等价性的概念. 为了保证一致性, 最好选择不可变的数据类型作为键. 成本模型: 统计比较的次数, 在内循环不进行比较的情况下, 转为统计数组的访问次数. 二叉查找树每个结点的键都大于其左子树的任意结点的键而小于右子树的任一结点的键. 一棵二叉查找树代表了一组键(及其相应的值)的集合, 同一个集合可以用多棵不同的二叉树表示, 如果将所有键都投影到一条直线上, 则可以保证得到一个有序的键列. 查找的惯用做法是: 如果含有该键的结点存在于表中, 则返回相应的值或者返回null. 我们将采用Java 8的Optional类型使得查找的结果总是返回Optional, 将显示的调用交给使用者. 使用二叉查找树的算法的运行时间取决于树的形状. 在最好的情况下, 一棵含有N个结点的树是完全平衡的, 每条空链接和根结点的距离都为lgN; 而在最坏的情况下树的高度为N. 当键的分布是随机的情况下, 二叉查找树一般具有较好的平衡性, 此时查找或者插入操作平均所需的次数为2lnN(1.39lgN). 虽然二叉查找树的查找成本比二分查找高约39%, 但插入操作是对数级别的. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 public class BST<Key extends Comparable<Key>, Value> { private Node root; private class Node { private Node left, right; private Key key; private Value val; private int N; Node(Key key, Value val, int N) { this.key = key; this.val = val; this.N = N; } } public int size() { return size(root); } private int size(Node node) { if (node == null) return 0; return node.N; } public Optional<Value> get(Key key) { return Optional.of(get(root, key)); } private Value get(Node node, Key key) { if (node == null) return null; int cmp = key.compareTo(node.key); if (cmp > 0) get(node.right, key); else if (cmp < 0) get(node.left, key); return node.val; } public void put(Key key, Value val) { root = put(root, key, val); } private Node put(Node node, Key key, Value val) { if (node == null) return new Node(key, val, 1); int cmp = key.compareTo(node.key); if (cmp > 0) node.right = put(node.right, key, val); else if (cmp < 0) node.left = put(node.left, key, val); else node.val = val; node.N = size(node.left) + size(node.right) + 1; return node; } } 平衡查找树 2-3查找树一棵2-3查找树或为一棵空树, 或者由以下结点组成: ...

优先队列

优先队列当一颗二叉树的每个节点都大于等于它的两个子结点时, 它被称为堆有序, 于是从任一结点向上都能够得到一列非递减的元素. 二叉堆(以下简称堆)是一组能够用堆有序的完全二叉树排序的元素, 并在数组中按照层级存储. 在一个堆中, 位置k的节点的父结点的位置为$ \lfloor k/2 \rfloor $, 而它的两个子节点的位置是2k和 2k+1. 对于一个含有N个元素的堆, 插入元素操作只需不超过lgN+1次比较, 删除最大元素的操作不超过2lgN次比较. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 public class MaxPQ<Key extends Comparable<Key>> { private Key[] pq; private int N = 0; @SuppressWarnings("unchecked") private Key[] cast(Object obj) { return (Key[]) obj; } private boolean less(int i, int j) { return pq[i].compareTo(pq[j]) < 0; } private void exch(int i, int j) { Key t = pq[i]; pq[i] = pq[j]; pq[j] = t; } private void swim(int k) { while (k > 1 && less(k / 2, k)) { exch(k, k / 2); k = k / 2; } } private void sink(int k) { while (2 * k <= N) { int j = 2 * k; if (j < N && less(j, j + 1)) j++; if (!less(k, j)) break; exch(k, j); k = j; } } private void resize(int newSize) { Key[] t = cast(new Comparable[newSize]); System.arraycopy(pq, 1, t, 1, N); pq = t; } public MaxPQ() { pq = cast(new Comparable[1]); } public MaxPQ(int max) { pq = cast(new Comparable[max + 1]); } public MaxPQ(Key[] a) { } public void insert(Key v) { if (N >= pq.length - 1) resize(pq.length * 2); pq[++N] = v; swim(N); } public Key max() { return pq[1]; } public Key delMax() { if (N < (pq.length - 1) / 2) resize(pq.length / 2); Key max = pq[1]; exch(1, N); pq[N--] = null; sink(1); return max; } public boolean isEmpty() { return N == 0; } public int size() { return N; } } 索引优先队列索引优先队列可以看成一个能够快速访问其中最小元素的数组, 事实上, 它能够快速访问数组的一个特定子集中的最小元素. ...

排序算法

排序算法的目标就是将所有元素的主键按照某种方式排列. 成本模型: 计算比较和交换的次数, 对于不交换元素的算法, 计算访问数组的次数. 排序算法的额外内存开销和运行时间是同等重要的, 排序算法可以分为原地排序算法以及需要额外内存空间来存储另一份数组副本的其他排序算法. 我们使用的算法模板如下 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 public interface Sort { static void sort(Comparable[] a) { } static boolean less(Comparable v, Comparable w) { return v.compareTo(w) < 0; } static void exch(Comparable[] a, int i, int j) { Comparable t = a[i]; a[i] = a[j]; a[j] = t; } static void show(Comparable[] a) { for (Comparable x : a) { StdOut.print(x + " "); StdOut.println(); } } static boolean isSorted(Comparable[] a) { for (int i = 1; i < a.length; i++) { if (less(a[i], a[i - 1])) return false; } return true; } static void main(String[] args) { String[] a = StdIn.readAllStrings(); sort(a); assert isSorted(a); show(a); } } 选择排序最简单的算法是选择排序, 不断寻找数组剩下元素的最小值, 并将其放在适当的位置. 其特点在于: ...