Object Lifetime and Storage Management

在考虑标识符和绑定(bindings)的时候, 关键在于区分标识符和它们所引用的对象, 以及以下事件: 对象的创建 绑定的创建 所有使用绑定的情况, 诸如引用变量, 子程序, 类型等 停用和重用那些暂时没有用的绑定 绑定的析构(destruction) 对象的析构 绑定的生命周期指的是这个绑定从创建到析构的整个过程, 类似的可以定义对象的生命周期. 通常情况下, 一个对象的生命周期可能会大于对应的绑定的生命周期, 即当 标识符不再引用该对象时, 该对象依然存在(例如在子程序中传入某个对象的引用, 如C++中的&参数). 当然, 一个绑定的生命周期也有可能大于对应对象的生命周期, 虽然这通常被认为是一个BUG. 对象的生命周期通常与以下内存分配(storage allocation)机制有关: 静态(static)对象会在程序的整个运行过程中被分配一个绝对地址. 栈(stack)对象随着子程序的调用和返回而被创建以及按照LIFO的顺序析构. 堆(heap)对象可以在任何时候创建和析构, 其额外要求更加通用和昂贵的内存分配算法. 静态分配(static allocation) 静态分配最明显的例子就是全局对象, 当然全局对象不是唯一的例子. 构成程序机器语言翻译的指令也可以认为是静态分配的变量; 数字和字符串的常量当然也是静态分配 的; 另外很多编译器会产生一系列的表用于支持运行时的debug, 动态类型检查, 垃圾回收, 异常处理等, 这些表也都是静态分配的. 静态分配的对象通常希望它们的值 不在变化, 因此经常被分配在被保护的只读的内存中以方便在试图修改其值产生中断并抛出运行时错误. 在很多语言中,一个具名常量通常要求有一个能够在编译期确定的初始值. 通常这些初始值都被限制在那些已知的常量以及内置的函数. 这些具名常量加上字面常量通常被 叫做表现常量(manifest constants)或者编译期常量(compile-time constants). 在某些语言中(C, Ada), 常量仅仅只是那些无法在elaboration time之后改变的值, 这些值可能依赖与其他在运行时才能确定的值. 这些elaboration-time的常量在作为递归函数的局部变量时必须要分配在栈上. C#显示提供了 声明两种常量的方法, 即const和readonly关键字. 另外编译器通常对于子程序的某些值采用特定的分配策略: 参数和返回值, 编译器通常会尽可能的将这些值存在寄存器中. 临时变量, 通常是那些复杂计算过程的中间值, 一个优秀的编译器也会将它们保存在寄存器内. bookkeeping information, 这些通常包含子程序的返回地址. 基于栈的分配(stack-based allocation) 一门语言如果想要支持递归, 那么在局部变量上采用静态分配的策略将不再适用(Fortran90之前不支持递归), 因为需要变量的个数是未知的. 所幸的是递归天然地适用 于栈结构的分配策略. 每一个子程序在运行时都有一个栈帧(frame, 或者称为活动记录, activation record), 包含了传入参数, 局部变量, 临时变量, 以及 bookkeeping信息. 通常传入参数位于帧的顶部, 方便被调用者定位参数, 而其他的布局则依赖于实现. 栈的维护是子程序调用序列的责任, 即调用者在调用前(序言, prologue)和调用后(尾声, epilogue)执行的代码. 通常有一个帧指针(frame pointer)来保存当前帧的地址, 在大多数语言的实现中, 栈都是往地址减 小的方向增长的. 在这样的实现方式下, 局部变量, 临时变量, bookkeeping信息对于帧指针有一个负的偏移, 而传入参数和返回对于帧指针则有一个正的偏移, 因为 这些都保存在调用者的帧上. ...

进程

The abstraction provided by the OS of a running program is something we call a process. And there are some APIs must be included in any interface of an operating system: create destroy wait miscellaneous control, most OS provide some kind of method to suspend a process and resume it. status, there are usually interfaces to get some status information about a process. The first thing that the OS must do to run a program is to load its code and any static data into memory, into the address space of the process. Once the code and static data are loaded into memory, there are a few other things the OS needs to do before running the process. Some memory must be allocated for the program’s run-time stack, and the OS may also allocate smoe memory for the program’s heap. Also, the os will do some other intialization tasks, particularly as related to I/O, i.e. in Unix systems each process by default has three open file descriptors, for standard input, output and error. ...

Functional Data Structure 2

Amortization Implementations with good amortized bounds are often simpler and faster than implementations with comparable worst-case bounds. Given a sequence of operations, we may wish to know the running time of the entire sequence, but not care about the running time of any individual operation. For instance, given a sequence of n operations, we may wish to bound the total running time of the sequence by O(n) without insisting than every individual operation run in O(1) time. We might be satisfied if a few operations run in O(log n) or even O(n) time, provided the total cost of the sequence is only O(n). This freedom opens up a wide design space of possible solutions, and often yields new solutions that are simpler and faster than worst-case solutions with equivalent bounds. ...

Functional Data Structure 1

Introduction To implement data structure in a functional way, there are two basic problems. First, from the point of view of designing and implementing efficient data structures, functional programming’s stricture against destructive updates(i.e. assignments) is a staggering handicap, tantamount to confiscating a master chef’s knives. Imperative data structures often rely on assignments in crucial ways, and so different solutions must be found for functional programs. The second difficulty is that functional data strcutures are expected to be more flexible than their imperative counterparts. In particular, when we update an imperative data structure we typically accept that the old version of the data strcuture will no longer be available, but when we update a functional data structure, we expect that both the old and the new version of the data structure will be available for further processing, this is called persistent, while the other is called ephemeral. And we are not surprised if the persistent version is more complicated and even less efficient that the ephemeral one. ...

Address Space

The Address Space The address space is the abstraction that OS is providing to the running program. The address space of a process contains all of the memory state of the running program. For example, the code of the program, the stack and the heap. In the sight of the program, it loaded into at a particular address and has a potentially very large address space, thus, we say that the OS is virtualizing memory. ...

Computer System 4

Linking Linking is the process of collecting and combining various pieces of code and data into a single file that can be loaded (copied) into memory and executed. Linking can be performed at compile time, when the source code is translated into machine code; at load time, when the program is loaded into memory and executed by the loader; and even at run time, by application programs. On modern systems, linking is performed automatically by programs called linkers. ...

Computer System 3

Optimization Writing an efficient program requires several type of activities. First, we must select an appropriate set of algorithms and data structures. Second, we must write source code that the compiler can effectively optimize to turn into efficient executable code. A third technique for dealing with especially demanding computations is to divide a task into portions that can be computed in parallel, on some combination of multiple cores and multiple processors. ...

Computer System 2

Assembler Computer execute machine code, sequences of bytes encoding the low-level operations that manipulate data manage memory, read and write data on storage devices, and communicate over networks. We will focus on x86-64, the commonest machine language used in processor with laptop and PC, also it’s widely used in supercomputer and lager data center. the x86-64 machine code is much different with the corresponding C code, the processor states below are everywhere: ...