CSAPP-Data Lab

发布于 2022-03-27  103 次阅读


Introduction

Students implement simple logical, two's complement, and floating point functions, 
but using a highly restricted subset of C. For example, they might be asked to compute 
the absolute value of a number using only bit-level operations and straightline code. 
This lab helps students understand the bit-level representations of C data types and 
the bit-level behavior of the operations on data.
学生实现简单的逻辑, 二进制补码和浮点函数, 但使用高度受限的C子集. 例如, 他们会被要求仅使用位级
运算和直线代码去计算一个数字的绝对值。这个实验能帮助学生理解C数据类型的位级表示以及操作数据的
位级行为。

一:Overview

In this lab, students work on a C file, called bits.c, that consists
of a series of programming "puzzles".  Each puzzle is an empty
function body that must be completed so that it implements a specified
mathematical function, such as "absolute value". Students must solve
the non-floating point puzzles using only straight-line C code and a
restricted set of C arithmetic and logical operators. For the
floating-point puzzles they can use conditionals and arbitrary
operators.

Students use the following three tools to check their work.
Instructors use the same tools to assign grades.

1. dlc: A "data lab compiler" that checks each function in bits.c for
compliance with the coding guidelines, checking that the students use
less than the maximum number of operators, that they use only
straight-line code, and that they use only legal operators. The
sources and a Linux binary are included with the lab.
Running './dlc -z' to identify coding rules violations.

2. btest: A test harness that checks the functions in bits.c for
correctness. This tool has been significantly improved, now checking
wide swaths around the edge cases for integers and floating point
representations such as 0, Tmin, denorm-norm boundary, and inf.
Compiling and running './btest -g' to determine correctness score.

3. driver.pl: A autograding driver program that uses dlc and btest to
check each test function in bits.c for correctness and adherence to
the coding guidelines
在这个lab中, 学生处理叫做bits.c的C文件, 其中包含一系列编程谜题. 每个谜题都是一个空的且必须完
成的函数体以便它实现指定的数学函数, 例如绝对值.学生必须仅使用straight-line的C代码和一个受限的
C算术和逻辑运算符来解决非浮点难题.对于浮点数谜题,他们可以使用条件和任意运算符。

学生使用以下三个工具来检查他们的工作。
教师使用相同的工具来分配成绩。

1. dlc: 一个data lab compiler, 它检查bits.c中的每个函数是否符合编程规则,  检查学生使用的运算
符数是否少于最大数量,检查他们是否只使用直线代码,并且检查他们是否只使用了合法的操作符. 源代码
和二进制文件包含在实验中. 
运行'./dlc -z'去识别编码规则违规.

2. btest: 一个测试工具,用于检查bits.c中的函数的正确性. 此工具已得到显着改进,现在可以检查整
数和浮点表示(例如 0、Tmin、denorm-norm 边界和 inf)的边缘情况周围的宽幅。
编译并运行 './btest -g' 以确定正确性分数. 

3. driver.pl: 自动分级驱动程序,使用dlc和btest检查bits.c中的每个测试函数的正确性和是否符
合编程规则。

二:Files

All CS:APP labs have the same simple top-level directory structure:

Makefile	   Builds the entire lab.
README		   This file.
src/		   Contains all source files for the lab.
datalab-handout/   Handout directory that goes to the students. Generated
		   by the Makefile from files in ./src. Never modify anything
		   in this directory. 
grade/		   Autograding scripts that instructors can use to 
		   grade student handins.
writeup/	   Sample Latex lab writeup.
contest/           Everything needed for the optional "Beat the Prof" contest.

三:Building the Lab

Step 0. If you decide to run the "Beat the Prof" contest (section 5),
then edit the ./contest/Contest.pm file so that the driver knows where
to send the results. See ./contest/README for the simple
instructions. If you decide *not* to offer the contest, then do
nothing in this step.

Step 1. Select the puzzles you want to include by editing the file
./src/selections.c.

The default ./src/selections.c comes from a previous instance of the
Data Lab at CMU.  The file ./src/selections-all.c contains the
complete list of puzzles to choose from.

Step 2. Modify the Latex lab writeup in ./writeup/datalab.tex to 
tailor it for your course. 

Step 3. Type the following in the current directory:
     unix> make clean
     unix> make 

The Makefile generates the btest source files, builds the dlc binary
(if necessary), formats the lab writeup, and then copies btest, the
dlc binary, and the driver to the handout directory.  After that, it
builds a tarfile of the handout directory (in ./datalab-handout.tar)
that you can then hand out to students.

Note on Binary Portability: dlc is distributed in the datalab-handout
directory as a binary. Linux binaries are not always portable across
distributions due to different versions of dynamic libraries. You'll
need to be careful to compile dlc on a machine that is compatible with
those that the students will be using.

Note: Running "make" also automatically generates the solutions to the
puzzles, which you can find in ./src/bits.c and ./src/bits.c-solution.
cd datalab-handout
make
如果报错bits.h未找到, sudo apt-get install gcc-multilib之后再make即可

四:bits.c

/*
 * Instructions to Students:
 *
 * STEP 1: Read the following instructions carefully.
 */

You will provide your solution to the Data Lab by
editing the collection of functions in this source file.

INTEGER CODING RULES:
 
  Replace the "return" statement in each function with one
  or more lines of C code that implements the function. Your code 
  must conform to the following style:
 
  int Funct(arg1, arg2, ...) {
      /* brief description of how your implementation works */
      int var1 = Expr1;
      ...
      int varM = ExprM;

      varJ = ExprJ;
      ...
      varN = ExprN;
      return ExprR;
  }

  Each "Expr" is an expression using ONLY the following:
  1. Integer constants 0 through 255 (0xFF), inclusive. You are
      not allowed to use big constants such as 0xffffffff.
  2. Function arguments and local variables (no global variables).
  3. Unary integer operations ! ~
  4. Binary integer operations & ^ | + << >>
    
  Some of the problems restrict the set of allowed operators even further.
  Each "Expr" may consist of multiple operators. You are not restricted to
  one operator per line.

  You are expressly forbidden to:
  1. Use any control constructs such as if, do, while, for, switch, etc.
  2. Define or use any macros.
  3. Define any additional functions in this file.
  4. Call any functions.
  5. Use any other operations, such as &&, ||, -, or ?:
  6. Use any form of casting.
  7. Use any data type other than int.  This implies that you
     cannot use arrays, structs, or unions.

 
  You may assume that your machine:
  1. Uses 2s complement, 32-bit representations of integers.
  2. Performs right shifts arithmetically.
  3. Has unpredictable behavior when shifting if the shift amount
     is less than 0 or greater than 31.

EXAMPLES OF ACCEPTABLE CODING STYLE:
  /*
   * pow2plus1 - returns 2^x + 1, where 0 <= x <= 31
   */
  int pow2plus1(int x) {
     /* exploit ability of shifts to compute powers of 2 */
     return (1 << x) + 1;
  }

  /*
   * pow2plus4 - returns 2^x + 4, where 0 <= x <= 31
   */
  int pow2plus4(int x) {
     /* exploit ability of shifts to compute powers of 2 */
     int result = (1 << x);
     result += 4;
     return result;
  }

FLOATING POINT CODING RULES

For the problems that require you to implement floating-point operations,
the coding rules are less strict.  You are allowed to use looping and
conditional control.  You are allowed to use both ints and unsigneds.
You can use arbitrary integer and unsigned constants. You can use any arithmetic,
logical, or comparison operations on int or unsigned data.

You are expressly forbidden to:
  1. Define or use any macros.
  2. Define any additional functions in this file.
  3. Call any functions.
  4. Use any form of casting.
  5. Use any data type other than int or unsigned.  This means that you
     cannot use arrays, structs, or unions.
  6. Use any floating point data types, operations, or constants.

NOTES:
  1. Use the dlc (data lab checker) compiler (described in the handout) to 
     check the legality of your solutions.
  2. Each function has a maximum number of operations (integer, logical,
     or comparison) that you are allowed to use for your implementation
     of the function.  The max operator count is checked by dlc.
     Note that assignment ('=') is not counted; you may use as many of
     these as you want without penalty.
  3. Use the btest test harness to check your functions for correctness.
  4. Use the BDD checker to formally verify your functions
  5. The maximum number of ops for each function is given in the
     header comment for each function. If there are any inconsistencies 
     between the maximum ops in the writeup and in this file, consider
     this file the authoritative source.

/*
 * STEP 2: Modify the following functions according the coding rules.
 * 
 *   IMPORTANT. TO AVOID GRADING SURPRISES:
 *   1. Use the dlc compiler to check that your solutions conform
 *      to the coding rules.
 *   2. Use the BDD checker to formally verify that your solutions produce 
 *      the correct answers.
 */
/*
 * 学生须知:
 *
 * 第 1 步:仔细阅读以下说明。
 */

您将通过编辑此源文件中的函数向Data Lab提供您的解决方案。

整数编码规则:
	将每个函数中的“return”语句替换为一个或多行实现该功能的 C 代码。你的代码
  必须符合以下风格:
	int Funct(arg1, arg2, ...) {
      /*简要描述你的实现是如何工作的*/
      int var1 = Expr1;
      ...
      int varM = ExprM;

      varJ = ExprJ;
      ...
      varN = ExprN;
      return ExprR;
  }
	每个“Expr”都是一个仅使用以下内容的表达式:
	1. 整数常量 0 到 255 (0xFF),包括端点。你是
      不允许使用大常量,例如 0xffffffff。
	2. 函数参数和局部变量(无全局变量)。
	3. 一元整数运算! ~
  4. 二元整数运算&^| + << >>

	一些问题甚至进一步限制了允许的运算符集。每个“Expr”可能由多个运算符组成。
	您不限于每行一个操作符。
明确禁止:
  1. 使用任何控制结构,例如 if、do、while、for、switch 等。
  2. 定义或使用任何宏。
  3. 在此文件中定义任何附加功能。
  4. 调用任何函数。
  5. 使用任何其他操作,例如 &&、||、- 或 ?:
  6. 使用任何形式的铸造。
  7. 使用除int以外的任何数据类型。这意味着你
     不能使用数组、结构或联合。

可以假设机器:
  1. 使用 2s 补码,整数的 32 位表示。
  2. 以算术方式执行右移。
  3. 如果移位量小于 0 或大于 31,则在移位时具有不可预测的行为。

可接受的编码风格示例:
  /*
   * pow2plus1 - 返回 2^x + 1,其中 0 <= x <= 31
   */
  int pow2plus1(int x) {
     /* 利用移位的能力来计算 2 的幂 */
     返回 (1 << x) + 1;
  }
	/*
   * pow2plus4 - returns 2^x + 4, where 0 <= x <= 31
   */
  int pow2plus4(int x) {
     /* exploit ability of shifts to compute powers of 2 */
     int result = (1 << x);
     result += 4;
     return result;
  }

浮点编码规则:
对于需要实现浮点运算的问题,编码规则不那么严格。您可以使用循环和条件控制。您可以同时使用整
数和无符号数。您可以使用任意整数和无符号常量。您可以对 int 或无符号数据使用任何算术、逻辑
或比较运算。

明确禁止:
  1. 定义或使用任何宏。
  2. 在此文件中定义任何附加功能。
  3. 调用任何函数。
  4. 使用任何形式的转换。
  5. 使用除int或unsigned之外的任何数据类型。这意味着你
     不能使用数组、结构或联合。
  6. 使用任何浮点数据类型、运算或常量。

NOTES:
	1. 使用 dlc(data lab checker)编译器(在讲义中描述)来检查您的解决方案的合法性。
	2. 每个函数都有一个最大数量的操作(整数、逻辑或比较),您可以用于实现该函数。
	最大运算符计数由dlc检查。注意赋值 ('=') 不计算在内;您可以根据需要使用任意数量的这些,
	而不会受到惩罚。
	3. 使用 btest 测试工具检查您的功能是否正确。
	4.使用BDD检查器正式验证你的函数。
	5. 每个函数的最大操作数在每个函数的标题注释中给出。如果writeup中的最大操作数与此文件中
	的最大操作数之间存在任何不一致,请将此文件视为权威来源。

/*
 * STEP 2:根据编码规则修改以下函数。
 *
 * 重要的。为避免评分意外:
 * 1.使用dlc编译器检查您的解决方案是否符合
 * 编码规则。
 * 2. 使用BDD检查器正式验证您的解决方案是否产生
 * 正确答案。
 */
  1. 异或的表示:
int bitXor(int x, int y) {
    //x | y = ~(~x & ~y)
    return ~(~(~x & y) & ~(x & ~y));
    //(~x & y) | (x & ~y)
    //(a | b) & (~a | ~b)
    //(a | b) & ~(a & b)
}
  1. 返回整型的最小二进制补码
int tmin(void)
{
	return 1 << 31;     //0x80000000
}
  1. 判断是否是int有符号整型的最大值(0x7fffffff)

利用+1后溢出来判断, 但是-1, 也就是0xffffffff,也是+1后溢出,但是抓住两个溢出后的不同点,如果是0x7fffffff(0b01111111111111111111111111111111),加1后变成(0b10000000000000000000000000000000)(-0x7fffffff),即变成自己的反,但是没有溢出int的四字节大小范围,而0xffffffff(-1),溢出后直接所有的bit位都变为0,变为(0b00000000000000000000000000000000)。

所以 !!(~x)就可以区分开这两个,~是bit位取反,而!是值取反,所以0xffffffff进行!!(~x)运算后返回0,0x7fffffff进行!!(~x)运算后返回1。

int isTmax(int x)
{
    int a = !(~(x + x + 1));
    int b = !!(~x);
    return a & b;
}
  1. 判断所有的奇数位是否为1

首先我们就要构造一个奇数位上全是1,偶数位上全是0的数(0xaaaaaaaa)(0b1010 1010 1010 1010 1010 1010 1010 1010),然后将值与这个数进行异或,异或后所有的奇数位都变为0

int allOddBits(int x) {
    int temp = 0xaa | (0xaa << 8) | (0xaa << 16) | (0xaa << 24);    //构造出奇数位全为1, 偶数位全为0的数
    int judge = temp & x;  //先进行&操作得到偶数位全为0,奇数位不变的数, 如果x的奇数位全为1,那么这里得到的judge就是和temp相等
    int result = !(temp ^ judge);  //相同的值进行异或后为0, !取反后就为1
    return result;
}
  1. 返回相反数(二进制位取反再加1)
int negate(int x) {
  return ~x + 1;
}
  1. 判断是否属于数字'0'-'9'的ASCII码范围

先判断二进制位在倒数下标4,5的值是否是1,以及前面的bit位是否是0,不是则先排除

之后再判断倒数下标3的值是否是0,是0则后面的bit不管是0还是1都可以(30-37)

如果倒数的下标3的值为1,那么倒数1,2下标都必须为0,倒数0,1下标随意(38-39)

int isAsciiDigit(int x)
{
    int r1 = 0xf; //0b1111
    int r2 = 0x30; //0b110000
    int jud1 = !(~((~((r1 | x) ^ r2)) | r1));
    int jud2 = !(x & (1 << 3));
    int jud3 = !(x & 6);
    return jud1 & (jud2 | ((!jud2) & jud3));
}
  1. 实现?条件运算符
conditional - same as x ? y : z
Example: conditional(2,4,5) = 4
int conditional(int x, int y, int z)
{
    int a = (!!x) + (~1 + 1);     //(~1 + 1)得到0xffffffff,如果x有值,!!x为1,如果x为0,!!x为0
    //这个式子是x有值,返回0(a=0),如果x为0,返回0xffffffff(a=0xffffffff)

    //当x有值时, a为0,我们必须返回y,可以(y & ~a),这样可以得到y
    //当x为0时,a为0xffffffff,我们必须返回z,可以(z & a),这样可以得到z
    return (y & ~a) | (z & a);//最后用或来进行连接,因为x要么为0,要么为1
}
  1. 实现≤返回1,否则返回0

如果符号不相等,且y的符号位为0,那么显然y>x,返回1 (0和正数 > 负数)

如果符号相等,如果y≥x,对x取反后,加上y,肯定是≥0的,不管正负,符号位都是0,>>31后变为全为0,然后!对值取反即可得到1

int isLessOrEqual(int x, int y) {
  int sy = (y >> 31);      //得到y的符号位
  int signSame = (sy ^ (x >> 31)); //判断x的符号位和y的符号位是否相等
  int xor = signSame & (!sy); //当符号位不同且y为正, 肯定y就大于或等于x
  int same = !((y + (~x + 1)) >> 31) & (!signSame);
  //当符号位相同时, 如果y>=x, 对x取反, 相加后肯定大于等于0, 符号位肯定为0, 取~后bit位全为1 
  return xor | same;
}
  1. 实现!操作符,使用除!操作符之外的所有操作符

只有当0的时候返回1,其他时候都返回0,找0的特点,0的相反数等于它本身,但是可以找到另外一个值TMIN(1 << 31),也是相反数等于它本身,所以必须排除TMIN,这样即可判断出0从而返回1

int logicalNeg(int x)
{
    //先判断其相反数是否等于本身
    int r1 = (x ^ (~x + 1)) >> 31;//当x为0的时候返回0,有值的时候(x ^ (~x + 1))返回(一个数异或其相反数得到负的两倍这个数), 所以>>31得到0xffffffff
    r1 = (~r1 & 1);//判断r1的值是否为0
    //其次判断x的符号位是否为1, 为1就说明是TMIN返回0, 为0就说明不是TMIN返回1
    int r2 = (~(x >> 31) & 1);
    return r1 & r2;//同时满足
}
  1. 计算利用二进制补码表示一个数所需要的最小bit位 -1 和0 是一组。 1,和-2是一组。(表示1 至少需要01,表示-2需要10) 表示2需要(010) 根据这个法则。一个负数的取反 转为正数,则那个正数的位数 既是负数的位数。 判断一个数不是正的,就是看符号位,我需要把符号位变成全1 或全0 来&真实的数。 正数是全0, 正数要和X 是一组。 所以 (~isPos & x) 得到正数结果(如果负数为0) 负数要和~X一组。 所以(isPos & (~x)) 得到负数结果(如果正数为0) 变为正数以后,就看最高位的1 在哪了。 比如 2,最高的为的1 在第二位。那么结果需要+1(只有是0的时候,不加这个1) 我们可以用2分法。 来看下高16位有没有1(等价于是不是全0)。 tmp 就是高位有1,则为1. 高位没1,则为0 有1 的情况,需要+16 然后丢掉后16位, 没1 啥也不加,丢掉前16位。 为了做条件选择,就把TMP左移31位 右移31位。高位有1,则为全1. 高位没1,则为0 全1 和 丢掉后16位(为cur) 是一组。 全0 和 原来的X是一组(为原x,因为高位没1,前16位本来就是0) 上述步骤为一个循环。 接下来只有后16位可能有1., 16位里2分8这个步骤 接下来 8位里2分4, 4位里2分2,

2位里二分1. 因为结果要么是1,要么是0了。可以res直接加上。 最后1位不用2分,此位一定要直接+1.(如果最后位是0,则不需要加)

int howManyBits(int x) {//step 1. if neg -> ~x, pos -> x
  int isPos = (x >> 31); //0000 is pos
  int res = 0; // for cal ans
  int cur,tmp,b16,isZero;
  x = (isPos & (~x)) | (~isPos & x); //if x is pos use post part, is neg user pre part
  isZero = !x; //incl 0 & -1, i is zero
  cur = x >> 16;
  tmp = !!cur;
  res = res + (tmp << 4);
  b16 = (tmp << 31) >> 31;
  x = (~b16 & x) | (b16 & cur);

  cur = x >> 8;
  tmp = !!cur;
  res = res + (tmp << 3);
  b16 = (tmp << 31) >> 31;
  x = (~b16 & x) | (b16 & cur);

  cur = x >> 4;
  tmp = !!cur;
  res = res + (tmp << 2);
  b16 = (tmp << 31) >> 31;
  x = (~b16 & x) | (b16 & cur);

  cur = x >> 2;
  tmp = !!cur;
  res = res + (tmp << 1);
  b16 = (tmp << 31) >> 31;
  x = (~b16 & x) | (b16 & cur);
  cur = x >> 1;

  res =  res + cur;
  
  return res + 1 + !isZero;
}
  1. 返回float型变量f的2*f
unsigned floatScale2(unsigned uf) {
    int sig = uf >> 31 & 1; 
    int exp = uf >> 23 & 0xff;

    if(exp == 0xff){
        return uf;
    }

    if(exp == 0){
        return (uf << 1) | (sig << 31);
    }

    return uf + (1 << 23);
}
  1. 将float转换为int
int floatFloat2Int(unsigned uf) {
  unsigned sign = (uf & 0x80000000);
  unsigned exp = (uf & 0x7f800000) >> 23;
  unsigned frac = (1 << 23) | ((uf << 9) >> 9);
  int E = exp - 127;
  int val;
  if (exp < 127) return 0;
  if (exp >= 158) return 0x80000000;
  if (E <= 23) {
    val = (frac >> (23 - E));
    return !sign ? val : -val;
  }
  val = (frac << (E - 23));
  return !sign ? val : -val;
}
  1. 返回float类型f的2的f次幂
unsigned floatPower2(int x) {
    int frac = x + 149;
    if (x > 127) return 0x7f800000u;
    if (x < -149) return 0u;
    if (x >= -126) {
      unsigned exp = x + 127;
      return exp << 23;
    } 
    return 1u << frac;
}


一沙一世界,一花一天堂。君掌盛无边,刹那成永恒。