马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?注册
x
Consider the following C++ program which randomly generates a terrain for a 3D racing game. It first initializes a large 2 dimensional array with some non-zero random values using the random_value() function (don't care about its internals but assume it returns a random unsigned integer value)1000000. It then loops over the array 10 times to smooth it out. unsigned char a[1000000][1000000]; for (unsigned int j=0; j<1000000; j++) { for (unsigned int i=0; i<1000000; i++) { a[j] = random_value() % 256; if (a[j] == 0) { a[j]=1; } } } for (unsigned int k=0; k<10; k++) { for (unsigned int j=1; j<999999; j++) { for (unsigned int i=1; i<999999; i++) { a[j] = (a[i-1][j-1] + a[j-1] + a[i+1][j-1] + a[i-1][j ] + a[j ] + a[i+1][j ] + a[i-1][j+1] + a[j+1] + a[i+1][j+1]) / 9; } } } 1. Write the program above in assembly form using the kinds of instructions listed below. You can assume the large array a[][] is already allocated at M[R1] and that R1 is initialized for you. Note that if you change R1, you should make sure to have a way to get its value back. Otherwise you will not be able to access the array (a) anymore. Also note that memory is byte addressable and the array is allocated in row-major-order. This means M(address) returns the byte stored at the address and that M(1000) is array element a[0][0], M(1001) is a[0][1], M(1002) is a[0][2], M(1000+1000000) is a[1][0], and so forth. ___________________________ | | LOADi R, address | // R = M(address) | STOREi R, address | // M(address) = R | LOAD R, Rm | // R = M(Rm) | STORE R, Rm | // M(Rm) = R | ADD Rs1, Rs2, Rd | // Rd = Rs1 + Rs2 | DIV Rs1, Rs2, Rd | // Rd = Rs1 / Rs2 | MOD Rs1, Rs2, Rd | // Rd = Rs1 % Rs2 | MOV Rs, Rd | // Rd = Rs | MOVi Rs, val | // Rs = val | JLT R, val, LABEL | // if (R < val) jump to LABEL | JGT R, val, LABEL | // if (R > val) jump to LABEL | JEQ R, val, LABEL | // if (R == val) jump to LABEL | JLE R, val, LABEL | // if (R <= val) jump to LABEL | JGE R, val, LABEL | // if (R >= val) jump to LABEL | JNE R, val, LABEL | // if (R != val) jump to LABEL | J LABEL | // jump to LABEL | JSUB LABEL | // jumps to function at LABEL (can return with RET) | RET | // return to the address after the last JSUB. | ___________________________ | |
You can assume the random_value() function is at label RANDOM_VALUE. After the function returns, the value in register R2 will have a new random value for you to use. Note that the function overwrites register R2. Hint: Use JSUB for this. 2. What is a better architecture for this program: VLIW or SuperScalar? Explain your answer. 3. Identify the temporary variables (values stored in registers) in your code. Draw the conflict graph (the graph where each node corresponds to a temporary and there is an edge between nodes that at some point must exist at the same time). Assume there is no pipelining so a variable is read at the beginning of instruction execution and written at the end. (20 points) 3.1. Explain an algorithm that can be used to minimize the number of registers used in your code without using integer-linear programming. 4. Demonstrate how you can use lp_solve to minimize the number of registers in your code above. Hint: Start by providing a general outline of your approach (the steps you need to take to accomplish this). Then try to elaborate on each step separately. Even if you aren't able to get a result for some step, you can get significant partial credit by CLEARLY explaining what you have to do in each step and why. 5. The following questions are using SimpleScalar: Consider the following two programs that calculate the average of 1000 numbers: avg0.cc and avg1.cc Compile each program, 3 times, for simplescalar using the following gcc options: So you should have 6 different compiled binaries. 5.1. Run all six cases with sim-safe, using the -v option. Briefly explain what is different in each case, and why. More specifically, focus on the total number of instructions executed. 5.2. Pick one of the cases from avg0.cc and one of the cases from avg1.cc and run using sim-bpred. Try -bpred taken and -bpred nottaken. Explain what you see and why in the simulator outputs. 5.3. Again using the -v output (lists all instructions as they are executed), and your favorite binary from the above explain what is different between sim-safe and sim-outoforder. Errata: Q: For given assembly program can I assume that register can hold value more than 1 byte (for example R1 can go from 0 to 999999) and for first loop can I assume it continuous array from 0 to 1000000000000? (means can I increment R1 value upto 1000000000000 ?
A: Yes. Q: Is there a limit to the amount of memory we have available?
A: No. |