cpu-architecture

Accessing and updating 2-way associative cache with same tag and offset bits

偶尔善良 提交于 2019-12-02 16:57:55
问题 I am confused on how the data is accessed on a 2-way associative cache. For example, C = ABS C = 32KB A = 2 B = 32bits S = 256 offset = lg(B) = 5 index = lg(S) = 8 tag = 32 - offset - index = 19 Say I have I have the following addresses tag | index | offset 1000 0000 0000 0000 000|0 0000 000|1 0000 1000 0000 0000 0000 000|0 0000 000|0 0000 1000 0000 0000 0000 000|0 0000 000|1 1010 and my cache looks like index valid dirty tag data valid dirty tag data 0: 1 0 0x80... some data1 1 0 0x80...

descriptor concept in NIC

徘徊边缘 提交于 2019-12-02 15:19:29
I am trying to understand the concept of Rx and Tx descriptors used in Network driver code. Are Descriptors in software(RAM) or hardware (NIC card). How do they get filled. EDIT: So in a Realtek card driver code. I have a following struct defined. struct Desc { uint32_t opts1; uint32_t opts2; uint64_t addr; }; txd->addr = cpu_to_le64(mapping); txd->opts2 = cpu_to_le32(opts2); txd->opts1 = cpu_to_le32(opts1 & ~DescOwn); So are the opts1 and opts2 and there bits like DescOwn card specific? Will they be defined by the manufacturer in the datasheet? Thanks Nayan Quick Answer: They are software

How to find the size of the L1 cache line size with IO timing measurements?

我们两清 提交于 2019-12-02 15:09:03
As a school assignment, I need to find a way to get the L1 data cache line size, without reading config files or using api calls. Supposed to use memory accesses read/write timings to analyze & get this info. So how might I do that? In an incomplete try for another part of the assignment, to find the levels & size of cache, I have: for (i = 0; i < steps; i++) { arr[(i * 4) & lengthMod]++; } I was thinking maybe I just need vary line 2, (i * 4) part? So once I exceed the cache line size, I might need to replace it, which takes sometime? But is it so straightforward? The required block might

Calculating actual/effective CPI for 3 level cache

倖福魔咒の 提交于 2019-12-02 14:43:22
问题 (a) You are given a memory system that has two levels of cache (L1 and L2). Following are the specifications: Hit time of L1 cache: 2 clock cycles Hit rate of L1 cache: 92% Miss penalty to L2 cache (hit time of L2): 8 clock cycles Hit rate of L2 cache: 86% Miss penalty to main memory: 37 clock cycles Assume for the moment that hit rate of main memory is 100%. Given a 2000 instruction program with 37% data transfer instructions (loads/stores), calculate the CPI (Clock Cycles per Instruction)

Program Counter and Instruction Register

你说的曾经没有我的故事 提交于 2019-12-02 14:08:14
Program counter holds the address of the instruction that should be executed next, while instruction register holds the actual instruction to be executed. wouldn't one of them be enough? And what is the length of each one of these registers? Thanks. Haleeq Usman You will need both always. The program counter (PC) holds the address of the next instruction to be executed, while the instruction register (IR) holds the encoded instruction. Upon fetching the instruction, the program counter is incremented by one "address value" (to the location of the next instruction). The instruction is then

Where is the L1 memory cache of Intel x86 processors documented?

家住魔仙堡 提交于 2019-12-02 13:52:38
I am trying to profile and optimize algorithms and I would like to understand the specific impact of the caches on various processors. For recent Intel x86 processors (e.g. Q9300), it is very hard to find detailed information about cache structure. In particular, most web sites (including Intel.com ) that post processor specs do not include any reference to L1 cache. Is this because the L1 cache does not exist or is this information for some reason considered unimportant? Are there any articles or discussions about the elimination of the L1 cache? [edit] After running various tests and

Why is x86 ugly? Why is it considered inferior when compared to others? [closed]

拥有回忆 提交于 2019-12-02 13:50:24
Recently I've been reading some SO archives and encountered statements against the x86 architecture. Why do we need different CPU architecture for server & mini/mainframe & mixed-core? says " PC architecture is a mess, any OS developer would tell you that. " Is learning Assembly Language worth the effort? ( archived ) says " Realize that the x86 architecture is horrible at best " Any easy way to learn x86 assembler? says " Most colleges teach assembly on something like MIPS because it's much simpler to understand, x86 assembly is really ugly " and many more comments like "Compared to most

Calculating actual/effective CPI for 3 level cache

半城伤御伤魂 提交于 2019-12-02 12:25:36
(a) You are given a memory system that has two levels of cache (L1 and L2). Following are the specifications: Hit time of L1 cache: 2 clock cycles Hit rate of L1 cache: 92% Miss penalty to L2 cache (hit time of L2): 8 clock cycles Hit rate of L2 cache: 86% Miss penalty to main memory: 37 clock cycles Assume for the moment that hit rate of main memory is 100%. Given a 2000 instruction program with 37% data transfer instructions (loads/stores), calculate the CPI (Clock Cycles per Instruction) for this scenario. For this part, I calculated it like this (am I doing this right?): (m1: miss rate of

What does “extend immediate to 32 bits” mean in MIPS?

你离开我真会死。 提交于 2019-12-02 10:10:41
I'm reading about the Instruction Decode (ID) phase in the MIPS datapath, and I've got the following quote: "Once operands are known, read the actual data (from registers) or extend the data to 32 bits (immediates)." Can someone explain what the "extend the data to 32 bits (immediates)" part means? I know that registers all contain 32 bits, and I know what an immediate is. I just don't understand why you need to extend the immediate from 26 to 32 bits. Thanks! On a 32-bit CPU, most of the operations you do (like adding, subtracting, dereferencing a pointer) are done with 32-bit numbers. When

How exactly to count the hit rate of a direct mapped cache?

岁酱吖の 提交于 2019-12-02 10:05:37
We got a cache given with 8 frames and it's directly mapped. Following access sequence on main memory blocks has been observed: 2 5 0 13 2 5 10 8 0 4 5 2 Count the hit rate of this organized cache. Solution: I understand how and why the numbers are placed in the table like that. But I don't understand why 2 and 5 have been bold-printed and why we got hit rate of 17%. This has been solved by our professor but I don't understand it completely. Like was mentioned by @Margaret Bloom in the comments, the numbers in bold refer to cache-hits. Non-bold refer to cache misses. You might understand it