amd-processor

Intel MKL vs. AMD Math Core Library

霸气de小男生 提交于 2019-12-03 04:52:33
问题 Does anybody have experience programming for both the Intel Math Kernel Library and the AMD Math Core Library? I'm building a personal computer for high performance statistical computations and am debating on the components to buy. An appeal of the AMD Math Core library is that it is free, but I am in academia so the MKL is not that expensive. But I'd be interested in hearing thoughts on: Which provides a better API? Which provides better performance, on average, per dollar, including

Creating a proper Task State Segment (TSS) structure with and without an IO Bitmap?

≡放荡痞女 提交于 2019-11-30 09:26:25
问题 Reading the documentation between Intel and AMD and looking at code makes it difficult at times to understand how to create a proper Task State Segment (TSS) that has no IO port bitmap (IOPB). There also seems to be confusion over creating a TSS with an IOPB as well since it seems ambiguous as to whether an IO Bitmap (IOPB) requires a trailing 0xff byte. I'm aware that there are is a dependency between the TSS and a TSS Descriptor (in the GDT). The TSS descriptor governs the base address of

Where is the Write-Combining Buffer located? x86

巧了我就是萌 提交于 2019-11-29 10:42:20
How is the Write-Combine buffer physically hooked up? I have seen block diagrams illustrating a number of variants: Between L1 and Memory controller Between CPU's store buffer and Memory controller Between CPU's AGUs and/or store units Is it microarchitecture-dependent? Write buffers can have different purposes or different uses in different processors. This answer may not apply to processors not specifically mentioned. I'd like to emphasis that the term "write buffer" may mean different things in different contexts. This answer is about Intel and AMD processors only. Write-Combining Buffers

Difference between intel and AMD multithreading

社会主义新天地 提交于 2019-11-29 05:18:26
I have an application meant for data transfer between 2 databases. Most of the operations of this application are independent and runs concurrently. Earlier this application was running on 4 core intel machine and now this application needs to be ported onto AMD quad(4) core machine. I am doubtful about couple of points below. I found AMD does not support hyper threading(HTT), this obviously means application performance (throughput) will degrade. Will performance degrade due to Context Switching, If yes will decreasing number of threads running concurrently help ?? Whether any code changes

Running Android emulator on computer with AMD processor

寵の児 提交于 2019-11-28 19:16:11
Is there anyway to run Android virtual devices through Eclipse while operating with AMD processor? I had Genymotion for a while and despite it worked, it was too much of a kerfuffle to dabble with it. For AMD processor, create a new Virtual Device and while selecting the system Image select the ABI as armeabi instead of the default x86 one. You don't need an Intel processor to run the emulator, it's just so much faster with the HAXM technology which obviously is not available to you. I recommend buying a cheap Android device for testing, as none of the emulations provided with the ADK are

Where is the Write-Combining Buffer located? x86

喜你入骨 提交于 2019-11-28 03:58:02
问题 How is the Write-Combine buffer physically hooked up? I have seen block diagrams illustrating a number of variants: Between L1 and Memory controller Between CPU's store buffer and Memory controller Between CPU's AGUs and/or store units Is it microarchitecture-dependent? 回答1: Write buffers can have different purposes or different uses in different processors. This answer may not apply to processors not specifically mentioned. I'd like to emphasis that the term "write buffer" may mean different

Does my AMD-based machine use little endian or big endian?

江枫思渺然 提交于 2019-11-27 19:20:20
I'm going though a computers system course and I'm trying to establish, for sure , if my AMD based computer is a little endian machine? I believe it is because it would be Intel-compatible. Specifically, my processor is an AMD 64 Athlon x2. I understand that this can matter in C programming. I'm writing C programs and a method I'm using would be affected by this. I'm trying to figure out if I'd get the same results if I ran the program on an Intel based machine (assuming that is little endian machine). Finally, let me ask this: Would any and all machines capable of running Windows (XP, Vista,

Difference between intel and AMD multithreading

烂漫一生 提交于 2019-11-27 19:04:38
问题 I have an application meant for data transfer between 2 databases. Most of the operations of this application are independent and runs concurrently. Earlier this application was running on 4 core intel machine and now this application needs to be ported onto AMD quad(4) core machine. I am doubtful about couple of points below. I found AMD does not support hyper threading(HTT), this obviously means application performance (throughput) will degrade. Will performance degrade due to Context

Android Studio emulator and AMD CPU

不羁岁月 提交于 2019-11-27 15:32:56
I can't run my app on standard Nexus 5 emulator. It seems it requires Intel HAXM but i have an AMD processor. So how can i use the emulator without buying an Intel processor (or installing Linux)? blues667 If you have an AMD processor, you can download an ARM image, but it is super slow on x86 platforms. The x86 image does not work with AMD CPUs, because the x86 image needs HAXM installed which needs VT-X support, and only Intel CPUs support it. So you can download the Genymotion emulator, which supports both VT-X & AMD-V technology. College Dude Genymotion is super fast. Other than hooking

Is vxorps-zeroing on AMD Jaguar/Bulldozer/Zen faster with xmm registers than ymm?

岁酱吖の 提交于 2019-11-27 15:06:17
AMD CPUs handle 256b AVX instructions by decoding into two 128b operations. e.g. vaddps ymm0, ymm1,ymm1 on AMD Steamroller decodes to 2 macro-ops, with half the throughput of vaddps xmm0, xmm1,xmm1 . XOR-zeroing is a special case (no input dependency, and on Jaguar at least avoids consuming a physical register file entry , and enables movdqa from that register to be eliminated at issue/rename, like Bulldozer does all the time even for non-zerod regs). But is it detected early enough that vxorps ymm0,ymm0,ymm0 still only decodes to 1 macro-op with equal performance to vxorps xmm0,xmm0,xmm0 ?