cpu | 易学教程

How does the clock work in Windows 7?

阅读更多关于 How does the clock work in Windows 7?

问题 I have read this answer somewhere but I don't understand it exactly: I understand Windows increments the clock every curTimeIncrement (156001 100 nanoseconds) with the value of curTimeAdjustment (156001 +- N). But when the clock is read using GetSystemTime does the routine interpolate within the 156001 nanosecond*100 interval to produce the precision indicated? Can someone try to explain it to me? What is curTimeIncrement , curTimeAdjustment and how can Windows do this? What is the effect for

Is there a CPU that runs Java in microcode? [closed]

阅读更多关于 Is there a CPU that runs Java in microcode? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . Java is a beautifully crafted OO language but the first thing I noticed is how slow it is (compared to C++). This is probably because it has to go through another layer of translation (the VM) instead of running directly in the CPU's native microcode. My question: Do you know of any attempts to create Java

Sandy-Bridge CPU specification

阅读更多关于 Sandy-Bridge CPU specification

I was able to put together bits here and there about the Sandy Bridge-E architecture but I am not totally sure about all the parameters e.g. the size of the L2 cache. Can anyone please confirm they are all correct? My main source was the 64-ia-32-architectures-optimization-manual.pdf On sandy bridge, each core has 256KB of L2 ( see the datasheet, section 1.1 ). for 6 cores, that's 1.5MB, but since each core only accesses its own, it's better to always look at it as 256KB per core. Moreover, the peak gflops looks completely wrong. AVX is 16 flops/cycle (as single floats). with 6 cores, that's

Which architecture to call Non-uniform memory access (NUMA)?

阅读更多关于 Which architecture to call Non-uniform memory access (NUMA)?

According to wiki : Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to a processor. But it is not clear whether it is about any memory including caches or about main memory only. For example Xeon Phi processor have next architecture: Memory access to main memory (GDDR) is same for all cores. Meanwhile memory access to L2 cache is different for different cores, since first native L2 cache is checked, then L2 cache of other cores is checked via ring. Is it NUMA or UMA architecture?

GPU PoolAllocator explodes the CPU memory

阅读更多关于 GPU PoolAllocator explodes the CPU memory

问题 I made a tensorflow model with relatively common operations (apart from a couple of tf.where and indices handling), but call it with very varying different input shapes (many undefined tensor shapes in the model). Everything works fine on the CPU. But when you use the GPU , the RAM usage (not the GPU memory, the CPU one) steadily increases up to fill the 256GB of the machine and kills itself. During the process, I get the usual messages : 2017-03-17 16:42:22.366601: I tensorflow/core/common

Would buffering cache changes prevent Meltdown?

阅读更多关于 Would buffering cache changes prevent Meltdown?

If new CPUs had a cache buffer which was only committed to the actual CPU cache if the instructions are ever committed would attacks similar to Meltdown still be possible? The proposal is to make speculative execution be able to load from memory, but not write to the CPU caches until they are actually committed. TL:DR: yes I think it would solve Spectre (and Meltdown) in their current form (using a flush+read cache-timing side channel to copy the secret data from a physical register), but probably be too expensive (in power cost, and maybe also performance) to be a likely implementation. But

Set cpu affinity on a loadable linux kernel module

阅读更多关于 Set cpu affinity on a loadable linux kernel module

I need to create a kernel module that enables ARM PMU counters on every core in the computer. I have trouble setting the cpu affinity. Ive tried sched_get_affinity , but apparently, it only works for user space processes. My code is below. Any ideas? #define _GNU_SOURCE #include <linux/module.h> /* Needed by all modules */ #include <linux/kernel.h> /* Needed for KERN_INFO */ int init_module(void){ unsigned reg; /* enable user-mode access to the performance counters*/ asm volatile("MRC p15, 0, %0, C9, C14, 0\n\t" : "=r"(reg)); reg |= 1; asm volatile("MCR p15, 0, %0, C9, C14, 0\n\t" :: "r"(reg))

Theano CNN on CPU: AbstractConv2d Theano optimization failed

阅读更多关于 Theano CNN on CPU: AbstractConv2d Theano optimization failed

问题 I'm trying to train a CNN for object detection on images with the CIFAR10 dataset for a seminar at my university but I get the following Error: AssertionError: AbstractConv2d Theano optimization failed: there is no implementation available supporting the requested options. Did you exclude both "conv_dnn" and "conv_gemm" from the optimizer? If on GPU, is cuDNN available and does the GPU support it? If on CPU, do you have a BLAS library installed Theano can link against? I am running Anaconda 2

What's the advantage of compiler instruction scheduling compared to dynamic scheduling? [closed]

阅读更多关于 What's the advantage of compiler instruction scheduling compared to dynamic scheduling? [closed]

Nowadays, super-scalar RISC cpus usually support out-of-order execution, with branch prediction and speculative execution. They schedule work dynamically. What's the advantage of compiler instruction scheduling, compared to an out-of-order CPU's dynamic scheduling? Does compile-time static scheduling matter at all for an out-of-order CPU, or only for simple in-order CPUs? It seems currently most software instruction scheduling work focuses on VLIW or simple CPUs. The GCC wiki's scheduling page also shows not much interest in updating gcc's scheduling algorithms. Advantage of static (compiler)

Java limiting resource usage

阅读更多关于 Java limiting resource usage

问题 Is there a way to limit both the number of cores that java uses? And in that same vein, is it possible to limit how much of that core is being used? 回答1: You can use taskset on linux. You can also lower the priority of a process, but unless the CPU(S) are busy, a process will get as much CPU as it can use. I have a library for dedicating thread to a core, called Java Thread Affinity, but it may have a different purpose to what you have in mind. Can you clarify why you want to do this? 回答2: I