profiling | 易学教程

When L1 misses are a lot different than L2 accesses… TLB related?

阅读更多关于 When L1 misses are a lot different than L2 accesses… TLB related?

问题 I have been running some benchmarks on some algorithms and profiling their memory usage and efficiency (L1/L2/TLB accesses and misses), and some of the results are quite intriguing for me. Considering an inclusive cache hierarchy (L1 and L2 caches), shouldn't the number of L1 cache misses coincide with the number of L2 cache accesses ? One of the explanations I find would be TLB related: when a virtual address is not mapped in TLB, the system automatically skips searches in some cache levels.

Optimizing for PyPy

阅读更多关于 Optimizing for PyPy

问题 (This is a follow-up to Statistical profiler for PyPy) I'm running some Python code under PyPy and would like to optimize it. In Python, I would use statprof or lineprofiler to know which exact lines are causing the slowdown and try to work around them. In PyPy though, both of the tools don't really report sensible results as PyPy might optimize away some lines. I would also prefer not to use cProfile as I find it very difficult to distil which part of the reported function is the bottleneck.

How can I profile a request in Ruby on Rails?

阅读更多关于 How can I profile a request in Ruby on Rails?

问题 How can I profile a controller action? One of my views is taking quite some time to render, and I'd like to break it down. I see script/performance/profiler , but that seems to only have access to the global scope. 回答1: ruby-prof is the way to go. Here's a howto, How to profile your Rails and Ruby applications with ruby-prof . If you use it in combination with a visualisation tool such as kcachegrind, you can easily separate the code that is your application code from the framework code. I

gcc: undefined reference to _mcount (gprof instrumentation)

阅读更多关于 gcc: undefined reference to _mcount (gprof instrumentation)

问题 When compiling my c++ sources with the -pg option to inject gprof profile instrumentation code the compile fails with the undefined reference to _mcount error. Without this option everything compiles (and runs) fine. What is wrong in my case? (Solaris 10 SPARC Platform) 回答1: Are you both compiling each object file and linking the final executable using the '-pg' flag? 来源： https://stackoverflow.com/questions/4603298/gcc-undefined-reference-to-mcount-gprof-instrumentation

Python Function calls are really slow

阅读更多关于 Python Function calls are really slow

问题 This is mostly to make sure my methodology is correct, but my basic question was is it worth it to check outside of a function if I need to access the function at all. I know, I know, premature optimization, but in many cases, its the difference between putting an if statement inside the function call to determine whether I need to run the rest of the code, or putting it before the function call. In other words, it takes no effort to do it one way or the other. Right now, all the checks are

Any tool that says how long each method takes to run?

阅读更多关于 Any tool that says how long each method takes to run?

问题 some parts of my program are way slow. and I was wondering if there is tool that i can use and for example it can tell me ok running methodA() took 100ms , etc ...or so useful info similar to that. 回答1: The System.Diagnostics namespace offers a helpful class called Stopwatch, which can be used to time parts of your code (think of it as a "poor man's profiler"). This is how you would use it: Stopwatch stopwatch = new Stopwatch(); stopwatch.Start(); // Start timing // This is what we want to

What do the colours mean for detached DOM nodes in the Chrome Heap Profiler?

阅读更多关于 What do the colours mean for detached DOM nodes in the Chrome Heap Profiler?

问题 When analyzing heap snapshots using Chrome devtools, I can't seem to figure out what the colours mean when viewing Detached DOM Trees. What is the difference between red & yellow? 回答1: There is a good explanation available here. From the article: Red nodes do not have direct references from JavaScript to them, but are alive because they’re part of a detached DOM tree. There may be a node in the tree referenced from JavaScript (maybe as a closure or variable) but is coincidentally preventing

How do I get the raw SQL underlying a LINQ query when using Entity Framework CTP 5 “code only”?

阅读更多关于 How do I get the raw SQL underlying a LINQ query when using Entity Framework CTP 5 “code only”?

问题 I've using Entity Framework CTP5 in "code only" mode. I'm running a LINQ query on a object that was return from the database, as the query is running really slowly. Is there any way in which I can get the SQL statement that is being generated from the query? Topic currentTopic = (from x in Repository.Topics let isCurrent = (x.StoppedAt <= x.StartedAt || (x.StartedAt >= currentTopicsStartedAtOrAfter)) where x.Meeting.Manager.User.Id == user.Id && isCurrent orderby x.StartedAt descending select

How do I get the raw SQL underlying a LINQ query when using Entity Framework CTP 5 “code only”?

阅读更多关于 How do I get the raw SQL underlying a LINQ query when using Entity Framework CTP 5 “code only”?

JVM Memory : Why memory on task manager difference with JProbe (or JConsole tool)

阅读更多关于 JVM Memory : Why memory on task manager difference with JProbe (or JConsole tool)

问题 The problem that I faced is my application's memory used is only 100MB after that it decreased 50MB, but on Window Task Manager it showed 150MB and always keep or increase but not decrease, How can we reduce Memory (Private working set) on task manager ? 回答1: What you are seeing in JConsole (or other monitoring tools) is the pattern the java memory is being used. The memory of the JVM is usually divided among these areas (what you also see in monitoring tools). Heap memory which is for Java