profiling

When L1 misses are a lot different than L2 accesses… TLB related?

孤街浪徒 提交于 2019-12-21 04:42:24
问题 I have been running some benchmarks on some algorithms and profiling their memory usage and efficiency (L1/L2/TLB accesses and misses), and some of the results are quite intriguing for me. Considering an inclusive cache hierarchy (L1 and L2 caches), shouldn't the number of L1 cache misses coincide with the number of L2 cache accesses ? One of the explanations I find would be TLB related: when a virtual address is not mapped in TLB, the system automatically skips searches in some cache levels.

Optimizing for PyPy

流过昼夜 提交于 2019-12-21 04:27:27
问题 (This is a follow-up to Statistical profiler for PyPy) I'm running some Python code under PyPy and would like to optimize it. In Python, I would use statprof or lineprofiler to know which exact lines are causing the slowdown and try to work around them. In PyPy though, both of the tools don't really report sensible results as PyPy might optimize away some lines. I would also prefer not to use cProfile as I find it very difficult to distil which part of the reported function is the bottleneck.

How can I profile a request in Ruby on Rails?

孤人 提交于 2019-12-21 03:48:28
问题 How can I profile a controller action? One of my views is taking quite some time to render, and I'd like to break it down. I see script/performance/profiler , but that seems to only have access to the global scope. 回答1: ruby-prof is the way to go. Here's a howto, How to profile your Rails and Ruby applications with ruby-prof . If you use it in combination with a visualisation tool such as kcachegrind, you can easily separate the code that is your application code from the framework code. I

gcc: undefined reference to _mcount (gprof instrumentation)

做~自己de王妃 提交于 2019-12-21 03:26:11
问题 When compiling my c++ sources with the -pg option to inject gprof profile instrumentation code the compile fails with the undefined reference to _mcount error. Without this option everything compiles (and runs) fine. What is wrong in my case? (Solaris 10 SPARC Platform) 回答1: Are you both compiling each object file and linking the final executable using the '-pg' flag? 来源: https://stackoverflow.com/questions/4603298/gcc-undefined-reference-to-mcount-gprof-instrumentation

Python Function calls are really slow

点点圈 提交于 2019-12-21 01:57:10
问题 This is mostly to make sure my methodology is correct, but my basic question was is it worth it to check outside of a function if I need to access the function at all. I know, I know, premature optimization, but in many cases, its the difference between putting an if statement inside the function call to determine whether I need to run the rest of the code, or putting it before the function call. In other words, it takes no effort to do it one way or the other. Right now, all the checks are

Any tool that says how long each method takes to run?

萝らか妹 提交于 2019-12-21 01:11:30
问题 some parts of my program are way slow. and I was wondering if there is tool that i can use and for example it can tell me ok running methodA() took 100ms , etc ...or so useful info similar to that. 回答1: The System.Diagnostics namespace offers a helpful class called Stopwatch, which can be used to time parts of your code (think of it as a "poor man's profiler"). This is how you would use it: Stopwatch stopwatch = new Stopwatch(); stopwatch.Start(); // Start timing // This is what we want to

What do the colours mean for detached DOM nodes in the Chrome Heap Profiler?

本小妞迷上赌 提交于 2019-12-20 18:25:41
问题 When analyzing heap snapshots using Chrome devtools, I can't seem to figure out what the colours mean when viewing Detached DOM Trees. What is the difference between red & yellow? 回答1: There is a good explanation available here. From the article: Red nodes do not have direct references from JavaScript to them, but are alive because they’re part of a detached DOM tree. There may be a node in the tree referenced from JavaScript (maybe as a closure or variable) but is coincidentally preventing

How do I get the raw SQL underlying a LINQ query when using Entity Framework CTP 5 “code only”?

纵饮孤独 提交于 2019-12-20 17:48:09
问题 I've using Entity Framework CTP5 in "code only" mode. I'm running a LINQ query on a object that was return from the database, as the query is running really slowly. Is there any way in which I can get the SQL statement that is being generated from the query? Topic currentTopic = (from x in Repository.Topics let isCurrent = (x.StoppedAt <= x.StartedAt || (x.StartedAt >= currentTopicsStartedAtOrAfter)) where x.Meeting.Manager.User.Id == user.Id && isCurrent orderby x.StartedAt descending select

How do I get the raw SQL underlying a LINQ query when using Entity Framework CTP 5 “code only”?

若如初见. 提交于 2019-12-20 17:48:09
问题 I've using Entity Framework CTP5 in "code only" mode. I'm running a LINQ query on a object that was return from the database, as the query is running really slowly. Is there any way in which I can get the SQL statement that is being generated from the query? Topic currentTopic = (from x in Repository.Topics let isCurrent = (x.StoppedAt <= x.StartedAt || (x.StartedAt >= currentTopicsStartedAtOrAfter)) where x.Meeting.Manager.User.Id == user.Id && isCurrent orderby x.StartedAt descending select

JVM Memory : Why memory on task manager difference with JProbe (or JConsole tool)

走远了吗. 提交于 2019-12-20 12:36:25
问题 The problem that I faced is my application's memory used is only 100MB after that it decreased 50MB, but on Window Task Manager it showed 150MB and always keep or increase but not decrease, How can we reduce Memory (Private working set) on task manager ? 回答1: What you are seeing in JConsole (or other monitoring tools) is the pattern the java memory is being used. The memory of the JVM is usually divided among these areas (what you also see in monitoring tools). Heap memory which is for Java