Micro optimisations iterating through a tree in C#

旧时模样 提交于 2019-12-01 06:20:37

问题


I'm working on a massive number crunching project. I've been optimising everything since the start as I knew it would be important. Doing performance analysis my code spends almost 40% of it's life in one function - the binary tree iterator.

        public ScTreeNode GetNodeForState(int rootIndex, float[] inputs)
        {
0.2%        ScTreeNode node = RootNodes[rootIndex].TreeNode;

24.6%       while (node.BranchData != null)
            {
0.2%            BranchNodeData b = node.BranchData;
0.5%            node = b.Child2;
12.8%           if (inputs[b.SplitInputIndex] <= b.SplitValue)
0.8%                node = b.Child1;
            }

0.4%        return node;
        }

Do any C# optimisation experts have any tips for optimising this further? All comparisons are floats. I know that in theory it shouldn't matter, but I'm using fields rather than properties so ensure optimisation. A small saving here could shave days off the process.

Please no replies saying "These optimisations don't matter in the real world" - because in this instance they do. :-)

Edit: I've updated the code to what I've got now following the comments below, and added in the performance analysis output for each line of code. As you can see, the main killer is the null check - why? I tried using a boolean flag IsLeaf on the node instead of the null check, but it was a equal performance hit on that line.

The code for branch node object is as follows:

public sealed class BranchNodeData
{
    /// <summary>
    /// The index of the data item in the input array on which we need to split
    /// </summary>
    internal int SplitInputIndex = 0;

    /// <summary>
    /// The value that we should split on
    /// </summary>
    internal float SplitValue = 0;

    /// <summary>
    /// The nodes children
    /// </summary>
    internal ScTreeNode Child1;
    internal ScTreeNode Child2;
}

Another Edit: Yet more thinking here... I was wondering why the line

BranchNodeData b = node.BranchData;

was registering 0.2% of execution and the null comparison line was registering 17.7%. I'm guessing this is a branch prediction fail? While that comparison being hit multiple times, and almost always returning true, it makes it very hard for CPU to predict when it's going to return false. I'm not very clued up on the low level workings of a CPU, but is this likely be the case?


回答1:


Just some code rewrite. It might help because it avoids at least two jumps.

public ScTreeNode GetNodeForState(int rootIndex, float[] inputs)
{

    ScTreeNode node = RootNodes[rootIndex].TreeNode;

    while (node.BranchData != null)
    {
        BranchNodeData b = node.BranchData;
        node = b.Child2;
        if (inputs[b.SplitInputIndex] <= b.SplitValue))
            node = b.Child1;
    }

    return node;

}



回答2:


BranchNodeData looks like a reference type. its only 0.2% of your runtime because its just making a pointer to the data that already exists, not actually copying or assigning anything.

You're probably getting such a hit on the null check because the CLR is having to do a cast in order to check the sealed class you've pasted in. Checking for nullity there isn't necessarily what you're after. There are a ton of ways to modify that class to give you a boolean to check against that wouldn't require as much computing power. I'd honestly go the route of having that be something that your ScTreeNode class can provide.




回答3:


Given the points made in the other answer about caching, but not in relation to the null check, try ordering the references to the BranchNodeData fields so that the first reference allows all of the following fields to be loaded into the cache.

That is, I assume the Jitter, or the CPU, is not smart enough to load "backwards" to cache SplitInputIndex, SplitValue and Child1 when Child2 is referenced first in the current code.

So either change the order of the fields in the BranchNodeData class, or change the set; if ... overwrite; to an if ... else.



来源:https://stackoverflow.com/questions/16416084/micro-optimisations-iterating-through-a-tree-in-c-sharp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!