Can someone explain in simple terms to me what a directed acyclic graph is?

前端 未结 13 1180
暖寄归人
暖寄归人 2020-12-22 16:23

Can someone explain in simple terms to me what a directed acyclic graph is? I have looked on Wikipedia but it doesn\'t really make me see its use in programming.

相关标签:
13条回答
  • 2020-12-22 16:43

    From a source code or even three address(TAC) code perspective you can visualize the problem really easily at this page...

    http://cgm.cs.mcgill.ca/~hagha/topic30/topic30.html#Exptree

    If you go to the expression tree section, and then page down a bit it shows the "topological sorting" of the tree, and the algorithm for how to evaluate the expression.

    So in that case you can use the DAG to evaluate expressions, which is handy since evaluation is normally interpreted and using such a DAG evaluator will make simple intrepreters faster in principal because it is not pushing and popping to a stack and also because it is eliminating common sub-expressions.

    The basic algorithm to compute the DAG in non ancient egyptian(ie English) is this:

    1) Make your DAG object like so

    You need a live list and this list holds all the current live DAG nodes and DAG sub-expressions. A DAG sub expression is a DAG Node, or you can also call it an internal node. What I mean by live DAG Node is that if you assign to a variable X then it becomes live. A common sub-expression that then uses X uses that instance. If X is assigned to again then a NEW DAG NODE is created and added to the live list and the old X is removed so the next sub-expression that uses X will refer to the new instance and thus will not conflict with sub-expressions that merely use the same variable name.

    Once you assign to a variable X, then co-incidentally all the DAG sub-expression nodes that are live at the point of assignment become not-live, since the new assignment invalidates the meaning of sub expressions using the old value.

    class Dag {
      TList LiveList;
      DagNode Root;
    }
    
    // In your DagNode you need a way to refer to the original things that
    // the DAG is computed from. In this case I just assume an integer index
    // into the list of variables and also an integer index for the opertor for
    // Nodes that refer to operators. Obviously you can create sub-classes for
    // different kinds of Dag Nodes.
    class DagNode {
      int Variable;
      int Operator;// You can also use a class
      DagNode Left;
      DagNode Right;
      DagNodeList Parents;
    }
    

    So what you do is walk through your tree in your own code, such as a tree of expressions in source code for example. Call the existing nodes XNodes for example.

    So for each XNode you need to decide how to add it into the DAG, and there is the possibility that it is already in the DAG.

    This is very simple pseudo code. Not intended for compilation.

    DagNode XNode::GetDagNode(Dag dag) {
      if (XNode.IsAssignment) {
        // The assignment is a special case. A common sub expression is not
        // formed by the assignment since it creates a new value.
    
        // Evaluate the right hand side like normal
        XNode.RightXNode.GetDagNode();  
    
    
        // And now take the variable being assigned to out of the current live list
        dag.RemoveDagNodeForVariable(XNode.VariableBeingAssigned);
    
        // Also remove all DAG sub expressions using the variable - since the new value
        // makes them redundant
        dag.RemoveDagExpressionsUsingVariable(XNode.VariableBeingAssigned);
    
        // Then make a new variable in the live list in the dag, so that references to
        // the variable later on will see the new dag node instead.
        dag.AddDagNodeForVariable(XNode.VariableBeingAssigned);
    
      }
      else if (XNode.IsVariable) {
        // A variable node has no child nodes, so you can just proces it directly
        DagNode n = dag.GetDagNodeForVariable(XNode.Variable));
        if (n) XNode.DagNode = n;
        else {
          XNode.DagNode = dag.CreateDagNodeForVariable(XNode.Variable);
        }
        return XNode.DagNode;
      }
      else if (XNode.IsOperator) {
        DagNode leftDagNode = XNode.LeftXNode.GetDagNode(dag);
        DagNode rightDagNode = XNode.RightXNode.GetDagNode(dag);
    
    
        // Here you can observe how supplying the operator id and both operands that it
        // looks in the Dags live list to check if this expression is already there. If
        // it is then it returns it and that is how a common sub-expression is formed.
        // This is called an internal node.
        XNode.DagNode = 
          dag.GetOrCreateDagNodeForOperator(XNode.Operator,leftDagNode,RightDagNode) );
    
        return XNode.DagNode;
      }
    }
    

    So that is one way of looking at it. A basic walk of the tree and just adding in and referring to the Dag nodes as it goes. The root of the dag is whatever DagNode the root of the tree returns for example.

    Obviously the example procedure can be broken up into smaller parts or made as sub-classes with virtual functions.

    As for sorting the Dag, you go through each DagNode from left to right. In other words follow the DagNodes left hand edge, and then the right hand side edge. The numbers are assigned in reverse. In other words when you reach a DagNode with no children, assign that Node the current sorting number and increment the sorting number, so as the recursion unwinds the numbers get assigned in increasing order.

    This example only handles trees with nodes that have zero or two children. Obviously some trees have nodes with more than two children so the logic is still the same. Instead of computing left and right, compute from left to right etc...

    // Most basic DAG topological ordering example.
    void DagNode::OrderDAG(int* counter) {
      if (this->AlreadyCounted) return;
    
      // Count from left to right
      for x = 0 to this->Children.Count-1
        this->Children[x].OrderDag(counter)
    
      // And finally number the DAG Node here after all
      // the children have been numbered
      this->DAGOrder = *counter;
    
      // Increment the counter so the caller gets a higher number
      *counter = *counter + 1;
    
      // Mark as processed so will count again
      this->AlreadyCounted = TRUE;
    }
    
    0 讨论(0)
  • 2020-12-22 16:46

    I assume you already know basic graph terminology; otherwise you should start from the article on graph theory.

    Directed refers to the fact that the edges (connections) have directions. In the diagram, these directions are shown by the arrows. The opposite is an undirected graph, whose edges don't specify directions.

    Acyclic means that, if you start from any arbitrary node X and walk through all possible edges, you cannot return to X without going back on an already-used edge.

    Several applications:

    • Spreadsheets; this is explained in the DAG article.
    • Revision control: if you have a look at the diagram in that page, you will see that the evolution of revision-controlled code is directed (it goes "down", in this diagram) and acyclic (it never goes back "up").
    • Family tree: it's directed (you are your parents' child, not the other way around) and acyclic (your ancestors can never be your descendant).
    0 讨论(0)
  • 2020-12-22 16:46

    The name tells you most of what you need to know of its definition: It's a graph where every edge only flows in one direction and once you crawl down an edge your path will never return you to the vertex you just left.

    I can't speak to all the uses (Wikipedia helps there), but for me DAGs are extremely useful when determining dependencies between resources. My game engine for instance represents all loaded resources (materials, textures, shaders, plaintext, parsed json etc) as a single DAG. Example:

    A material is N GL programs, that each need two shaders, and each shader needs a plaintext shader source. By representing these resources as a DAG, I can easily query the graph for existing resources to avoid duplicate loads. Say you want several materials to use vertex shaders with the same source code. It is wasteful to reload the source and recompile the shaders for every use when you can just establish a new edge to the existing resource. In this way you can also use the graph to determine if anything depends on a resource at all, and if not, delete it and free its memory, in fact this happens pretty much automatically.

    By extension, DAGs are useful for expressing data processing pipelines. The acyclic nature means you can safely write contextual processing code that can follow pointers down the edges from a vertex without ever reencountering the same vertex. Visual programming languages like VVVV, Max MSP or Autodesk Maya's node-based interfaces all rely on DAGs.

    0 讨论(0)
  • 2020-12-22 16:55

    If you know what trees are in programming, then DAGs in programming are similar but they allow a node to have more than one parent. This can be handy when you want to let a node be clumped under more than just a single parent, yet not have the problem of a knotted mess of a general graph with cycles. You can still navigate a DAG easily, but there are multiple ways to get back to the root (because there can be more than one parent). A single DAG could in general have multiple roots but in practice may be better to just stick with one root, like a tree. If you understand single vs. multiple inheritance in OOP, then you know tree vs. DAG. I already answered this here.

    0 讨论(0)
  • 2020-12-22 16:57

    Example uses of a directed acyclic graph in programming include more or less anything that represents connectivity and causality.

    For example, suppose you have a computation pipeline that is configurable at runtime. As one example of this, suppose computations A,B,C,D,E,F, and G depend on each other: A depends on C, C depends on E and F, B depends on D and E, and D depends on F. This can be represented as a DAG. Once you have the DAG in memory, you can write algorithms to:

    • make sure the computations are evaluated in the correct order (topological sort)
    • if computations can be done in parallel but each computation has a maximum execution time, you can calculate the maximum execution time of the entire set

    among many other things.

    Outside the realm of application programming, any decent automated build tool (make, ant, scons, etc.) will use DAGs to ensure proper build order of the components of a program.

    0 讨论(0)
  • 2020-12-22 16:59

    Graphs, of all sorts, are used in programming to model various different real-world relationships. For example, a social network is often represented by a graph (cyclic in this case). Likewise, network topologies, family trees, airline routes, ...

    0 讨论(0)
提交回复
热议问题