melp.nl

< Return to main page

Flow charts in code: enter graphviz and the "dot" language

If you're like me, you like gui's as long as they don't push you in a direction other than your train of thought. Whenever the tool tends to distract you from the task you are performing, you get annoyed. Stuff like "why am I searching for such an over-obvious functionality", or "why didn't they think of making the clickable area a bit bigger", or simply "aaaarggghhh, it crashed on me again".

I like typing code. It says what it is. I even considered writing graphs such as flow charts or database models in SVG once, but that's a bridge too far. What is pretty helpful though, is graphviz, and the "dot" language.

In essence, DOT1 is a declarative language in which you express nodes and their relations in a graph. You can label the nodes and their edges (links between nodes) and you have an array of styling and shaping tools at hand. I have never more quickly drawn a flow chart for some documenting work than with this tool. So I decided I'd use it for my database model too, and behold, it helped me focusing on just the content and logic in stead of any other crap.

I'll focus just on the flow chart here, but I hope it'll get you enthusiastic to apply it to other types of graphs.

Installing graphviz

Graphviz is oooold. It should be available in your distro's package repository, or you can download binaries from the graphviz website.

The tools

Graphviz comes with a set of layout engines which have different command line tools. I have only used dot so far, tried a bit of fdp and neato, but as long as I do not really understand how they differ, I can't really tell you anything about them.

So, you'll only need dot for now.

How to get started

There are two basic structures you can use, a graph and a digraph which stand for a graph and a directed graph respectively. The difference is that a directed graph always has edges with a "value", i.e., you'd speak of a "parent"-"child" type of relationship. A regular graph is just nodes that are somehow connected, but there is no direction or value to their relationship. For example, synonyms are synonyms either way.

In practice, most of the graphs or charts you'll do are directed graphs. For now it suffices to say that a directed graph connects nodes by arrows (->) and regular graphs by a line (--).

My first graph

#!dot
digraph {
    A -> B;
}

Using the dot utility, you can render the graph as an image or many other output formats:

#!shell
dot -Tpng -o graph.png graph.dot


We declare a relationship between the parent and the child. By declaring the relationship, we also declare the nodes. So if we'd output this graph, we'd see the relationship.

We can continue declaring relationships and nodes like this, in this declarative form:

#!dot
digraph {
    A -> B -> C -> D;
    C -> E -> F;
    F -> B;
}

As you can see, it is pretty darn easy. There is absolutely no way you could have done this faster, and have the advantages of easily editing the graph structure, be able to version control it, insert comments, etc.


How to start writing a flow chart

Flow charts are a typically good example for using graphviz. You'll usually only need a few different shapes, layout of the resulting graph isn't overly important, and specifying relationships between the different nodes really is all you need.

#!dot
digraph {
    label="How to make sure 'input' is valid"

    start[shape="box", style=rounded];
    end[shape="box", style=rounded];
    if_valid[shape="diamond", style=""];
    message[shape="parallelogram", style=""]
    input[shape="parallelogram", style=""]

    start -> input;
    input -> if_valid;
    if_valid -> message[label="no"];
    if_valid -> end[label="yes"];
    message -> input;
}


Since all node mentioning is declarative, you can easily put labels in a different section than the structure of your graph:

#!dot
digraph {
    label="How to make sure 'input' is valid"

    start[shape="box", style=rounded];
    end[shape="box", style=rounded];
    if_valid[shape="diamond", style=""];
    message[shape="parallelogram", style=""]
    input[shape="parallelogram", style=""]

    start -> input;
    input -> if_valid;
    if_valid -> message[label="no"];
    if_valid -> end[label="yes"];
    message -> input;

    if_valid[label="Is input\nvalid?"]
    message[label="Show\nmessage"]
    input[label="Prompt\nfor input"]
}


Moreover, you can use the node keyword to declare attributes for sets of nodes:

#!dot
digraph {
    label="How to make sure 'input' is valid";

    node[shape="box", style="rounded"]
       start; end;
    node[shape="parallelogram", style=""]
       message; input;
    node[shape="diamond", style=""]
       if_valid;

    start -> input;
    input -> if_valid;
    if_valid -> message[label="no"];
    if_valid -> end[label="yes"];
    message -> input;     

    if_valid[label="Is input\nvalid?"]
    message[label="Show\nmessage"]
    input[label="Prompt\nfor input"]
}


This can add so much productivity to drawing graphs that it is silly not to use it. If you're not happy about the layout, you can use ranking of nodes to make sure nodes are ranked equally so they are put next to each other:

#!dot
digraph {
    label="How to make sure 'input' is valid";

    node[shape="box", style="rounded"]
       start; end;
    node[shape="parallelogram", style=""]
       message; input;
    node[shape="diamond", style=""]
       if_valid;

    start -> input;
    input -> if_valid;
    if_valid -> message[label="no"];
    if_valid -> end[label="yes"];
    message -> input;     

    {rank=same; message input}
}


But if I'm honest, you shouldn't bother too much. If you want the layout more presentable, you should just output any of the vector-based output formats and tweak it yourself with any drawing program out there, or even better: output SVG and an XSL transformation to do that for you ;). The essence of it all is that you are productive in documenting something, not in making beautifully laid out graphs. Be focused on that rather than having people "ooh" and "aah" your handywork.

Summarizing

Keep focus on the structure and decisions in your chart, in stead of how stuff should be placed and how it is laid out. You should be worrying about the concept behind the chart in stead of the chart it self. What better way just to be concerning yourself with the nodes, their relations and their type?

The key is that you are documenting domain logic, which is expressed in a logical sense rather than some graphical visualization. You have many possibilities here: for example, generate these charts from static code analysis, generate model relationships from your domain model, class diagrams, etc, etc. I'd say the possibilities are endless, but that seems tautologic.

Have fun, and let me know if you have any tips regarding graphviz or the DOT language!


  1. Don't worry, you don't need to be able to read the BNF syntax description. There's more docs out there. ↩︎


< Return to main page


You're looking at a very minimalistic archived version of this website. I wish to preserve to content, but I no longer wish to maintain Wordpress, nor handle the abuse that comes with that.