Making Sense of Erlang’s Event Tracer

PostsMaking Sense of Erlang’s Event Tracer


Let’s face it people, while Erlang is wonderful in a number of ways documentation is not one of them.  I mean, it’s there, and it’s better than some other projects.  However, there are incredibly useful things staring you in the face that are just impenetrably difficult.  The et module is one of those.

Fear not, gentle reader.  After more time than I’d care to admit, I’ve managed to figure out roughly how the pieces fit together.  As usual, they exhibit the combination of weirdness and inspiration that have driven us to embrace Erlang.  Without futher ado, let me take a stab at laying out exactly how to actually use it.
Four Modules
The event tracer framework is made up of four modules:
  • et
  • et_collector
  • et_viewer
  • et_selector

In addition, you’ll probably want to familiarize yourself with the dbg module and possibly seq_trace module as well.

The Event Tracer Interface
This is perhaps the most confusing module.  It contains a function that you call to report arbitrary events to the tracing interface.  The method is named report_event.  There is also a humorous alias named phone_home.  Get it?  et:phone_home?  Yeah, it makes me sad, too.
Ostensibly, they’re supposed to be called with options something like:

report_event(85,from,to,message,extra_stuff)

The number (in this case 85) is an integer from 1 to 100 that specifies the “detail level” of the message.  The higher the number, the more important it is.  This provides a crude form of priority filtering.  Avoid using 100, since it seems to disagree with being displayed in the viewer.
The from, to, and message variables are exactly what they sound like.  From and To are visualized as “lifelines”, with the message passing from one to the other.  If from and to are the same value, then it is displayed next to the lifeline as an “action”.  The extra_stuff value is simply data that you can attach that will be displayed with someone actually clicks on the action or message.
In a perfect world, this would be enough to start to get your hands direty.  We do not live in that perfect world.  Oddly enough, these functions do absolutely nothing.  Let me repeat that.  They do nothing, except return an ominous atom hopefully_traced.  This confused me at first, and let me down the long rabbit hole that is the event tracer.
Their purpose is to be traced.  Rather than having them do something, they’re just there so that the system can notice when you call them (via tracing) and use the data accordingly.  While this is a great idea it has three faults.  The first is that it makes the process rather magical.  The second is that Erlang tracing is a seething pile of pain that involves reasonably complex knowledge of clever ports, tracing return formats, and specialized tracing MatchSpecs (which are really their own special kind of hell).  The third problem is that this conspires to make an incredibly useful and reasonably flashy feature inaccessible to users.
The Collector and Viewer
These two pieces work in concert.  Basically, the collector receives trace events and processes them.  The viewer interrogates the collector and displays a relatively nice, interactive representation of them.  As usual, it’s in TK, which is a shame.
You might wonder why these aren’t just one module.  It turns out that, in typical Erlang style, the collector is a generic full-fledged framework that allows processes to “subscribe” to the events that it collects.  This would probably be really useful if there were any other subscribers.  Someday I may write one to help debug some code, but for now just trust that it really is a pretty interesting architecture.
You also have the advantage that the viewer creates a collector for you.  With a few options and some debugging settings you can start collecting events.
The Selector
This is perhaps the most useful and frustrating module in the entirety of the et suite.  This is mostly because it isn’t really mentioned in the User’s Guide.  It turns out that the collector needs “filters” to convert the raw trace data into “events” that it can display.  The et_selector module provides the default filter and some API calls to manage the filter pattern.  Due to the architecture of the collector, this module is quite a bit of a bear.
Effectively, it’s a mishmash of functions that achieve the following:
  • Convert Any Trace Message Into An Appropriate Event
  • Magically Notice Traces of the et Module and Make Appropriate Events
  • Carefully Prevent Translating A Message Twice
  • Manage A “Trace Pattern”
It turns out the there are a few strange interactions.  First of all, you still have to tell the system to trace the right things.  Secondly, without the correct incantation you will get infuriatingly strange “trace” events instead of the nice events you’re reporting with et:report_event/5.  Finally, the documentation doesn’t do a decent job of telling you what the trace pattern even is.
How To Put It Together
It turns out that the collector automatically registers itself to listen for debugging events, so all you have to do is enable them.
For those people who want to do general tracing, consult the dbg module on how to trace whatever you’re interested in and let it work its magic.  In my case, I just wanted et:report_event/5 to work, so that’s what I’ll illustrate here.  I did the following:
  1. Create A Collector
  2. Create A Viewer (this can do step #1 for you)
  3. Turn On and Pare Down Debugging
The following module achieves this.
-module(trace_test).

-export([test/0]).

test() ->
  et_viewer:start([
    {title,”Coffee Order”},
    {trace_global,true},
    {trace_pattern,{et,max}},
    {max_actors,10}
  ]),
  dbg:p(all,call),
  dbg:tpl(et, report_event, 5, []),
  Drink = {drink,iced_chai_latte},
  Size = {size,grande},
  Milk = {milk,whole},
  Flavor = {flavor,vanilla},
  et:report_event(99,customer,barrista1,place_order,[Drink,Size,Milk,Flavor]),
  et:report_event(80,barrista1,register,enter_order,[Drink,Size,Flavor]),
  et:report_event(80,register,barrista1,give_total,”$5″),
  et:report_event(80,barrista1,barrista1,get_cup,[Drink,Size]),
  et:report_event(80,barrista1,barrista2,give_cup,[]),
  et:report_event(90,barrista1,customer,request_money,”$5″),
  et:report_event(90,customer,barrista1,pay_money,”$5″),
  et:report_event(80,barrista2,barrista2,get_chai_mix,[]),
  et:report_event(80,barrista2,barrista2,add_flavor,[Flavor]),
  et:report_event(80,barrista2,barrista2,add_milk,[Milk]),
  et:report_event(80,barrista2,barrista2,add_ice,[]),
  et:report_event(80,barrista2,barrista2,swirl,[]),
  et:report_event(80,barrista2,customer,give_tasty_beverage,[Drink,Size]),
  ok.
Running through the above, the most important points are:
  • Turn On Global Tracing (it doesn’t work for me if I don’t)
  • Set a Trace Pattern
  • Tell The Debugger to Trace Function Calls
  • Tell It Specifically To Trace The et:report_event/5 Function
The Aftermath
The most vexing part of figuring this out was the trace pattern of {et,max}.  The trace pattern is basically a tuple of a module and a detail level (either an integer or the atom max for full detail).
The specified module flows from your instantiation of the viewer, to the collector that it automatically creates, gets stashed in as the trace pattern, and eventually goes down into the bowels of the selector.  Tracking this down sucked.
This fact is documented basically nowhere.  It also doesn’t make a lot of sense, either.  The module that you specify gets passed down (eventually) into et_selector‘s default filter.  The format of the report_event/5 function call is hardcoded in that filter.  That makes it very hard for me to imagine why you would ever specify another module.  I suppose you could replace the default filter, but it turns out that doing so is pretty obtuse to do even if you wanted to.  At any rate, just pass it through and it works.
That said, I think it was worth it.  If you compile the above example code and execute trace_test:test(), you’ll see something like the following screenshot.  Source code here, screenshot here.
Beautiful.  I think I’m going to get a cup of coffee now.

About Jayson Vantuyl

I live in San Francisco, California; have an awesome son, David; and make machines do (subjectively) interesting things. I'm generally an all around handy fellow.

5 comments

  • Reply Ulf Wiger said: April 18, 2009 12:25 pm

    Impressive work researching all this! :)

    I've long thought that et should be used more, but your post illustrates well why it isn't.

  • Reply darkua said: May 27, 2010 9:20 am

    Thanks a lot, great job!

  • Reply Gianfranco Alongi said: September 7, 2010 8:22 am

    Great page!
    Seriously outperforms the current documentation.

    Cheers

  • Pingback: The Erlang stack | the web dev
  • Reply Yu-ri said: January 28, 2014 2:27 pm

    good post.. kept going from site to site wondering what pieces I'm missing.. thanks for explaining all the "read between the lines" stuff

Leave a Comment

Your email address will not be published. Required fields are marked *