API review

Proposer: your name here

Present at review:

Blaise
Josh
James

Question / concerns / comments

Enter your thoughts on the API and any questions / concerns you have here. Please sign your name. Anything you want to address in the API review should be marked down here before the start of the meeting.

Josh

The example for using / still has a \ in it.
The whole doxygen frontpage should be in the wiki (this is more doc-review, I know)
Aggregator
- Does it need the publishData() method? Can you do it automatically with a ros::Timer instead?
- diagnostic callback is public
- init is public -- is this something the user needs to call, or something that is called internally in the constructor?
StatusItem
- update has no way of telling you if it succeeded or not
- level being a uint8 is a bit strange. I understand you want it to match the message, but it'd be nice to have that be an enum. You could do, for example:

enum DiagnosticLevel
{
  Level_OK = diagnostic_msgs::DiagnosticStatus::OK,
  Level_Warn = diagnostic_msgs::DiagnosticStatus::WARN,
  Level_Error = diagnostic_msgs::DiagnosticStatus::ERROR,
};

update and constructor take bare pointers to the messages, is that correct?

GenericAnalyzer
- Why init vs. initOther? The description of initOther doesn't make it clear when one should be used vs. the other, or if initOther should ever really be used
- In fact, I assume GenericAnalyzer is not supposed to be instantiated directly at all, just from yaml, or automatically by the Aggregator? If so, that should be mentioned in the class description.

Comments to Josh

GenericAnalyzer can analyze messages that match the settings of its YAML file, or it can analyze the remainder. The aggregator node automatically makes an "Other" GenericAnalyzer to analyze any messages that haven't been looked at already.

StatusItem

The item is made from a DiagnosticStatus pointer, so that should be correct

Enum for Level

Should do that

Public v. Private

Should do that, too

Blaise

This is my first look at this. I basically wrote down all the whys that came to mind as I looked at this, so there are a lot of them. I'm not saying we have to change these, they are just the things I want to understand and discuss.

"remove_prefix" instead of "start_name"?
Not sure I like the term of prefix for the categories that diagnostics are being sorted into. I would prefer something like category.
In many cases (all the ones I get my hands on) you can use the prefix to match all the diagnostics coming from a node. Would that be better that explicitely naming them all for each component in the robot?
The check for argc in aggregator_node should happen after ros is initialized (ros will remove some of the parameters).
Why did we go for C++? Speed? (Just curious...)

In the component analyzer example, why are we giving each component a name that isn't getting used? (tilt_hk, imu, base_hk) Also, does it make sense for that name to be at the same level as keywords like "type", "prefix" and "timeout"? It would make more sense to me to have a list of components like this:

sensors:
  type: ComponentAnalyzer
  prefix: Sensors
  timeout: 5.0
  components:
    - name: Base Hokuyo
      start_name: base_hokuyo_node
      fields: [
        'base_hokuyo_node: Connection Status',
        'base_hokuyo_node: Frequency Status']
    - name: Tilt Hokuyo
      start_name: tilt_hokuyo_node
      fields: [
        'tilt_hokuyo_node: Connection Status',
        'tilt_hokuyo_node: Frequency Status']
    - name: IMU
      start_name: imu_node
      fields: [
        'imu_node: Calibration Status',
        'imu_node: Frequency Status',
        'imu_node: IMU Status']

I would find regular expressions preferable over having 'contains', 'startswith', 'name', etc. I would at least suggest adding the option of using a 'regex'.
Same comment for top-level names. It isn't clear to me that it is good to assume that your whole private namespace is full of diagnostic_analyzers. If you want to have them as separate parameters, I would push them into ~analyzers or something like that.
I don't have a good sense for how we are going to be using this in the multi-robot case. Should diagnostics be going to "/diagnostics" or to "diagnostics"? (This is more a question for me.)
Why is the "start_prefix" a command line argument rather than a parameter?
Does the GenericAnalyzer do anything the ComponentAnalyzer doesn't do?
Is there any support for including files. It seems that could be useful if you don't want to have to work with a monolithic yaml file for the whole robot.
Why the plugin_list.cpp file? That information should usually lie with each plugin. As it is, you get the impression that all plugins will have to be registered in that file.
In case I don't get to dig into the code enough: What are the semantics of receiving diagnostics and then republishing them all at once later?
How do I specify an empty initial prefix? Currently if I put just / or "", the tree has a root that is the empty string.
I can't help wondering if a lot of this could be avoided by specifying titles for the diagnostics. At the very least specifying titles would allow a large part of the diagnostics to be aggregated automatically without needing to be listed explicitly in the aggregator. (Though perhaps we want them all listed explicitly.)
The question mark icon (when you stop getting diagnostics) seems too friendly to me. I would want something more worrying.
Would a way of resetting staleness be useful?
The class descriptions for diagnostic_aggregator should not contain references to pr2 as this is non-pr2 specific.
Would it be cleaner to have a derived class of GenericAnalyzer to handle the "Other" case?
Why not make maintenance easier and put your analyzers into a boost::shared_ptr?
Currently it looks like if you have a diagnostic that blips to a bad state and then returns to normal the blip may not show up at all in the robot monitor. Is this a good thing?
Are there cases where diagnostics from a given source would be matched by a different analyzer depending on circumstances? If not, publishData could be greatly sped up by caching which analyzer was used for a given diagnostic.
Likewise, is it intended for analyzers to do more than just reclassify where a message is displayed? If not then they only need to look at messages from a given source the first time they come in.

Meeting agenda

To be filled out by proposer based on comments gathered during API review period

Conclusion

$/!\$ Prefix -> Path, First Prefix -> Base Path (X)
$/!\$ Combine generic/component analyzers (X - component removed)
$/!\$ start_name -> remove_prefix (X)
$/!\$ Adopt Blaise's yaml changes (dictionary->list) (X - component removed, other YAML changes in)
$/!\$ Push analyzer parameters into ~analyzers (X)
$/!\$ Add regex supplement for matching analyzers (X)
$/!\$ Change start_prefix to be a parameter called base_path (X)
$/!\$ Analyzers need access to every message from last timestep -- give access to all messages since last update. (X)
$/!\$ Remove initOther -> create OtherAnalyzer (X)
$/!\$ Make GenericAnalyzer possible/easy to subclass (X - GenericAnalyzerBase)
$/!\$ Empty base path should not create a tree item in the robot_monitor (X)
$/!\$ Remove references to pr2 from doxygen docs (X - component removed)
$/!\$ Detailed class descriptions need to be fleshed out (Doc review, but they've been updated)
$/!\$ Pull out all analysis, status level aggregating, etc. into a library so it can be used by any analyzer.
$/!\$ StatusItem::getUpdateInterval() -> getLastUpdateTime(), return a ros::Time (X)
$/!\$ Analyzer init() returning false should not crash aggregator, instead continue and report with a diagnostic (X)
$/!\$ StatusItem should warn when time goes backwards (X)
$/!\$ Simplified analyzer API: (X)

bool match(name of item) const; // return whether or not analyze() should be called on this item
bool analyze(status item);  // happens as messages arrive on individual StatusItems. Return true to indicate that you have handled this message (rather than just observed it).
vector<> report(); // happens immediately before publish()

$/!\$ Compononent analyzer -- should not need to list individual fields. Instead: (X - removed, changed merged to generic)

base_hk:
    name: Base Hokuyo
    start_name: base_hokuyo_node
    fields: 2

/X\ Need a way for a node to say that a diagnostic should be removed (node shutting down, etc.). Schedule meeting to discuss.