rospack/2008-09-30 Code Review

Package Developer:

Examiner:Rob Wheeler

Present at code review:

Notes from examiner

Documentation

API
- Not intended for external consumption
ROS API
- rospack does not expose a ROS API
command-line
- $/!\$ The ill-named --profile command is not documented
- The rest of the documentation looks fine
tutorials / examples
- Adequate

Naming

$/!\$ This code pre-dated the WG naming convention document, and thus does not follow our new naming conventions. That being said, the code doesn't appear to follow a self-consistent naming style either. For example:
- Most function names are underscored (cmd_deps_manifests), while others are not (cmd_depsindent)
- Some private member variables have a leading underscore (_deps), others do not (deps_calculated)
- The code is not consistent in its use of user-defined types (e.g. when is vector<Package *> used instead of VecPkg?)

Source Code

* Choose your containers wisely

When choosing a container, consider:
Run-time efficiency
Memory efficiency
Ease of use and maintainability

The main data structure of rospack is a collection of packages. The two operations typically performed on the collection is to lookup a package by name and to insert a package into the collection.

When the filesystem is crawled to find packages it checks to see if the package name is already in use, and if not it inserts the package. This results in n lookups and n insertions.

Once the filesystem has been successfully crawled, the results can be stored in a cache. To load the cache, we simply do n insertions.

The most common operation is find. Here is how some different containers perform:

container	insertion	lookup	crawl
vector (unordered)	O(1)	O(n)	nO(1) + nO(n)
map	O(log n)	O(log n)	nO(log n) + nO(log n)
unordered_map (hash table)	O(1)	O(1)	nO(1)+nO(1)

Also, consider choices for:

Priority queue
Deque

* Containers of pointers to objects are clumsy and error-prone

Containers will call the destructor for their contents when they are cleared or destroyed before freeing the container memory itself. However, builtin types like pointers do not have destructors. Therefore, it becomes necessary to write code like:

ROSPack::~ROSPack()
{
  for (VecPkg::iterator p = Package::pkgs.begin();
       p != Package::pkgs.end(); ++p)
    delete (*p);
  Package::pkgs.clear();
}

And memory leaks are easy to code. For example,

void ROSPack::crawl_for_packages(bool force_crawl)
{
  Package::pkgs.clear();

leaks all of the packages read previously from the cache. A good solution to this problem is to use smart pointers.

* If it feels dirty when you write it, then it probably is

The collection of all packages in the system is stored as a static public member of the Package class. It is initialized by the ROSPack class, used exclusively by ROSPack member functions, and destroyed in the ROSPack destructor. In other words, it probably should be a private member of the ROSPack class.

* When in Rome...

The command-line parsing still feels unconventional to me. Other programs with a similar flavor have the structure:
```
program command [options] [package]
```
Examples include svn and openssl. However, rospack has the following syntax:
```
rospack [options] command [package]
```
Options such as --lang and --deps-only are accepted by all commands, whether it makes sense or not.

* Don't reinvent the wheel

For example,

  string token("\n");
  for (string::size_type s = cmd.find(token); s != string::npos;
       s = cmd.find(token, s))
  {
    cmd.replace(s,token.length(),string(" "));
  }

Could be written as,

  replace(s.begin(), s.end(), '\n', ' ');

Another example,

  string s(cstr);
  while (1) // every decent C program has a while(1) in it
  {
    int i = s.find(string("${prefix}"));
    if (i < 0)
      break; // no more occurrences
    s.replace(i, string("${prefix}").length(), path);
  }

Could be written as,

  boost::replace_all(s, "${prefix}", path);

* Memory leak. The pointer newp is leaked:

        // Filter out duplicates; first encountered takes precedence
        Package* newp = new Package(child_path);
        // TODO: make this check more efficient
        bool dup = false;
        for(std::vector<Package *>::const_iterator it = Package::pkgs.begin();
            it != Package::pkgs.end();
            it++)
        {
          if((*it)->name == newp->name)
          {
            dup=true;
            break;
          }
        }
        if(dup)
          delete newp;
        else
          Package::pkgs.push_back(new Package(child_path));

* Streaming I/O can be cleaner. Instead of:

    string cache_path(string(ros_root) + "/.rospack_cache");
    FILE *cache = fopen(cache_path.c_str(), "r");
    if (cache) // one last check just in case nutty stuff happened in between
    {
      char linebuf[30000];
      while (!feof(cache))
      {
        linebuf[0] = 0;
        fgets(linebuf, sizeof(linebuf), cache);
        if (!linebuf[0] || linebuf[0] == '#')
            continue;
        char *newline_pos = strchr(linebuf, '\n');
        if (newline_pos)
          *newline_pos = 0;
        Package::pkgs.push_back(new Package(linebuf));
      }
      fclose(cache);
      return; // cache load went OK; we're done here.
    }
  }

try:

  string cache_path(string(getenv("ROS_ROOT")) + "/.rospack_cache");
  string line;
  ifstream cache(cache_path);
  if (cache.is_open())
  {
    while (!cache.eof())
    {
      getline(cache, line);
      if (!line.length() || line[0] == '#')
          continue;
      Package::pkgs.push_back(new Package(line));
    }
    cache.close();
  }

* Do the easy stuff first

Don't bother checking if the cache is good if a crawl is forced:

  if (cache_is_good() && !force_crawl)

It is cheaper to check a single char is '.' than it is to make a system call:

    struct dirent *ent;
    while ((ent = readdir(d)) != NULL)
    {
      struct stat s;
      string child_path = cqe.path + fs_delim + string(ent->d_name);
      if (stat(child_path.c_str(), &s) != 0)
        continue;
      if (!S_ISDIR(s.st_mode))
        continue;
      if (ent->d_name[0] == '.')
        continue; // ignore hidden dirs

* Don't optimized prematurely, but don't pessimize either...

While you shouldn't spend too much time worrying about optimization, especially in a program like this, you shouldn't get in the habit of writing really inefficient code. For example,
```
  vector<string> name_tokens;
  string_split(path, name_tokens, fs_delim);
  name = name_tokens.back();
```
Could be replaced with the simpler, more efficient:
```
  return path.substr(path.rfind('/')+1);
```

* snarfing flags is confusing, error-prone, and difficult to verify. Boost::regex makes things easier. After defining the regular expression:

  string fmt("(?1$2)(?3$5)");
  string fmt_other("");
  boost::regex cflags("(^-I(\\S*\\s*))|"
                      "((?<=\\s)(-I)(\\S*\\s*))");

The calls to snarf get replaced by one liners. For example,

  string cflags_I = boost::regex_replace(source, cflags, fmt, boost::match_default |           
                                                              boost::format_no_copy | 
                                                              boost::format_all);
  string cflags_other = boost::regex_replace(source, cflags, fmt_other);

* Crawling for packages seems embarrassingly recursive

* Code to produce --profile is unnecessarily complicated:

limited-size priority queue vs. priority queue
vector vs. priority queue

* Are missing XML elements handled properly?

* Calling exit() makes valgrind unhappy if destructors need to be called.

Manifest / Build System

build
- $/!\$ make clean does not blow away LIBOBJ (i.e. rospack.o)
dependencies
manifest data

Testing

all tests pass
$/!\$ unit tests
- The unit tests provide approximately 90% code coverage. The biggest chunk of the untested code is related to the untested --profile command. Most of the remaining uncovered 10% deals with exceptional situations, such as file system errors or malformed XML, however, there are a few remaining common use cases that could be covered:
  - cycle detection in descendants() [rospack.cpp:242]
  - using cached descendants in descendants() [rospack.cpp:246]
  - duplicate package name found in descendants() [rospack.cpp:275]
  - OS specific export flags [rospack.cpp:352]
  - newline replacement of backtick command [rospack.cpp:388]
  - backspace escaped space [rospack.cpp:623 & 663)
  - legacy export is not tested, is it time to remove the code? [rospack.cpp:867]
  - the help command is not tested [rospack.cpp:894]
  - Discarding a package from the lang list named ros* that isn't a language [rospack.cpp:924]
  The exceptional situations not covered are:
  - bad XML not tested in direct_deps() [rospack.cpp:310]
  - popen failed [rospack.cpp:396]
  - error exit status of popen [rospack.cpp:408]
  - empty manifest file [rospack.cpp:444]
  - cache file not writable [rospack.cpp:1133]

Conclusions

This section will be filled out during the group code review meeting, and will include

$/!\$ Any action items that need to be taken before clearing
Change in official package status