Stack Proposal
Proposers: Daniel Stonier, Jihoon Lee
Simply add comments and address concerns here with your name in brackets [] after the comment before October 1st 2012.
Contents
Definitions
So we're all on the same page:
Ros API : a single topic/service/action or a collection of such for a node or launched system.
Gateway : the object connecting local and remote ros systems.
Public Interface : the set of ros api offered by a gateway for others to use.
Flipping : the process of flipping local ros api out to a remote system.
Pulling : the process of pulling ros api from a remote gateway’s advertised interface to the local system.
Concept
The gateway will be the public frontend for a ros master and is intended to act much like a gateway on a local area network controlling what is exposed and what is forwarded between the local ros master and the outside world (other ros systems). The intention is to generalise this kind of interface beyond tools that previously existed (foreign_relay, master_sync and fki_multimaster) and also make their configuration/usage as simple as possible.
Design
Goals
- Gate_G01 : do not burden non-multimaster systems with unnecessary overhead
- Gate_G02 : do not interface directly with foreign ros masters (put the gateway inbetween)
- Gate_G03 : support all ros api types (topics, services and actions).
- Gate_G04 : auto-discover other gateways (pre-configured, zeroconf or name server)
- Gate_G05 : provide quality of network connection statistics between systems
- Gate_G06 : control what ros api should be put on a public interface (simplicity and security)
- Gate_G07 : support flipping of local ros api to a remote gateway (control what and where your share)
- Gate_G08 : support pulling of publicly exposed remote ros api from a remote gateway (open sharing)
- Gate_G09 : configure/manage the type of connections created between systems (unreliable, comporessed, ...)
- Gate_G10 : peer to peer gateway interactions (true multimaster, not two multimaster)
- Gate_G11 : access control to permit/block requests for flipping/pulling
- Gate_G12 : decouple from higher level components such as the app_manager (re-usable building block).
Decisions
- Gate_D01a : sit alongside, do not extend rosmaster itself
- Gate_D01b : modular components called when necessary (discovery, sync, zeroconf, multicast)
- Gate_D02 : adapter like interface that accepts/makes requests on one side and interacts with the ros master api on the other
- Gate_D03 : leave concepts like bundling of ros api in capabilities (or similar) to higher level components
- Gate_D04 : gateway discovery mechanisms should be optional and varied
- By hand (yaml), centralised server (redis), zeroconf or multicast.
- Gate_D05 : assume each gateway represents a single system interface that needs to be monitored, this makes it simpler and is typical for robots, even if there is multiple machines connected internally
- Gate_D06a : default settings for a gateway should not expose anything.
- Gate_D06b : convenience option to dump all local topics on the public interface.
- Gate_D07a : gateways should have the option to block flip requests.
- Gate_D07b : flip requests should have enough details for passing on to the ros master api.
- Gate_D07c : convenience option to flip all local interfaces to a remote.
- Gate_D07d : convenience option to flip all public interfaces to a remote.
- Gate_D09a : specify transport type and hints at the system level (e.g. reliable/unreliable configuration via roslaunch).
- Gate_D09b : more complete transport types (multi-language unreliable, compression etc).
- Gate_D12 : do not make decisions about what to expose, where and how - gateways should be just a mechanism that can be controlled by policies dictated by higher level components.
Other Notes
Gateway Model - some more details here.
First Implementation - some technical steps outlining progress towards a first implementation.
Comments/Issues
Gateways as relays only?
Piyush
In the initial implementation, will gateways act as relays? i.e. all topics in the public API will be subscribed by the gateway and republished to the foreign gateway?
I had thought about this as well. One problem is that there is currently no way to guarantee unreliable connections. Unreliable transport hints go in at subscriber creation - there is nothing at the publisher end (if an unreliable subscriber requests a connection with a publisher, it will be an unreliable connection if the transport types can do it - i.e. roscpp can do it since it has a udpros implementation, but rospy connections can't). This has some interesting consequences.
- Flipping roscpp publishers out to a remote ros system don't need anything done and subsequently don't need relays.
- Flipping rospy publishers wouldn't create any unreliable connections, so these could use a relay to convert to a roscpp publisher
- The resulting connection to a publisher on a remote ros system is at the mercy of the subscriber (don't have any control over whether the user is connecting a reliable or unreliable subscriber).
- Similarly, a relayed unreliable subscriber connection to a remote system is at the mercy of the kind of publisher (roscpp ok, rospy fail).
Still, at least it would ensure that the gateway has done it's part in making sure the connections would be unreliable. I'd like to raise some of these issues on the ros-ng sig [DS].
Transport Types
Piyush
Gate_G09 will be a difficult problem. Currently you can't configure a connection's type or transport hints from the system level, e.g. from roslauchers, like remaps are done. You can't introspect on them either (introspection would allow you to interpose relays if desired). See the discussion on Transports [DS].
The TF Tree
Piyush
Exposing TF is important as it allows out of the box usage of a number of existing ROS applications. Ideally, only a subset of the TF tree should be exposed as part of the public ROS API. This is necessary for ensuring some privacy of data of the local machine, as well as not burdening foreign TF trees with unnecessary transforms.
Good point - Nick said they compressed TF trees. Also we should see what has changed with tf2 and whether it is easier now [DS].
Clock Synchronization
Piyush
This might be important for things like tf, but I am not sure how crucial it is.
Topic type availability
Jihoon
What if you wish to flip a message type to a remote gateway that knows nothing of that topic type? This will probably fail when registering using the remote's ros master api...
Message types are generally publicly accessible (even for closed source systems). It could be the role of a higher level system to make sure that these are always locally available (with tools to guarantee this) for multimaster participants. Should this in anyway be the responsibility of the gateway? At least the gateway should provide error handling, i.e. inform the gateway that wishes to flip, that it fails... [Daniel]
Message Compatibility
Jihoon
How to ensure all participants in a multimaster system resolve message compatibility issues?
They talked about this at roscon. Until now I think they haven't really had an absolute need for it since they are so publicly accessible. Next gen ros is talking about moving to protobuffs which actually do implement message version compatibility. Given that things might change, it may be better just to wait and see what happens. [Daniel]
Conclusion
Todo
Action items that need to be taken.
Major issues that need to be resolved