DARPP Distributed Whiteboard design v1.0

CSE515
Distributed Whiteboard Design
Team name: "DARPP"


Membership

Each member (a computer running an implementation of the agreed-upon software) maintains the following state as best it can:

Objects

Member Object
Display Object
The pair of Lamport # and member ID is a unique ID and totally ordered. displayObjects can sort unambiguously. Lamport numbers usually suffice. Any duplicates can be disambiguated by IP address

Some kinds of displayObjects:
A pointer is easily implemented as a series of moves of some object such as an Arrow or a colored dot.

We thought about whether to ever remove a displayObject from the list. If there is a value in being able to replay the whole conversation, keep the unerased version, and put it initially in the original place. If only the final state of the board is desired, subsumed objects could be omitted from the list as long as zombies retransmitted also get omitted. Unless the list becomes huge, it is not anticipated that we would ever need the optimization of omitting objects that are not longer extant.

Another idea is to put undesired objects in the Trash. Easily implemented by making the position be off-screen. Emptying the trash would be an optimization that need not be implemented in the first release of the product.

Another thought for Garbage Collection for future releases: If we have the ability to take a snapshot of the board, it need not contain the history, only the final view.

A simple semantics of taking a snapshot is to render the snapshot as a published, immutable object. It can be referred to by name and copied from, but not modified in place.

Future releases might allow different named conferences, which might or might not be open to further modification.

Protocol


  1. A member wishing to join sends a #Hello message to any known computer/conference address

  2. Response from the other computer is ....

  3. In response to #Welcome (same as #update??), the recipient notes if the update message includes any more recent time-stamp from any members, possibly including mention of members currently unknown, or thought to be flakey.

    The recipient of an #update message also assures that any displayObjects are included in the local cache (a SortedSet) of displayObjects.

    Any new, higher value of maxLamport is kept.

  4. Afterwards, the joiner says "Hello" to any newly-introduced or recently updated group members (list accumulated from any other messages).

  5. In response to "Hello" anytime, recepient sends their current state to the sender of the "Hello". It could also serve as a trigger to send an update (=#Welcome?) to anybody you haven't talked with "lately". The algorithm needs to be careful that update loops don't form, but other than that, there is no harm in sending state info to someone.

  6. Send a #disconnect message to all known group members when leaving. At this time this leaving member is marked #disconnected by recipients of the message. It is likely that keeping track of those you have talked to lately will be useful in the future, so don't be hasty to expunge all knowledge of them from your member list.

  7. On message timeout, the destination is marked #flakey and removed from the senders view, but not the members list.

  8. Lamport clock is used locally to order objects.

  9. Refresh is available for local redraw from the locally cached displayList.

  10. Re-synchronization by requesting update (=? #Hello message?) from some apparently active member of local view.

  11. A reason NOT to unify #Hello and #LookAtMe: when a member is #disconnected, they would not want to get woken up just because the conversation continues. If someone explicitly sends #Hello, a #disconnected member should wake up if it is running.

  12. Alternately (additionally?) a member may send a #Welcome (update) message to confederates.

  13. Can be implemented using Java swing and RMI. 3 of us humans like this approach.
  14. If implemented in non-language-specific protocol such as SOAP, can also be implemented in Squeak.
  15. Does Ruby do SOAP?

  16. Drawing or other display modifying creates a local displayObject, which is either new or modifies a previous object. This local object is given its Lamport number and sent to every active member with a #newDisplayObject: message.

  17. On getting a #newDisplayObject: message, a member should assure that the displayObject is in the local displayList. THEN it should add itself to the message's memberList (the set of members claiming to have seen the message) and forward it to any active members who are not already on the message's memberList.

  18. Two people may "simultaneously" create display objects that overlap.
    The Lamport with memberID secondary sorting is a total ordering, so everyone who gets both objects will see them in the same order.


A diagram showing some message sequences is seen in DARPP Block Diagram


Our design is very good at error recovery: no one ever need to assume the role of leader. Any other member can usefully update a new or lapsed member.

In case of network separation, any subset could carry on with a semantically plausible result on re-merger: worst case is that drawings occlude each other, which could be remedied by any participant moving things to new places and erasing other things.

Any single member could make drawings completely in isolation, but have them be viewed as colleagues wake up.

In practice, we think that a lot of independent, uncommunicative drawing would result in chaos when disjointed parts of the network merge. Reasonable semantics beg a fairly easy technical solution: a desirable kind of displayObject is a Group. When objects are in a Group, they can be moved, swapped-out(=erased or trashed) or swapped back in as a unit. A person creating an elaborate drawing indepedently off-line would politely put them in a Group.
This feature is not for the first release of the product.

We hassled about more different kinds of messages, and more intricate protocols. We agree on the desire to keep it as simple as possible. We are not fully in agreement on what constitutes simple.

One position is to be liberal about sending state-info since we are careful to assure that redundancies are detected and not entered in duplicate.

Another position is to make the messages have a very simple semantics. A problem is that there are more different ways to fail. More coding to implement recovery.

While our initial thought is to use TCP, we have been designing the semantics such that UDP might fit wonderfully. There is really no place that a lost message causes anything worse than a temporarily obsolete view.

We discussed the difference between #disconnected and #flakey. We had originally used the term #dead instead of #flakey, but it is too permanent for possibly transient unreachability. If a member observes that a colleage does not answer, it is marked #flakey in the local {members}, but gossip about the flakiness is not broadcast. We decided that this simplifies our design. And a side-effect is that transient network partitions might very well go unnoticed, as those machines which can continue will do so.

When a machine sends #disconnect, it presumably is going off-line. Perhaps the hardware is shutting down soon.

A machine which sends #disconnect, but does not terminate the whiteboard server might well be rejoined at any time by some colleague sending #Hello. We don't want to create a notion that anyone must remember when the member is gone until.

User interface:

The UI includes at least the following:

A plausible User Interface might look like this User Interface Example: DARPP_UI.pdf