Title of Invention

COMMUNICATION NETWORK AND METHOD FOR MANAGING A COMMUNICATION NETWORK

Abstract The present invention relates to a communication network comprising an arrangement of nodes, each of the nodes being a leader node (11-15) or a child node (130) of one of the leader nodes (11-15), each of the leader nodes (11-15) having unidirectional links to a number of other leader nodes (11-15), the number of other leader nodes (11-15) being a function of n1, n1 being a total number of leader nodes (11-15).
Full Text Description
Communication network and method for managing a communication network
The present invention relates to a communication network and a method for managing the communication network.
In distributed systems, knowledge of the entities which form the system is essential for the sy3tem. For instance in a cluster of workstations employing a distributed load balancing algorithm to serve client requests, each workstation needs to be aware of all other workstations in the cluster and the load on each one of them. If a workstation becomes unavailable in the event of a failure, the remaining workstations need to quickly identify the failure and inform their respective load balancing algorithms of the unavailability of the workstation, ensuring optimal load distribution. In highly dynamic environments where entities join and leave the group frequently and the communication network fails sporadically, maintaining group information becomes a non-trivial issue.
To maintain this group membership information, protocols use different graph topologies like rings, stars and trees to represent the group. A group member is represented as a vertex or node in the graph and the logical communication link between members as an edge between nodes.
One solution is periodic broadcast or multicast by every
node to all other nodes in the network. The amount of messages that gets generated with the size of the cluster

is multi-fold (0(N2) - where N is the size of network). This ensures fast convergence time in case of state changes in the network but is very un-reliable with large number of interconnected systems. The higher the number of messages generated in the network, the more is the chance of a message loss. This will raise false alarms among few nodes and the group membership protocol will work inappropriately.
In another method, each node sends messages to a central node and the central node will inform to all other nodes using some broadcast mechanism. This will make the single node to be a bottleneck and may have many delays while processing large number of messages from ail the nodes in the network.
There is another method in which few nodes act as membership servers, and when any of the other nodes in the system needs membership information, they send a request to one of these servers. To maintain the state information across membership servers becomes a tedious task and may tend to perform inefficiently when the network size grows.
It is an object of the present invention to provide an efficient network topology.
This object is achieved by a communication network,
- comprising an arrangement of nodes, each of the nodes being a leader node or a child node of one of the leader nodes,

- each of the leader nodes having unidirectional
links to a number of other leader nodes, the number
of other leader nodes being a function of n1, n1 being
a total number of leader nodes.
This object is achieved by a method for managing a communication network, comprising:
- arranging nodes of the network, each of the nodes
being a leader node or a child node of one of the
leader nodes, depending on a total number of nodes;
- each of the leader nodes having unidirectional
links to a number of other leader nodes, the number
of other leader node3 being a function of n1 n1 being
a total number of leader nodes.
The underlying idea of the present invention is to provide a network having the nodes arranged as leader nodes and child nodes of the leader nodes. A leader node is linked to only a number of other leader nodes in the network which is derived from the total number of leader nodes. For peer detection, child nodes periodically send messages to their leader node, which transmit these messages to other leaoers nodes linked to it. The leader nodes transmit the received message to their child nodes This results in a uniform view of the entire network.
In a preferred embodiment, a maximum number of child nodes of each leader node equals n].-l The present network arranges the node in a balanced state, such that the number of child nodes along with their leader node equals total number of leader nodes in the network. Hence if gCb is the number of leader nodes, and ncb is the number of

child nodes of each leader node along with the leader node, then total number of nodes in the network (N) would be gcb * ncb. In a completely balanced state, gch = ncb Therefore gcb = ncb = VN.
In a further preferred embodiment, the leader nodes maintain skip lists for the unidirectional links to the other leader nodes in the network. Skip lists are based on the concept of linked lists. In addition to pointers provided in linked lists to the next node, skip lists have pointers to further nodes also. Such number of further nodes can be calculated based on the total number of nodes.
In another embodiment, a new node is added to the network as one of, a child node of one of the leader nodes, depending on nj, a respective number of child nodes of the leader node, and a maximum size of the network; or a new leader node in the network when each leader node in the network has a number of child nodes equal to n1-l. In a completely balanced state, as described above, gcb = nCb. Hence a new node is made a separate leader node. In case the. network is not in a completely balanced state, the new node is assigned as a child node to one of the leader nodes in the network depending on the number of child nodes of each leader node, the number of leader nodes and the total number of nodes in the network.
The present invention is further described hereinafter with reference to illustrated embodiments shown in the accompanying drawinqs, in which:
FIG 1 shows a general structure of the topology of an embodiment of the present invention,

FIG 2 - FIG 7 illustrate by an example how a network grows, using an embodiment of the present invention,
FIG 8 shows a flowchart relating to a method of managing a topology,
FIG 9 shows a flowchart relating to a method of message
transmission in a network,
FIG 10 shows a graph comparing message complexity of various known topologies with the topology according to an embodiment of the present invention,
FIG 11 shows a graph comparing efficiency of various known topologies with the topology according to an embodiment of the present invention, and
FIG 12 shows a graph comparing robustness of various known topologies with the topology according to an embodiment of the present invention.
All nodes in the present network are assigned a unique node ID (NID) and a group ID (GID). The GID is usually the same for all child nodes attached to the same leader node and is the same as the leader node's NID. All node3 having the same GID are henceforth referred to as a group.
In a preferred embodiment, the leader nodes are connected via a data structure which provides link to a number of other leader nodes in the network, depending on the data structure. In one embodiment the leader nodes are connected via skip lists. A skip list is a probabilistic

data structure, ba3cd on linked lists having additional forward pointers. Each leader node, in the present network, maintains logc(n1) links, where there are n1 leader nodes and c is configurable.
FIG 1 shows the structure of the network as per an embodiment of the present invention. FIG 1 shows the nodes arranged in groups. Each group has a leader node 11-15. The leader node of a group is directly connected to a number of leader nodes, which are in turn directly connected to other leader nodes via a data structure 16-25. The leader nodes are also connected to their child nodes 130. Thus there is a connection between all the nodes. This increases efficiency and robustness of the topology.
As per a preferred embodiment of the present invention, the present network is said to be in a completely balanced state when the total number of nodes, N7, in the network, is exactly the square of a positive integer greater than 1. The node joining immediately after the network attains a completely balanced state is assigned a leader node status. In a completely balanced state, the number of groups in the network (gCb) equals the next positive integer of (square root of size of the network, N) i.e. [√N] each leader node of the groups having \'(N -1) child nodes attached to it. And in a completely balanced state, the number of nodes in each group, ncb, equals number of groups in the network, gcb,
As an example, for a balanced network, the number of skip-lists a leader node maintains is log2(√N) , where the number of leader nodes in the network is √N. So, if

i= log2(√N), a node with NID x will be directly linked to nodes with NID X+2°. . .X+2i-1.
GID of nodes is derived as i * [√N]' where {i = 1, 2, 3 ... [√N] }, "S" being a maximum size the network can grow.
The GID of a group to which a new node joining the network is assigned, is based on the conditions provided in following Table 1 where S is the maximum size the network is expected to grow.

The NIDs are retrieved from a Failure Node Index (FNI) if available, for assigning Lo new nodes. When a node fails, its NID is cut in a Failure Node Index which allows reuse of the NID. If there is none available in the FNI, the "highest numbered NID within the group to which the node is assigned + 1" is assigned to the new node. In case of a new group, NID of the first child node of the group is GID +1.
The above network is now explained with the help of an example shown in FIG 2 - FIG 7.
In this example S is given a value of 1000000 i.e. the maximum size the network is expected to grow is 1000000. Fit" 2 shows one node 20 in the network i.e. N=l.

According to table 1, when N On addition of a second node 21, N=2, 30 GID is 2*1000-2000 as shown in FIG 3.
For the third node 23 N In FIG 5 N=4 . So gcb=2, hence (gcb + 1) *ncb gives 6. Since N>3 and Similarly FTG 6 and FIG 7, where N=5 and N=6 and nodes 25 and 2 6 are added.
FIG 8 relates to a flowchart for a method 50 for managing the present network, as per a preferred embodiment.
In step 51, the method comprises arranging the nodes in groups, each group having leader nodes and child nodes of the leader node?. The number of groups and nodes in each group depends on the total number of nodes in the
network.
The groups formed in step 51 are linked using a data structure in step 52. In a preferred embodiment, skip list are used to link the groups. Skip lists provide a forward pointer in addition to a pointer to the next adjacent node. Each group is coupled to a predetermined

number of groups, depending on total groups in the
network.
FIG 9 relates to a flowchart for a method 60, describing the message transmission in the network of the present invention, as per a preferred embodiment.
To manage the network, child nodes in a group send messages to their respective leader nodes in sLep 61. The leader nodes consolidate these messages to determine the child nodes present in the group and exchange between themselves information, contained in the messages, about child nodes using skip-lists in step 62. The use of skip-lists reduces the VCT and increases the robustness of the topology with minimal increase in the messaging complexity. Only updates are sent to reduce communication bandwidth usage. The leader nodes get complete information about all other nodes in the network. They consolidate information received from other leader nodes and propagate this information to child nodes attached to it resulting in a uniform view of the entire network, in seep 63. Finally the status of the network gets updated at each node —
In a preferred embodiment, failures in the network are also taken care of, Failures can occur at two different places, at the child node or at the leader node. When a child node fails, the leader node of the group will not receive any message from the failed child node. After one or more cycles of the leader node not receiving messages from the child node, the leader node declares the child node to be failed and propagates this information to other leader nodes and the child nodes attached to it.

When a leader node fails, the child nodes attached to a leader node detect the failure of the leader node when it fails to provide update messages. The child nodes confirm the failure of the leader node and select a new leader node from the child nodes attached Lo the failed leader node. For instance the child node with the least NID can become the leader node for the group. The newly elected leader node then propagates this information to the other leader nodes.
When a node fails, its NID is put in a Failure Node Index or FNI which allows re-use of NID's. The re-organization of the network is not done on each and every node failure. The re-organization happens when all the nodes of a particular group / groups fail.
In an embodiment, a leader node sends a re-organization request to all other leader: nodes in the network. A request is sent to peer leader nodes for identifying the potential candidate node for leader node of the failed group. If at least one suggestion is received from the existing peer leaders, one among the suggested nodes is selected as the leader of the failed group.
The identified new leader node gives his acceptance for becoming a leader. This information of the new leader is propagated to all the peer leaders in the network. This information is further propagated down to all the
children.
FIG 10, FIG 11 and FIG 12 evaluate various conventional topologies and compare them with the topology of the network according to an embodiment of the present invention.

For the purpose of the present document,
efficiency is a measure of the view convergence time (VCT) or the time required for all members of the group to have a consistent view of the entire group. VCT is measured as the maximum number of hops required for updates to reach between two nodes in the group. It is inversely proportional to the average path length of a network, hence Lhe lower the distance or hop count, the higher is the efficiency;
robustness is a measure of the reachability of the nodes, if the removal of a node results in the network to split. A combination of functional and structural robustness is considered here. The robustness of a graph with respect to a vertex is the ratio of accessibility of the graph obtained after removing the vertex. Average robustness is the average of robustness over all the nodes. Worst case robustness is the minimum of robustness over all the nodes.
the notion of cost is a measure of messaging complexity. In a network, an edge represents a communication link between two vertices along which messages are transmitted. The messaging complexity is proportional to the number of edges in the network.
FIG 10 is a graph indicating messaging complexity involved in different topologies.
The x-axis 100 represents the number of nodes and the y-axis 200 is the number of messages. Line 101 in the graph
is for all-to-all topology, line 102 is for tree, ring

and star topologies and line 103 is the topology as per an embodiment of the present invention.
An all-to-all broadcast has a messaging complexity of the order of O(n2) and the tree, ring, star and the embodiment according to the present invention all have a complexity of the order O(n).
The VCT of different topologies can be seen in FIG 11.
The x-axis 100 in FIG 11 represents the number of nodes and the y-axis 200 is the VCT. Line 110 in the graph is for all-to-all topology, line 111 is for tree, line 112 is for star topology and line 113 is for the topology as per an embodiment of the present invention.
All-to-all broadcast has a VCT of 1 and the star topology has a VCT of 2. A tree topology has a VCT of the order of O(log(n)). The graph clearly indicates that the topology as per the shown embodiment of the present invention outperforms the tree with respect to the view convergence time.
FIG 12 shows the worst case normalized robustness characteristics of the various topologies.
The x-axis 10.0 represents the number of nodes and the y-axis 200 is the normalized worst case robustness. Line 120 in the graph is for all-to-all topology, line 121 is tree, line 122 is star topology, and line 123 is ths topology as per an embodiment, of the present invention.
The star topology has a robustness of zero. The ring and all-to-all broadcast topologies have the best worst case

robustness characteristics with a value of one. With the increasing number of nodes, the robustness characteristics of the present topology approach almost one, while the robustness of the tree falls off towards zero. Accordingly the shown embodiment of the present invention clearly has a more robust topology compared to that of the tree.
In light of the above, the present topology provides a significant improvement in comparison to hierarchical trees, with respect to view convergence time and robustness without compromising on messaging complexity.
Although the invention has been described with reference to specific embodiments, this description is not meant to be construed in a limiting sense. Various modifications of the disclosed embodiments, as well as alternate embodiments of the invention, will become apparent to persons skilled in the art upon reference to the description of the invention. It is therefore contemplated that such modifications can be made without departing from the spirit or scope of the present invention as defined.

We claim:
1. A communication network,
- comprising an arrangement of nodes, each of the
nodes being a leader node (11-15) or a child node
(130) of one of the leader nodes (11-15),
- each of the leader nodes (11-15) having
unidirectional, links (16-25) to a number of other
leader nodes (11-15), the number of other leader
nodes (11-15) being a function of n, n1 being a total
number of leader nodes (11-15).

2. The communication network according to claim 1, wherein
a maximum number of child nodes (130) of each leader
node (11-15) equals n1-1.
3. The communication network according to any of the
preceding claims, wherein the leader nodes (11-15)
maintain skip lists for the unidirectional links to the
other leader nodes (11-15).
4. The communication network according to any of the
preceding claims, wherein a new node is added to the
network as one of,
• a child node (130) of one of the leader nodes (11-15), depending on n1, a respective number of child nodes (130) of the leader node (11-15), and a maximum size of the network; or

• a new leader node (11-15) in the network when each leader node (11-15) in the network has a number of child nodes (130) equal to n1-l.
b. The communication network according to any of the
preceding claims, wherein the child nodes (130) and the leader nodes (11-15) in the network are configured to perform:
• message transmission from the child nodes (130)
to their respective leader node (11-15);
• circulation of the messages received by the
leader node (11-15), to the other linked leader
nodes (11-15) of the network; and
• message transmission from the ether linked
leader nodes (11-15) to their respective child
nodes (130),
the messages comprising da updated status of the network.
£. A method for managing a communication network, comprising:
- arranging nodes of the network, each of the nodes being a leader node (11-15) or a child node (130) of one of the leader nodes (11-15) , depending on a total number of nodes;

- each of the leader nodes (11-15) having unidirectional links to a number of other leader nodes (11-15), the number of other leader nodes (11-15) being a function of n1, n1 being a total number of leader node3 (11-15).
7. The method according to claim 6, comprising arranging
the nodes in the network such that a maximum number of
child nodes (130) of each leader node (11-15) equals
n1-l.
8. The method according to claim 6 or 7, comprising the
leader nodes (11-15) maintaining skip lists for the
unidirectional links to the other leader nodes (11-15)
in the network.
9. The method according to any of claims 6 to 8, wherein
adding a new node to the network comprises one of,
• assigning the new node as a child node (130) of one of the leader nodes (11-lb) in the network depending on in, a respective number of child nodes (130) of the leader node (11-15), and a maximum size of the network.
■ adding the new node as a new leader node (11-15) in the network when each leader node (11-15) in the network has a number of child nodes (130) equal to
n2-1
10. The meLhod according to any of claims 6 to 9,
comprising,

• transmitting messages from the child nodes
(130)to their respective leader node (11-15);
• circulating the messages received from the
child nodes (130), by the leader node (11-15),
to the other linked leader nodes (11-15) of the
network; and
• further transmitting the messages from the
other linked leader nodes (11-15) to their
respective child nodes (130),
the messages comprising an updated status of the network.
11. A network or method substantially as herein described and illustrated in the figures of the accompanying drawings.

The present invention relates to a communication network comprising an arrangement of nodes, each of the nodes being a leader node (11-15) or a child node (130) of one of the leader nodes (11-15), each of the leader nodes (11-15) having unidirectional links to a number of other leader nodes (11-15), the number of other leader nodes (11-15) being a function of n1, n1 being a total number of leader nodes (11-15).

Documents:

http://ipindiaonline.gov.in/patentsearch/GrantedSearch/viewdoc.aspx?id=IFWhOf+xRHAFxkR+yXCfzA==&loc=wDBSZCsAt7zoiVrqcFJsRw==


Patent Number 270236
Indian Patent Application Number 1196/KOL/2008
PG Journal Number 49/2015
Publication Date 04-Dec-2015
Grant Date 03-Dec-2015
Date of Filing 10-Jul-2008
Name of Patentee SIEMENS INFORMATION SYSTEM LTD.
Applicant Address 43, SHANTIPALLY, E M BYPASS- RASHBEHARI CONNECTOR, KOLKATA-700 042
Inventors:
# Inventor's Name Inventor's Address
1 SOWRIRAJAN ANTHONY DASS SIEMENTS INFORMATION SYSTEM LTD. OF 43, SHANTIPALLY; EM BYPASS-RASHBEHARI CONNECTOR; KOLKATA-700 042
2 RAVI KRISHNA MADIPADAGA SIEMENTS INFORMATION SYSTEM LTD. OF 43, SHANTIPALLY; EM BYPASS-RASHBEHARI CONNECTOR; KOLKATA-700 042
3 SAIKAT MUKHARJEE SIEMENTS INFORMATION SYSTEM LTD. OF 43, SHANTIPALLY; EM BYPASS-RASHBEHARI CONNECTOR; KOLKATA-700 042
4 SABI SHAW BHASKARAN SIEMENTS INFORMATION SYSTEM LTD. OF 43, SHANTIPALLY; EM BYPASS-RASHBEHARI CONNECTOR; KOLKATA-700 042
PCT International Classification Number H04B1/40; H04B7/15
PCT International Application Number N/A
PCT International Filing date
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 NA