Wednesday, September 2, 2009

Paper Review: Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications


With the advancement of P2P technology comes the different approaches to solve different network routing issues. The paper proposed by Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan introduces an algorithm for P2P networks and explains how it solves the Look Up problem in P2P systems.

Chord is a peer-to-peer algorithm that imlements the Distributed Hash Table (DHT) protocols for finding a single node in a structured network of peers. The nodes in a Chord system handle a number of keys that represent an entity of information. The Chord aims to improve the core operation in P2P systems which is to efficiently store and locate data items. The object of the paper is to discuss a scalable protocol for lookup in a dynamic P2P system subject to frequent node arrivals and departures.

With the idea of Consistent Hashing, nodes in a Chord system can locate a data item in no more than O(log N) time. It also allows the nodes in the system to store only the O(log N) locations of other nodes in the network. The Chord routing mechanism reinforced by the Stabilization operation allows any node to join or leave the network at any time which makes the technique highly scalable. The authors claim that even with very large number of nodes connected to the network, Chord system still provides a relatively fast way for locating data items.

Chord also aims to provide load balancing, however, it is not defined in the paper how it is done. There are machines/nodes that are capable of storing very large files and providing faster way of communicating data items. Authors must take into consideration that not all machines have the same computing power or storage system. Additionally, quantification of the data items are not based on how many they are but based on how large or small they are. These issues must be considered by the authors when addressing the load balancing issue.

Another issue that must be reinvestigated is the mechanism for file storage. Storage of files in Chord is done through DHT mapping making the detailed operations abstracted to the end users. With this mapping system, it is possible that a particular data item which is geographically close to a user be assigned to a node far from the user therefore producing more querying delays. A mechanism for addressing this issue should be implemented on top of the DHT protocol.

The Chord protocol may have solved many of the P2P issues making it comparable to other P2P algorithms. However, there are still problems that the authors and, perhaps, other network programmers should take into considerations to improve the P2P Chord system. Addionally, given the rapid increase in data items and number of nodes that can accomodate these data, O(log N) running time would not be enough in the future. Network programmers should continously devise new ways to improve the overall performance of P2P systems.

Reference:
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. Ion Stoica , Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan. MIT Laboratory for Computer Science.

Paper Review: Looking Up Data in P2P Systems



P2P systems offer a way on how to distribute the load to the different nodes connected to the network. P2P or Peer to Peer is a distributed network topology that is composed of nodes that make up a portion of the resources with minimal or no centralized infrastructure or server interference. The technology is often used in services like file sharing and is intended to accomodate high scalability and service robustness.

Operations involving the nodes connected to a P2P system are generally unpredictable and dynamic. A node may connect to or disconnect from the network at any given time. Because of the absence of a centralized host, handling of these operations is not that easy. Additionally, a reliable mechanism for looking up the data items available in a P2P network is not yet completely addressed by most popular systems currently in use. Hari Balakrishnan, et. al, enumerated the recent advances to solve the lookup problem. The algorithms developed by several research groups present a simple and general interface, a distributed hash table (DHT).

Before the advancement of the P2P systems, handling of data was done by a central database that maps a file name to the locations of servers that store the file. However, this approach obviously inherented one of the worst problems in a traditional centralized system: the central database is a single point of failure. The techniques discussed in the paper tried to solve the lookup problem without adapting any contralized mechanism.

One of these algorithms is the Chord skiplist routing. Every node in a Chord system maintains a finger table of the IP addresses of the nodes halfway, quarter-of-the-way, eight of the way and so forth, around the ID space that resembles a skiplist data structure. A node forwads a particular query to the nodes listed in its finger table until the successor of the key that matches the query is found. Another approach is the tree-like routing, which is implemented by well known techniques like Pastry, Tapestry, and Kademlia. In a Tree-like routing system, prefixes of the IDs are used to determine the route to a particular node in the network. This kind of routing system is also regarded as location-aware routing. Last approach discussed in the paper is the CAN which is implemented using a d-dimensional Cartesian coordinate system. Each node in CAN is responsible for a single hyper-rectagle called zone. These zones are used to determine the boundaries of the nodes where the neighbors or the nodes of adjacent zones are also defined.

The algorithms discussed in the paper are still under improvement and yet to be further explored. As stated in the paper, developers of these algorithms identified their designs based on relevant priorities and issues. Distance, operation costs, reliability, proximity and security are the main issues that network developers and programmers must take into consideration when establishing another topology or approach to P2P system.


References:
Looking Up Data in P2P Systems. Ion Stoica , Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan. MIT Laboratory for Computer Science. 2003 ACM.

Sunday, July 12, 2009

Paper Review: Interdomain Internet Routing

The Border Gataway Protocol or BGP is a major routing protocol in the Internet.The protocol route selection is based on the prefixes assigned to the different Autonomous Systems (AS). The BGP is responsible for exchanging routing information between ASes which made it considered as wide-area routing protocol. It has 2 known classes. First is the eBGP or the session between BGP routers in different Ases and iBGP or the session between BGP routers in the same AS.

The paper discusses the different operations in the routing process of the BGP. More importantly, the paper explains the design goals that motivate the implementation of BGP. One of which is scalibility, according to the paper an important requirement for BGP was to ensure that the Internet routing infrastructure remained scalable as the number of connected networks increased. Along with the scability goal is to enforce various types of routing policy to allow route filtering. And lastly, routing protocol must allow the ASes to administer local decisions on how to route packets.

BGP may be perfect for most of the network topologies, however there are still issues that need to be considered by system/network administrators. Convergence problem or the disablity of the BGP to converge quickly after a fault is detected is a major problem. Convergence in BGP, according to Labovitz et al in their paper from ACM SIGCOMM 2000, would require super-exponential number of steps for some cases. Administrators must establish reliable checking and corrections to avoid such network problems. Additionaly, issues like route validity, path visibility, and route safety should also be given priority when establishing interdomain route.

Paper Review: On Inferring Autonomous System Relationships in the Internet


Policies governing the routing of information among the ASes are essential in understanding the behavior of the Internet as well as the rules that are incorporated in different network domains. These policies determine how packets are transmitted from one AS to another and how information are routed across networks. Creation of these routing policies can be attributed to the infrastracture of the administrative domains and their relationships to one another.

Due to the rapid increase of machines interconnected by networks and the packets routed to different network paths, a way of analyzing the relationships among the different domains is very helpful in determining how every piece of information is securely and efficiently transmitted. The paper by Lixin Gao tried to study the architecture of the Internet by identifying the relationship between administrative domains. Relationships or the commercial agreements between pairs of administrative domains can be classified into customer–provider, peering, mutual-transit, and mutual-backup agreements.

Each class of relationships has its own function in the Internet. An AS can be a provider to another AS in which makes it entitled to functions that determine its accessibility, transmission authority, and the security it provides. ASes can also be a costumer, peer or a sibling of another AS. To be able to determine how routing is done accross different domains, relationships between ASes in the network must first be identified. The paper presented different algorithms that will determine the relationships of the ASes in the network. These algorithms are essential to properly identify the policies between the ASes making the systems capable to detect any misconfiguration in the ASes. Once an error is detected, ASes can easily update their policies and routing tables which is based on their relationships to their neighbors.

Inferring the relationship between ASes helps identify not only the terms of the agreement but also promotes a quicker response to any errors related to network policies. Although the algorithms presented are quite complicated, the author tried to deliver her arguments in an interesting and very informative manner. This remarkable study will surely serve as a foundation for future advancements relating to networks and administrative domains.

Thursday, July 2, 2009

Paper Review: The Design Philosophy of the DARPA Internet Protocols

The main objective of the DARPA Internet Protocol design as explained by David Clark is to connect different networks allowing them to communicate with one another and to ensure that communication media are reliable and fault-tolerant.

This design was originally conceptualized by DARPA prior to the creation of the Internet protocol Suite.The concepts discussed in this paper led to the discovery of some of the most commonly used network technologies today like packet switching and fate-sharing. Packet switching was designed to allow different hosts to communicate using a fixed-sized blocks, called packets, that will be transmitted through a shared communication network. The author supplemented this idea by explaining that networks should be interconnected by Internet packet switches, called gateways. Fate-sharing, on the other hand, means that as long as there is a path or communcation channel that exist between two applications, then the connection between these applications can not be terminated.

Other items discussed in the paper include the goal of establishing a system that will provide various network services which later on led to the creation of TCP/IP layered protocols. Management of resources with low level of effort is also an objective of the design.

All these factors and objectives that were considered by the authors will explain why modern network systems including the Internet are equipped with maximum possible reliability and speed.

Paper Review: End to End Arguments in System Design

The paper written by Saltser, Reed and Clark proposes a design in which programs within the host side play a very important role to achive maximum security and reliability. As compared to the system-level functions, application functions seem to be simpler to implement and more economical.

The arguments presented in the paper pointed out that system-level functions may not support all the functionalities needed by host-side applications, and therefore, higher-level applications would tend to implement the necessary functionalities. There are also some functions that are not necessary to the hosts and just add overhead to the system performance.

End to End design provides not only the desired functionalities needed by the application programs but it also guarantees secured data transmission, ordered message delivery and the capability of suppressing duplicate messages. With these features offered by End to End design, its underlying concept became the foundation for the Internet Protocol Suite.

The Internet Protocol Suite or TCP/IP implements a set of of communication protocol wherein TCP acts as a smart transport protocol that resides at the host, and IP, a simple network protocol responsible only for moving data packets across network. TCP does almost all the complex tasks including error detection and retransmission. IP, on the other hand, does the very simple task of delivering datagrams from one machine to another based on IP addresses.

The authors have successfully presented the advantages of the end to end design over other designs. End to End design also became one of the standards in computer networking. However, there are still issues that need to be further evaluated in order to achieve maximum network performance.