Cooperative network supercomputing is becoming increasingly popular for harnessing the power of the global Internet computing platform.
A typical Internet supercomputer consists of a master computer or server and a large number of computers called workers, performing computation on behalf of the master.
Despite the simplicity and benefits of a single master approach, as the scale of such computing environments grows, it becomes unrealistic to assume the existence of the infallible master that is able to coordinate the activities of multitudes of workers.
Large-scale distributed systems are inherently dynamic and are subject to perturbations, such as failures of computers and network links, thus it is also necessary to consider fully distributed peer-to-peer solutions. We present a study of cooperative computing with the focus on modeling distributed computing settings, algorithmic techniques enabling one to combine efficiency and fault-tolerance in distributed systems, and the exposition of trade-offs between efficiency and fault-tolerance for robust cooperative computing.
The focus of the exposition is on the abstract problem, called Do-All, and formulated in terms of a system of cooperating processors that together need to perform a collection of tasks in the presence of adversity.
Our presentation deals with models, algorithmic techniques, and analysis.
Our goal is to present the most interesting approaches to algorithm design and analysis leading to many fundamental results in cooperative distributed computing.
The algorithms selected for inclusion are among the most efficient that additionally serve as good pedagogical examples.
Each chapter concludes with exercises and bibliographic notes that include a wealth of references to related work and relevant advanced results.