Projects for Distributed Systems

5/29/2025

Distributed Systems (EECS 491)

EECS 491 is an advanced undergraduate course focused on the design and implementation of scalable, consistent, and fault-tolerant distributed systems. The course explored the principles and practices that underpin real-world cloud infrastructure and large-scale services.

Key Skills & Experience:

This course significantly deepened my understanding of distributed coordination, reliability under failure, and systems-level programming, while strengthening my ability to build robust infrastructure software in a concurrent, asynchronous environment.


Projects

Project 1: MapReduce

As an introduction to distributed programming in Go, I implemented a simplified version of the MapReduce programming model—mirroring many ideas from the original MapReduce paper but adapted for a smaller, educational-scale system.

Key Contributions:

This project strengthened my ability to write concurrent, networked systems in Go, and introduced me to foundational concepts in distributed coordination, resilience, and parallel computation orchestration.


Project 2: Primary/Backup Key-Value Store with Viewservice

Built a fault-tolerant, replicated key/value store in Go using a primary/backup model coordinated by a centralized view service. This project emphasized managing server roles, ensuring consistency across failures, and maintaining correctness during network partitions and crashes.

Key Skills & Experience:

This project deepened my understanding of server coordination, replication under network uncertainty, and the subtleties of distributed consistency guarantees in an asynchronous, failure-prone environment. It served as a bridge between basic fault tolerance and more advanced protocols like Paxos and sharded replication.


Project 3: Paxos-Based Key/Value Store

Designed and implemented a fault-tolerant, linearizable key/value store by building a replicated state machine powered by the Paxos consensus algorithm. This system eliminated the need for a central coordinator by ensuring all replicas agreed on the global order of operations through distributed consensus.

Key Skills & Experience:

This project deepened my experience with distributed consensus, log replication, and state machine coordination, while significantly advancing my skills in networked systems programming, fault-tolerant design, and concurrent logic in Go.


Project 4: Sharded Key/Value Store with Paxos-Based Shardmaster

Designed and implemented a fault-tolerant, sharded key/value storage system coordinated by a Paxos-replicated Shardmaster. This project extended the Paxos-based key/value infrastructure to support horizontal scalability, shard reconfiguration, and cross-group coordination; building on principles found in real-world systems like BigTable, Spanner, and HBase.

Key Skills & Experience

This project solidified my ability to design and implement reconfigurable, consistent distributed systems. It combined core distributed systems concepts: sharding, leader election, replication, and fault-tolerance, with system-level concurrency and RPC-based coordination in Go.