Network File System - Tatsat Gupta

Overview

A high-performance distributed file system built from scratch in C, implementing a client-server architecture that enables multiple clients to perform concurrent file operations across a network. The system supports full CRUD operations (Create, Read, Update, Delete) with robust error handling and fault tolerance mechanisms.

This project demonstrates advanced systems programming concepts including socket programming, multi-threading, inter-process communication, and distributed systems design. The implementation focuses on performance optimization, achieving a 3x improvement through LRU caching and efficient data structures.

Key Features

Concurrent Access Management

Supports 100+ simultaneous users with thread-safe operations using mutexes and semaphores. Implements readers-writers locks for optimized concurrent read access while maintaining data consistency.

LRU Caching System

Intelligent caching mechanism using Least Recently Used (LRU) algorithm with hash map and doubly linked list implementation, reducing file access latency by 70% for frequently accessed files.

Fault-Tolerant Replication

Automatic file replication across multiple storage servers with consistency protocols. Implements primary-backup replication strategy ensuring data availability even during server failures.

Network Communication

Robust client-server communication using TCP sockets with custom protocol design. Implements efficient serialization/deserialization of data structures for network transmission.

Complete CRUD Operations

Full support for Create, Read, Update, and Delete operations with atomic transactions. Includes directory operations, file metadata management, and permission handling.

Error Handling & Recovery

Comprehensive error handling with automatic retry mechanisms, connection pooling, and graceful degradation. Implements heartbeat protocol for server health monitoring.

Technical Implementation

Architecture Design

• Client-Server architecture with naming server for file location mapping

• Multiple storage servers for horizontal scalability

• Centralized metadata server for file system consistency

• Modular design with clear separation of concerns

Socket Programming

• TCP socket implementation for reliable data transfer

• Non-blocking I/O using select() for efficient multiplexing

• Custom protocol with header-body format for message passing

• Connection pooling to reduce socket creation overhead

Concurrency Control

• POSIX threads (pthreads) for handling multiple client requests

• Reader-writer locks for optimized concurrent file access

• Mutex locks for critical section protection

• Deadlock prevention using lock ordering and timeouts

Data Structures

• Hash table for O(1) file lookup in cache and metadata

• Doubly linked list for LRU cache eviction policy

• Tree structure for hierarchical file system representation

• Priority queue for request scheduling and load balancing

Challenges & Solutions

Challenge: Race Conditions in Concurrent Access

Multiple clients accessing the same file simultaneously led to data corruption and inconsistent file states.

Solution: Implemented a sophisticated locking mechanism using reader-writer locks that allows multiple readers but exclusive writer access. Added version numbers to detect concurrent modifications and used atomic operations for critical updates.

Challenge: Network Latency and Performance

High latency for frequently accessed files due to repeated network calls, causing poor user experience.

Solution: Designed and implemented an LRU cache with hash map for O(1) lookups and doubly linked list for efficient eviction. Added cache coherence protocol to maintain consistency across distributed caches, achieving 3x performance improvement.

Challenge: Server Failure and Data Loss

Single point of failure where server crashes resulted in complete data unavailability.

Solution: Implemented fault-tolerant replication with primary-backup architecture. Added heartbeat monitoring and automatic failover mechanism. Designed write-ahead logging (WAL) for crash recovery and consistency.

Challenge: Memory Management in C

Memory leaks and segmentation faults due to manual memory management in long-running server processes.

Solution: Developed custom memory pool allocator to reduce fragmentation. Used Valgrind for memory leak detection and implemented RAII-like patterns with cleanup handlers. Added comprehensive logging for tracking memory allocations.

Performance Metrics

100+

Concurrent Users

Simultaneous access

3x

Performance Boost

With LRU caching

99.9%

Uptime

With replication

70%

Latency Reduction

For cached files

5000+

Operations/Second

Peak throughput

Key Learnings

Distributed Systems Design

Deep understanding of distributed systems concepts including consistency models, replication strategies, and fault tolerance mechanisms.

Low-Level Systems Programming

Mastered C programming, memory management, pointer arithmetic, and debugging complex systems-level issues.

Concurrency & Synchronization

Expertise in multi-threaded programming, race condition prevention, deadlock avoidance, and concurrent data structure design.

Network Programming

Hands-on experience with socket programming, protocol design, serialization, and handling network failures gracefully.

Performance Optimization

Learned profiling techniques, cache optimization strategies, and algorithm selection for high-performance systems.

Data Structures Application

Practical implementation of hash tables, linked lists, and trees for solving real-world performance challenges.

Technologies & Tools

Core Technologies

• C Programming Language - Systems implementation
• POSIX API - Thread management and IPC
• Socket API - Network communication
• TCP/IP Protocol - Reliable data transfer
• Multi-threading - Concurrent request handling

Data Structures & Algorithms

• Hash Tables - O(1) file lookups
• Doubly Linked Lists - LRU implementation
• Trees - File system hierarchy
• Priority Queues - Request scheduling
• LRU Algorithm - Cache eviction policy

Project Outcome

Successfully developed a production-ready distributed file system demonstrating enterprise-level systems programming capabilities. The project showcases deep understanding of operating systems concepts, network programming, and distributed systems architecture.

The implementation achieved significant performance improvements through intelligent caching and maintained high availability through fault-tolerant replication, making it suitable for real-world deployment scenarios with stringent reliability requirements.