This project implements a distributed system for calculating the arithmetic mean of a large array using the **Message Passing Interface (MPI)** in Java. It demonstrates core parallel computing concepts such as data distribution, local processing, and results aggregation.
## 📋 Project Requirements
The task involves the following core requirements:
1.**Data Generation:** Rank 0 generates a random array of 1000 elements.
2.**Distribution:** Elements are distributed among participating processes.
3.**Local Computation:** Each process calculates its own local average.
4.**Global Aggregation:** Rank 0 calculates the global average using the local averages, following the **Weighted Mean** logic to ensure mathematical accuracy.
5.**Performance Analysis:** Testing the system with different numbers of processes (NP=1, 2, 4, 8).
---
## 🛠 Prerequisites & Setup
To run this project, you need:
***Java JDK 11 or higher**.
***MPJ Express Library (v0.44)**.
***Environment Variables:**
*`MPJ_HOME`: Should point to your MPJ Express directory (e.g., `C:\mpj-v0_44`).
*`Path`: Should include `%MPJ_HOME%\bin`.
### IDE Configuration (IntelliJ IDEA):
1. Add `mpj.jar` (found in `MPJ_HOME/lib`) to your Project Structure as an external library.
2. Ensure the class is within the correct package (e.g., `package org.example;`).
---
## 💻 Code Explanation
### 1. Workload Distribution
The system uses **Point-to-Point Communication** (`Send`/`Recv`) to handle cases where the array size is not perfectly divisible by the number of processes.
***Base Size:**`totalElements / size`
***Remainder Handling:** The last process receives all remaining elements to ensure no data loss.
### 2. Weighted Mean Logic
As per the requirement, the global average is not simply the average of local averages. We implemented the **Weighted Average** formula:
```text
Global Average = Σ (Local_Average_i * Local_Size_i) / Total_Elements
```
This is achieved by using `MPI.COMM_WORLD.Gather()` to collect all partial results at Rank 0.
### 3. Synchronization & Timing
***MPI.COMM_WORLD.Barrier():** Used to synchronize processes before and after computation for accurate performance measurement.
***System.nanoTime():** Provides high-precision timing to analyze the execution overhead.
---
## 🚀 How to Run
Navigate to the source directory (`src/main/java`) and use the following commands:
During testing, you might notice that increasing the number of processes (NP) for 1000 elements results in a **higher** execution time.
**Key Insight for Distributed Systems:**
***Communication Overhead:** The time taken to create processes and pass messages (Send/Recv) outweighs the computation time for a small array (1000 elements).
***Scalability:** Parallelism becomes beneficial when the workload (Computation) is heavy enough to justify the communication cost (e.g., 10+ million elements).