Commit bac38bd7 authored by tammam.alsoleman's avatar tammam.alsoleman

ADD README.md

parent 3d46557e
---
# MPI Parallel Array Averaging System 🚀
This project implements a distributed system for calculating the arithmetic mean of a large array using the **Message Passing Interface (MPI)** in Java. It demonstrates core parallel computing concepts such as data distribution, local processing, and results aggregation.
## 📋 Project Requirements
The task involves the following core requirements:
1. **Data Generation:** Rank 0 generates a random array of 1000 elements.
2. **Distribution:** Elements are distributed among participating processes.
3. **Local Computation:** Each process calculates its own local average.
4. **Global Aggregation:** Rank 0 calculates the global average using the local averages, following the **Weighted Mean** logic to ensure mathematical accuracy.
5. **Performance Analysis:** Testing the system with different numbers of processes (NP=1, 2, 4, 8).
---
## 🛠 Prerequisites & Setup
To run this project, you need:
* **Java JDK 11 or higher**.
* **MPJ Express Library (v0.44)**.
* **Environment Variables:**
* `MPJ_HOME`: Should point to your MPJ Express directory (e.g., `C:\mpj-v0_44`).
* `Path`: Should include `%MPJ_HOME%\bin`.
### IDE Configuration (IntelliJ IDEA):
1. Add `mpj.jar` (found in `MPJ_HOME/lib`) to your Project Structure as an external library.
2. Ensure the class is within the correct package (e.g., `package org.example;`).
---
## 💻 Code Explanation
### 1. Workload Distribution
The system uses **Point-to-Point Communication** (`Send`/`Recv`) to handle cases where the array size is not perfectly divisible by the number of processes.
* **Base Size:** `totalElements / size`
* **Remainder Handling:** The last process receives all remaining elements to ensure no data loss.
### 2. Weighted Mean Logic
As per the requirement, the global average is not simply the average of local averages. We implemented the **Weighted Average** formula:
```text
Global Average = Σ (Local_Average_i * Local_Size_i) / Total_Elements
```
This is achieved by using `MPI.COMM_WORLD.Gather()` to collect all partial results at Rank 0.
### 3. Synchronization & Timing
* **MPI.COMM_WORLD.Barrier():** Used to synchronize processes before and after computation for accurate performance measurement.
* **System.nanoTime():** Provides high-precision timing to analyze the execution overhead.
---
## 🚀 How to Run
Navigate to the source directory (`src/main/java`) and use the following commands:
**1. Compile:**
```powershell
javac -cp "C:\mpj-v0_44\lib\mpj.jar" org/example/MPI_AVG.java
```
**2. Run (Example with 4 processes):**
```powershell
mpjrun.bat -np 4 org.example.MPI_AVG
```
---
## 📊 Performance Discussion (Important)
During testing, you might notice that increasing the number of processes (NP) for 1000 elements results in a **higher** execution time.
**Key Insight for Distributed Systems:**
* **Communication Overhead:** The time taken to create processes and pass messages (Send/Recv) outweighs the computation time for a small array (1000 elements).
* **Scalability:** Parallelism becomes beneficial when the workload (Computation) is heavy enough to justify the communication cost (e.g., 10+ million elements).
---
## 📂 Project Structure
```
MPI_ArrayTask/
├── src/
│ └── main/
│ └── java/
│ └── org/example/
│ └── MPI_AVG.java # Core Logic
├── run_tests.ps1 # Automation Script (Optional)
└── README.md # Project Documentation
```
---
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment