Update README.md + Add Results.xlsx

673dc40c · Almouhannad Hafez · c53f2af5 · 673dc40c · 673dc40c
Commit 673dc40c authored Nov 02, 2024 by Almouhannad Hafez
Show whitespace changes
Inline Side-by-side

Showing with 22 additions and 2 deletions

README.md README.md +22 -2

Results.xlsx Results.xlsx +0 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -6,6 +6,8 @@
 **[Part1. Data preprocessing](#part1-data-preprocessing)**  
 **[Part2. Basic Morphological analyzer](#part2-basic-morpholgical-analyzer)**  
 **[Part3. Lemmatization, POS Tagging, and N-Gram](#part3-lemmatization-pos-tagging-and-n-gram)**  
+**[Results](#results)**
 ## ***Description***  
 **Classifying symptom (as a text data) into a disease**  
 > [Dataset link](https://www.kaggle.com/datasets/niyarrbarman/symptom2disease)
@@ -14,6 +16,8 @@
 - **Data folder**: Containing dataset and train and test sets
 - **Constants.py**: Some fixed values to use in other files as `CONSTANTS` class
 - **Other .ipynb files**: Jupyter notebooks containing actual work
+- **Results.xlsx**: Excel worksheet containing results
+- **conda_nlp_environment.yml**: Python modules requirements
 ## ***How to run***
 > **Using [Anaconda](https://www.anaconda.com/)**  
@@ -79,3 +83,19 @@
    1. `3-Gram`
    1. `4-Gram`
    1. `5-Gram`
+## ***Results***
+| Case\\Criterion         | Accuracy(Train) | Accuracy(Test) | Precision(Test-Average) | Recall(Test-Average) | F1-Score(Test-Average) |
+| ----------------------- | --------------- | -------------- | ----------------------- | -------------------- | ---------------------- |
+| nltk stemmer            | 0.994211288     | 0.91991342     | 0.925513814             | 0.923767509          | 0.919308               |
+| nltk lemmatizer         | 0.994211288     | 0.924242424    | 0.929407177             | 0.927885156          | 0.923453               |
+| Stanza lemmatizer       | 0.994211288     | 0.928571429    | 0.932850383             | 0.931115994          | 0.927117               |
+| SpaCy lemmatizer        | 0.995658466     | 0.928571429    | 0.934227363             | 0.931373992          | 0.928329               |
+| Lemma + Verbs only      | 0.781476122     | 0.608225108    | 0.656473336             | 0.62431736           | 0.6131                 |
+| Lemma + Adjectives only | 0.868306802     | 0.606060606    | 0.681515062             | 0.620177307          | 0.614097               |
+| Lemma + Nouns only      | 0.97829233      | 0.876623377    | 0.886798865             | 0.880636574          | 0.873959               |
+| Text + 1Gram            | 0.997105644     | 0.898268398    | 0.912889052             | 0.90487335           | 0.89945                |
+| Text + 2Gram            | 0.998552822     | 0.885281385    | 0.894742015             | 0.891828538          | 0.883421               |
+| Text + 3Gram            | 0.997105644     | 0.867965368    | 0.881810904             | 0.874727644          | 0.866752               |
+| Text + 4Gram            | 1               | 0.800865801    | 0.848577524             | 0.814521589          | 0.809801               |
+| Text + 5Gram            | 1               | 0.707792208    | 0.839340945             | 0.72337248           | 0.739326               |    
\ No newline at end of file
--- a/Results.xlsx
+++ b/Results.xlsx