(5) Add results

8fe8e8a4 · Almouhannad Hafez · 1d492ae4 · 8fe8e8a4 · 8fe8e8a4
Commit 8fe8e8a4 authored Nov 16, 2024 by Almouhannad Hafez
Show whitespace changes
Inline Side-by-side

Showing with 29 additions and 14 deletions

README.md README.md +29 -14

Results.xlsx Results.xlsx +0 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -90,12 +90,25 @@
 - **Applying data augmentation on the original dataset, added 5 new rephrased rows for each original row using LLM `LLama3`**
 ![Augmentation_effect](./images/Augmentation_effect.png)

+## ***Part4. Dependency tree***
+**Files:**
+> **`5.0.Process_texts_stanza.ipynb`**  
+- **Applying stanza pipeline containing `tokenize,mwt,pos,lemma,depparse` and storing results for using later**
+> **`5.1.Dep_parsing_classifier.ipynb`**  
+- **Applying classification task using dep. tree features as tuples, this includes using:**
+    1. `(head_word, dependent_word, dependency_relation)`
+        - Syntactic inner relations between words in a sentence (Shallow parsing)
+    1. `(head_pos, dependent_pos, dependency_relation)`
+        - pos: Part Of Speech
+    1. `(root_word, "ROOT")`
+        - i.e. Head words for sentences
+
 ## ***Results***

 > ***Using augmented dataset*** 

 | Case\\Criterion                                    | Accuracy(Train) | Accuracy(Test) | Precision(Test-Average) | Recall(Test-Average) | F1-Score(Test-Average) | Notes                     |
-| ----------------------- | --------------- | -------------- | ----------------------- | -------------------- | ---------------------- | ------------------------- |
+| -------------------------------------------------- | --------------- | -------------- | ----------------------- | -------------------- | ---------------------- | ------------------------- |
 | nltk stemmer                                       | 0.9629          | 0.9524         | 0.9513                  | 0.9522               | 0.9509                 | alpha=0.1, 300features    |
 | nltk lemmatizer                                    | 0.9832          | 0.9699         | 0.9703                  | 0.9699               | 0.9696                 | alpha=0.1, 700features    |
 | Stanza lemmatizer                                  | 0.9783          | 0.9671         | 0.9673                  | 0.9672               | 0.9668                 | alpha=0.1, 550features    |
@@ -108,6 +121,8 @@
 | Text + (1,4)Gram                                   | 0.9967          | 0.9807         | 0.9802                  | 0.9805               | 0.9802                 | alpha=0.01, 12100features |
 | Text + (2,3)Gram                                   | 0.9951          | 0.9695         | 0.9688                  | 0.9694               | 0.9688                 | alpha=0.01, 9100features  |
 | Text + (2,4)Gram                                   | 0.9951          | 0.9646         | 0.9634                  | 0.9645               | 0.9635                 | alpha=0.01, 14100features |
+| Stanza Dep. Relation tuples                        | 0.9984          | 0.9781         | 0.9783                  | 0.9784               | 0.9781                 | alpha=0.01, 7000features  |
+| Stanza Dep.Relation+POS Relations+Headwords tuples | 0.9981          | 0.9747         | 0.9747                  | 0.9749               | 0.9745                 | alpha=0.01, 8000features  |
 ---

 > ***Applied features selection and model's hyperparameters tuning*** 

--- a/Results.xlsx
+++ b/Results.xlsx