(5) Add results

8fe8e8a4 · Almouhannad Hafez · 1d492ae4 · 8fe8e8a4 · 8fe8e8a4
Commit 8fe8e8a4 authored Nov 16, 2024 by Almouhannad Hafez
Hide whitespace changes
Inline Side-by-side

Showing with 29 additions and 14 deletions

README.md README.md +29 -14

Results.xlsx Results.xlsx +0 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -90,24 +90,39 @@
 - **Applying data augmentation on the original dataset, added 5 new rephrased rows for each original row using LLM `LLama3`**
 ![Augmentation_effect](./images/Augmentation_effect.png)
+## ***Part4. Dependency tree***
+**Files:**
+> **`5.0.Process_texts_stanza.ipynb`**  
+- **Applying stanza pipeline containing `tokenize,mwt,pos,lemma,depparse` and storing results for using later**
+> **`5.1.Dep_parsing_classifier.ipynb`**  
+- **Applying classification task using dep. tree features as tuples, this includes using:**
+    1. `(head_word, dependent_word, dependency_relation)`
+        - Syntactic inner relations between words in a sentence (Shallow parsing)
+    1. `(head_pos, dependent_pos, dependency_relation)`
+        - pos: Part Of Speech
+    1. `(root_word, "ROOT")`
+        - i.e. Head words for sentences
 ## ***Results***
 > ***Using augmented dataset*** 
-| Case\\Criterion         | Accuracy(Train) | Accuracy(Test) | Precision(Test-Average) | Recall(Test-Average) | F1-Score(Test-Average) | Notes                     |
+| Case\\Criterion                                    | Accuracy(Train) | Accuracy(Test) | Precision(Test-Average) | Recall(Test-Average) | F1-Score(Test-Average) | Notes                     |
-| ----------------------- | --------------- | -------------- | ----------------------- | -------------------- | ---------------------- | ------------------------- |
+| -------------------------------------------------- | --------------- | -------------- | ----------------------- | -------------------- | ---------------------- | ------------------------- |
-| nltk stemmer            | 0.9629          | 0.9524         | 0.9513                  | 0.9522               | 0.9509                 | alpha=0.1, 300features    |
+| nltk stemmer                                       | 0.9629          | 0.9524         | 0.9513                  | 0.9522               | 0.9509                 | alpha=0.1, 300features    |
-| nltk lemmatizer         | 0.9832          | 0.9699         | 0.9703                  | 0.9699               | 0.9696                 | alpha=0.1, 700features    |
+| nltk lemmatizer                                    | 0.9832          | 0.9699         | 0.9703                  | 0.9699               | 0.9696                 | alpha=0.1, 700features    |
-| Stanza lemmatizer       | 0.9783          | 0.9671         | 0.9673                  | 0.9672               | 0.9668                 | alpha=0.1, 550features    |
+| Stanza lemmatizer                                  | 0.9783          | 0.9671         | 0.9673                  | 0.9672               | 0.9668                 | alpha=0.1, 550features    |
-| SpaCy lemmatizer        | 0.9776          | 0.9657         | 0.9655                  | 0.9656               | 0.9652                 | alpha=0.1, 550features    |
+| SpaCy lemmatizer                                   | 0.9776          | 0.9657         | 0.9655                  | 0.9656               | 0.9652                 | alpha=0.1, 550features    |
-| Lemma + Verbs only      | 0.7106          | 0.6321         | 0.6293                  | 0.6278               | 0.6214                 | alpha=0.1, 400features    |
+| Lemma + Verbs only                                 | 0.7106          | 0.6321         | 0.6293                  | 0.6278               | 0.6214                 | alpha=0.1, 400features    |
-| Lemma + Adjectives only | 0.7990          | 0.7357         | 0.7383                  | 0.7351               | 0.7299                 | alpha=0.1, 450features    |
+| Lemma + Adjectives only                            | 0.7990          | 0.7357         | 0.7383                  | 0.7351               | 0.7299                 | alpha=0.1, 450features    |
-| Lemma + Nouns only      | 0.9678          | 0.9419         | 0.9406                  | 0.9419               | 0.9406                 | alpha=0.1, 600features    |
+| Lemma + Nouns only                                 | 0.9678          | 0.9419         | 0.9406                  | 0.9419               | 0.9406                 | alpha=0.1, 600features    |
-| Text + (1,2)Gram        | 0.9965          | 0.9800         | 0.9801                  | 0.9799               | 0.9798                 | alpha=0.01, 3100features  |
+| Text + (1,2)Gram                                   | 0.9965          | 0.9800         | 0.9801                  | 0.9799               | 0.9798                 | alpha=0.01, 3100features  |
-| Text + (1,3)Gram        | 0.9960          | 0.9807         | 0.9806                  | 0.9805               | 0.9803                 | alpha=0.01, 6600features  |
+| Text + (1,3)Gram                                   | 0.9960          | 0.9807         | 0.9806                  | 0.9805               | 0.9803                 | alpha=0.01, 6600features  |
-| Text + (1,4)Gram        | 0.9967          | 0.9807         | 0.9802                  | 0.9805               | 0.9802                 | alpha=0.01, 12100features |
+| Text + (1,4)Gram                                   | 0.9967          | 0.9807         | 0.9802                  | 0.9805               | 0.9802                 | alpha=0.01, 12100features |
-| Text + (2,3)Gram        | 0.9951          | 0.9695         | 0.9688                  | 0.9694               | 0.9688                 | alpha=0.01, 9100features  |
+| Text + (2,3)Gram                                   | 0.9951          | 0.9695         | 0.9688                  | 0.9694               | 0.9688                 | alpha=0.01, 9100features  |
-| Text + (2,4)Gram        | 0.9951          | 0.9646         | 0.9634                  | 0.9645               | 0.9635                 | alpha=0.01, 14100features |
+| Text + (2,4)Gram                                   | 0.9951          | 0.9646         | 0.9634                  | 0.9645               | 0.9635                 | alpha=0.01, 14100features |
+| Stanza Dep. Relation tuples                        | 0.9984          | 0.9781         | 0.9783                  | 0.9784               | 0.9781                 | alpha=0.01, 7000features  |
+| Stanza Dep.Relation+POS Relations+Headwords tuples | 0.9981          | 0.9747         | 0.9747                  | 0.9749               | 0.9745                 | alpha=0.01, 8000features  |
 ---
 > ***Applied features selection and model's hyperparameters tuning*** 

--- a/Results.xlsx
+++ b/Results.xlsx