Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Sign in
Toggle navigation
N
NLP-Project
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
almohanad.hafez
NLP-Project
Commits
220bba63
Commit
220bba63
authored
Jan 18, 2025
by
Almouhannad Hafez
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Update results
parent
85b5c0db
Changes
4
Show whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
40 additions
and
22 deletions
+40
-22
README.md
README.md
+40
-22
Results.xlsx
Results.xlsx
+0
-0
Embedding_result.png
images/Embedding_result.png
+0
-0
Ontology_results.png
images/Ontology_results.png
+0
-0
No files found.
README.md
View file @
220bba63
...
...
@@ -116,12 +116,29 @@
1.
`(root_word, "ROOT")`
-
i.e. Head words for sentences
## ***Part6. Ontology***
**Files:**
> **`6.1.BO_synsets_classifier.ipynb`**
-
**Classification using Bag Of Synsets (BO)**
> **`6.2.BOS_ParsingTree_NGrams.ipynb`**
-
**Classification using Bag Of Synsets (BO) and other features from previous steps**
![
Ontology_results
](
./images/Ontology_results.png
)
## ***Part7. Word embedding***
**Files:**
> **`7.1.Word2Vec_classifier.ipynb`**
-
**Classification using Word2Vec embedding weighted average for words vectors based on POS**
> **`7.2.BERT_classifier.ipynb`**
-
**Classification using BERT CLS token**
![
Embedding_result.png
](
./images/Embedding_result.png
)
## ***Results***
> ***Using augmented dataset***
| Case
\\
Criterion | Accuracy(Train) | Accuracy(Test) | Difference(%) | Precision(Test-Average) | Recall(Test-Average) | F1-Score(Test-Average) | Notes |
| ------------------------------------------------------------ | --------------- | -------------- | ------------- | ----------------------- | -------------------- | ---------------------- | ------------------------- |
| ------------------------------------------------------------ | --------------- | -------------- | ------------- | ----------------------- | -------------------- | ---------------------- | -------------------------
-----
|
| nltk stemmer | 0.9852 | 0.9604 | 2.5 | 0.9593 | 0.9587 | 0.9574 | alpha=0.1, 450features |
| nltk lemmatizer | 0.9891 | 0.9625 | 2.7 | 0.9635 | 0.9626 | 0.9608 | alpha=0.1, 700features |
| Stanza lemmatizer | 0.9843 | 0.9646 | 2.0 | 0.9652 | 0.9642 | 0.9623 | alpha=0.1, 550features |
...
...
@@ -141,7 +158,8 @@
| BO synsets + POS filtering | 0.9810 | 0.9271 | 5.4 | 0.9287 | 0.9256 | 0.9224 | alpha=0.01, 1500features |
| BO synsets + WSD | 0.9961 | 0.9563 | 4.0 | 0.9594 | 0.9564 | 0.9542 | alpha=0.01,1750features |
| BO synsets + WSD + Stanza Dep. Relation tuples + (1,3) Grams | 0.9963 | 0.9708 | 2.5 | 0.9713 | 0.9706 | 0.9683 | alpha=0.01,5500features |
| | | | | | | | |
| Word2Vec embedding using weighted vector based on POS | 0.9632 | 0.9000 | 6.3 | 0.9079 | 0.8970 | 0.8943 | KNN, n=20, cosine, 185features |
| BERT embedding using CLS token | 1.0000 | 0.9899 | 1.0 | 0.9861 | 0.9886 | 0.9872 | KNN, n=20, cosine, 695features |
---
> ***Applied features selection and model's hyperparameters tuning***
...
...
Results.xlsx
View file @
220bba63
No preview for this file type
images/Embedding_result.png
0 → 100644
View file @
220bba63
278 KB
images/Ontology_results.png
0 → 100644
View file @
220bba63
145 KB
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment