Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Sign in
Toggle navigation
N
NLP-Project
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
almohanad.hafez
NLP-Project
Commits
673dc40c
Commit
673dc40c
authored
Nov 02, 2024
by
Almouhannad Hafez
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Update README.md + Add Results.xlsx
parent
c53f2af5
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
22 additions
and
2 deletions
+22
-2
README.md
README.md
+22
-2
Results.xlsx
Results.xlsx
+0
-0
No files found.
README.md
View file @
673dc40c
...
...
@@ -5,7 +5,9 @@
**[How to run](#how-to-run)**
**[Part1. Data preprocessing](#part1-data-preprocessing)**
**[Part2. Basic Morphological analyzer](#part2-basic-morpholgical-analyzer)**
**[Part3. Lemmatization, POS Tagging, and N-Gram](#part3-lemmatization-pos-tagging-and-n-gram)**
**[Part3. Lemmatization, POS Tagging, and N-Gram](#part3-lemmatization-pos-tagging-and-n-gram)**
**[Results](#results)**
## ***Description***
**Classifying symptom (as a text data) into a disease**
> [Dataset link](https://www.kaggle.com/datasets/niyarrbarman/symptom2disease)
...
...
@@ -14,6 +16,8 @@
-
**Data folder**
: Containing dataset and train and test sets
-
**Constants.py**
: Some fixed values to use in other files as
`CONSTANTS`
class
-
**Other .ipynb files**
: Jupyter notebooks containing actual work
-
**Results.xlsx**
: Excel worksheet containing results
-
**conda_nlp_environment.yml**
: Python modules requirements
## ***How to run***
> **Using [Anaconda](https://www.anaconda.com/)**
...
...
@@ -78,4 +82,20 @@
1.
`2-Gram`
1.
`3-Gram`
1.
`4-Gram`
1.
`5-Gram`
\ No newline at end of file
1.
`5-Gram`
## ***Results***
| Case
\\
Criterion | Accuracy(Train) | Accuracy(Test) | Precision(Test-Average) | Recall(Test-Average) | F1-Score(Test-Average) |
| ----------------------- | --------------- | -------------- | ----------------------- | -------------------- | ---------------------- |
| nltk stemmer | 0.994211288 | 0.91991342 | 0.925513814 | 0.923767509 | 0.919308 |
| nltk lemmatizer | 0.994211288 | 0.924242424 | 0.929407177 | 0.927885156 | 0.923453 |
| Stanza lemmatizer | 0.994211288 | 0.928571429 | 0.932850383 | 0.931115994 | 0.927117 |
| SpaCy lemmatizer | 0.995658466 | 0.928571429 | 0.934227363 | 0.931373992 | 0.928329 |
| Lemma + Verbs only | 0.781476122 | 0.608225108 | 0.656473336 | 0.62431736 | 0.6131 |
| Lemma + Adjectives only | 0.868306802 | 0.606060606 | 0.681515062 | 0.620177307 | 0.614097 |
| Lemma + Nouns only | 0.97829233 | 0.876623377 | 0.886798865 | 0.880636574 | 0.873959 |
| Text + 1Gram | 0.997105644 | 0.898268398 | 0.912889052 | 0.90487335 | 0.89945 |
| Text + 2Gram | 0.998552822 | 0.885281385 | 0.894742015 | 0.891828538 | 0.883421 |
| Text + 3Gram | 0.997105644 | 0.867965368 | 0.881810904 | 0.874727644 | 0.866752 |
| Text + 4Gram | 1 | 0.800865801 | 0.848577524 | 0.814521589 | 0.809801 |
| Text + 5Gram | 1 | 0.707792208 | 0.839340945 | 0.72337248 | 0.739326 |
\ No newline at end of file
Results.xlsx
0 → 100644
View file @
673dc40c
File added
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment