NLP Intern
Machine Learning Based Test Automation System
-
Made clusters of sentence with same percentage similarity using NLP and increasing the software speed through the use of parallel computing.
-
First the document is cleaned using scrubbing which includes removing noise, lemmatization, removing numbers and then comparing the resulting sentence using spacy module.
-
The speed of comparison is increased using distributed systems using pyspark which uses master slave configuration where data.
-
Check the repository here