Norvig's Spell Checker Algorithm for Azerbaijani Language

The purpose of this project is to prepare a spell checker for Azerbaijani language by implementing a Azerbaijani corpus to Norvig’s algorithm. In general, Spell checking tools train through a corpus, train themselves on the correct spelling of words, and in the future, if the word is misspelled, take the correct word in the corpus as a reference. Choosing the right corpus is very important in spell checking, for this purpose I tried several corpuses in the Azerbaijani language available on the Internet, but most of the corpus itself contained such incorrect spelling words. I decided to create a new corpus based on several books written in Azerbaijani. Because, existing corpuses are crawled data and errors may exist. The corpus I created consists of 1478667 words collected from 47 books in 6 fields (biology, geography, detective, literature, encyclopedia, novel).

Deployments

© 2023 Nijat Zeynalov