Most common Persian words
As a second-generation Iranian American, who has spent practically no time in Iran, I have found it difficult to learn the Persian language beyond mere kitchen talk. In an effort to improve my vocabulary, I sought out a list of the most common Persian words. I could not find such a list, so I searched for a Persian-language corpus that I could use to produce the list myself.
I came across the Hamshahri Persian Corpus and decided to use it. I ran a word count on the corpus to determine what the most common words are in the Persian language. I posted the results sorted by the most frequently used words here.
The list was rather long, so I’ve only included words that appeared in the corpus over 1000 times. I plan to start at the top of the list and make flashcards out of any words I don’t know or am unsure of. This should help me focus on words that are more commonly used. I hope you find it useful as well. I will post the Java code I used to parse the corpus if anybody is interested.
If I ever find the time, my next goal is to try to find phrases, word combinations, and word patterns. If anybody is interested in helping out, please let me know. I’d also be interested in finding out about similar (non-commercial) efforts for other languages, particularly other indo-european languages or other languages that use an Arabic script.
December 5th, 2008 at 4:13 pm
I came across your blog from on of your article in DevX.com.
I work as Java/J2EE developer in Canada.
It was exciting to find an Iranian with java expertise and many articles and book.
I just want to say hello wish you best of luck.