Archive for the ‘persian’ Category

Most common Persian words

Saturday, November 10th, 2007

As a second-generation Iranian American, who has spent practically no time in Iran, I have found it difficult to learn the Persian language beyond mere kitchen talk. In an effort to improve my vocabulary, I sought out a list of the most common Persian words. I could not find such a list, so I searched for a Persian-language corpus that I could use to produce the list myself.

I came across the Hamshahri Persian Corpus and decided to use it. I ran a word count on the corpus to determine what the most common words are in the Persian language. I posted the results sorted by the most frequently used words here.

The list was rather long, so I’ve only included words that appeared in the corpus over 1000 times. I plan to start at the top of the list and make flashcards out of any words I don’t know or am unsure of. This should help me focus on words that are more commonly used. I hope you find it useful as well. I will post the Java code I used to parse the corpus if anybody is interested.

If I ever find the time, my next goal is to try to find phrases, word combinations, and word patterns. If anybody is interested in helping out, please let me know. I’d also be interested in finding out about similar (non-commercial) efforts for other languages, particularly other indo-european languages or other languages that use an Arabic script.