I just made a nice little pipeline. I love the Linux shell.
cat file | tr A-Z a-z | tr -c a-z "\n" | sort | uniq -c | sort -nr
That gives a nice ordered count of the most used words in the file. It's pretty dumb -- it breaks words at anything which isn't an alphabetical character -- but it does the trick.
The first tr changes uppercase characters to lowercase, the second changes anything which isn't alphabetic to a newline. Then the words are sorted alphabetically, then uniq counts successive identical lines and outputs the count with the word, then that list is sorted numerically in descending order. Lovely.