Linux SORT problem

**aura2** · April 27th, 2006, 10:02 PM

I am using slackware 10, and wanted to get more familiar with piping and sorting. I have a document named output.txt and I want to display the top 20 most recurring words along with how many times they were used in the document.
I know I have to use sort and wc but I don't know how to pipe them together.
Any suggestions?

**aura2** · April 27th, 2006, 10:13 PM

Ah nvm posted too soon

I just used
cat output.txt | sort | uniq -c | sort -nr | head -20

***preacherman481*** · April 27th, 2006, 10:27 PM

Are you sure that gives you what you want? Doesn't seem to be working for me. I think it just gives you the first 20 unique lines in alphabetical order. If you are using a regular document it won't work as you desired. The only way it would work the way you were asking, would be if the document was just a list of words one under the other like this:

apple
pear
bear
car
house
etc
etc

***preacherman481*** · April 28th, 2006, 12:33 AM

See this page. You will have to modify the example they use according to your own needs, but it will help you get what you are wanting to do. See the exercise "Frequency analysis of Text" and the solution for the exercise on the same page.

**aura2** · April 28th, 2006, 02:40 AM

Yeah it works just fine for me, the file I had was already broken up into 1 word per line.

Thread: Linux SORT problem

Thread Tools

Display

Linux SORT problem

Posting Permissions