Linux SORT problem
Results 1 to 5 of 5

Thread: Linux SORT problem

  1. #1
    Banned
    Join Date
    Jun 2004
    Posts
    154

    Linux SORT problem

    I am using slackware 10, and wanted to get more familiar with piping and sorting. I have a document named output.txt and I want to display the top 20 most recurring words along with how many times they were used in the document.
    I know I have to use sort and wc but I don't know how to pipe them together.
    Any suggestions?

  2. #2
    Banned
    Join Date
    Jun 2004
    Posts
    154
    Ah nvm posted too soon
    I just used
    cat output.txt | sort | uniq -c | sort -nr | head -20

  3. #3
    Senior Member
    Join Date
    Feb 2002
    Posts
    856
    Are you sure that gives you what you want? Doesn't seem to be working for me. I think it just gives you the first 20 unique lines in alphabetical order. If you are using a regular document it won't work as you desired. The only way it would work the way you were asking, would be if the document was just a list of words one under the other like this:

    apple
    pear
    bear
    car
    house
    etc
    etc
    For the wages of sin is death, but the free gift of God is eternal life in Christ Jesus our Lord.
    (Romans 6:23, WEB)

  4. #4
    Senior Member
    Join Date
    Feb 2002
    Posts
    856
    See this page. You will have to modify the example they use according to your own needs, but it will help you get what you are wanting to do. See the exercise "Frequency analysis of Text" and the solution for the exercise on the same page.
    For the wages of sin is death, but the free gift of God is eternal life in Christ Jesus our Lord.
    (Romans 6:23, WEB)

  5. #5
    Banned
    Join Date
    Jun 2004
    Posts
    154
    Yeah it works just fine for me, the file I had was already broken up into 1 word per line.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •