AO RSS Feed.
Results 1 to 8 of 8

Thread: AO RSS Feed.

  1. #1
    Super Moderator
    Know-it-All Master Beaver

    Join Date
    Jan 2003
    Posts
    3,914

    AO RSS Feed.

    Hey Hey,

    I've had a bit of a bout of insomnia tonight, so I decided to create a shell script.

    Code:
    #!/bin/bash
    nc www.antionline.com 80 < get.txt | grep -B 5 "END TEMPLATE: P_activetopicbits -->" > ao.html
    html2text -style pretty -nobs -o ao.txt ao.html
    cat ao.html | grep showthread | cut -d " " -f 4 | cut -d "\"" -f 2 | cut -d "<" -f 3 | grep http > links.txt
    cat ao.html | grep "Today at" | cut -d ">" -f 4 | cut -d "<" -f 1 > title.txt
    cat links.txt | cut -d "&" -f 1 > link1.txt
    cat links.txt | cut -d "&" -f 2 > link2.txt
    counter=1
    echo "<?xml version=\"1.0\" ?>" > ao.xml
    echo "<rss version=\"2.0\">" >> ao.xml
    echo "  <channel>" >> ao.xml
    echo "     <title>AO RSS Feed</title>" >> ao.xml
    echo "     <link>http://www.antionline.com</link>" >> ao.xml
    echo "     <description>Computer Security</description>" >> ao.xml
    until [ $counter -eq 20 ]; do
    	title=`head -n $counter title.txt | tail -n 1`
    	link1=`head -n $counter link1.txt | tail -n 1`
    	link2=`head -n $counter link2.txt | tail -n 1`
    	echo "          <item>" >> ao.xml
            echo "          <title>$title</title>" >> ao.xml
    	echo "          <link>$link1&$link2</link>" >> ao.xml
    	echo "          </item>" >> ao.xml
    	counter=`expr $counter + 1`
    done
    echo "  </channel>" >> ao.xml
    echo "</rss>" >> ao.xml
    As you can see it's a fairly simple shell script (you can download it from http://www.seeminglyrandom.info/ao/script.sh if you wish. Anyways it generates a simple RSS feed with the links off the main page (the first 20). It'll display the title of the thread and a link to the thread. It's nothing fancy, but it took a long time since I've never touched XML before and had to fix parse errors. Anyways it seems to be working.. feel free to use it as you wish. For those of you not interested in generating your own. I have a cron on my website running it every 15 minutes. If my host doesn't complain, I plan on changing it to every 5 minutes. You can access the XML feed from http://www.seeminglyrandom.info/ao/ao.xml.

    It's not much but hopefully you'll enjoy it.

    Peace,
    HT

    PS... is this an OK forum to post this in? I figure it's security related (it's AO Headlines and I wanted everyone to know it existed).....
    IT Blog: .:Computer Defense:.
    PnCHd (Pronounced Pinched): Acronym - Point 'n Click Hacked. As in: "That website was pinched" or "The skiddie pinched my computer because I forgot to patch".

  2. #2
    Leftie Linux Lover the_JinX's Avatar
    Join Date
    Nov 2001
    Location
    Beverwijk Netherlands
    Posts
    2,535
    great stuff..

    will look nice next to the lwn.net rss
    saves me the time of coming to AO with nothing interesting there

    <edit type=add>
    don't forget you need the get.txt file http://www.seeminglyrandom.info/ao/get.txt

    People might need html2text
    The problem is there are a lot of versions of html2text http://www.google.com/search?q=html2text
    Just guessing that this http://userpage.fu-berlin.de/~mbayer....html#download is the version I need..
    And it seems to work well !!
    </edit>
    ASCII stupid question, get a stupid ANSI.
    When in Russia, pet a PETSCII.

    Get your ass over to SLAYRadio the best station for C64 Remixes !

  3. #3
    Super Moderator
    Know-it-All Master Beaver

    Join Date
    Jan 2003
    Posts
    3,914
    Hey Hey,

    Thanks for the comments JinX, I completely forgot that they'd need get.txt.... Anyways.. my feed is a lil slow currently.. I'm having crontab issues with my host. I'm going to move it over to another location I think and I'll post an updated link when I get a chance. For now whenever I remember I'm updating the feed by manual....

    Peace,
    HT

    IT Blog: .:Computer Defense:.
    PnCHd (Pronounced Pinched): Acronym - Point 'n Click Hacked. As in: "That website was pinched" or "The skiddie pinched my computer because I forgot to patch".

  4. #4
    Super Moderator
    Know-it-All Master Beaver

    Join Date
    Jan 2003
    Posts
    3,914
    Hey Hey,

    Well it's a round about method of doing this, however I've got 2 shells, one that doesn't have webspace and one that "apparently" doesn't have a functioning crontab. So i've come to a conclusion. The one with the crontab and no webspace is running lynx every 5 minutes and requesting a page off my server. I'm using -dump and outputing to /dev/null. The page it's requestion contains a single SSI statement to exec the shell script that generates the feed. The result is http://www.seeminglyrandom.info/ao/ao.xml

    Enjoy,... and as always Peace,
    HT
    IT Blog: .:Computer Defense:.
    PnCHd (Pronounced Pinched): Acronym - Point 'n Click Hacked. As in: "That website was pinched" or "The skiddie pinched my computer because I forgot to patch".

  5. #5
    Leftie Linux Lover the_JinX's Avatar
    Join Date
    Nov 2001
    Location
    Beverwijk Netherlands
    Posts
    2,535

    BUGFIX

    damn.. it seems the new banner (or something else they (jupmedia) changed) killed your rss thingy.. I'm getting all blanks too !!

    If I find a fix, I'll post it here . .

    &lt;edit type="add"&gt;
    Well that was easy.. The entire layout of the AO html is changed..

    the line could/should be
    nc www.antionline.com 80 &lt; get.txt | grep -A 122 "Active In AntiOnline's Forums" &gt; ao.html
    (this greps you all the active topics..)

    also, I've noticed that although you are using html2text, nothing is done with the outcome.. (ao.txt)
    so Let's lose that line (an the dependancy on html2text)

    the weird thing I get now is 19 lines and the last is the same as number 18, well the title is, the link isn't..

    Found out the cause..

    you are grepping for "Today at" and the later posts (numbers 19 and further) are posted "Yesterday at"

    That fixed it !

    my updated version..

    Code:
    #!/bin/bash
    nc www.antionline.com 80 &lt; get.txt | grep -A 122 "Active In AntiOnline's Forums" &gt; ao.html
    cat ao.html | grep showthread | cut -d " " -f 4 | cut -d "\"" -f 2 | cut -d "&lt;" -f 3 | grep http &gt; links.txt
    cat ao.html | grep "Today at" | cut -d "&gt;" -f 4 | cut -d "&lt;" -f 1 &gt; title.txt
    cat ao.html | grep "Yesterday at" | cut -d "&gt;" -f 4 | cut -d "&lt;" -f 1 &gt;&gt; title.txt
    cat links.txt | cut -d "&" -f 1 &gt; link1.txt
    cat links.txt | cut -d "&" -f 2 &gt; link2.txt
    counter=1
    echo "&lt;?xml version=\"1.0\" ?&gt;" &gt; ao.xml
    echo "&lt;rss version=\"2.0\"&gt;" &gt;&gt; ao.xml
    echo "  &lt;channel&gt;" &gt;&gt; ao.xml
    echo "     &lt;title&gt;AO RSS Feed&lt;/title&gt;" &gt;&gt; ao.xml
    echo "     &lt;link&gt;http://www.antionline.com&lt;/link&gt;" &gt;&gt; ao.xml
    echo "     &lt;description&gt;Computer Security&lt;/description&gt;" &gt;&gt; ao.xml
    until [ $counter -eq 21 ]; do
            title=`head -n $counter title.txt | tail -n 1`
            link1=`head -n $counter link1.txt | tail -n 1`
            link2=`head -n $counter link2.txt | tail -n 1`
            echo "          &lt;item&gt;" &gt;&gt; ao.xml
            echo "          &lt;title&gt;$title&lt;/title&gt;" &gt;&gt; ao.xml
            echo "          &lt;link&gt;$link1&amp;$link2&lt;/link&gt;" &gt;&gt; ao.xml
            echo "          &lt;/item&gt;" &gt;&gt; ao.xml
            counter=`expr $counter + 1`
    done
    echo "  &lt;/channel&gt;" &gt;&gt; ao.xml
    echo "&lt;/rss&gt;" &gt;&gt; ao.xml
    you can also download the latest rss feed and all the scripts..

    http://etv.cx/~the_jinx/ao.xml

    http://etv.cx/~the_jinx/ao_rss
    http://etv.cx/~the_jinx/get.txt

    &lt;/edit&gt;

  6. #6
    Banned
    Join Date
    Jul 2001
    Posts
    1,100
    Greetings All:

    Unfortunately, with scripts like these you're always opening yourself up to things breaking every time the webmaster adds a new banner (which happens pretty often on this website), moves things around, etc.

    You also have to worry about injection of code from people making creative topics, etc. etc. (which probably wouldn't be a big worry considering that it's AO that you're pulling from, but regardless).

    Although you did a good job with this HTRegz, perhaps someone should request that JupiterMedia start an official thread dump.

    I use to have one, so I'm sure the code is probably still integrated into the site, all they'd have to do is add it as a cron job, and there you have it....

  7. #7
    Leftie Linux Lover the_JinX's Avatar
    Join Date
    Nov 2001
    Location
    Beverwijk Netherlands
    Posts
    2,535
    It could even be made with ssi (php)

    I mean.. the queries that make up the frontpage could also very easily make an rss (without much overhead)

    PS I made an official request here
    ASCII stupid question, get a stupid ANSI.
    When in Russia, pet a PETSCII.

    Get your ass over to SLAYRadio the best station for C64 Remixes !

  8. #8
    Leftie Linux Lover the_JinX's Avatar
    Join Date
    Nov 2001
    Location
    Beverwijk Netherlands
    Posts
    2,535

    Major Update

    Another major upgrade of the AO rss thingy..

    the new version has added checking (better ?) for & problems ( makes it &amp; )
    Also the new version has added discription of

    Topic starter
    In what forum
    Number of replies
    Poster of last reply
    Time of last reply

    Hope you'll like..

    and I'll keep it updated

    Code:
        nc www.antionline.com 80 &lt; get.txt | grep -A 122 "Active In AntiOnline's Forums" &gt; ao.html
        cat ao.html | grep showthread | cut -d " " -f 4 | cut -d "\"" -f 2 | cut -d "&lt;" -f 3 | grep http &gt; links.txt
        cat ao.html | grep "Today at" | cut -d "&gt;" -f 4 | cut -d "&lt;" -f 1 &gt; title.txt
        cat ao.html | grep "Yesterday at" | cut -d "&gt;" -f 4 | cut -d "&lt;" -f 1 &gt;&gt; title.txt
        cat ao.html | grep "Today at" | cut -d "&gt;" -f 6 | cut -d "&lt;" -f 1 &gt; time.txt
        cat ao.html | grep "Yesterday at" | cut -d "&gt;" -f 6 | cut -d "&lt;" -f 1 &gt;&gt; time.txt
        cat ao.html | grep "Topic Started By: " | cut -d "&gt;" -f2 | cut -d "&lt;" -f1 &gt; starter.txt
        cat ao.html | grep "Topic Started By: " | cut -d "&gt;" -f6 | cut -d "&lt;" -f1 &gt; lastreply.txt
        cat ao.html | grep "Thread Is In: " | cut -d "&gt;" -f2 | cut -d "&lt;" -f1 &gt; cat.txt
        cat ao.html | grep "Thread Is In: " | cut -d "&gt;" -f4 | cut -d "&lt;" -f1 &gt; rep.txt
        counter=1
        echo "&lt;?xml version=\"1.0\" ?&gt;" &gt; ao.xml
        echo "&lt;rss version=\"2.0\"&gt;" &gt;&gt; ao.xml
        echo "  &lt;channel&gt;" &gt;&gt; ao.xml
        echo "     &lt;title&gt;Antionline RSS Feed&lt;/title&gt;" &gt;&gt; ao.xml
        echo "     &lt;link&gt;http://www.antionline.com&lt;/link&gt;" &gt;&gt; ao.xml
        echo "     &lt;description&gt;Maximum Security for a Connected World&lt;/description&gt;" &gt;&gt; ao.xml
        until [ $counter -eq 21 ]; do
             title=`head -n $counter title.txt | tail -n 1`
             title=${title//&amp;/&amp;amp;}
    	links=`head -n $counter links.txt | tail -n 1`
    	links=${links//&amp;/&amp;amp;}
    	desc="Topic Started By: "`head -n $counter starter.txt | tail -n 1`", "
    	desc=$desc" In "`head -n $counter cat.txt | tail -n 1`", "
    	desc=$desc`head -n $counter rep.txt | tail -n 1`" Replies, "
    	desc=$desc"Last Reply By: "`head -n $counter lastreply.txt | tail -n 1`
    	desc=$desc","`head -n $counter time.txt | tail -n 1`
    	desc=${desc//&amp;/&amp;amp;}
    	echo "          &lt;item&gt;" &gt;&gt; ao.xml
             echo "          &lt;title&gt;$title&lt;/title&gt;" &gt;&gt; ao.xml
    	echo "          &lt;link&gt;$links&lt;/link&gt;" &gt;&gt; ao.xml
    	echo "		&lt;description&gt;$desc&lt;/description&gt;" &gt;&gt; ao.xml
    	echo "          &lt;/item&gt;" &gt;&gt; ao.xml
    	counter=`expr $counter + 1`
        done
        echo "  &lt;/channel&gt;" &gt;&gt; ao.xml
        echo "&lt;/rss&gt;" &gt;&gt; ao.xml
    latest version: http://www.etv.cx/~the_jinx/ao_rss
    the rss feed: http://www.etv.cx/~the_jinx/ao.xml

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

 Security News

     Patches

       Security Trends

         How-To

           Buying Guides