-
October 13th, 2006, 05:59 PM
#1
Junior Member
Creating a wordlist from a website
Is there way to create a wordlist from a website? Like scanning a website and taking all the words on it to create a list?
Thanks
-
October 13th, 2006, 06:15 PM
#2
you could always just grab the HTML, write a simple script to extract all the HTML tags, and place every word on a new line, and then search through the file to remove duplicate words.
If you think I have this script you're wrong. ;p
...This Space For Rent.
-[WebCarnage]
-
October 13th, 2006, 09:03 PM
#3
Member
Can you elaborate a bit more?
Do you need specific words? because then you can just use the search button.
Or do you need all the words in html listed for example on alfabatic order?
Cheerio!
-
October 13th, 2006, 11:01 PM
#4
<?php
$url = "http://something.com/index.html";
$file = fopen($url, "r");
while(!feof($file)) {
$data = $data . fgets($file, 4096);
}
fclose ($file);
$data = preg_replace('#<[^>]*?>#s',"",$data); // Strip html tags
$data = htmlspecialchars_decode($data); // Translate **** like & to acctual letters/characters
$data = preg_replace('#\n*#s'," ",$data); // replace new lines with spaces
$data = preg_replace('#[^A-z0-9\-\s]#',"",$data); // Strip elements that do not produce words.
$data = preg_replace('#\s{2,}#'," ",$data); // strip double spaces
$dataex = explode(" ",$data);
print_r($dataex); // <-- word list
//Created in 3 minutes, is very unlikely to work...but its a start :P
?>
With all the subtlety of an artillery barrage / Follow blindly, for the true path is sketchy at best. .: Bring OS X to x86!:.
Og ingen kan minnast dei linne drag i dronningas andlet den fagre dag Då landet her kvilte i heilag fred og alle hadde kjærleik å elske med.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
|