Google Indexing Binaries too - For good or for bad
Results 1 to 8 of 8

Thread: Google Indexing Binaries too - For good or for bad

  1. #1
    Junior Member
    Join Date
    Oct 2005
    Posts
    15

    Google Indexing Binaries too - For good or for bad

    Just during my regular routine of going through news across the IT world.. I stumbled upon the following link

    http://arstechnica.com/news.ars/post/20060710-7225.html


    Should Google continue to index binary files, despite the potential drawbacks? The company's position is that the more things on the Internet that are searched, the better things are for everyone, and that people shouldn't worry too much about any possible misuse. Still, the more powerful a tool becomes, the more the potential for abuse increases. This applies not only to Google, but to the Internet in general. As always, skeptical computing is the best defense.
    Google bots crawling the web have become capable enough to index the Binary File contents too:

    http://homemade-tutorials.blogspot.c...ble-files.html


    Are there are many unforeseen dangers of spreading malware and spyware using google bots looking straight into the eye now... Just throwing this topic for discussion.
    For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled.

    Lets make Poverty a History
    http://www.makepovertyhistory.org/

  2. #2
    Dissident 4dm1n brokencrow's Avatar
    Join Date
    Feb 2004
    Location
    Shawnee country
    Posts
    1,243
    There's going to be pluses and there's going to be minuses with Google indexing binaries.

    The downside is some 'genius' will figure out a way to socially engineer the feature into a spyware/malware setup.

    The upside is that such indexing casts a light into those shadowy corners of the web that harbor spyware and malware, not to mention warez.
    “Everybody is ignorant, only on different subjects.” — Will Rogers

  3. #3
    Senior Member
    Join Date
    Jan 2003
    Posts
    3,914
    Hey Hey,

    There are no drawbacks to google providing binary search capabilities.... Human stupidity has always existed and to say that a drawback of the searching is human stupidity (i.e. the ability to be socially engineered) isn't right... You'd have to say that everything has the drawback of human stupidity... It's one of those global constants that you just can't consider... it always exists... I could say that this forums drawback is that I could social engineer someone out of their password... that's human stupidity... not a drawback.... Whoever wrote the initial article is a drawback (they're part of the problem of human stupidity)...

    Peace,
    HT
    IT Blog: .:Computer Defense:.
    PnCHd (Pronounced Pinched): Acronym - Point 'n Click Hacked. As in: "That website was pinched" or "The skiddie pinched my computer because I forgot to patch".

  4. #4
    Dissident 4dm1n brokencrow's Avatar
    Join Date
    Feb 2004
    Location
    Shawnee country
    Posts
    1,243
    I could say that this forums drawback is that I could social engineer someone out of their password... that's human stupidity...
    The question then is, whose stupidity?
    “Everybody is ignorant, only on different subjects.” — Will Rogers

  5. #5
    Senior Member
    Join Date
    Jan 2003
    Posts
    3,914
    Originally posted here by brokencrow
    The question then is, whose stupidity?
    Judging by that comment... I'm going to say yours
    IT Blog: .:Computer Defense:.
    PnCHd (Pronounced Pinched): Acronym - Point 'n Click Hacked. As in: "That website was pinched" or "The skiddie pinched my computer because I forgot to patch".

  6. #6
    Junior Member
    Join Date
    Oct 2005
    Posts
    15

    Search into the deeper end of the world wide web

    These is still a vast world wide web left beyond the reach of today's search engines, which i prefer to call 'Deeper end of the Web'.


    The deeper end of the web consists of web pages created dynamically in response to the user requests and huge databases linked to the sites. Though search engine querry can fetch you a static page html or else depending upon the metadata or whatsoever.. still your search querry in not able to fetch any response from the huge data residing in the public databases.. The search engines are yet to get capable of digging this much deeper in to the web yet.

    Searching on the Internet today can be compared to dragging a net across the surface of the ocean. While a great deal may be caught in the net, there is still a wealth of information that is deep, and therefore, missed. The reason is simple: Most of the Web's information is buried far down on dynamically generated sites, and standard search engines never find it.

    Traditional search engines create their indices by spidering or crawling surface Web pages. To be discovered, the page must be static and linked to other pages. Traditional search engines can not "see" or retrieve content in the deep Web — those pages do not exist until they are created dynamically as the result of a specific search. Because traditional search engine crawlers can not probe beneath the surface, the deep Web has heretofore been hidden.

    The deep Web is qualitatively different from the surface Web. Deep Web sources store their content in searchable databases that only produce results dynamically in response to a direct request. But a direct query is a "one at a time" laborious way to search. BrightPlanet's search technology automates the process of making dozens of direct queries simultaneously using multiple-thread technology and thus is the only search technology, so far, that is capable of identifying, retrieving, qualifying, classifying, and organizing both "deep" and "surface" content.

    This is the interesting paper on this topic.


    P.S. posting slightly offtopic but interesting info in this post only to keep the spirit of the thread alive.


    Peace
    For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled.

    Lets make Poverty a History
    http://www.makepovertyhistory.org/

  7. #7
    Dissident 4dm1n brokencrow's Avatar
    Join Date
    Feb 2004
    Location
    Shawnee country
    Posts
    1,243
    I wonder to what extent a robots.txt file will keep Google and other search engines out of websites, or parts of them.

    I use robots.txt files on some of my sites, particularly the stuff I locally host via a DSL connection. They seem to work.
    “Everybody is ignorant, only on different subjects.” — Will Rogers

  8. #8
    Senior Member
    Join Date
    Jan 2003
    Posts
    3,914
    Those of you interested in experimenting in this may wish to check out http://metasploit.com/research/misc/mwsearch/?q=bagle

    Peace,
    HT
    IT Blog: .:Computer Defense:.
    PnCHd (Pronounced Pinched): Acronym - Point 'n Click Hacked. As in: "That website was pinched" or "The skiddie pinched my computer because I forgot to patch".

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •