Results 1 to 2 of 2

Thread: Steganography and Steganalysis

  1. #1
    Senior Member therenegade's Avatar
    Join Date
    Apr 2003
    Posts
    400

    Steganography and Steganalysis

    This is a brief draft of a paper I’ve been working on recently. It introduces steganography as a means to transmitting data covertly and methods to detect it. I’ve deliberately kept all the math out; I’m hoping this doesn’t make things too unclear. Certain parts of this text have been taken from the original, in the part about a few programs available on the net for instance, I’ve directly quoted material from the manufacturers themselves. References have been included as and where necessary. Any comments or criticism would be appreciated.

  2. #2
    Senior Member therenegade's Avatar
    Join Date
    Apr 2003
    Posts
    400
    Someone asked me to paste it in here,sorry it took so long:


    STEGANOGRAPHY AND STEGANALYSIS


    This is a brief draft of a paper I’ve been working on recently. It introduces steganography as a means to transmitting data covertly and methods to detect it. I’ve deliberately kept all the math out; I’m hoping this doesn’t make things too unclear. Certain parts of this text have been taken from the original, in the part about a few programs available on the net for instance, I’ve directly quoted material from the manufacturers themselves. References have been included as and where necessary. Any comments or criticism would be appreciated.


    Introduction:

    Steganography simply put, is a way to send information without showing the information is present at all. This information can be sent by manipulating seemingly normal media like text, pictures, even audio and video files with astonishingly little difference between the original and doctored file to the naked eye. This differs from cryptography, where you encrypt your information (plaintext). If intercepted, the original plaintext can only be viewed if you know the passkey and the algorithm (one way or the other). Steganography however, aims at concealing the fact that the information was ever sent, the logic being, if you don’t know it was sent you couldn’t possibly begin procedures to obtain it. This makes the combination of a plaintext, encrypted first..and then suitably manipulated to conceal itself in a ‘carrier’ quite disconcerting. Many admins configure their firewalls not to send exe or zip files, most though, let pictures, audio, video and text files through. I recall having read somewhere that 70% of the pictures surveyed had significantly different information than the original image. This does make for at least a little concern, what would be the point in plugging all the holes in your network just to have people smuggle valuable information through a picture of Pamela Andersen which you’d ogle over and let through or an mp3 with Britney Spears screeching which you’d probably grimace at (if you have good taste: P) but let through anyway?


    I hope this has provided at least a brief idea of what steganography is, here’s a brief glossary of terms related to steganography:

    Steganography: It can be defined as an art of hiding information within information so not to arouse suspicion.

    Watermarking: Watermarking embeds a digital signal in text, image, audio or video files, which may contain information and proof of rights to a product's owner or publisher. It can be thought of as a steganographic application in which the data embedded verifies authenticity.

    Lossless compression is preferred when there is a requirement that the original information remain intact (as with steganographic images). The original message can be reconstructed exactly. This type of compression is typical in GIF and BMP images.

    Lossy compression, while also saving space, may not maintain the integrity of the original image. This method is typical in JPG images and yields very good compression.

    Carrier: Any file that is modified with very little visual/aural difference to hide data that is not visible to the human eye

    The ‘carriers’ generally used to transmit hidden messages are usually ones that are common and don’t arouse suspicion..hence pictures and audio formats tend to get used a lot. Let us look at ways to send them in a little more detail.

    Text:
    While text is not generally a suitable media to hide any other type of information like pictures, it can be used in more ways than one to hide text within itself. Methods to do so range from the simple ‘null ciphers’, which may rely on word shifting to hiding them in the properties of Word files or in the comments written while writing webpages.
    Simple example of a word cipher:
    Fishing freshwater bends and saltwater
    coasts rewards anyone feeling stressed.
    Resourceful anglers usually find masterful
    leapers fun and admit swordfish rank
    overwhelming anyday.

    Reading the third letter in every word we get, ‘Send lawyers guns and money’.

    Spam, surprisingly, is also a way to send these kinds of messages. Spam Mimic (www.spammimic.com) provides a facility to encode your innocuous message into seemingly normal spam, which most users don’t look at anyway.
    The following is an encoded ‘spam’ message of the plaintext ‘This is a test message’:

    Dear Friend; Thank-you for your interest in our publication. If you no longer wish to receive our publications simply reply with a Subject: of "REMOVE" and you will immediately be removed from our club! This mail is being sent in compliance with Senate bill 1816; Title 3; Section 304. This is not multi-level marketing. Why work for somebody else when you can become rich within 45 days. Have you ever noticed more people than ever are surfing the web & people love convenience
    ! Well, now is your chance to capitalize on this .We will help you SELL MORE and use credit cards on your website. You are guaranteed to succeed because we take all the risk! But don't believe us! Ms Ames of Montana tried us and says, "I was skeptical but it worked for me”! We are licensed to operate in all states. We implore you - act now. Sign up a friend and you'll get a discount of 80%! Best regards. Dear E-Commerce professional, especially for you - this breath-taking news. We will comply with all removal requests. This mail is being sent in compliance with Senate bill 1626; Title 1; Section 301. This is different than anything else you've seen! Why work for somebody else when you can become rich in 38 weeks. Have you ever noticed most everyone has a cell phone plus people love convenience! Well, now is your chance to capitalize on this. We will help you decrease perceived waiting time by 200% plus use credit cards on your website! You are guaranteed to succeed because we take all the risk. But don't believe us. Mr. Jones of Georgia tried us and says, "Now I'm rich many more things are possible”! This offer is 100% legal! So make yourself rich now by ordering immediately! Sign up a friend and you'll get a discount of 60%. Best regards!



    Colour theory:

    In order to understand how pictures can be digitally manipulated it is necessary to also understand a little about how colours are represented. Additional knowledge of file formats that are used like jpeg, gif and bmp are also useful. The following section is a brief introduction to just that:
    Colours are generally represented on the screen in the form of pixels.

    A colour can either be a primary colour or a secondary colour. The primary colours are Red, Green and Blue. All the other colours are secondary colours as these can be made from any combination of the primary colours. For example, green + red = yellow, green +blue = cyan and red + blue = magenta. Similarly, by variations in the intensity of the primary colours, any colour can be viewed. The RGB (Red, Green, Blue) component of each colour is specified in terms of 8 bits or a byte..hence the intensities of the Red, Green or Blue component can vary from 0-255. If three bytes are combined to constitute one pixel then the system would be called a 24-bit (or a 3 byte) colour scheme or the 24 bit true colour scheme as it is known today. The method of encoding would determine the size of the actual picture..for example, a picture with a 640 X 480 pixel with an 8-bit colour scheme would occupy 640 X 480 = 307200 bits or 307kb aprrox. Whereas the same picture using a 24 bit colour scheme would occupy 640 X 480 X 3 = 921600 bits or 921kb roughly three times the size.
    Another way to represent images with an 8-bit scheme is by using a palette. This would involve using a 24-bit true colour but use a palette that specifies which colours are used in that image. Each pixel is encoded in 8 bits, where the value points to a 24-bit colour entry in the palette. This method limits the unique number of colours in a given image to 256(28).
    Colour palettes and eight-bit color are commonly used with Graphics Interchange Format (GIF) and Bitmap (BMP) image formats. GIF and BMP are generally considered to offer lossless compression because the image recovered after encoding and compression is bit-for-bit identical to the original image
    The Joint Photographic Experts Group (JPEG) image format uses discrete cosine transforms rather than a pix-by-pix encoding. In JPEG, the image is divided into 8 X 8 blocks for each separate color component. The goal is to find blocks where the amount of change in the pixel values (the energy) is low. If the energy level is too high, the block is subdivided into 8 X 8 sub blocks until the energy level is low enough. Each 8 X 8 block (or sub block) is transformed into 64 discrete cosine transforms coefficients that approximate the luminance (brightness, darkness, and contrast) and chrominance (color) of that portion of the image. JPEG is generally considered to be lossy compression because the image recovered from the compressed JPEG file is a close approximation of, but not identical to, the original

    For example, let’s take a simple example on how a LSB (least bit substitution) could be made:

    The ASCII code for the letter A is 01000001. If you’d want to hide the letter A in a block of data say,


    10010101 00001101 11001001 10010110
    00001111 11001011 10011111 00010000


    So to hide 01000001 we’d just replace the LSB of every byte with the corresponding bits from 01000001.The result turns out to be:


    10010100 00001101 11001000 10010110
    00001110 11001010 10011110 00010001
    By overwriting the least significant bit, the numeric value of the byte changes very little and is least likely to be detected by the human eye or ear.



    Data can also be hidden in various audio bit streams. With the rising popularity of the mp3 format (due to good voice quality with great compression ratios of nearly 11:1), it was only inevitable that there would be attempts to try and hide data in it. The ISO-MPEG Audio Layer-3 or mp3 format as we know it more popularly also has bit redundancies. These bit redundancies can be used to send data with very almost imperceptible change in the voice quality to the human ear.

    Hiding data in protocols:

    Similarly, data can be hidden by exploiting redundancies in the protocols like TCP/IP as well. There are flags in both the TCP and IP protocols that are not necessarily set every time a packet passes through a router/whatnot. The amount of packets sent by even a small server every day is quite large. Hence if data can be sent by changing the flags or manipulating the header, it would mean a virtually undetectable method to send data. Most admins wouldn’t worry about an odd flag being set:P
    The only flaw with this method would be that header information is sometimes changed as it passes through routers. Hence there’d be no guarantee that your information would remain intact.


    Steganalysis:

    Steganalysis is the detection of the hidden information present in a carrier. The methods for steganalysis are quite similar to the methods for cryptanalysis in the sense that they can be classified on how much prior information is available. Mere detection is not sufficient, as the analyst may want to extract the hidden data from the carrier. Hence steganalysis can be divided into two kinds: Passive and Active. Passive steganalysis involves only detection of the presence of hidden information or a modified carrier while Active steganalysis involves extracting the hidden information as well. The primary difference between cryptanalysis and steganalysis is that in cryptanalysis the goal is to break the encryption. If however, the extraction of the hidden data is not possible in steganalysis you could always modify the carrier so that the hidden data in it now becomes ‘garbage’. Consider a case scenario in which an admin had to encounter encrypted data and carriers in which hidden data were present. With the encrypted data the only option he’d have would be to hope to be able to break the encryption. With the carrier however, even if he were unable to extract the data he’d always have the alternative to remodify the data by running it through a few stego applications himself (Sort of like a if I cant see your data then I’m not letting the other person see it as well.)

    Based on how much prior information is available to the analyst, steganalysis can be classified as:
    Steganography-only attack: The steganography medium is the only item available for analysis.

    Known-carrier attack: The carrier and steganography media are both available for analysis. The original carrier and the modified one with the hidden data are known.

    Known-message attack: The hidden message is known. This may help by looking for patterns and comparing the data with future transmissions intercepted.

    Chosen-steganography attack: The steganography medium and algorithm are both known. The steganalyst generates modified carriers by running them through different stego algorithms and searches for similar patterns with the intercepted data.

    Chosen-message attack: A known message and steganography algorithm are used to create steganography media for future analysis and comparison. By running known data through known stego algorithms patters can be stored for reference later.

    Known-steganography attack: The carrier and steganography medium, as well as the steganography algorithm, are known. This is the dream of any steganalyst

    Tools of the trade:

    These are a few products I thought worth mentioning. A more comprehensive list of tools can be found at: http://www.jjtc.com/Security/stegtools.htm


    Outguess: (www.outguess.org)

    OutGuess is a steganographic tool used to insert hidden information into the redundant bits of data sources. The basis of the program is the extraction of redundant bits for use in embedding and write them back after modification. The thing that makes OutGuess unique is the ability to preserve frequency counts statistics, which is a main property used in detection of steganographic content. OutGuess automatically determines the maximum message size that can be hidden in an image and still be able to maintain frequency count statistics to avoid detection.


    JPHS (JPHide and JPSeek): ( linux01.gwdg.de/~alatham/stego.html )

    JPHIDE and JPSEEK are freeware programs that allow you to hide data within a JPEG file. The Author of the programs designed them in such a way so as to make it nearly impossible to prove that the image contains hidden data. By keeping an insertion rate of 5% or less, the author believes that its impossible to conclude that there is any hidden data without having the original image to compare against. The basic rule of steganography is that as the insertion percentage increases the statistical nature of the jpeg coefficients differ from normal. Typically an insertion rate of greater than 15% starts to affect the image enough to become visible to the naked eye. The author suggests using host images that have a lot of fine detail, as they are better at hiding the effects of the data hiding.


    MP3stego: (http://www.petitcolas.net/fabien/ste...aphy/mp3stego/)

    MP3Stego will hide information in MP3 files during the compression process. The data is first compressed, encrypted and then hidden in the MP3 bit stream.


    S-Tools:

    S-Tools uses least significant bit substitution in files that employ lossless compression, such as eight- or 24-bit color and pulse code modulation. S-Tools employs a password for least significant bit randomization and can encrypt data using the Data Encryption Standard (DES), International Data Encryption Algorithm (IDEA), Message Digest Cipher (MDC), or Triple-DES.

    Gif-it-Up: (http://packetstormsecurity.nl/crypt/stego/gif-it-up/)

    This is another freeware program which can be used to hide data within gif files using the least significant bit substitution method. It also includes an encryption option.

    Steganos Security Suite: (www.steganos.com)

    This is a paid tool that is supposed to be pretty good. Registration is required after seven days of activation though.

    EnCase: (www.guidancesoftwares.com)

    In EnCase investigators must identify and match the MD5 hash value of each suspected file. They must import or build a library of hash sets (in this particular case, a steganography software) with the library feature in EnCase [5]. The hash sets will identify stego file matches.

    Ilook Investigator: (www.ilook-forensics.org)


    ILook Investigator © is a forensic analysis tool used by thousands of law enforcement labs and investigators around the world for the investigation of forensic images created by many different imaging utilities.


    StegDetect: (http://www.outguess.org/download.php)

    Stegdetect was written by Niels Provos in 2001, the author of the steganography program called Outguess. Stegdetect is reliable in detecting JPEG images that have content embedded with JSteg, JPHide and OutGuess.

    Since the motive is to verify data integrity, programs like MD5 that generate unique hashes for a file can be used. These hash values can then be compared to see whether any modification has been made to the original file.


    Jpegger:

    Jpegger, a Java web crawler to scan and save JPEG images from the Internet. Jpegger is multi-threaded, and stores a list of URLs it has visited to avoid backtracking, and stores its current state during it’s crawling in case of a failure. Jpegger uses HTMLParser, an Open Source html parser available at http://htmlparser.sourceforge.net

    Conclusion:

    Steganography needs to be recognized as a growing threat. Even though the scope of steganography is limited, (for instance a large amount of data cannot be sent without having a very large carrier which could arouse suspicion.) small chunks of data can be sent with increasing efficiency. There’s not much point plugging up holes in our systems when a user can send sensitive data through anyway.


    REFERENCES:


    1) Neil F. Johnson and Sushil Jajodia
    “Steganalysis of Images Created Using Current Steganography Software”
    URL: http://www.jjtc.com/ihws98/jjgmu.html


    2) Niels Provos, Peter Honeyman
    Detecting Steganographic Content on the Internet”
    URL: http://www.citi.umich.edu/techreport...i-tr-01-11.pdf


    3) Gary C. Kessler “An Overview of Steganography for the Computer Forensics Examiner
    URL: http://www.garykessler.net/library/fsc_stego.html

    4) Steganalysis: Detecting hidden information with computer forensic analysis
    URL: http://www.sans.org/rr/whitepapers/s...raphy/1014.php

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •