using dd?
Results 1 to 7 of 7

Thread: using dd?

  1. #1
    Senior Member
    Join Date
    Oct 2004
    Posts
    172

    using dd?

    i'm trying to backup my windows NTFS partition using a knoppix disk, a secondary hard drive and the dd command. i've been using this command: "dd if=/dev/hda | gzip > /mnt/hdb1/hda1bak.img.gz", which i've seen on a tutorial, http://wiki.linuxquestions.org/wiki/Dd. i have a few questions/problems that i'm hoping you guys could help me with:

    1. i've seen several examples and they seem to change the bs to 512, 1024, etc. without saying why. do i need to change the bs depending on the drive i'm grabbing? for example if my NTFS drive was formatted with 1024 cluster sizes, would i have to set the bs to 1024 or something?

    2. also, some descriptions have said the bs was the number of bytes read at a time and other say the number of blocks read at at time, which is it?

    3. why does dd take so long? i did a zero-fill on my 160 gig hard drive with this command "dd if=/dev/zero of=/dev/hda bs=1M" and it took eleven hours to finish. also, when i try to grab an image of my NTFS partition(only 40 gigs), it takes like 5 hours.

    4. i tried to increase the bs size so that it might go faster. i incresed it from the default to 3M, so the command i ran was "dd if=/dev/hda1 bs=3M | gzip > /mnt/hdb1/hda1bak.img.gz". this took a very long time to finish running and when it finished it gave me an error about the filesize being too large. can anybody tell me what i should be setting the bs to for grabbing my 40 gig windows partition and about how long it should take? i'm no expert, but i dont think it shouldnt be taking this long.

  2. #2
    Jaded Network Admin nebulus200's Avatar
    Join Date
    Jun 2002
    Posts
    1,356
    Its been a little while since I have messed with any of this directly and I don't have my handy-dandy references in front of me, but I am going to try to go off memory (not always a good thing).

    Ok, I know why I was thrown (yes age is part of it). Is there any reason why you aren't using of= in the dd command ? Is the second drive large enough to hold the image from the first drive ? For just imaging the drive, its ok to take the default block size (just: dd if=/xxxx of=yyyy (you may want to check out the options, depending on your objective, like skipping over errors)). The only time the cluster/block size becomes critical is if you are going to slice data out of the image at a later point, in which case you will need to know the cluster size information contained in the MBR of the image. Check out the sluethkit (http://www.sleuthkit.org/sleuthkit/desc.php), it has some handy tools for reading disk images, the relevant one being 'mmls', which will help analyze the image and determine the starting/stop blocks, block size, etc of the partition in the image you are interested in. You could for example, use this information to slice off say your C drive, mount the image, and then grab a file that you lost off of it, without having to restore the whole disk.


    Another thing you might consider (and that was instructed at SANS), was to do the transfer over to another system (the reasoning being in the class was that you were imaging compromised systems, which isn't the case here), but still, if you have the systems laying around, this method can be useful. The easiest way being to setup a netcat listener on the server receiving the file (that redirects the data received to a file) and to send the file from the system you are backing up using netcat. It was pretty effective and fairly fast, depending on your setup with the disks (ie sharing same controller). Given the issues you say you were experiencing with slowness, I almost wonder whether you had an I/O or controller issue (ie, maybe both hdds being on same controller ? ). We were able to image 4G drives fairly quickly ( a few minutes), which you can extrapolate for yourself.



    EDIT: Also wondering if your choice of a very large block size isn't creating some other kind of issue (like requesting bigger chunks than the hdd is able to read well, will have to look at that tomorrow when I am not tired and brain fried), I'd stick with the smaller blocksizes, if you use them at all.

    There is only one constant, one universal, it is the only real truth: causality. Action. Reaction. Cause and effect...There is no escape from it, we are forever slaves to it. Our only hope, our only peace is to understand it, to understand the 'why'. 'Why' is what separates us from them, you from me. 'Why' is the only real social power, without it you are powerless.

    (Merovingian - Matrix Reloaded)

  3. #3
    Senior Member
    Join Date
    Oct 2004
    Posts
    172
    thanx alot . i think the hard drives may be on the same controller, theyre plugged into the same ribbon wire. the wire has two openings at one end and one opening that plugs into the mobo, so i guess the drives are probably on the same controller. the reason i didnt specify the output file is because the output is automatically sent to stdout, which i am piping to gzip and making a compressed file of my hard drive image. in the end, the compressed file should only be about 500mb because most of the partition is empty space. the netcat thing seems like it might work, i had read something about the wonders of dd and netcat that i found on google, but i dont see how it would speed anything up in my case. wouldnt netcat mean i would be doing this over a network somehow? i just want to image my one drive periodically. i had also thought that you could mount images created with dd by using the mount command(i've never tried it but i'm sure i've read it somewhere). i'll try putting the drives on different wires and giving it another crack, im not even sure if i've got another controller on my mobo tho...

    edit:
    well, looks like i only have one ide controller and two channels (i guess i kinda mixed up the channels and the controller).


  4. #4
    Senior Member
    Join Date
    Oct 2004
    Posts
    172
    one thing that doesnt add up is that I always used to use this bootable ghost disk with a hard drive image on it to re-image my hard drive all the time and it would always finish within about 10 minutes. the hard drive that was being imaged and my cd drive were both on the same ide controller, so i dont think that both of my hard drives being on the same ide controller is causing the slow speed, it could be though, i dont know.

  5. #5
    AO Curmudgeon rcgreen's Avatar
    Join Date
    Nov 2001
    Posts
    2,716
    Just for kicks, you might test the speed without gzip, if you have space.
    You never know how much it affects the speed until you experiment
    without it.
    I came in to the world with nothing. I still have most of it.

  6. #6
    Jaded Network Admin nebulus200's Avatar
    Join Date
    Jun 2002
    Posts
    1,356
    Originally posted here by slinky2004
    one thing that doesnt add up is that I always used to use this bootable ghost disk with a hard drive image on it to re-image my hard drive all the time and it would always finish within about 10 minutes. the hard drive that was being imaged and my cd drive were both on the same ide controller, so i dont think that both of my hard drives being on the same ide controller is causing the slow speed, it could be though, i dont know.
    I am wanting to say that I have seen it slow down things (stuff trying to read/write/write to different disks on same controller), but I guess my reasoning with the netcat possibily being faster is that your biggest hit on running those programs is going to be doing your I/O to disk (and the waits that accompany it). Assuming you are doing this over a 100 (or faster) Mb LAN, you would essentially have one pc doing the reading, one doing the writing, and the LAN providing enough bandwidth to accomodate the data transfer rate (depends on your controller, and drives, but 100MB should cover most things pretty sufficiently).

    Regardless, like rcgreen said, experiment. I agree that it would probably be better to wait to do the gzip until after the file transfer is complete (btw, if memory serves with the disk images I have used in past, I think bzip did a little better job of shrinking down the size).

    I think you will be suprised if you get a fairly good LAN setup, how fast it can image the drive over the LAN

    EDIT: One other thing (still no references, but sleep helps). I am pretty sure that cluster size is block size, they are just used by different OS (Ie, windows/m$ calls it a cluster, Unix calls it a block).

    There is only one constant, one universal, it is the only real truth: causality. Action. Reaction. Cause and effect...There is no escape from it, we are forever slaves to it. Our only hope, our only peace is to understand it, to understand the 'why'. 'Why' is what separates us from them, you from me. 'Why' is the only real social power, without it you are powerless.

    (Merovingian - Matrix Reloaded)

  7. #7
    Senior Member
    Join Date
    Oct 2004
    Posts
    172
    thanx i've made an image of my windows drive. i dont actually know how long it took cuz for some reason it closed the window when it finished, so i didnt get any output from dd besides a .img.gz file. i thought it was supposed to say the number of blocks read/written and the number of seconds or something

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •