Hey Hey,

It's been a while since one of these tutorials has surfaced, but I've been bored so I'm going to start tryign to introduce a python module or two each night. Tonights code is something that most of you will probably laugh at. I'm going to introduce the tarfile module and portions of the string and sys modules.
As always I'm assuming you have Python and have read my previous tutorials so that you at least have a basic understanding of where this is going.

Anyways, let's do this just like the old ones.


********************
Step-by-Step Process
********************
1. Open your favourite editor (Vi, Pico, Notepad, Wordpad, Textpad, DOS Edit).
2. Type

Code:
"""
Python Script to Deal with Tarballs
Flags: c(ompress), d(compress), g(zip)
"""
import tarfile
import string
import sys
try : strFlags = sys.argv[1]
except : strFlags = "-h"
decompress = "yes"
compress = "yes"
gzip = "yes"
if string.find(strFlags, "-") == 0 :
    if string.find(strFlags, "h") != -1 :
        print "Usage: %s -<flags> <archive> <file name if compressing>" % sys.argv[0]
        print "Flags: d[ecompress]\n       c[ompress]\n       g[zip]"
        sys.exit(0)
    try :
        strArchive = sys.argv[2]
    except :
        print "Error Argument Missing"
        sys.exit(0)
    if string.find(strFlags, "d") == -1 :
        decompress = "no"
    if string.find(strFlags, "c") == -1 :
        compress = "no"
    if string.find(strFlags, "g") == -1 :
        gzip = "no"
    if compress == "yes" and decompress == "yes" :
        print "ERROR - CANNOT DECOMPRESS AND COMPRESS"
        sys.exit(0)
    elif compress == "yes" and gzip == "no" :
        try : strFile = sys.argv[3]
        except :
            print "Error Argument Missing"
            sys.exit(0)
        tarball = tarfile.open(strArchive, "w")
        tarball.add(strFile)
        tarball.close()
    elif compress == "yes" and gzip == "yes" :
        try : strFile = sys.argv[3]
        except :
            print "Error Argument Missing"
            sys.exit(0)
        tarball = tarfile.open(strArchive, "w:gz")
        tarball.add(strFile)
        tarball.close()
    elif decompress == "yes" and gzip == "no" :
        try:
            tarball = tarfile.open(strArchive, "r")
        except :
            print "ERROR - FILE MISSING"
            sys.exit(0)
        for tarfile.tarinfo in tarball :
            tarball.extract(tarfile.tarinfo)
        tarball.close()
    elif decompress == "yes" and gzip == "yes" :
        try :
            tarball = tarfile.open(strArchive, "r:gz")
        except :
            print "Error - File Missing"
            sys.exit(0)
        for tarfile.tarinfo in tarball :
            tarball.extract(tarfile.tarinfo)
        tarball.close()    
else :
    print "ERROR NO FLAGS GIVEN"
    sys.exit(0)
3. Save the script as tar.py
4. Open a command prompt and type python tar.py -h
5. Now for the walk through.

We start off with a comment which is signified by 3 quotation marks
Code:
"""
This is also how a comment is ended.
Following this we import the 3 modules we are going to use (tarfile, sys and string). We do so using the import statement. In previous tutorials we used import * from <module>. This was done so that we wouldn't have to reference the module. However I now feel that you can keep up and reference the correct module, this is a more proper way of programming.

I have used a fair amount of error checking in this, so I will cover all those lines right no. While the error checking and the cod are by no means complete, I decided to cover some of it. I mentioned error checking in Introduction to Python #3 if you need to go back and look at it. Basically what it does is it tries to execute the code following try : and if it is successful it carries on with the rest of the program, however if the code fails (if the argument isn't present for example) then it runs the except : code, which prints and error and then uses sys.exit(0) to tell the program to exit cleanly.

Next I set a few variables equal to yes. I suppose I could have used 1/0 but yes/no worked easier for simplicity I wanted. Basically these three variables will store the values of our flags (on or off)

Now we'll check to see if we have a - to signify our flags. The code is slightly redundant here, it has already checked for the present argument and if it didn't exist it set it to -h (the first try and except). This is just making sure the - exists to be picky, if it doesn't the program will exit.
If the flag is set to -h (help) which as you can see makes use of %s to allow us to include the value of a variable in our string, as well as \n which represents a new line (for more info on either of these see Introduction to Python #2 and Introduction to Python #3.
The string.find(strFlags, -) command, simply checks to where the hyphen exists in the strFlags variable. If the - didn't exist a -1 would be returned, since it is in the first position the index of 0 is returned.
Next comes a collection of if statements (I explored if statements in the original Python Introduction

We are now into our tarfile module code. This is what we really want to explore. I have used three options since they will be most recognized, tar, untar, and gzip. The first thing we do every time is open the file we want to work with (this could be creation or an already existing file). We open a file by creating a variable to "store the file" (sorry, I'm a networking guy, not a programmer.. I'm not up on all the lingo).We use the tarfile.open to reference the file.. The first value passed to tarfile.open is the name of the archive we wish to open/create (in this case stored as strArchive), The second value is the mode (r[ead] or w[rite]). If we are dealing with gzip compression we add :gz to tell the module about the compression.
If we are compressing the file, it is rather simple we just access the file by referencing it's variable (tarball) and use the add function, which we pass the name of the file we are compressing, We then close our file stream (Hey I remember the word.. I think.. but i'm not changing it in case I'm wrong) and we close it by referencing the variable/stream (tarball) with the .close function.
If we are decompressing the file, we must decompress once for each file in the tar. We use a For statement (addressed in a previous tutorial i believe), if not it simple says for each file name in this file. We access the module fuction tarfile.tarinfo to find the names of the files in the archive. Then we use that name to extract it using the extract function on the filestream. Passing it the tarinfo function which stores the name of the current file in the archive.
We then close the filestream in the same way we did while compressing a file.

The only thing I didn't touch on was arguments. For you C/C++ programs, this should seem fairly familiar (at least based on my basic knowledge of those languages). sys.argv is an array that stores all the arguments. The first argument would be sys.argv[0], which would be the name of the script being executed, sys.argv[1] would be the name of the first argument following the script name. There is however no sys.argc function, to get the equivalent of argc in C/C++ you would have to use len(sys.argv).


Anyways, let me remind you again... I'm into networking, not programming and it's rather late. I hope this is understandable and coherent. It isn't the most basic of tutorials, because I'm hoping by now if you are interested in python you will either have read or are going to read the other tutorials I have posted. I will continue to post more of these if there is a community interest.

For now I must say g'nite

Peace,
HT

[Edit]
I wanted to add that you can also use the tarinfo command to spoof the gid/uid on a *nix environment. If anyone is interested in more info on that, they can send me a PM..... I won't publish it here yet because I'm thinking about doing a tutorial on the security concerns presented by python.
[/Edit]