Python Programs and Using Files Top More about Functions Contents

Input and Output

Generally all computational entities, be it variables, functions, or programs, require some sort of input and output. Input roughly corresponds to the process of supplying data and output roughly corresponds to the process of retrieving data. For a variable, assignment is the way values (data) are supplied to the variable, while referencing the variable (as part of an expression) is the way that data is retrieved. For functions, the input is the set of arguments that are supplied in a function call, while the output of a function is its return value. For programs, input refers to pulling in data and output refers to pushing out results. In this chapter, we focus on managing the input and output of programs.

Input

Input is the process of obtaining data that a program needs to perform its task. There are generally three ways for a program to obtain data:

We examine these three approaches in turn.

Reading from the keyboard

To read from the keyboard, one uses the input function:

    name = input("What is your name? ")

The input function prints the given message to the console and waits until a response is typed. In the example above, the message is "What is your name? "; this message is known as a prompt. The input function (as of Python3) always returns a string. If you wish to read an integer, you can wrap the call to input in a call to int:

    age = int(input("How old are you? "))

Other conversion functions similar to int are float, which converts the string input returns to a real number, and eval, which converts a string into its Python equivalent. For example, we could substitute eval for int or float and we would get the exact same result, provided an integer or a real number, respectively, were typed in response to input prompt. The eval function can even to some math for you:

    >>> eval("3 + 7")
    10

Reading From the command line

The second way to pass information to a program is through command-line arguments. The command line is the line typed in a terminal window that runs a python program (or any other program). Here is a typical command line on a Linux system:

    lusth@sprite:~/l1/activities$ python3 prog3.py

Everything up to and including the dollar sign is the system prompt. As with all prompts, it is used to signify that the system is waiting for input. The user of the system (me) has typed in the command:

    python3 prog3.py

in response to the prompt. Suppose prog3.py is a file with the following code:

    import sys

    def main():
        print("command-line arguments:")
        print("    ",sys.argv)
        return 0

    main()

In this case, the output of this program would be:

    command-line arguments:
        ['prog3.py']

Note that the program imports the sys module and when the value of the variable sys.argv is printed, we see its value is:

    ['prog3.py']

This tells us that sys.argv points to an array (because of the square brackets) and that the program file name, as a string, is found in this array.

Any whitespace-delimited tokens following the program file name are stored in sys.argv along with the name of the program being run by the Python interpreter. For example, suppose we run prog3.py with the this command:

    python3 prog3.py 123 123.4 True hello, world

Then the output would be:

    command-line arguments:
        ['prog3.py', '123', '123.4', 'True', 'hello,', 'world']

From this result, we can see that all of the tokens are stored in sys.argv and that they are stored as strings, regardless of whether they look like some other entity, such as integer, real number, or Boolean.

If we wish for "hello, world" to be a single token, we would need to enclose the tokens in quotes:

    python3 prog3.py 123 123.4 True "hello, world"

In this case, the output is:

    command-line arguments:
        ['prog3.py', '123', '123.4', 'True', 'hello, world']

There are certain characters that have special meaning to the system. A couple of these are '*' and ';'. To include these characters in a command-line argument, they need to be escaped by inserting a backslash prior to the character. Here is an example:

    python3 prog3.py \; \* \\

To insert a backslash, one escapes it with a backslash. The output from this command is:

    command-line arguments:
        ['prog3.py', ';', '*', '\\']

Although it looks as if there are two backslashes in the last token, there is but a single backslash. Python uses two backslashes to indicate a single backslash.

Counting the command line arguments

The number of command-line arguments (including the program file name) can be found by using the len (length) function. If we modify prog3.py's main function to be:

    def main():
        print("command-line argument count:",len(sys.argv))
        print("command-line arguments:")
        print("    ",sys.argv)
        return 0

and enter the following command at the system prompt:

    python3 prog3.py 123 123.4 True hello world

we get this output:

    command-line argument count: 6
    command-line arguments:
         ['prog3.py', '123', '123.4', 'True', 'hello', 'world']

As expected, we see that there are six command-line arguments, including the program file name.

Accessing individual arguments

As with most programming languages and with Python arrays, the individual tokens in sys.argv are accessed with zero-based indexing. To access the program file name, we would use the expression:

    sys.argv[0]

To access the first command-line argument after the program file name, we would use the expression:

    sys.argv[1]

Let's now modify prog3.py so that it prints out each argument individually. We will using a construct called a loop to do this. You will learn about looping later, but for now, the statement starting with for generates all the legal indices for sys.argv, from zero to the length of sys.argv minus one. Each index, stored in the variable i, is then used to print the argument stored at that location in sys.argv:

    def main():
        print("command-line argument count:",len(sys.argv))
        print("command-line arguments:")
        for i in range(0,len(sys.argv),1):
            print("   ",i,":",sys.argv[i])
        return 0;

Given this command:

    python3 prog3.py 123 123.4 True hello world

we get the following output:

    command-line argument count: 6
    command-line arguments:
        0 : prog3.py
        1 : 123
        2 : 123.4
        3 : True
        4 : hello
        5 : world

This code works no matter how many command line arguments are sent. The superior student will ascertain that this is true.

What command-line arguments are

The command line arguments are stored as strings. Therefore, you must use the int, float, or eval functions if you wish to use any of the command line arguments to integers, real numbers or Booleans, as examples.

Reading from files

The third way to get data to a program is to read the data that has been previously stored in a file.

Python uses a file pointer system in reading from a file. To read from a file, the first step is to obtain a pointer to the file. This is known as opening a file. The file pointer will always point to the first unread character in a file. When a file is first opened, the file pointer points to the first character in the file.

Reading files using file pointer methods

A Python file pointer has a number of associated functions for reading in parts or all of a file. Suppose we wish to read from a file named data. We first obtain a file pointer by opening the file like this:

    fp = open("data","r")

The open function takes two arguments, the name of the file and the kind of file pointer to return. We store the file pointer in a variable named fp (a variable name commonly used to hold a file pointer). In this case, we wish for a reading file pointer, so we pass the string "r". We can also open a file for writing; more on that in the next section.

Next, we can read the entire file into a single string with the read method:

    text = fp.read()

After this statement is evaluated, the variable text would point to a string containing every character in the file data. We call read a method, rather than a function (which it is), to indicate that it is a function that belongs to a file pointer object, which fp is. You will learn about objects in a later class, but the "dot" operator is a clue that the thing to the left of the dot is an object and the thing to the right is a method (or simple variable) that belongs to the object.

When we are done reading a file, we close it:

    fp.close()

Instead of reading all the file in at once using the read method, it is sometimes useful to read a file one line at a time. One uses the readline method for this task, but the readline method is not very useful until we learn about loops in a later chapter.

Using a scanner

A scanner is a reading subsystem that allows you to read whitespace-delimited tokens from a file. Whitespace are those characters in a text file that are generally not visible: spaces, tabs, and newlines. For example, this sentence:

   I am a sentence!

is composed of four whitespace delimited tokens: I, am, a, and sentence!.

Typically, a scanner is used just like a file pointer, but is far more flexible. Suppose we wish to read from a file named data. We would open the file with a scanner like this:

    s = Scanner("data")

The Scanner function takes a single argument, the name of the file, and returns a scanner object, which can be used to read the tokens in the file.

Suppose the file data contained:

    True 3 -4.4
    "hello there"
    ack!

The file contains six whitespace-delimited tokens: True, 3, -4.4, "hello, followed by there", and ack!. We can use the scanner to read each of the tokens using the readtoken method.

from scanner import *

def main():
    s = Scanner("data")
    b = s.readtoken()
    i = s.readtoken()
    f = s.readtoken()
    str1 = s.readtoken()
    str2 = s.readtoken()
    t = s.readtoken()
    print("The type of",b,"is",type(b))
    print("The type of",i,"is",type(i))
    print("The type of",f,"is",type(f))
    print("The type of",str1,"is",type(str1))
    print("The type of",str2,"is",type(str2))
    print("The type of",t,"is",type(t))
    s.close()
    return 0;

main()

To run this program, you will first need to get a scanner for Python. You can obtain a scanner by issuing this command:

    wget troll.cs.ua.edu/cs150/book/scanner.py

The scanner.py file needs to reside in the same directory as the program that imports it.

The program, as written, runs fine, yielding the following output:

    The type of True is <class 'str'>
    The type of 3 is <class 'str'>
    The type of -4.4 is <class 'str'>
    The type of "hello is <class 'str'>
    The type of there" is <class 'str'>
    The type of ack! is <class 'str'>

The type function tells us what kind of literal is passed to it. We can see that every token is read in as a string, from the <class 'str'> return value from the type function.

Here is a revised version that takes advantage of the full power of the scanner; the program reads in the objects as they appear, a Boolean, integer, float, string, and token:

from scanner import *

def main():
    s = Scanner("data")
    b = s.readbool()
    i = s.readint()
    f = s.readfloat()
    str = s.readstring()
    t = s.readtoken()
    print("The type of",b,"is",type(b))
    print("The type of",i,"is",type(i))
    print("The type of",f,"is",type(f))
    print("The type of",str,"is",type(str))
    print("The type of",t,"is",type(t))
    s.close()
    return 0;

main()

The output of the revised program is:

    The type of True is <class 'bool'>
    The type of 3 is <class 'int'>
    The type of -4.4 is <class 'float'>
    The type of "hello there" is <class 'str'>
    The type of ack! is <class 'str'>

The methods readbool, readint, readfloat convert the tokens they read into the appropriate type. Thus, readbool returns a Boolean, not a string, readint returns an integer, and so on. The readstring method will read a string delimited by double quotes as a single token. Note that the double quotes are considered part of the string; to remove them, one can use the following technique:

    str = s.readstring()
    str = str[1:-1]         # a slice, all but the first and last characters

If any of the reading methods fail (i.e., trying to read an integer when there is no integer at that point), the read methods return the empty string. As a final note, always remember to close a scanner when you are done with it, as in:

   s.close()

Output

Once a program has processed its input, it needs to make its output known, either by displaying results to the user or by storing the results in a file.

Writing to the console

One uses the print function to display text on the console, for benefit of the user of the program. The print function is variadic, which means it can take a variable number of arguments. The print function has lots of options, but we will be interested in only two, sep and end. The sep (for separator) option specifies what is printed between the arguments sent to be printed. If the sep option is missing, a space is printed between the values received by print:

    >>> print(1,2,3)
    1 2 3
    >>>

If we wish to use commas as the separator, we would do this:

    >>> print(1,2,3,sep=",")
    1,2,3
    >>>

If we wish to have no separator, we bind sep to an empty string:

    >>> print(1,2,3,sep="")
    123
    >>>

The end option specifies what be printed after the arguments are printed. If you don't supply a end option, a newline is printed by default. This call to print prints an exclamation point and then a newline at the end:

    >>> print(1,2,3,end="!\n")
    1 2 3!
    >>>

If you don't want anything printed at the end, bind end to an empty string:

    >>> print(1,2,3,end="")
    1 2 3>>>

Notice how the Python prompt ends up on the same line as the values printed.

You can combine the sep and end options.

Printing quote characters

Suppose I have the string:

    str = "Hello"

If I print my string:

    print(str)

The output looks like this:

    Hello

Notice the double quotes are not printed. But suppose I wish to print quotes around my string, so that the output looks like:

    "Hello"

To do this, the print statement becomes:

   print("\"",str,"\"",sep="")

If you need a refresher on what the string "\"" means, please see the chapter on literals. The superior student will ponder on the necessity of sep="".

Writing to a file

Python also requires a file pointer to write to a file. The open function is again used to obtain a file pointer, but this time we desire a writing file pointer, so we send the string "w" as the second argument to open:

    fp = open("data.save","w")

Now the variable fp points to a writing file object, which has a method named write. The only argument write takes is a string. That string is then written to the file. Here is a function that copies the text from one file into another:

    def copyFile(inFile,outFile):
        in = open(inFile,"r")
        out = open(outFile,"w")
        text = in.read();
        out.write(text);
        in.close()
        out.close()
        return "done"

Opening a file in order to write to it has the effect of emptying the file of its contents soon as it is opened. The following code deletes the contents of a file (which is different than deleting the file):

    # delete the contents
    fp = open(fileName,"w")
    fp.close()

We can get rid of the variable fp by simply treating the call to open as the object open returns, as in:

    # delete the contents
    open(fileName,"w").close()

If you wish to start writing to a file, but save what was there previously, call the open function to obtain an appending file pointer:

    fp = open(fileName,"a")

Subsequent writes to fp will append text to what is already there.

lusth@cs.ua.edu


Python Programs and Using Files Top More about Functions Contents