html/ 0000755 0001750 0001750 00000000000 11317152311 011411 5 ustar downey downey html/dex.html 0000644 0001750 0001750 00000126773 11317152311 013077 0 ustar downey downey
Index
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
PrefaceBy Jeff Elkner
This book owes its existence to the collaboration made possible by the
Internet and the free software movement. Its three authors We think this book is a testament to the benefits and future possibilities of this kind of collaboration, the framework for which has been put in place by Richard Stallman and the Free Software Foundation. How and why I came to use PythonIn 1999, the College Board's Advanced Placement (AP) Computer Science exam was given in C++ for the first time. As in many high schools throughout the country, the decision to change languages had a direct impact on the computer science curriculum at Yorktown High School in Arlington, Virginia, where I teach. Up to this point, Pascal was the language of instruction in both our first-year and AP courses. In keeping with past practice of giving students two years of exposure to the same language, we made the decision to switch to C++ in the first-year course for the 1997-98 school year so that we would be in step with the College Board's change for the AP course the following year. Two years later, I was convinced that C++ was a poor choice to use for introducing students to computer science. While it is certainly a very powerful programming language, it is also an extremely difficult language to learn and teach. I found myself constantly fighting with C++'s difficult syntax and multiple ways of doing things, and I was losing many students unnecessarily as a result. Convinced there had to be a better language choice for our first-year class, I went looking for an alternative to C++. I needed a language that would run on the machines in our Linux lab as well as on the Windows and Macintosh platforms most students have at home. I wanted it to be free and available electronically, so that students could use it at home regardless of their income. I wanted a language that was used by professional programmers, and one that had an active developer community around it. It had to support both procedural and object-oriented programming. And most importantly, it had to be easy to learn and teach. When I investigated the choices with these goals in mind, Python stood out as the best candidate for the job. I asked one of Yorktown's talented students, Matt Ahrens, to give Python a try. In two months he not only learned the language but wrote an application called pyTicket that enabled our staff to report technology problems via the Web. I knew that Matt could not have finished an application of that scale in so short a time in C++, and this accomplishment, combined with Matt's positive assessment of Python, suggested that Python was the solution I was looking for. Finding a textbookHaving decided to use Python in both of my introductory computer science classes the following year, the most pressing problem was the lack of an available textbook. Free content came to the rescue. Earlier in the year, Richard Stallman had introduced me to Allen Downey. Both of us had written to Richard expressing an interest in developing free educational content. Allen had already written a first-year computer science textbook, How to Think Like a Computer Scientist. When I read this book, I knew immediately that I wanted to use it in my class. It was the clearest and most helpful computer science text I had seen. It emphasized the processes of thought involved in programming rather than the features of a particular language. Reading it immediately made me a better teacher. How to Think Like a Computer Scientist was not just an excellent book, but it had been released under a GNU public license, which meant it could be used freely and modified to meet the needs of its user. Once I decided to use Python, it occurred to me that I could translate Allen's original Java version of the book into the new language. While I would not have been able to write a textbook on my own, having Allen's book to work from made it possible for me to do so, at the same time demonstrating that the cooperative development model used so well in software could also work for educational content. Working on this book for the last two years has been rewarding for both my students and me, and my students played a big part in the process. Since I could make instant changes whenever someone found a spelling error or difficult passage, I encouraged them to look for mistakes in the book by giving them a bonus point each time they made a suggestion that resulted in a change in the text. This had the double benefit of encouraging them to read the text more carefully and of getting the text thoroughly reviewed by its most important critics, students using it to learn computer science. For the second half of the book on object-oriented programming, I knew that someone with more real programming experience than I had would be needed to do it right. The book sat in an unfinished state for the better part of a year until the free software community once again provided the needed means for its completion. I received an email from Chris Meyers expressing interest in the book. Chris is a professional programmer who started teaching a programming course last year using Python at Lane Community College in Eugene, Oregon. The prospect of teaching the course had led Chris to the book, and he started helping out with it immediately. By the end of the school year he had created a companion project on our website at http://www.ibiblio.org/obp called Python for Fun and was working with some of my most advanced students as a master teacher, guiding them beyond where I could take them. Introducing programming with PythonThe process of translating and using How to Think Like a Computer Scientist for the past two years has confirmed Python's suitability for teaching beginning students. Python greatly simplifies programming examples and makes important programming ideas easier to teach. The first example from the text illustrates this point. It is the traditional "hello, world" program, which in the C++ version of the book looks like this:
#include <iostream.h>
void main()
{
cout << "Hello, world." << endl;
}
in the Python version it becomes: print "Hello, World!"
Even though this is a trivial example, the advantages of Python stand
out. Yorktown's Computer Science I course has no prerequisites, so
many of the students seeing this example are looking at their first
program. Some of them are undoubtedly a little nervous, having heard
that computer programming is difficult to learn. The C++ version has
always forced me to choose between two unsatisfying options: either to
explain Comparing the explanatory text of the program in each version of the book further illustrates what this means to the beginning student. There are thirteen paragraphs of explanation of "Hello, world!" in the C++ version; in the Python version, there are only two. More importantly, the missing eleven paragraphs do not deal with the "big ideas" in computer programming but with the minutia of C++ syntax. I found this same thing happening throughout the book. Whole paragraphs simply disappear from the Python version of the text because Python's much clearer syntax renders them unnecessary. Using a very high-level language like Python allows a teacher to postpone talking about low-level details of the machine until students have the background that they need to better make sense of the details. It thus creates the ability to put "first things first" pedagogically. One of the best examples of this is the way in which Python handles variables. In C++ a variable is a name for a place that holds a thing. Variables have to be declared with types at least in part because the size of the place to which they refer needs to be predetermined. Thus, the idea of a variable is bound up with the hardware of the machine. The powerful and fundamental concept of a variable is already difficult enough for beginning students (in both computer science and algebra). Bytes and addresses do not help the matter. In Python a variable is a name that refers to a thing. This is a far more intuitive concept for beginning students and is much closer to the meaning of "variable" that they learned in their math courses. I had much less difficulty teaching variables this year than I did in the past, and I spent less time helping students with problems using them.
Another example of how Python aids in the teaching and learning of
programming is in its syntax for functions. My students have always
had a great deal of difficulty understanding functions. The main
problem centers around the difference between a function definition
and a function call, and the related distinction between a parameter
and an argument. Python comes to the rescue with syntax that is
nothing short of beautiful. Function definitions begin with the
keyword Using Python has improved the effectiveness of our computer science program for all students. I see a higher general level of success and a lower level of frustration than I experienced during the two years I taught C++. I move faster with better results. More students leave the course with the ability to create meaningful programs and with the positive attitude toward the experience of programming that this engenders. Building a communityI have received email from all over the globe from people using this book to learn or to teach programming. A user community has begun to emerge, and many people have been contributing to the project by sending in materials for the companion website at http://www.thinkpython.com. With the publication of the book in print form, I expect the growth in the user community to continue and accelerate. The emergence of this user community and the possibility it suggests for similar collaboration among educators have been the most exciting parts of working on this project for me. By working together, we can increase the quality of materials available for our use and save valuable time. I invite you to join our community and look forward to hearing from you. Please write to the authors at feedback@thinkpython.com.
Jeffrey Elkner
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 19QueuesThis chapter presents two ADTs: the Queue and the Priority Queue. In real life, a queue is a line of customers waiting for service of some kind. In most cases, the first customer in line is the next customer to be served. There are exceptions, though. At airports, customers whose flights are leaving soon are sometimes taken from the middle of the queue. At supermarkets, a polite customer might let someone with only a few items go first. The rule that determines who goes next is called the queueing policy. The simplest queueing policy is called FIFO, for "first-in-first-out." The most general queueing policy is priority queueing, in which each customer is assigned a priority and the customer with the highest priority goes first, regardless of the order of arrival. We say this is the most general policy because the priority can be based on anything: what time a flight leaves; how many groceries the customer has; or how important the customer is. Of course, not all queueing policies are "fair," but fairness is in the eye of the beholder. The Queue ADT and the Priority Queue ADT have the same set of operations. The difference is in the semantics of the operations: a queue uses the FIFO policy; and a priority queue (as the name suggests) uses the priority queueing policy. 19.1 The Queue ADTThe Queue ADT is defined by the following operations:
19.2 Linked QueueThe first implementation of the Queue ADT we will look at is called a linked queue because it is made up of linked Node objects. Here is the class definition: class Queue:
The methods isEmpty and remove are identical to the LinkedList methods isEmpty and removeFirst. The insert method is new and a bit more complicated. We want to insert new items at the end of the list. If the queue is empty, we just set head to refer to the new node. Otherwise, we traverse the list to the last node and tack the new node on the end. We can identify the last node because its next attribute is None. There are two invariants for a properly formed Queue object. The value of length should be the number of nodes in the queue, and the last node should have next equal to None. Convince yourself that this method preserves both invariants. 19.3 Performance characteristics
Normally when we invoke a method, we are not concerned with the
details of its implementation. But there is one "detail"
we might want to know First look at remove. There are no loops or function calls here, suggesting that the runtime of this method is the same every time. Such a method is called a constant time operation. In reality, the method might be slightly faster when the list is empty since it skips the body of the conditional, but that difference is not significant. The performance of insert is very different. In the general case, we have to traverse the list to find the last element. This traversal takes time proportional to the length of the list. Since the runtime is a linear function of the length, this method is called linear time. Compared to constant time, that's very bad. 19.4 Improved Linked QueueWe would like an implementation of the Queue ADT that can perform all operations in constant time. One way to do that is to modify the Queue class so that it maintains a reference to both the first and the last node, as shown in the figure:
The ImprovedQueue implementation looks like this: class ImprovedQueue:
So far, the only change is the attribute last. It is used in insert and remove methods: class ImprovedQueue:
Since last keeps track of the last node, we don't have to search for it. As a result, this method is constant time. There is a price to pay for that speed. We have to add a special case to remove to set last to None when the last node is removed: class ImprovedQueue:
This implementation is more complicated than the
Linked Queue implementation, and it is more difficult to demonstrate
that it is correct. The advantage is that we have achieved
the goal As an exercise, write an implementation of the Queue ADT using a Python list. Compare the performance of this implementation to the ImprovedQueue for a range of queue lengths. 19.5 Priority queueThe Priority Queue ADT has the same interface as the Queue ADT, but different semantics. Again, the interface is:
The semantic difference is that the item that is removed from the queue is not necessarily the first one that was added. Rather, it is the item in the queue that has the highest priority. What the priorities are and how they compare to each other are not specified by the Priority Queue implementation. It depends on which items are in the queue. For example, if the items in the queue have names, we might choose them in alphabetical order. If they are bowling scores, we might go from highest to lowest, but if they are golf scores, we would go from lowest to highest. As long as we can compare the items in the queue, we can find and remove the one with the highest priority. This implementation of Priority Queue has as an attribute a Python list that contains the items in the queue. class PriorityQueue:
The initialization method, isEmpty, and insert are all veneers on list operations. The only interesting method is remove: class PriorityQueue:
At the beginning of each iteration, maxi holds the index of the biggest item (highest priority) we have seen so far. Each time through the loop, the program compares the i-eth item to the champion. If the new item is bigger, the value of maxi is set to i. When the for statement completes, maxi is the index of the biggest item. This item is removed from the list and returned. Let's test the implementation: >>> q = PriorityQueue()
If the queue contains simple numbers or strings, they are removed in numerical or alphabetical order, from highest to lowest. Python can find the biggest integer or string because it can compare them using the built-in comparison operators. If the queue contains an object type, it has to provide a __cmp__ method. When remove uses the > operator to compare items, it invokes the __cmp__ for one of the items and passes the other as an argument. As long as the __cmp__ method works correctly, the Priority Queue will work. 19.6 The Golfer classAs an example of an object with an unusual definition of priority, let's implement a class called Golfer that keeps track of the names and scores of golfers. As usual, we start by defining __init__ and __str__: class Golfer:
__str__ uses the format operator to put the names and scores in neat columns. Next we define a version of __cmp__ where the lowest score gets highest priority. As always, __cmp__ returns 1 if self is "greater than" other, -1 if self is "less than" other, and 0 if they are equal. class Golfer:
Now we are ready to test the priority queue with the Golfer class: >>> tiger = Golfer("Tiger Woods", 61)
As an exercise, write an implementation of the Priority Queue ADT using a linked list. You should keep the list sorted so that removal is a constant time operation. Compare the performance of this implementation with the Python list implementation. 19.7 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
ForewordBy David Beazley As an educator, researcher, and book author, I am delighted to see the completion of this book. Python is a fun and extremely easy-to-use programming language that has steadily gained in popularity over the last few years. Developed over ten years ago by Guido van Rossum, Python's simple syntax and overall feel is largely derived from ABC, a teaching language that was developed in the 1980's. However, Python was also created to solve real problems and it borrows a wide variety of features from programming languages such as C++, Java, Modula-3, and Scheme. Because of this, one of Python's most remarkable features is its broad appeal to professional software developers, scientists, researchers, artists, and educators.
Despite Python's appeal to many different communities, you may still
wonder "why Python?" or "why teach programming with Python?"
Answering these questions is no simple task When I teach computer science courses, I want to cover important concepts in addition to making the material interesting and engaging to students. Unfortunately, there is a tendency for introductory programming courses to focus far too much attention on mathematical abstraction and for students to become frustrated with annoying problems related to low-level details of syntax, compilation, and the enforcement of seemingly arcane rules. Although such abstraction and formalism is important to professional software engineers and students who plan to continue their study of computer science, taking such an approach in an introductory course mostly succeeds in making computer science boring. When I teach a course, I don't want to have a room of uninspired students. I would much rather see them trying to solve interesting problems by exploring different ideas, taking unconventional approaches, breaking the rules, and learning from their mistakes. In doing so, I don't want to waste half of the semester trying to sort out obscure syntax problems, unintelligible compiler error messages, or the several hundred ways that a program might generate a general protection fault.
One of the reasons why I like Python is that it provides a really nice
balance between the practical and the conceptual. Since Python is
interpreted, beginners can pick up the language and start doing
neat things almost immediately without getting lost in the problems of
compilation and linking. Furthermore, Python comes with a large
library of modules that can be used to do all sorts of tasks ranging
from web-programming to graphics. Having such a practical focus is a
great way to engage students and it allows them to complete
significant projects. However, Python can also serve as an excellent
foundation for introducing important computer science concepts. Since
Python fully supports procedures and classes, students can be
gradually introduced to topics such as procedural abstraction, data
structures, and object-oriented programming
In reading Jeffrey's preface, I am struck by his comments that Python
allowed him to see a "higher level of success and a lower level of
frustration" and that he was able to "move faster with better
results." Although these comments refer to his introductory course, I
sometimes use Python for these exact same reasons in advanced graduate
level computer science courses at the University of Chicago. In these
courses, I am constantly faced with the daunting task of covering a
lot of difficult course material in a blistering nine week quarter.
Although it is certainly possible for me to inflict a lot of pain and
suffering by using a language like C++, I have often found this
approach to be counterproductive Although Python is still a young and evolving language, I believe that it has a bright future in education. This book is an important step in that direction.
David Beazley
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 11Files and exceptionsWhile a program is running, its data is in memory. When the program ends, or the computer shuts down, data in memory disappears. To store data permanently, you have to put it in a file. Files are usually stored on a hard drive, floppy drive, or CD-ROM. When there are a large number of files, they are often organized into directories (also called "folders"). Each file is identified by a unique name, or a combination of a file name and a directory name. By reading and writing files, programs can exchange information with each other and generate printable formats like PDF. Working with files is a lot like working with books. To use a book, you have to open it. When you're done, you have to close it. While the book is open, you can either write in it or read from it. In either case, you know where you are in the book. Most of the time, you read the whole book in its natural order, but you can also skip around. All of this applies to files as well. To open a file, you specify its name and indicate whether you want to read or write. Opening a file creates a file object. In this example, the variable f refers to the new file object. >>> f = open("test.dat","w")
The open function takes two arguments. The first is the name of the file, and the second is the mode. Mode "w" means that we are opening the file for writing. If there is no file named test.dat, it will be created. If there already is one, it will be replaced by the file we are writing. When we print the file object, we see the name of the file, the mode, and the location of the object. To put data in the file we invoke the write method on the file object: >>> f.write("Now is the time")
Closing the file tells the system that we are done writing and makes the file available for reading: >>> f.close()
Now we can open the file again, this time for reading, and read the contents into a string. This time, the mode argument is "r" for reading: >>> f = open("test.dat","r")
If we try to open a file that doesn't exist, we get an error: >>> f = open("test.cat","r")
Not surprisingly, the read method reads data from the file. With no arguments, it reads the entire contents of the file: >>> text = f.read()
There is no space between "time" and "to" because we did not write a space between the strings. read can also take an argument that indicates how many characters to read: >>> f = open("test.dat","r")
If not enough characters are left in the file, read returns the remaining characters. When we get to the end of the file, read returns the empty string: >>> print f.read(1000006)
The following function copies a file, reading and writing up to fifty characters at a time. The first argument is the name of the original file; the second is the name of the new file: def copyFile(oldFile, newFile):
The break statement is new. Executing it breaks out of the loop; the flow of execution moves to the first statement after the loop. In this example, the while loop is infinite because the value True is always true. The only way to get out of the loop is to execute break, which happens when text is the empty string, which happens when we get to the end of the file. 11.1 Text filesA text file is a file that contains printable characters and whitespace, organized into lines separated by newline characters. Since Python is specifically designed to process text files, it provides methods that make the job easy. To demonstrate, we'll create a text file with three lines of text separated by newlines: >>> f = open("test.dat","w")
The readline method reads all the characters up to and including the next newline character: >>> f = open("test.dat","r")
readlines returns all of the remaining lines as a list of strings: >>> print f.readlines()
In this case, the output is in list format, which means that the strings appear with quotation marks and the newline character appears as the escape sequence <br>012. At the end of the file, readline returns the empty string and readlines returns the empty list: >>> print f.readline()
The following is an example of a line-processing program. filterFile makes a copy of oldFile, omitting any lines that begin with #: def filterFile(oldFile, newFile):
The continue statement ends the current iteration of the loop, but continues looping. The flow of execution moves to the top of the loop, checks the condition, and proceeds accordingly. Thus, if text is the empty string, the loop exits. If the first character of text is a hash mark, the flow of execution goes to the top of the loop. Only if both conditions fail do we copy text into the new file. 11.2 Writing variablesThe argument of write has to be a string, so if we want to put other values in a file, we have to convert them to strings first. The easiest way to do that is with the str function: >>> x = 52
An alternative is to use the format operator %. When applied to integers, % is the modulus operator. But when the first operand is a string, % is the format operator. The first operand is the format string, and the second operand is a tuple of expressions. The result is a string that contains the values of the expressions, formatted according to the format string. As a simple example, the format sequence "%d" means that the first expression in the tuple should be formatted as an integer. Here the letter d stands for "decimal": >>> cars = 52
The result is the string '52', which is not to be confused with the integer value 52. A format sequence can appear anywhere in the format string, so we can embed a value in a sentence: >>> cars = 52
The format sequence "%f" formats the next item in the tuple as a floating-point number, and "%s" formats the next item as a string: >>> "In %d days we made %f million %s." % (34,6.1,'dollars')
By default, the floating-point format prints six decimal places. The number of expressions in the tuple has to match the number of format sequences in the string. Also, the types of the expressions have to match the format sequences: >>> "%d %d %d" % (1,2)
In the first example, there aren't enough expressions; in the second, the expression is the wrong type. For more control over the format of numbers, we can specify the number of digits as part of the format sequence: >>> "%6d" % 62
The number after the percent sign is the minimum number of spaces the number will take up. If the value provided takes fewer digits, leading spaces are added. If the number of spaces is negative, trailing spaces are added: >>> "%-6d" % 62
For floating-point numbers, we can also specify the number of digits after the decimal point: >>> "%12.2f" % 6.1
In this example, the result takes up twelve spaces and includes two digits after the decimal. This format is useful for printing dollar amounts with the decimal points aligned. For example, imagine a dictionary that contains student names as keys and hourly wages as values. Here is a function that prints the contents of the dictionary as a formatted report: def report (wages) :
To test this function, we'll create a small dictionary and print the contents: >>> wages = {'mary': 6.23, 'joe': 5.45, 'joshua': 4.25}
By controlling the width of each value, we guarantee that the columns will line up, as long as the names contain fewer than twenty-one characters and the wages are less than one billion dollars an hour. 11.3 DirectoriesWhen you create a new file by opening it and writing, the new file goes in the current directory (wherever you were when you ran the program). Similarly, when you open a file for reading, Python looks for it in the current directory. If you want to open a file somewhere else, you have to specify the path to the file, which is the name of the directory (or folder) where the file is located: >>> f = open("/usr/share/dict/words","r")
This example opens a file named words that resides in a directory named dict, which resides in share, which resides in usr, which resides in the top-level directory of the system, called /. You cannot use / as part of a filename; it is reserved as a delimiter between directory and filenames. The file /usr/share/dict/words contains a list of words in alphabetical order, of which the first is the name of a Danish university. 11.4 PicklingIn order to put values into a file, you have to convert them to strings. You have already seen how to do that with str: >>> f.write (str(12.3))
The problem is that when you read the value back, you get a string. The original type information has been lost. In fact, you can't even tell where one value ends and the next begins: >>> f.readline()
The solution is pickling, so called because it "preserves" data structures. The pickle module contains the necessary commands. To use it, import pickle and then open the file in the usual way: >>> import pickle
To store a data structure, use the dump method and then close the file in the usual way: >>> pickle.dump(12.3, f)
Then we can open the file for reading and load the data structures we dumped: >>> f = open("test.pck","r")
Each time we invoke load, we get a single value from the file, complete with its original type. 11.5 ExceptionsWhenever a runtime error occurs, it creates an exception. Usually, the program stops and Python prints an error message. For example, dividing by zero creates an exception: >>> print 55/0
So does accessing a nonexistent list item: >>> a = []
Or accessing a key that isn't in the dictionary: >>> b = {}
Or trying to open a nonexistent file: >>> f = open("Idontexist", "r")
In each case, the error message has two parts: the type of error before the colon, and specifics about the error after the colon. Normally Python also prints a traceback of where the program was, but we have omitted that from the examples. Sometimes we want to execute an operation that could cause an exception, but we don't want the program to stop. We can handle the exception using the try and except statements. For example, we might prompt the user for the name of a file and then try to open it. If the file doesn't exist, we don't want the program to crash; we want to handle the exception: filename = raw_input('Enter a file name: ')
The try statement executes the statements in the first block. If no exceptions occur, it ignores the except statement. If an exception of type IOError occurs, it executes the statements in the except branch and then continues. We can encapsulate this capability in a function: exists takes a filename and returns true if the file exists, false if it doesn't: def exists(filename):
You can use multiple except blocks to handle different kinds of exceptions. The Python Reference Manual has the details. If your program detects an error condition, you can make it raise an exception. Here is an example that gets input from the user and checks for the value 17. Assuming that 17 is not valid input for some reason, we raise an exception. def inputNumber () :
The raise statement takes two arguments: the exception type and specific information about the error. ValueError is one of the exception types Python provides for a variety of occasions. Other examples include TypeError, KeyError, and my favorite, NotImplementedError. If the function that called inputNumber handles the error, then the program can continue; otherwise, Python prints the error message and exits: >>> inputNumber ()
The error message includes the exception type and the additional information you provided. As an exercise, write a function that uses inputNumber to input a number from the keyboard and that handles the ValueError exception. 11.6 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 18Stacks18.1 Abstract data typesThe data types you have seen so far are all concrete, in the sense that we have completely specified how they are implemented. For example, the Card class represents a card using two integers. As we discussed at the time, that is not the only way to represent a card; there are many alternative implementations. An abstract data type, or ADT, specifies a set of operations (or methods) and the semantics of the operations (what they do), but it does not specify the implementation of the operations. That's what makes it abstract. Why is that useful?
When we talk about ADTs, we often distinguish the code that uses the ADT, called the client code, from the code that implements the ADT, called the provider code. 18.2 The Stack ADTIn this chapter, we will look at one common ADT, the stack. A stack is a collection, meaning that it is a data structure that contains multiple elements. Other collections we have seen include dictionaries and lists. An ADT is defined by the operations that can be performed on it, which is called an interface. The interface for a stack consists of these operations:
A stack is sometimes called a "last in, first out" or LIFO data structure, because the last item added is the first to be removed. 18.3 Implementing stacks with Python listsThe list operations that Python provides are similar to the operations that define a stack. The interface isn't exactly what it is supposed to be, but we can write code to translate from the Stack ADT to the built-in operations. This code is called an implementation of the Stack ADT. In general, an implementation is a set of methods that satisfy the syntactic and semantic requirements of an interface. Here is an implementation of the Stack ADT that uses a Python list: class Stack :
A Stack object contains an attribute named items that is a list of items in the stack. The initialization method sets items to the empty list. To push a new item onto the stack, push appends it onto items. To pop an item off the stack, pop uses the homonymous * Note list method to remove and return the last item on the list. Finally, to check if the stack is empty, isEmpty compares items to the empty list. An implementation like this, in which the methods consist of simple invocations of existing methods, is called a veneer. In real life, veneer is a thin coating of good quality wood used in furniture-making to hide lower quality wood underneath. Computer scientists use this metaphor to describe a small piece of code that hides the details of an implementation and provides a simpler, or more standard, interface. 18.4 Pushing and poppingA stack is a generic data structure, which means that we can add any type of item to it. The following example pushes two integers and a string onto the stack: >>> s = Stack()
We can use isEmpty and pop to remove and print all of the items on the stack: while not s.isEmpty() :
The output is + 45 54. In other words, we just used a stack to print the items backward! Granted, it's not the standard format for printing a list, but by using a stack, it was remarkably easy to do. You should compare this bit of code to the implementation of printBackward in Section 17.4. There is a natural parallel between the recursive version of printBackward and the stack algorithm here. The difference is that printBackward uses the runtime stack to keep track of the nodes while it traverses the list, and then prints them on the way back from the recursion. The stack algorithm does the same thing, except that it uses a Stack object instead of the runtime stack. 18.5 Using a stack to evaluate postfixIn most programming languages, mathematical expressions are written with the operator between the two operands, as in 1+2. This format is called infix. An alternative used by some calculators is called postfix. In postfix, the operator follows the operands, as in 1 2 +. The reason postfix is sometimes useful is that there is a natural way to evaluate a postfix expression using a stack:
As an exercise, apply this algorithm to the expression 1 2 + 3 *.
This example demonstrates one of the advantages of postfix As an exercise, write a postfix expression that is equivalent to 1 + 2 * 3. 18.6 Parsing
To implement the previous algorithm, we need
to be able to traverse a string and break it into operands and
operators. This process is an example of parsing, and the
results Python provides a split method in both the string and re (regular expression) modules. The function string.split splits a string into a list using a single character as a delimiter. For example: >>> import string
In this case, the delimiter is the space character, so the string is split at each space. The function re.split is more powerful, allowing us to provide a regular expression instead of a delimiter. A regular expression is a way of specifying a set of strings. For example, [A-z] is the set of all letters and [0-9] is the set of all digits. The ^ operator negates a set, so [^0-9] is the set of every character that is not a digit, which is exactly the set we want to use to split up postfix expressions: >>> import re
Notice that the order of the arguments is different from string.split; the delimiter comes before the string. The resulting list includes the operands 123 and 456 and the operators * and /. It also includes two empty strings that are inserted as "phantom operands," whenever an operator appears without a number before or after it. 18.7 Evaluating postfixTo evaluate a postfix expression, we will use the parser from the previous section and the algorithm from the section before that. To keep things simple, we'll start with an evaluator that only implements the operators + and *: def evalPostfix(expr):
The first condition takes care of spaces and empty strings. The next two conditions handle operators. We assume, for now, that anything else must be an operand. Of course, it would be better to check for erroneous input and report an error message, but we'll get to that later. Let's test it by evaluating the postfix form of (56+47)*2: >>> print evalPostfix ("56 47 + 2 *")
18.8 Clients and providers
One of the fundamental goals of an ADT is to separate the
interests of the provider, who writes the code that implements
the ADT, and the client, who uses the ADT.
The provider only has to worry
about whether the implementation is correct Conversely, the client assumes that the implementation of the ADT is correct and doesn't worry about the details. When you are using one of Python's built-in types, you have the luxury of thinking exclusively as a client. Of course, when you implement an ADT, you also have to write client code to test it. In that case, you play both roles, which can be confusing. You should make some effort to keep track of which role you are playing at any moment. 18.9 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 16Inheritance16.1 InheritanceThe language feature most often associated with object-oriented programming is inheritance. Inheritance is the ability to define a new class that is a modified version of an existing class. The primary advantage of this feature is that you can add new methods to a class without modifying the existing class. It is called "inheritance" because the new class inherits all of the methods of the existing class. Extending this metaphor, the existing class is sometimes called the parent class. The new class may be called the child class or sometimes "subclass." Inheritance is a powerful feature. Some programs that would be complicated without inheritance can be written concisely and simply with it. Also, inheritance can facilitate code reuse, since you can customize the behavior of parent classes without having to modify them. In some cases, the inheritance structure reflects the natural structure of the problem, which makes the program easier to understand. On the other hand, inheritance can make programs difficult to read. When a method is invoked, it is sometimes not clear where to find its definition. The relevant code may be scattered among several modules. Also, many of the things that can be done using inheritance can be done as elegantly (or more so) without it. If the natural structure of the problem does not lend itself to inheritance, this style of programming can do more harm than good. In this chapter we will demonstrate the use of inheritance as part of a program that plays the card game Old Maid. One of our goals is to write code that could be reused to implement other card games. 16.2 A hand of cardsFor almost any card game, we need to represent a hand of cards. A hand is similar to a deck, of course. Both are made up of a set of cards, and both require operations like adding and removing cards. Also, we might like the ability to shuffle both decks and hands. A hand is also different from a deck. Depending on the game being played, we might want to perform some operations on hands that don't make sense for a deck. For example, in poker we might classify a hand (straight, flush, etc.) or compare it with another hand. In bridge, we might want to compute a score for a hand in order to make a bid. This situation suggests the use of inheritance. If Hand is a subclass of Deck, it will have all the methods of Deck, and new methods can be added. In the class definition, the name of the parent class appears in parentheses: class Hand(Deck):
This statement indicates that the new Hand class inherits from the existing Deck class. The Hand constructor initializes the attributes for the hand, which are name and cards. The string name identifies this hand, probably by the name of the player that holds it. The name is an optional parameter with the empty string as a default value. cards is the list of cards in the hand, initialized to the empty list: class Hand(Deck):
For just about any card game, it is necessary to add and remove cards from the deck. Removing cards is already taken care of, since Hand inherits removeCard from Deck. But we have to write addCard: class Hand(Deck):
Again, the ellipsis indicates that we have omitted other methods. The list append method adds the new card to the end of the list of cards. 16.3 Dealing cardsNow that we have a Hand class, we want to deal cards from the Deck into hands. It is not immediately obvious whether this method should go in the Hand class or in the Deck class, but since it operates on a single deck and (possibly) several hands, it is more natural to put it in Deck. deal should be fairly general, since different games will have different requirements. We may want to deal out the entire deck at once or add one card to each hand. deal takes three parameters: the deck, a list (or tuple) of hands, and the total number of cards to deal. If there are not enough cards in the deck, the method deals out all of the cards and stops: class Deck :
The last parameter, nCards, is optional; the default is a large number, which effectively means that all of the cards in the deck will get dealt. The loop variable i goes from 0 to nCards-1. Each time through the loop, a card is removed from the deck using the list method pop, which removes and returns the last item in the list. The modulus operator (%) allows us to deal cards in a round robin (one card at a time to each hand). When i is equal to the number of hands in the list, the expression i % nHands wraps around to the beginning of the list (index 0). 16.4 Printing a HandTo print the contents of a hand, we can take advantage of the printDeck and __str__ methods inherited from Deck. For example: >>> deck = Deck()
It's not a great hand, but it has the makings of a straight flush. Although it is convenient to inherit the existing methods, there is additional information in a Hand object we might want to include when we print one. To do that, we can provide a __str__ method in the Hand class that overrides the one in the Deck class: class Hand(Deck)
Initially, s is a string that identifies the hand. If the hand is empty, the program appends the words is empty and returns the result. Otherwise, the program appends the word contains and the string representation of the Deck, computed by invoking the __str__ method in the Deck class on self. It may seem odd to send self, which refers to the current Hand, to a Deck method, until you remember that a Hand is a kind of Deck. Hand objects can do everything Deck objects can, so it is legal to send a Hand to a Deck method. In general, it is always legal to use an instance of a subclass in place of an instance of a parent class. 16.5 The CardGame classThe CardGame class takes care of some basic chores common to all games, such as creating the deck and shuffling it: class CardGame:
This is the first case we have seen where the initialization method performs a significant computation, beyond initializing attributes. To implement specific games, we can inherit from CardGame and add features for the new game. As an example, we'll write a simulation of Old Maid. The object of Old Maid is to get rid of cards in your hand. You do this by matching cards by rank and color. For example, the 4 of Clubs matches the 4 of Spades since both suits are black. The Jack of Hearts matches the Jack of Diamonds since both are red. To begin the game, the Queen of Clubs is removed from the deck so that the Queen of Spades has no match. The fifty-one remaining cards are dealt to the players in a round robin. After the deal, all players match and discard as many cards as possible. When no more matches can be made, play begins. In turn, each player picks a card (without looking) from the closest neighbor to the left who still has cards. If the chosen card matches a card in the player's hand, the pair is removed. Otherwise, the card is added to the player's hand. Eventually all possible matches are made, leaving only the Queen of Spades in the loser's hand. In our computer simulation of the game, the computer plays all hands. Unfortunately, some nuances of the real game are lost. In a real game, the player with the Old Maid goes to some effort to get their neighbor to pick that card, by displaying it a little more prominently, or perhaps failing to display it more prominently, or even failing to fail to display that card more prominently. The computer simply picks a neighbor's card at random. 16.6 OldMaidHand classA hand for playing Old Maid requires some abilities beyond the general abilities of a Hand. We will define a new class, OldMaidHand, that inherits from Hand and provides an additional method called removeMatches: class OldMaidHand(Hand):
We start by making a copy of the list of cards, so that we can traverse the copy while removing cards from the original. Since self.cards is modified in the loop, we don't want to use it to control the traversal. Python can get quite confused if it is traversing a list that is changing! For each card in the hand, we figure out what the matching card is and go looking for it. The match card has the same rank and the other suit of the same color. The expression 3 - card.suit turns a Club (suit 0) into a Spade (suit 3) and a Diamond (suit 1) into a Heart (suit 2). You should satisfy yourself that the opposite operations also work. If the match card is also in the hand, both cards are removed. The following example demonstrates how to use removeMatches: >>> game = CardGame()
Notice that there is no __init__ method for the OldMaidHand class. We inherit it from Hand. 16.7 OldMaidGame classNow we can turn our attention to the game itself. OldMaidGame is a subclass of CardGame with a new method called play that takes a list of players as an argument. Since __init__ is inherited from CardGame, a new OldMaidGame object contains a new shuffled deck: class OldMaidGame(CardGame):
Some of the steps of the game have been separated into methods. removeAllMatches traverses the list of hands and invokes removeMatches on each: class OldMaidGame(CardGame):
As an exercise, write printHands which traverses self.hands and prints each hand. count is an accumulator that adds up the number of matches in each hand and returns the total. When the total number of matches reaches twenty-five, fifty cards have been removed from the hands, which means that only one card is left and the game is over. The variable turn keeps track of which player's turn it is. It starts at 0 and increases by one each time; when it reaches numHands, the modulus operator wraps it back around to 0. The method playOneTurn takes an argument that indicates whose turn it is. The return value is the number of matches made during this turn: class OldMaidGame(CardGame):
If a player's hand is empty, that player is out of the game, so he or she does nothing and returns 0. Otherwise, a turn consists of finding the first player on the left that has cards, taking one card from the neighbor, and checking for matches. Before returning, the cards in the hand are shuffled so that the next player's choice is random. The method findNeighbor starts with the player to the immediate left and continues around the circle until it finds a player that still has cards: class OldMaidGame(CardGame):
If findNeighbor ever went all the way around the circle without finding cards, it would return None and cause an error elsewhere in the program. Fortunately, we can prove that that will never happen (as long as the end of the game is detected correctly). We have omitted the printHands method. You can write that one yourself. The following output is from a truncated form of the game where only the top fifteen cards (tens and higher) were dealt to three players. With this small deck, play stops after seven matches instead of twenty-five. >>> import cards
16.8 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 6Iteration6.1 Multiple assignmentAs you may have discovered, it is legal to make more than one assignment to the same variable. A new assignment makes an existing variable refer to a new value (and stop referring to the old value). bruce = 5
The output of this program is 5 7, because the first time bruce is printed, his value is 5, and the second time, his value is 7. The comma at the end of the first print statement suppresses the newline after the output, which is why both outputs appear on the same line. Here is what multiple assignment looks like in a state diagram:
With multiple assignment it is especially important to distinguish between an assignment operation and a statement of equality. Because Python uses the equal sign (=) for assignment, it is tempting to interpret a statement like a = b as a statement of equality. It is not! First, equality is commutative and assignment is not. For example, in mathematics, if a = 7 then 7 = a. But in Python, the statement a = 7 is legal and 7 = a is not. Furthermore, in mathematics, a statement of equality is always true. If a = b now, then a will always equal b. In Python, an assignment statement can make two variables equal, but they don't have to stay that way: a = 5
The third line changes the value of a but does not change the value of b, so they are no longer equal. (In some programming languages, a different symbol is used for assignment, such as <- or :=, to avoid confusion.) Although multiple assignment is frequently helpful, you should use it with caution. If the values of variables change frequently, it can make the code difficult to read and debug. 6.2 The while statementComputers are often used to automate repetitive tasks. Repeating identical or similar tasks without making errors is something that computers do well and people do poorly. We have seen two programs, nLines and countdown, that use recursion to perform repetition, which is also called iteration. Because iteration is so common, Python provides several language features to make it easier. The first feature we are going to look at is the while statement. Here is what countdown looks like with a while statement: def countdown(n):
Since we removed the recursive call, this function is not recursive. You can almost read the while statement as if it were English. It means, "While n is greater than 0, continue displaying the value of n and then reducing the value of n by 1. When you get to 0, display the word Blastoff!" More formally, here is the flow of execution for a while statement:
The body consists of all of the statements below the header with the same indentation. This type of flow is called a loop because the third step loops back around to the top. Notice that if the condition is false the first time through the loop, the statements inside the loop are never executed. The body of the loop should change the value of one or more variables so that eventually the condition becomes false and the loop terminates. Otherwise the loop will repeat forever, which is called an infinite loop. An endless source of amusement for computer scientists is the observation that the directions on shampoo, "Lather, rinse, repeat," are an infinite loop. In the case of countdown, we can prove that the loop terminates because we know that the value of n is finite, and we can see that the value of n gets smaller each time through the loop, so eventually we have to get to 0. In other cases, it is not so easy to tell: def sequence(n):
The condition for this loop is n != 1, so the loop will continue until n is 1, which will make the condition false. Each time through the loop, the program outputs the value of n and then checks whether it is even or odd. If it is even, the value of n is divided by 2. If it is odd, the value is replaced by n*3+1. For example, if the starting value (the argument passed to sequence) is 3, the resulting sequence is 3, 10, 5, 16, 8, 4, 2, 1. Since n sometimes increases and sometimes decreases, there is no obvious proof that n will ever reach 1, or that the program terminates. For some particular values of n, we can prove termination. For example, if the starting value is a power of two, then the value of n will be even each time through the loop until it reaches 1. The previous example ends with such a sequence, starting with 16. Particular values aside, the interesting question is whether we can prove that this program terminates for all positive values of n. So far, no one has been able to prove it or disprove it! As an exercise, rewrite the function nLines from Section 4.9 using iteration instead of recursion. 6.3 TablesOne of the things loops are good for is generating tabular data. Before computers were readily available, people had to calculate logarithms, sines and cosines, and other mathematical functions by hand. To make that easier, mathematics books contained long tables listing the values of these functions. Creating the tables was slow and boring, and they tended to be full of errors. When computers appeared on the scene, one of the initial reactions was, "This is great! We can use the computers to generate the tables, so there will be no errors." That turned out to be true (mostly) but shortsighted. Soon thereafter, computers and calculators were so pervasive that the tables became obsolete. Well, almost. For some operations, computers use tables of values to get an approximate answer and then perform computations to improve the approximation. In some cases, there have been errors in the underlying tables, most famously in the table the Intel Pentium used to perform floating-point division. Although a log table is not as useful as it once was, it still makes a good example of iteration. The following program outputs a sequence of values in the left column and their logarithms in the right column: x = 1.0
The string '\t' represents a tab character. As characters and strings are displayed on the screen, an invisible marker called the cursor keeps track of where the next character will go. After a print statement, the cursor normally goes to the beginning of the next line. The tab character shifts the cursor to the right until it reaches one of the tab stops. Tabs are useful for making columns of text line up, as in the output of the previous program: 1.0 0.0
If these values seem odd, remember that the log function uses base e. Since powers of two are so important in computer science, we often want to find logarithms with respect to base 2. To do that, we can use the following formula:
Changing the output statement to: print x, '\t', math.log(x)/math.log(2.0)
yields: 1.0 0.0
We can see that 1, 2, 4, and 8 are powers of two because their logarithms base 2 are round numbers. If we wanted to find the logarithms of other powers of two, we could modify the program like this: x = 1.0
Now instead of adding something to x each time through the loop, which yields an arithmetic sequence, we multiply x by something, yielding a geometric sequence. The result is: 1.0 0.0
Because of the tab characters between the columns, the position of the second column does not depend on the number of digits in the first column. Logarithm tables may not be useful any more, but for computer scientists, knowing the powers of two is! As an exercise, modify this program so that it outputs the powers of two up to 65,536 (that's 216). Print it out and memorize it. The backslash character in '\t' indicates the beginning of an escape sequence. Escape sequences are used to represent invisible characters like tabs and newlines. The sequence \n represents a newline. An escape sequence can appear anywhere in a string; in the example, the tab escape sequence is the only thing in the string. How do you think you represent a backslash in a string? As an exercise, write a single string that 6.4 Two-dimensional tablesA two-dimensional table is a table where you read the value at the intersection of a row and a column. A multiplication table is a good example. Let's say you want to print a multiplication table for the values from 1 to 6. A good way to start is to write a loop that prints the multiples of 2, all on one line: i = 1
The first line initializes a variable named i, which acts as a counter or loop variable. As the loop executes, the value of i increases from 1 to 6. When i is 7, the loop terminates. Each time through the loop, it displays the value of 2*i, followed by three spaces. Again, the comma in the print statement suppresses the newline. After the loop completes, the second print statement starts a new line. The output of the program is: 2 4 6 8 10 12
So far, so good. The next step is to encapsulate and generalize. 6.5 Encapsulation and generalizationEncapsulation is the process of wrapping a piece of code in a function, allowing you to take advantage of all the things functions are good for. You have seen two examples of encapsulation: printParity in Section 4.5; and isDivisible in Section 5.4. Generalization means taking something specific, such as printing the multiples of 2, and making it more general, such as printing the multiples of any integer. This function encapsulates the previous loop and generalizes it to print multiples of n: def printMultiples(n):
To encapsulate, all we had to do was add the first line, which declares the name of the function and the parameter list. To generalize, all we had to do was replace the value 2 with the parameter n. If we call this function with the argument 2, we get the same output as before. With the argument 3, the output is: 3 6 9 12 15 18
With the argument 4, the output is: 4 8 12 16 20 24
By now you can probably guess how to print a multiplication table i = 1
Notice how similar this loop is to the one inside printMultiples. All we did was replace the print statement with a function call. The output of this program is a multiplication table: 1 2 3 4 5 6
6.6 More encapsulationTo demonstrate encapsulation again, let's take the code from the end of Section 6.5 and wrap it up in a function: def printMultTable():
This process is a common development plan. We develop code by writing lines of code outside any function, or typing them in to the interpreter. When we get the code working, we extract it and wrap it up in a function. This development plan is particularly useful if you don't know, when you start writing, how to divide the program into functions. This approach lets you design as you go along. 6.7 Local variablesYou might be wondering how we can use the same variable, i, in both printMultiples and printMultTable. Doesn't it cause problems when one of the functions changes the value of the variable? The answer is no, because the i in printMultiples and the i in printMultTable are not the same variable. Variables created inside a function definition are local; you can't access a local variable from outside its "home" function. That means you are free to have multiple variables with the same name as long as they are not in the same function. The stack diagram for this program shows that the two variables named i are not the same variable. They can refer to different values, and changing one does not affect the other.
The value of i in printMultTable goes from 1 to 6. In the diagram it happens to be 3. The next time through the loop it will be 4. Each time through the loop, printMultTable calls printMultiples with the current value of i as an argument. That value gets assigned to the parameter n. Inside printMultiples, the value of i goes from 1 to 6. In the diagram, it happens to be 2. Changing this variable has no effect on the value of i in printMultTable. It is common and perfectly legal to have different local variables with the same name. In particular, names like i and j are used frequently as loop variables. If you avoid using them in one function just because you used them somewhere else, you will probably make the program harder to read. 6.8 More generalizationAs another example of generalization, imagine you wanted a program that would print a multiplication table of any size, not just the six-by-six table. You could add a parameter to printMultTable: def printMultTable(high):
We replaced the value 6 with the parameter high. If we call printMultTable with the argument 7, it displays: 1 2 3 4 5 6
This is fine, except that we probably want the table to be
square Just to be annoying, we call this parameter high, demonstrating that different functions can have parameters with the same name (just like local variables). Here's the whole program: def printMultiples(n, high):
Notice that when we added a new parameter, we had to change the first line of the function (the function heading), and we also had to change the place where the function is called in printMultTable. As expected, this program generates a square seven-by-seven table: 1 2 3 4 5 6 7
When you generalize a function appropriately, you often get a program with capabilities you didn't plan. For example, you might notice that, because ab = ba, all the entries in the table appear twice. You could save ink by printing only half the table. To do that, you only have to change one line of printMultTable. Change printMultiples(i, high)
to printMultiples(i, i)
and you get 1
As an exercise, trace the execution of this version of printMultTable and figure out how it works. 6.9 FunctionsA few times now, we have mentioned "all the things functions are good for." By now, you might be wondering what exactly those things are. Here are some of them:
6.10 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 10Dictionaries
The compound types you have learned about Dictionaries are similar to other compound types except that they can use any immutable type as an index. As an example, we will create a dictionary to translate English words into Spanish. For this dictionary, the indices are strings. One way to create a dictionary is to start with the empty dictionary and add elements. The empty dictionary is denoted {}: >>> eng2sp = {}
The first assignment creates a dictionary named eng2sp; the other assignments add new elements to the dictionary. We can print the current value of the dictionary in the usual way: >>> print eng2sp
The elements of a dictionary appear in a comma-separated list. Each entry contains an index and a value separated by a colon. In a dictionary, the indices are called keys, so the elements are called key-value pairs. Another way to create a dictionary is to provide a list of key-value pairs using the same syntax as the previous output: >>> eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}
If we print the value of eng2sp again, we get a surprise: >>> print eng2sp
The key-value pairs are not in order! Fortunately, there is no reason to care about the order, since the elements of a dictionary are never indexed with integer indices. Instead, we use the keys to look up the corresponding values: >>> print eng2sp['two']
The key 'two' yields the value 'dos' even though it appears in the third key-value pair. 10.1 Dictionary operationsThe del statement removes a key-value pair from a dictionary. For example, the following dictionary contains the names of various fruits and the number of each fruit in stock: >>> inventory = {'apples': 430, 'bananas': 312, 'oranges': 525,
If someone buys all of the pears, we can remove the entry from the dictionary: >>> del inventory['pears']
Or if we're expecting more pears soon, we might just change the value associated with pears: >>> inventory['pears'] = 0
The len function also works on dictionaries; it returns the number of key-value pairs: >>> len(inventory)
10.2 Dictionary methods
A method is similar to a function >>> eng2sp.keys()
This form of dot notation specifies the name of the function, keys, and the name of the object to apply the function to, eng2sp. The parentheses indicate that this method has no parameters. A method call is called an invocation; in this case, we would say that we are invoking keys on the object eng2sp. The values method is similar; it returns a list of the values in the dictionary: >>> eng2sp.values()
The items method returns both, in the form of a
list of tuples >>> eng2sp.items()
The syntax provides useful type information. The square brackets indicate that this is a list. The parentheses indicate that the elements of the list are tuples. If a method takes an argument, it uses the same syntax as a function call. For example, the method has_key takes a key and returns true (1) if the key appears in the dictionary: >>> eng2sp.has_key('one')
If you try to call a method without specifying an object, you get an error. In this case, the error message is not very helpful: >>> has_key('one')
10.3 Aliasing and copyingBecause dictionaries are mutable, you need to be aware of aliasing. Whenever two variables refer to the same object, changes to one affect the other. If you want to modify a dictionary and keep a copy of the original, use the copy method. For example, opposites is a dictionary that contains pairs of opposites: >>> opposites = {'up': 'down', 'right': 'wrong', 'true': 'false'}
alias and opposites refer to the same object; copy refers to a fresh copy of the same dictionary. If we modify alias, opposites is also changed: >>> alias['right'] = 'left'
If we modify copy, opposites is unchanged: >>> copy['right'] = 'privilege'
10.4 Sparse matricesIn Section 8.14, we used a list of lists to represent a matrix. That is a good choice for a matrix with mostly nonzero values, but consider a sparse matrix like this one:
The list representation contains a lot of zeroes: matrix = [ [0,0,0,1,0],
An alternative is to use a dictionary. For the keys, we can use tuples that contain the row and column numbers. Here is the dictionary representation of the same matrix: matrix = {(0,3): 1, (2, 1): 2, (4, 3): 3}
We only need three key-value pairs, one for each nonzero element of the matrix. Each key is a tuple, and each value is an integer. To access an element of the matrix, we could use the [] operator: matrix[0,3]
Notice that the syntax for the dictionary representation is not the same as the syntax for the nested list representation. Instead of two integer indices, we use one index, which is a tuple of integers. There is one problem. If we specify an element that is zero, we get an error, because there is no entry in the dictionary with that key: >>> matrix[1,3]
The get method solves this problem: >>> matrix.get((0,3), 0)
The first argument is the key; the second argument is the value get should return if the key is not in the dictionary: >>> matrix.get((1,3), 0)
get definitely improves the semantics of accessing a sparse matrix. Shame about the syntax. 10.5 HintsIf you played around with the fibonacci function from Section 5.7, you might have noticed that the bigger the argument you provide, the longer the function takes to run. Furthermore, the run time increases very quickly. On one of our machines, fibonacci(20) finishes instantly, fibonacci(30) takes about a second, and fibonacci(40) takes roughly forever. To understand why, consider this call graph for fibonacci with n=4:
A call graph shows a set function frames, with lines connecting each frame to the frames of the functions it calls. At the top of the graph, fibonacci with n=4 calls fibonacci with n=3 and n=2. In turn, fibonacci with n=3 calls fibonacci with n=2 and n=1. And so on. Count how many times fibonacci(0) and fibonacci(1) are called. This is an inefficient solution to the problem, and it gets far worse as the argument gets bigger. A good solution is to keep track of values that have already been computed by storing them in a dictionary. A previously computed value that is stored for later use is called a hint. Here is an implementation of fibonacci using hints: previous = {0:1, 1:1}
The dictionary named previous keeps track of the Fibonacci numbers we already know. We start with only two pairs: 0 maps to 1; and 1 maps to 1. Whenever fibonacci is called, it checks the dictionary to determine if it contains the result. If it's there, the function can return immediately without making any more recursive calls. If not, it has to compute the new value. The new value is added to the dictionary before the function returns. Using this version of fibonacci, our machines can compute fibonacci(40) in an eyeblink. But when we try to compute fibonacci(50), we see the following: >>> fibonacci(50)
The L at the end of the result indicates that the answer +(20,365,011,074) is too big to fit into a Python integer. Python has automatically converted the result to a long integer. 10.6 Long integersPython provides a type called long that can handle any size integer. There are two ways to create a long value. One is to write an integer with a capital L at the end: >>> type(1L)
The other is to use the long function to convert a value to a long. long can accept any numerical type and even strings of digits: >>> long(1)
All of the math operations work on longs, so in general any code that works with integers will also work with long integers. Any time the result of a computation is too big to be represented with an integer, Python detects the overflow and returns the result as a long integer. For example: >>> 1000 * 1000
In the first case the result has type int; in the second case it is long. 10.7 Counting lettersIn Chapter 7, we wrote a function that counted the number of occurrences of a letter in a string. A more general version of this problem is to form a histogram of the letters in the string, that is, how many times each letter appears. Such a histogram might be useful for compressing a text file. Because different letters appear with different frequencies, we can compress a file by using shorter codes for common letters and longer codes for letters that appear less frequently. Dictionaries provide an elegant way to generate a histogram: >>> letterCounts = {}
We start with an empty dictionary. For each letter in the string, we find the current count (possibly zero) and increment it. At the end, the dictionary contains pairs of letters and their frequencies. It might be more appealing to display the histogram in alphabetical order. We can do that with the items and sort methods: >>> letterItems = letterCounts.items()
You have seen the items method before, but sort is the first method you have encountered that applies to lists. There are several other list methods, including append, extend, and reverse. Consult the Python documentation for details. 10.8 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Appendix DRecommendations for further readingSo where do you go from here? There are many directions to pursue, extending your knowledge of Python specifically and computer science in general. The examples in this book have been deliberately simple, but they may not have shown off Python's most exciting capabilities. Here is a sampling of extensions to Python and suggestions for projects that use them.
Another popular platform is wxPython, which is essentially a Python veneer over wxWindows, a C++ package which in turn implements windows using native interfaces on Windows and Unix (including Linux) platforms. The windows and controls under wxPython tend to have a more native look and feel than those of Tkinter and are somewhat simpler to program. Any type of GUI programming will lead you into event-driven programming, where the user and not the programmer determines the flow of execution. This style of programming takes some getting used to, sometimes forcing you to rethink the whole structure of a program. Python-related web sites and booksHere are the authors' recommendations for Python resources on the web:
And here are some books that contain more material on the Python language:
Recommended general computer science booksThe following suggestions for further reading include many of the authors' favorite books. They deal with good programming practices and computer science in general.
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 7Strings7.1 A compound data type
So far we have seen three types: int, float, and string. Strings are qualitatively different from the
other two because they are made up of smaller pieces Types that comprise smaller pieces are called compound data types. Depending on what we are doing, we may want to treat a compound data type as a single thing, or we may want to access its parts. This ambiguity is useful. The bracket operator selects a single character from a string. >>> fruit = "banana"
The expression fruit[1] selects character number 1 from fruit. The variable letter refers to the result. When we display letter, we get a surprise: a
The first letter of "banana" is not a. Unless you are a computer scientist. In that case you should think of the expression in brackets as an offset from the beginning of the string, and the offset of the first letter is zero. So b is the 0th letter ("zero-eth") of "banana", a is the 1th letter ("one-eth"), and n is the 2th ("two-eth") letter. To get the first letter of a string, you just put 0, or any expression with the value 0, in the brackets: >>> letter = fruit[0]
The expression in brackets is called an index. An index specifies a member of an ordered set, in this case the set of characters in the string. The index indicates which one you want, hence the name. It can be any integer expression. 7.2 LengthThe len function returns the number of characters in a string: >>> fruit = "banana"
To get the last letter of a string, you might be tempted to try something like this: length = len(fruit)
That won't work. It causes the runtime error IndexError: string
length = len(fruit)
Alternatively, we can use negative indices, which count backward from the end of the string. The expression fruit[-1] yields the last letter, fruit[-2] yields the second to last, and so on. 7.3 Traversal and the for loopA lot of computations involve processing a string one character at a time. Often they start at the beginning, select each character in turn, do something to it, and continue until the end. This pattern of processing is called a traversal. One way to encode a traversal is with a while statement: index = 0
This loop traverses the string and displays each letter on a line by itself. The loop condition is index < len(fruit), so when index is equal to the length of the string, the condition is false, and the body of the loop is not executed. The last character accessed is the one with the index len(fruit)-1, which is the last character in the string. As an exercise, write a function that takes a string as an argument and outputs the letters backward, one per line.
Using an index to
traverse a set of values is so common that
Python provides an alternative, simpler syntax for char in fruit:
Each time through the loop, the next character in the string is assigned to the variable char. The loop continues until no characters are left. The following example shows how to use concatenation and a for loop to generate an abecedarian series. "Abecedarian" refers to a series or list in which the elements appear in alphabetical order. For example, in Robert McCloskey's book Make Way for Ducklings, the names of the ducklings are Jack, Kack, Lack, Mack, Nack, Ouack, Pack, and Quack. This loop outputs these names in order: prefixes = "JKLMNOPQ"
The output of this program is: Jack
Of course, that's not quite right because "Ouack" and "Quack" are misspelled. As an exercise, modify the program to fix this error. 7.4 String slicesA segment of a string is called a slice. Selecting a slice is similar to selecting a character: >>> s = "Peter, Paul, and Mary"
The operator [n:m] returns the part of the string from the "n-eth" character to the "m-eth" character, including the first but excluding the last. This behavior is counterintuitive; it makes more sense if you imagine the indices pointing between the characters, as in the following diagram:
If you omit the first index (before the colon), the slice starts at the beginning of the string. If you omit the second index, the slice goes to the end of the string. Thus: >>> fruit = "banana"
7.5 String comparisonThe comparison operators work on strings. To see if two strings are equal: if word == "banana":
Other comparison operations are useful for putting words in alphabetical order: if word < "banana":
You should be aware, though, that Python does not handle upper- and lowercase letters the same way that people do. All the uppercase letters come before all the lowercase letters. As a result: Your word, Zebra, comes before banana.
A common way to address this problem is to convert strings to a standard format, such as all lowercase, before performing the comparison. A more difficult problem is making the program realize that zebras are not fruit. 7.6 Strings are immutableIt is tempting to use the [] operator on the left side of an assignment, with the intention of changing a character in a string. For example: greeting = "Hello, world!"
Instead of producing the output Jello, world!, this code
produces the runtime error TypeError: object doesn't support item
Strings are immutable, which means you can't change an existing string. The best you can do is create a new string that is a variation on the original: greeting = "Hello, world!"
The solution here is to concatenate a new first letter onto a slice of greeting. This operation has no effect on the original string. 7.7 A find functionWhat does the following function do? def find(str, ch):
In a sense, find is the opposite of the [] operator. Instead of taking an index and extracting the corresponding character, it takes a character and finds the index where that character appears. If the character is not found, the function returns -1. This is the first example we have seen of a return statement inside a loop. If str[index] == ch, the function returns immediately, breaking out of the loop prematurely. If the character doesn't appear in the string, then the program exits the loop normally and returns -1. This pattern of computation is sometimes called a "eureka" traversal because as soon as we find what we are looking for, we can cry "Eureka!" and stop looking. As an exercise, modify the find function so that it has a third parameter, the index in the string where it should start looking. 7.8 Looping and countingThe following program counts the number of times the letter a appears in a string: fruit = "banana"
This program demonstrates another pattern of computation called a counter. The variable count is initialized to 0 and then
incremented each time an a is found. (To increment is to
increase by one; it is the opposite of decrement, and unrelated
to "excrement," which is a noun.) When the loop exits, count
contains the result As an exercise, encapsulate this code in a function named countLetters, and generalize it so that it accepts the string and the letter as arguments. As a second exercise, rewrite this function so that instead of traversing the string, it uses the three-parameter version of find from the previous. 7.9 The string moduleThe string module contains useful functions that manipulate strings. As usual, we have to import the module before we can use it: >>> import string
The string module includes a function named find that does the same thing as the function we wrote. To call it we have to specify the name of the module and the name of the function using dot notation. >>> fruit = "banana"
This example demonstrates one of the benefits of modules Actually, string.find is more general than our version. First, it can find substrings, not just characters: >>> string.find("banana", "na")
Also, it takes an additional argument that specifies the index it should start at: >>> string.find("banana", "na", 3)
Or it can take two additional arguments that specify a range of indices: >>> string.find("bob", "b", 1, 2)
In this example, the search fails because the letter b does not appear in the index range from 1 to 2 (not including 2). 7.10 Character classificationIt is often helpful to examine a character and test whether it is upper- or lowercase, or whether it is a character or a digit. The string module provides several constants that are useful for these purposes. The string string.lowercase contains all of the letters that the system considers to be lowercase. Similarly, string.uppercase contains all of the uppercase letters. Try the following and see what you get: >>> print string.lowercase
We can use these constants and find to classify characters. For example, if find(lowercase, ch) returns a value other than -1, then ch must be lowercase: def isLower(ch):
Alternatively, we can take advantage of the in operator, which determines whether a character appears in a string: def isLower(ch):
As yet another alternative, we can use the comparison operator: def isLower(ch):
If ch is between a and z, it must be a lowercase letter. As an exercise, discuss which version of isLower you think will be fastest. Can you think of other reasons besides speed to prefer one or the other? Another constant defined in the string module may surprise you when you print it: >>> print string.whitespace
Whitespace characters move the cursor without printing anything. They create the white space between visible characters (at least on white paper). The constant string.whitespace contains all the whitespace characters, including space, tab (\t), and newline (\n). There are other useful functions in the string module, but this book isn't intended to be a reference manual. On the other hand, the Python Library Reference is. Along with a wealth of other documentation, it's available from the Python website, www.python.org. 7.11 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 20TreesLike linked lists, trees are made up of nodes. A common kind of tree is a binary tree, in which each node contains a reference to two other nodes (possibly null). These references are referred to as the left and right subtrees. Like list nodes, tree nodes also contain cargo. A state diagram for a tree looks like this:
To avoid cluttering up the picture, we often omit the Nones. The top of the tree (the node tree refers to) is called the root. In keeping with the tree metaphor, the other nodes are called branches and the nodes at the tips with null references are called leaves. It may seem odd that we draw the picture with the root at the top and the leaves at the bottom, but that is not the strangest thing.
To make things worse, computer scientists mix in another
metaphor Finally, there is a geometric vocabulary for talking about trees. We already mentioned left and right, but there is also "up" (toward the parent/root) and "down" (toward the children/leaves). Also, all of the nodes that are the same distance from the root comprise a level of the tree. We probably don't need three metaphors for talking about trees, but there they are. Like linked lists, trees are recursive data structures because they are defined recursively. A tree is either: 20.1 Building treesThe process of assembling a tree is similar to the process of assembling a linked list. Each constructor invocation builds a single node. class Tree:
The cargo can be any type, but the arguments for left and right should be tree nodes. left and right are optional; the default value is None. To print a node, we just print the cargo. One way to build a tree is from the bottom up. Allocate the child nodes first: left = Tree(2)
Then create the parent node and link it to the children: tree = Tree(1, left, right);
We can write this code more concisely by nesting constructor invocations: >>> tree = Tree(1, Tree(2), Tree(3))
Either way, the result is the tree at the beginning of the chapter. 20.2 Traversing treesAny time you see a new data structure, your first question should be, "How do I traverse it?" The most natural way to traverse a tree is recursively. For example, if the tree contains integers as cargo, this function returns their sum: def total(tree):
The base case is the empty tree, which contains no cargo, so the sum is 0. The recursive step makes two recursive calls to find the sum of the child trees. When the recursive calls complete, we add the cargo of the parent and return the total. 20.3 Expression treesA tree is a natural way to represent the structure of an expression. Unlike other notations, it can represent the computation unambiguously. For example, the infix expression 1 + 2 * 3 is ambiguous unless we know that the multiplication happens before the addition. This expression tree represents the same computation:
The nodes of an expression tree can be operands like 1 and 2 or operators like + and *. Operands are leaf nodes; operator nodes contain references to their operands. (All of these operators are binary, meaning they have exactly two operands.) We can build this tree like this: >>> tree = Tree('+', Tree(1), Tree('*', Tree(2), Tree(3)))
Looking at the figure, there is no question what the order of operations is; the multiplication happens first in order to compute the second operand of the addition. Expression trees have many uses. The example in this chapter uses trees to translate expressions to postfix, prefix, and infix. Similar trees are used inside compilers to parse, optimize, and translate programs. 20.4 Tree traversalWe can traverse an expression tree and print the contents like this: def printTree(tree):
In other words, to print a tree, first print the contents of the root, then print the entire left subtree, and then print the entire right subtree. This way of traversing a tree is called a preorder, because the contents of the root appear before the contents of the children. For the previous example, the output is: >>> tree = Tree('+', Tree(1), Tree('*', Tree(2), Tree(3)))
This format is different from both postfix and infix; it is another notation called prefix, in which the operators appear before their operands. You might suspect that if you traverse the tree in a different order, you will get the expression in a different notation. For example, if you print the subtrees first and then the root node, you get: def printTreePostorder(tree):
The result, 1 2 3 * +, is in postfix! This order of traversal is called postorder. Finally, to traverse a tree inorder, you print the left tree, then the root, and then the right tree: def printTreeInorder(tree):
The result is 1 + 2 * 3, which is the expression in infix. To be fair, we should point out that we have omitted an important complication. Sometimes when we write an expression in infix, we have to use parentheses to preserve the order of operations. So an inorder traversal is not quite sufficient to generate an infix expression. Nevertheless, with a few improvements, the expression tree and the three recursive traversals provide a general way to translate expressions from one format to another. As an exercise, modify printTreeInorder so that it puts parentheses around every operator and pair of operands. Is the output correct and unambiguous? Are the parentheses always necessary? If we do an inorder traversal and keep track of what level in the tree we are on, we can generate a graphical representation of a tree: def printTreeIndented(tree, level=0):
The parameter level keeps track of where we are in the tree. By default, it is initially 0. Each time we make a recursive call, we pass level+1 because the child's level is always one greater than the parent's. Each item is indented by two spaces per level. The result for the example tree is: >>> printTreeIndented(tree)
If you look at the output sideways, you see a simplified version of the original figure. 20.5 Building an expression treeIn this section, we parse infix expressions and build the corresponding expression trees. For example, the expression (3+7)*9 yields the following tree:
Notice that we have simplified the diagram by leaving out the names of the attributes. The parser we will write handles expressions that include numbers, parentheses, and the operators + and *. We assume that the input string has already been tokenized into a Python list. The token list for (3+7)*9 is: ['(', 3, '+', 7, ')', '*', 9, 'end']
The end token is useful for preventing the parser from reading past the end of the list. As an exercise, write a function that takes an expression string and returns a token list. The first function we'll write is getToken, which takes a token list and an expected token as arguments. It compares the expected token to the first token on the list: if they match, it removes the token from the list and returns true; otherwise, it returns false: def getToken(tokenList, expected):
Since tokenList refers to a mutable object, the changes made here are visible to any other variable that refers to the same object. The next function, getNumber, handles operands. If the next token in tokenList is a number, getNumber removes it and returns a leaf node containing the number; otherwise, it returns None. def getNumber(tokenList):
Before continuing, we should test getNumber in isolation. We assign a list of numbers to tokenList, extract the first, print the result, and print what remains of the token list: >>> tokenList = [9, 11, 'end']
The next method we need is getProduct, which builds an expression tree for products. A simple product has two numbers as operands, like 3 * 7. Here is a version of getProduct that handles simple products. def getProduct(tokenList):
Assuming that getNumber succeeds and returns a singleton tree, we assign the first operand to a. If the next character is *, we get the second number and build an expression tree with a, b, and the operator. If the next character is anything else, then we just return the leaf node with a. Here are two examples: >>> tokenList = [9, '*', 11, 'end']
>>> tokenList = [9, '+', 11, 'end']
The second example implies that we consider a single operand to be a kind of product. This definition of "product" is counterintuitive, but it turns out to be useful.
Now we have to deal with compound products, like like 3 * 5 *
With a small change in getProduct, we can handle an arbitrarily long product: def getProduct(tokenList):
In other words, a product can be either a singleton or a tree with * at the root, a number on the left, and a product on the right. This kind of recursive definition should be starting to feel familiar. Let's test the new version with a compound product: >>> tokenList = [2, '*', 3, '*', 5 , '*', 7, 'end']
Next we will add the ability to parse sums. Again, we use a slightly counterintuitive definition of "sum." For us, a sum can be a tree with + at the root, a product on the left, and a sum on the right. Or, a sum can be just a product. If you are willing to play along with this definition, it has a nice property: we can represent any expression (without parentheses) as a sum of products. This property is the basis of our parsing algorithm. getSum tries to build a tree with a product on the left and a sum on the right. But if it doesn't find a +, it just builds a product. def getSum(tokenList):
Let's test it with 9 * 11 + 5 * 7: >>> tokenList = [9, '*', 11, '+', 5, '*', 7, 'end']
We are almost done, but we still have to handle parentheses. Anywhere in an expression where there can be a number, there can also be an entire sum enclosed in parentheses. We just need to modify getNumber to handle subexpressions: def getNumber(tokenList):
Let's test this code with 9 * (11 + 5) * 7: >>> tokenList = [9, '*', '(', 11, '+', 5, ')', '*', 7, 'end']
The parser handled the parentheses correctly; the addition happens before the multiplication. In the final version of the program, it would be a good idea to give getNumber a name more descriptive of its new role. 20.6 Handling errorsThroughout the parser, we've been assuming that expressions are well-formed. For example, when we reach the end of a subexpression, we assume that the next character is a close parenthesis. If there is an error and the next character is something else, we should deal with it. def getNumber(tokenList):
The raise statement creates an exception; in this case a ValueError. If the function that called getNumber, or one of the other functions in the traceback, handles the exception, then the program can continue. Otherwise, Python will print an error message and quit. As an exercise, find other places in these functions where errors can occur and add appropriate raise statements. Test your code with improperly formed expressions. 20.7 The animal treeIn this section, we develop a small program that uses a tree to represent a knowledge base. The program interacts with the user to create a tree of questions and animal names. Here is a sample run: Are you thinking of an animal? y
Here is the tree this dialog builds:
At the beginning of each round, the program starts at the top of the tree and asks the first question. Depending on the answer, it moves to the left or right child and continues until it gets to a leaf node. At that point, it makes a guess. If the guess is not correct, it asks the user for the name of the new animal and a question that distinguishes the (bad) guess from the new animal. Then it adds a node to the tree with the new question and the new animal. Here is the code: def animal():
The function yes is a helper; it prints a prompt and then takes input from the user. If the response begins with y or Y, the function returns true: def yes(ques):
The condition of the outer loop is True, which means it will continue until the break statement executes, if the user is not thinking of an animal. The inner while loop walks the tree from top to bottom, guided by the user's responses. When a new node is added to the tree, the new question replaces the cargo, and the two children are the new animal and the original cargo. One shortcoming of the program is that when it exits, it forgets everything you carefully taught it! As an exercise, think of various ways you might save the knowledge tree in a file. Implement the one you think is easiest. 20.8 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 14Classes and methods14.1 Object-oriented featuresPython is an object-oriented programming language, which means that it provides features that support object-oriented programming. It is not easy to define object-oriented programming, but we have already seen some of its characteristics:
For example, the Time class defined in Chapter 13 corresponds to the way people record the time of day, and the functions we defined correspond to the kinds of things people do with times. Similarly, the Point and Rectangle classes correspond to the mathematical concepts of a point and a rectangle. So far, we have not taken advantage of the features Python provides to support object-oriented programming. Strictly speaking, these features are not necessary. For the most part, they provide an alternative syntax for things we have already done, but in many cases, the alternative is more concise and more accurately conveys the structure of the program. For example, in the Time program, there is no obvious connection between the class definition and the function definitions that follow. With some examination, it is apparent that every function takes at least one Time object as an argument. This observation is the motivation for methods. We have already seen some methods, such as keys and values, which were invoked on dictionaries. Each method is associated with a class and is intended to be invoked on instances of that class. Methods are just like functions, with two differences:
In the next few sections, we will take the functions from the previous two chapters and transform them into methods. This transformation is purely mechanical; you can do it simply by following a sequence of steps. If you are comfortable converting from one form to another, you will be able to choose the best form for whatever you are doing. 14.2 printTimeIn Chapter 13, we defined a class named Time and you wrote a function named printTime, which should have looked something like this: class Time:
To call this function, we passed a Time object as an argument: >>> currentTime = Time()
To make printTime a method, all we have to do is move the function definition inside the class definition. Notice the change in indentation. class Time:
Now we can invoke printTime using dot notation. >>> currentTime.printTime()
As usual, the object on which the method is invoked appears before the dot and the name of the method appears after the dot. The object on which the method is invoked is assigned to the first parameter, so in this case currentTime is assigned to the parameter time. By convention, the first parameter of a method is called self. The reason for this is a little convoluted, but it is based on a useful metaphor. The syntax for a function call, printTime(currentTime), suggests that the function is the active agent. It says something like, "Hey printTime! Here's an object for you to print." In object-oriented programming, the objects are the active agents. An invocation like currentTime.printTime() says "Hey currentTime! Please print yourself!" This change in perspective might be more polite, but it is not obvious that it is useful. In the examples we have seen so far, it may not be. But sometimes shifting responsibility from the functions onto the objects makes it possible to write more versatile functions, and makes it easier to maintain and reuse code. 14.3 Another exampleLet's convert increment (from Section 13.3) to a method. To save space, we will leave out previously defined methods, but you should keep them in your version: class Time:
The transformation is purely mechanical Now we can invoke increment as a method. currentTime.increment(500)
Again, the object on which the method is invoked gets assigned to the first parameter, self. The second parameter, seconds gets the value 500. As an exercise, convert convertToSeconds (from Section 13.5) to a method in the Time class. 14.4 A more complicated exampleThe after function is slightly more complicated because it operates on two Time objects, not just one. We can only convert one of the parameters to self; the other stays the same: class Time:
We invoke this method on one object and pass the other as an argument: if doneTime.after(currentTime):
You can almost read the invocation like English: "If the done-time is after the current-time, then..." 14.5 Optional argumentsWe have seen built-in functions that take a variable number of arguments. For example, string.find can take two, three, or four arguments. It is possible to write user-defined functions with optional argument lists. For example, we can upgrade our own version of find to do the same thing as string.find. This is the original version from Section 7.7: def find(str, ch):
This is the new and improved version: def find(str, ch, start=0):
The third parameter, start, is optional because a default value, 0, is provided. If we invoke find with only two arguments, it uses the default value and starts from the beginning of the string: >>> find("apple", "p")
If we provide a third argument, it overrides the default: >>> find("apple", "p", 2)
As an exercise, add a fourth parameter, end, that specifies where to stop looking. 14.6 The initialization methodThe initialization method is a special method that is invoked when an object is created. The name of this method is __init__ (two underscore characters, followed by init, and then two more underscores). An initialization method for the Time class looks like this: class Time:
There is no conflict between the attribute self.hours and the parameter hours. Dot notation specifies which variable we are referring to. When we invoke the Time constructor, the arguments we provide are passed along to init: >>> currentTime = Time(9, 14, 30)
Because the arguments are optional, we can omit them: >>> currentTime = Time()
Or provide only the first: >>> currentTime = Time (9)
Or the first two: >>> currentTime = Time (9, 14)
Finally, we can make assignments to a subset of the parameters by naming them explicitly: >>> currentTime = Time(seconds = 30, hours = 9)
14.7 Points revisitedLet's rewrite the Point class from Section 12.1 in a more object-oriented style: class Point:
The initialization method takes x and y values as optional parameters; the default for either parameter is 0. The next method, __str__, returns a string representation of a Point object. If a class provides a method named __str__, it overrides the default behavior of the Python built-in str function. >>> p = Point(3, 4)
Printing a Point object implicitly invokes __str__ on the object, so defining __str__ also changes the behavior of print: >>> p = Point(3, 4)
When we write a new class, we almost always start by writing __init__, which makes it easier to instantiate objects, and __str__, which is almost always useful for debugging. 14.8 Operator overloadingSome languages make it possible to change the definition of the built-in operators when they are applied to user-defined types. This feature is called operator overloading. It is especially useful when defining new mathematical types. For example, to override the addition operator +, we provide a method named __add__: class Point:
As usual, the first parameter is the object on which the method is invoked. The second parameter is conveniently named other to distinguish it from self. To add two Points, we create and return a new Point that contains the sum of the x coordinates and the sum of the y coordinates. Now, when we apply the + operator to Point objects, Python invokes __add__: >>> p1 = Point(3, 4)
The expression p1 + p2 is equivalent to p1.__add__(p2), but obviously more elegant. As an exercise, add a method __sub__(self, other) that overloads the subtraction operator, and try it out. There are several ways to override the behavior of the multiplication operator: by defining a method named __mul__, or __rmul__, or both. If the left operand of * is a Point, Python invokes __mul__, which assumes that the other operand is also a Point. It computes the dot product of the two points, defined according to the rules of linear algebra: def __mul__(self, other):
If the left operand of * is a primitive type and the right operand is a Point, Python invokes __rmul__, which performs scalar multiplication: def __rmul__(self, other):
The result is a new Point whose coordinates are a multiple of the original coordinates. If other is a type that cannot be multiplied by a floating-point number, then __rmul__ will yield an error. This example demonstrates both kinds of multiplication: >>> p1 = Point(3, 4)
What happens if we try to evaluate p2 * 2? Since the first operand is a Point, Python invokes __mul__ with 2 as the second argument. Inside __mul__, the program tries to access the x coordinate of other, which fails because an integer has no attributes: >>> print p2 * 2
Unfortunately, the error message is a bit opaque. This example demonstrates some of the difficulties of object-oriented programming. Sometimes it is hard enough just to figure out what code is running. For a more complete example of operator overloading, see Appendix B. 14.9 PolymorphismMost of the methods we have written only work for a specific type. When you create a new object, you write methods that operate on that type. But there are certain operations that you will want to apply to many types, such as the arithmetic operations in the previous sections. If many types support the same set of operations, you can write functions that work on any of those types. For example, the multadd operation (which is common in linear algebra) takes three arguments; it multiplies the first two and then adds the third. We can write it in Python like this: def multadd (x, y, z):
This method will work for any values of x and y that can be multiplied and for any value of z that can be added to the product. We can invoke it with numeric values: >>> multadd (3, 2, 1)
Or with Points: >>> p1 = Point(3, 4)
In the first case, the Point is multiplied by a scalar and then added to another Point. In the second case, the dot product yields a numeric value, so the third argument also has to be a numeric value. A function like this that can take arguments with different types is called polymorphic. As another example, consider the method frontAndBack, which prints a list twice, forward and backward: def frontAndBack(front):
Because the reverse method is a modifier, we make a copy of the list before reversing it. That way, this method doesn't modify the list it gets as an argument. Here's an example that applies frontAndBack to a list: >>> myList = [1, 2, 3, 4]
Of course, we intended to apply this function to lists, so it is not surprising that it works. What would be surprising is if we could apply it to a Point. To determine whether a function can be applied to a new type, we apply the fundamental rule of polymorphism: If all of the operations inside the function can be applied to the type, the function can be applied to the type. The operations in the method include copy, reverse, and print. copy works on any object, and we have already written a __str__ method for Points, so all we need is a reverse method in the Point class: def reverse(self):
Then we can pass Points to frontAndBack: >>> p = Point(3, 4)
The best kind of polymorphism is the unintentional kind, where you discover that a function you have already written can be applied to a type for which you never planned. 14.10 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|
Warning: the HTML version of this document is generated from
Latex and may contain translation errors. In
particular, some mathematical expressions are not translated correctly.
Chapter 8Lists
A list is an ordered set of values, where each value is
identified by an index. The values that make up a list are
called its elements. Lists are similar to strings, which are
ordered sets of characters, except that the elements of a list can
have any type. Lists and strings 8.1 List valuesThere are several ways to create a new list; the simplest is to enclose the elements in square brackets ([ and ]): [10, 20, 30, 40]
The first example is a list of four integers. The second is a list of three strings. The elements of a list don't have to be the same type. The following list contains a string, a float, an integer, and (mirabile dictu) another list: ["hello", 2.0, 5, [10, 20]]
A list within another list is said to be nested. Lists that contain consecutive integers are common, so Python provides a simple way to create them: >>> range(1,5)
The range function takes two arguments and returns a list that contains all the integers from the first to the second, including the first but not including the second! There are two other forms of range. With a single argument, it creates a list that starts at 0: >>> range(10)
If there is a third argument, it specifies the space between successive values, which is called the step size. This example counts from 1 to 10 by steps of 2: >>> range(1, 10, 2)
Finally, there is a special list that contains no elements. It is called the empty list, and it is denoted []. With all these ways to create lists, it would be disappointing if we couldn't assign list values to variables or pass lists as arguments to functions. We can. vocabulary = ["ameliorate", "castigate", "defenestrate"]
8.2 Accessing elements
The syntax for accessing the elements of a list is the same as the
syntax for accessing the characters of a string print numbers[0]
The bracket operator can appear anywhere in an expression. When it appears on the left side of an assignment, it changes one of the elements in the list, so the one-eth element of numbers, which used to be 123, is now 5. Any integer expression can be used as an index: >>> numbers[3-2]
If you try to read or write an element that does not exist, you get a runtime error: >>> numbers[2] = 5
If an index has a negative value, it counts backward from the end of the list: >>> numbers[-1]
numbers[-1] is the last element of the list, numbers[-2] is the second to last, and numbers[-3] doesn't exist. It is common to use a loop variable as a list index. horsemen = ["war", "famine", "pestilence", "death"]
This while loop counts from 0 to 4. When the loop variable i is 4, the condition fails and the loop terminates. So the body of the loop is only executed when i is 0, 1, 2, and 3. Each time through the loop, the variable i is used as an index into the list, printing the i-eth element. This pattern of computation is called a list traversal. 8.3 List lengthThe function len returns the length of a list. It is a good idea to use this value as the upper bound of a loop instead of a constant. That way, if the size of the list changes, you won't have to go through the program changing all the loops; they will work correctly for any size list: horsemen = ["war", "famine", "pestilence", "death"]
The last time the body of the loop is executed, i is len(horsemen) - 1, which is the index of the last element. When i is equal to len(horsemen), the condition fails and the body is not executed, which is a good thing, because len(horsemen) is not a legal index. Although a list can contain another list, the nested list still counts as a single element. The length of this list is four: ['spam!', 1, ['Brie', 'Roquefort', 'Pol le Veq'], [1, 2, 3]]
As an exercise, write a loop that traverses the previous list and prints the length of each element. What happens if you send an integer to len? 8.4 List membershipin is a boolean operator that tests membership in a sequence. We used it in Section 7.10 with strings, but it also works with lists and other sequences: >>> horsemen = ['war', 'famine', 'pestilence', 'death']
Since "pestilence" is a member of the horsemen list, the in operator returns true. Since "debauchery" is not in the list, in returns false. We can use the not in combination with in to test whether an element is not a member of a list: >>> 'debauchery' not in horsemen
8.5 Lists and for loopsThe for loop we saw in Section 7.3 also works with lists. The generalized syntax of a for loop is: for VARIABLE in LIST:
This statement is equivalent to: i = 0
The for loop is more concise because we can eliminate the loop variable, i. Here is the previous loop written with a for loop. for horseman in horsemen:
It almost reads like English: "For (every) horseman in (the list of) horsemen, print (the name of the) horseman." Any list expression can be used in a for loop: for number in range(20):
The first example prints all the even numbers between zero and nineteen. The second example expresses enthusiasm for various fruits. 8.6 List operationsThe + operator concatenates lists: >>> a = [1, 2, 3]
Similarly, the * operator repeats a list a given number of times: >>> [0] * 4
The first example repeats [0] four times. The second example repeats the list [1, 2, 3] three times. 8.7 List slicesThe slice operations we saw in Section 7.4 also work on lists: >>> list = ['a', 'b', 'c', 'd', 'e', 'f']
If you omit the first index, the slice starts at the beginning. If you omit the second, the slice goes to the end. So if you omit both, the slice is really a copy of the whole list. >>> list[:]
8.8 Lists are mutableUnlike strings, lists are mutable, which means we can change their elements. Using the bracket operator on the left side of an assignment, we can update one of the elements: >>> fruit = ["banana", "apple", "quince"]
With the slice operator we can update several elements at once: >>> list = ['a', 'b', 'c', 'd', 'e', 'f']
We can also remove elements from a list by assigning the empty list to them: >>> list = ['a', 'b', 'c', 'd', 'e', 'f']
And we can add elements to a list by squeezing them into an empty slice at the desired location: >>> list = ['a', 'd', 'f']
8.9 List deletionUsing slices to delete list elements can be awkward, and therefore error-prone. Python provides an alternative that is more readable. del removes an element from a list: >>> a = ['one', 'two', 'three']
As you might expect, del handles negative indices and causes a runtime error if the index is out of range. You can use a slice as an index for del: >>> list = ['a', 'b', 'c', 'd', 'e', 'f']
As usual, slices select all the elements up to, but not including, the second index. 8.10 Objects and valuesIf we execute these assignment statements, a = "banana"
we know that a and b will refer to a string with the letters "banana". But we can't tell whether they point to the same string. There are two possible states:
In one case, a and b refer to two different things that
have the same value. In the second case, they refer to the same
thing. These "things" have names Every object has a unique identifier, which we can obtain with the id function. By printing the identifier of a and b, we can tell whether they refer to the same object. >>> id(a)
In fact, we get the same identifier twice, which means that Python only created one string, and both a and b refer to it. Interestingly, lists behave differently. When we create two lists, we get two objects: >>> a = [1, 2, 3]
So the state diagram looks like this:
a and b have the same value but do not refer to the same object. 8.11 AliasingSince variables refer to objects, if we assign one variable to another, both variables refer to the same object: >>> a = [1, 2, 3]
In this case, the state diagram looks like this:
Because the same list has two different names, a and b, we say that it is aliased. Changes made with one alias affect the other: >>> b[0] = 5
Although this behavior can be useful, it is sometimes unexpected or undesirable. In general, it is safer to avoid aliasing when you are working with mutable objects. Of course, for immutable objects, there's no problem. That's why Python is free to alias strings when it sees an opportunity to economize. 8.12 Cloning listsIf we want to modify a list and also keep a copy of the original, we need to be able to make a copy of the list itself, not just the reference. This process is sometimes called cloning, to avoid the ambiguity of the word "copy." The easiest way to clone a list is to use the slice operator: >>> a = [1, 2, 3]
Taking any slice of a creates a new list. In this case the slice happens to consist of the whole list. Now we are free to make changes to b without worrying about a: >>> b[0] = 5
As an exercise, draw a state diagram for a and b before and after this change. 8.13 List parametersPassing a list as an argument actually passes a reference to the list, not a copy of the list. For example, the function head takes a list as an argument and returns the first element: def head(list):
Here's how it is used: >>> numbers = [1, 2, 3]
The parameter list and the variable numbers are aliases for the same object. The state diagram looks like this:
Since the list object is shared by two frames, we drew it between them. If a function modifies a list parameter, the caller sees the change. For example, deleteHead removes the first element from a list: def deleteHead(list):
Here's how deleteHead is used: >>> numbers = [1, 2, 3]
If a function returns a list, it returns a reference to the list. For example, tail returns a list that contains all but the first element of the given list: def tail(list):
Here's how tail is used: >>> numbers = [1, 2, 3]
Because the return value was created with the slice operator, it is a new list. Creating rest, and any subsequent changes to rest, have no effect on numbers. 8.14 Nested listsA nested list is a list that appears as an element in another list. In this list, the three-eth element is a nested list: >>> list = ["hello", 2.0, 5, [10, 20]]
If we print list[3], we get [10, 20]. To extract an element from the nested list, we can proceed in two steps: >>> elt = list[3]
Or we can combine them: >>> list[3][1]
Bracket operators evaluate from left to right, so this expression gets the three-eth element of list and extracts the one-eth element from it. 8.15 MatricesNested lists are often used to represent matrices. For example, the matrix:
might be represented as: >>> matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
matrix is a list with three elements, where each element is a row of the matrix. We can select an entire row from the matrix in the usual way: >>> matrix[1]
Or we can extract a single element from the matrix using the double-index form: >>> matrix[1][1]
The first index selects the row, and the second index selects the column. Although this way of representing matrices is common, it is not the only possibility. A small variation is to use a list of columns instead of a list of rows. Later we will see a more radical alternative using a dictionary. 8.16 Strings and listsTwo of the most useful functions in the string module involve lists of strings. The split function breaks a string into a list of words. By default, any number of whitespace characters is considered a word boundary: >>> import string
An optional argument called a delimiter can be used to specify which characters to use as word boundaries. The following example uses the string ai as the delimiter: >>> string.split(song, 'ai')
Notice that the delimiter doesn't appear in the list. The join function is the inverse of split. It takes a list of strings and concatenates the elements with a space between each pair: >>> list = ['The', 'rain', 'in', 'Spain...']
Like split, join takes an optional delimiter that is inserted between elements: >>> string.join(list, '_')
As an exercise, describe the relationship between string.join(string.split(song)) and song. Are they the same for all strings? When would they be different? 8.17 Glossary
Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.
|