Previous Up Next

Buy this book at Amazon.com

This HTML version of the book is provided as a convenience, but some math equations are not translated correctly. The PDF version is more reliable.

Chapter 5  Functions

5.1  Name Collisions

Remember that all of your scripts run in the same workspace, so if one script changes the value of a variable, all your other scripts see the change. With a small number of simple scripts, that’s not a problem, but eventually the interactions between scripts become unmanageable.

For example, the following (increasingly familiar) script computes the sum of the first n terms in a geometric sequence, but it also has the side-effect of assigning values to A1, total, i and a.

A1 = 1; total = 0; for i=1:10 a = A1 * 0.5^(i-1); total = total + a; end ans = total

If you were using any of those variable names before calling this script, you might be surprised to find, after running the script, that their values had changed. If you have two scripts that use the same variable names, you might find that they work separately and then break when you try to combine them. This kind of interaction is called a name collision.

As the number of scripts you write increases, and they get longer and more complex, name collisions become more of a problem. Avoiding this problem is one of the motivations for functions.

5.2  Functions

A function is like a script, except

  • Each function has its own workspace, so any variables defined inside a function only exist while the function is running, and don’t interfere with variables in other workspaces, even if they have the same name.
  • Function inputs and outputs are defined carefully to avoid unexpected interactions.

To define a new function, you create an M-file with the name you want, and put a function definition in it. For example, to create a function named myfunc, create an M-file named myfunc.m and put the following definition into it.

function res = myfunc(x) s = sin(x) c = cos(x) res = abs(s) + abs(c) end

The first word of the file has to be the word function, because that’s how MATLAB tells the difference between a script and a function file.

A function definition is a compound statement. The first line is called the signature of the function; it defines the inputs and outputs of the function. In this case the input variable is named x. When this function is called, the argument provided by the user will be assigned to x.

The output variable is named res, which is short for “result.” You can call the output variable whatever you want, but as a convention, I like to call it res. Usually the last thing a function does is assign a value to the output variable.

Once you have defined a new function, you call it the same way you call built-in MATLAB functions. If you call the function as a statement, MATLAB puts the result into ans:

>> myfunc(1)

s = 0.84147098480790

c = 0.54030230586814

res = 1.38177329067604

ans = 1.38177329067604

But it is more common (and better style) to assign the result to a variable:

>> y = myfunc(1)

s = 0.84147098480790

c = 0.54030230586814

res = 1.38177329067604

y = 1.38177329067604

While you are debugging a new function, you might want to display intermediate results like this, but once it is working, you will want to add semi-colons to make it a silent function. Most built-in functions are silent; they compute a result, but they don’t display anything (except sometimes warning messages).

Each function has its own workspace, which is created when the function starts and destroyed when the function ends. If you try to access (read or write) the variables defined inside a function, you will find that they don’t exist.

>> clear >> y = myfunc(1); >> who

Your variables are: y

>> s ??? Undefined function or variable ’s’.

The only value from the function that you can access is the result, which in this case is assigned to y.

If you have variables named s or c in your workspace before you call myfunc, they will still be there when the function completes.

>> s = 1; >> c = 1; >> y = myfunc(1); >> s, c

s = 1 c = 1

So inside a function you can use whatever variable names you want without worrying about collisions.

5.3  Documentation

At the beginning of every function file, you should include a comment that explains what the function does.

function res = myfunc (x) s = sin(x); c = cos(x); res = abs(s) + abs(c); end

When you ask for help, MATLAB prints the comment you provide.

>> help myfunc res = myfunc (x) Compute the Manhattan distance from the origin to the point on the unit circle with angle (x) in radians.

There are lots of conventions about what should be included in these comments. Among other things, it is a good idea to include

  • The signature of the function, which includes the name of the function, the input variable(s) and the output variable(s).
  • A clear, concise, abstract description of what the function does. An abstract description is one that leaves out the details of how the function works, and includes only information that someone using the function needs to know. You can put additional comments inside the function that explain the details.
  • An explanation of what the input variables mean; for example, in this case it is important to note that x is considered to be an angle in radians.
  • Any preconditions and postconditions.

5.4  Function names

There are three “gotchas” that come up when you start naming functions. The first is that the “real” name of your function is determined by the file name, not by the name you put in the function signature. As a matter of style, you should make sure that they are always the same, but if you make a mistake, or if you change the name of a function, it is easy to get confused.

In the spirit of making errors on purpose, change the name of the function in myfunc to something_else, and then run it again.

If this is what you put in myfunc.m:

function res = something_else (x) s = sin(x); c = cos(x); res = abs(s) + abs(c); end

Then here’s what you’ll get:

>> y = myfunc(1); >> y = something_else(1); ??? Undefined command/function ’something_else’.

The second gotcha is that the name of the file can’t have spaces. For example, if you write a function and name the file my func.m, which the MATLAB editor will happily allow you to do, and then try to run it, you get:

>> y = my func(1) ??? y = my func(1) | Error: Unexpected MATLAB expression.

The third gotcha is that your function names can collide with built-in MATLAB functions. For example, if you create an M-file named sum.m, and then call sum, MATLAB might call your new function, not the built-in version! Which one actually gets called depends on the order of the directories in the search path, and (in some cases) on the arguments. As an example, put the following code in a file named sum.m:

function res = sum(x) res = 7; end

And then try this:

>> sum(1:3)

ans = 6

>> sum

ans = 7

In the first case MATLAB used the built-in function; in the second case it ran your function! This kind of interaction can be very confusing. Before you create a new function, check to see if there is already a MATLAB function with the same name. If there is, choose another name!

5.5  Multiple input variables

Functions can, and often do, take more than one input variable. For example, the following function takes two input variables, a and b:

function res = hypotenuse(a, b) res = sqrt(a^2 + b^2); end

If you remember the Pythagorean Theorem, you probably figured out that this function computes the length of the hypotenuse of a right triangle if the lengths of the adjacent sides are a and b. (There is a MATLAB function called hypot that does the same thing.)

If we call it from the Command Window with arguments 3 and 4, we can confirm that the length of the third side is 5.

>> c = hypotenuse(3, 4)

c = 5

The arguments you provide are assigned to the input variables in order, so in this case 3 is assigned to a and 4 is assigned to b. MATLAB checks that you provide the right number of arguments; if you provide too few, you get

>> c = hypotenuse(3) ??? Input argument "b" is undefined.

Error in ==> hypotenuse at 2 res = sqrt(a^2 + b^2);

This error message is confusing, because it suggests that the problem is in hypotenuse rather than in the function call. Keep that in mind when you are debugging.

If you provide too many arguments, you get

>> c = hypotenuse(3, 4, 5) ??? Error using ==> hypotenuse Too many input arguments.

Which is a better message.

5.6  Logical functions

In Section 4.4 we used logical operators to compare values. MATLAB also provides logical functions that check for certain conditions and return logical values: 1 for “true” and 0 for “false”.

For example, isprime checks to see whether a number is prime.

>> isprime(17)

ans = 1

>> isprime(21)

ans = 0

The functions isscalar and isvector check whether a value is a scalar or vector; if both are false, you can assume it is a matrix (at least for now).

To check whether a value you have computed is an integer, you might be tempted to use isinteger. But that would be wrong, so very wrong. isinteger checks whether a value belongs to one of the integer types (a topic we have not discussed); it doesn’t check whether a floating-point value happens to be integral.

>> c = hypotenuse(3, 4)

c = 5

>> isinteger(c)

ans = 0

To do that, we have to write our own logical function, which we’ll call isintegral:

function res = isintegral(x) if round(x) == x res = 1; else res = 0; end end

This function is good enough for most applications, but remember that floating-point values are only approximately right; in some cases the approximation is an integer but the actual value is not.

5.7  An incremental development example

Let’s say that we want to write a program to search for “Pythagorean triples:” sets of integral values, like 3, 4 and 5, that are the lengths of the sides of a right triangle. In other words, we would like to find integral values a, b and c such that a2 + b2 = c2.

Here are the steps we will follow to develop the program incrementally.

  • Write a script named find_triples and start with a simple statement like x=5.
  • Write a loop that enumerates values of a from 1 to 3, and displays them.
  • Write a nested loop that enumerates values of b from 1 to 4, and displays them.
  • Inside the loop, call hypotenuse to compute c and display it.
  • Use isintegral to check whether c is an integral value.
  • Use an if statement to print only the triples a, b and c that pass the test.
  • Transform the script into a function.
  • Generalize the function to take input variables that specify the range to search.

So the first draft of this program is x=5, which might seem silly, but if you start simple and add a little bit at a time, you will avoid a lot of debugging.

Here’s the second draft:

for a=1:3 a end

At each step, the program is testable: it produces output (or another visible effect) that you can check.

5.8  Nested loops

The third draft contains a nested loop:

for a=1:3 a for b=1:4 b end end

The inner loop gets executed 3 times, once for each value of a, so here’s what the output loops like (I adjusted the spacing to make the structure clear):

>> find_triples

a = 1 b = 1 b = 2 b = 3 b = 4

a = 2 b = 1 b = 2 b = 3 b = 4

a = 3 b = 1 b = 2 b = 3 b = 4

The next step is to compute c for each pair of values a and b.

for a=1:3 for b=1:4 c = hypotenuse(a, b); [a, b, c] end end

To display the values of a, b and c, I am using a feature we haven’t seen before. The bracket operator creates a new matrix which, when it is displayed, shows the three values on one line:

>> find_triples

ans = 1.0000 1.0000 1.4142 ans = 1.0000 2.0000 2.2361 ans = 1.0000 3.0000 3.1623 ans = 1.0000 4.0000 4.1231 ans = 2.0000 1.0000 2.2361 ans = 2.0000 2.0000 2.8284 ans = 2.0000 3.0000 3.6056 ans = 2.0000 4.0000 4.4721 ans = 3.0000 1.0000 3.1623 ans = 3.0000 2.0000 3.6056 ans = 3.0000 3.0000 4.2426 ans = 3 4 5

Sharp-eyed readers will notice that we are wasting some effort here. After checking a=1 and b=2, there is no point in checking a=2 and b=1. We can eliminate the extra work by adjusting the range of the second loop:

for a=1:3 for b=a:4 c = hypotenuse(a, b); [a, b, c] end end

If you are following along, run this version to make sure it has the expected effect.

5.9  Conditions and flags

The next step is to check for integral values of c. This loop calls isintegral and prints the resulting logical value.

for a=1:3 for b=a:4 c = hypotenuse(a, b); flag = isintegral(c); [c, flag] end end

By not displaying a and b I made it easy to scan the output to make sure that the values of c and flag look right.

>> find_triples

ans = 1.4142 0 ans = 2.2361 0 ans = 3.1623 0 ans = 4.1231 0 ans = 2.8284 0 ans = 3.6056 0 ans = 4.4721 0 ans = 4.2426 0 ans = 5 1

I chose the ranges for a and b to be small (so the amount of output is manageable), but to contain at least one Pythagorean triple. A constant challenge of debugging is to generate enough output to demonstrate that the code is working (or not) without being overwhelmed.

The next step is to use flag to display only the successful triples:

for a=1:3 for b=a:4 c = hypotenuse(a, b); flag = isintegral(c); if flag [a, b, c] end end end

Now the output is elegant and simple:

>> find_triples

ans = 3 4 5

5.10  Encapsulation and generalization

As a script, this program has the side-effect of assigning values to a, b, c and flag, which would make it hard to use if any of those names were in use. By wrapping the code in a function, we can avoid name collisions; this process is called encapsulation because it isolates this program from the workspace.

In order to put the code we have written inside a function, we have to indent the whole thing. The MATLAB editor provides a shortcut for doing that, the Increase Indent command under the Text menu. Just don’t forget to unselect the text before you start typing!

The first draft of the function takes no input variables:

function res = find_triples () for a=1:3 for b=a:4 c = hypotenuse(a, b); flag = isintegral(c); if flag [a, b, c] end end end end

The empty parentheses in the signature are not strictly necessary, but they make it apparent that there are no input variables. Similarly, when I call the new function, I like to use parentheses to remind me that it is a function, not a script:

>> find_triples()

The output variable isn’t strictly necessary, either; it never gets assigned a value. But I put it there as a matter of habit, and also so my function signatures all have the same structure.

The next step is to generalize this function by adding input variables. The natural generalization is to replace the constant values 3 and 4 with a variable so we can search an arbitrarily large range of values.

function res = find_triples (n) for a=1:n for b=a:n c = hypotenuse(a, b); flag = isintegral(c); if flag [a, b, c] end end end end

Here are the results for the range from 1 to 15:

>> find_triples(15)

ans = 3 4 5 ans = 5 12 13 ans = 6 8 10 ans = 8 15 17 ans = 9 12 15

Some of these are more interesting than others. The triples 5,12,13 and 8,15,17 are “new,” but the others are just multiples of the 3,4,5 triangle we already knew.

5.11  A misstep

When you change the signature of a function, you have to change all the places that call the function, too. For example, suppose I decided to add a third input variable to hypotenuse:

function res = hypotenuse(a, b, d) res = (a.^d + b.^d) ^ (1/d); end

When d is 2, this does the same thing it did before. There is no practical reason to generalize the function in this way; it’s just an example. Now when you run find_triples, you get:

>> find_triples(20) ??? Input argument "d" is undefined.

Error in ==> hypotenuse at 2 res = (a.^d + b.^d) ^ (1/d);

Error in ==> find_triples at 7 c = hypotenuse(a, b);

So that makes it pretty easy to find the error. This is an example of a development technique that is sometimes useful: rather than search the program for all the places that use hypotenuse, you can run the program and use the error messages to guide you.

But this technique is risky, especially if the error messages make suggestions about what to change. If you do what you’re told, you might make the error message go away, but that doesn’t mean the program will do the right thing. MATLAB doesn’t know what the program is supposed to do, but you should.

And that brings us to the Eighth Theorem of debugging:

Error messages sometimes tell you what’s wrong, but they seldom tell you what to do (and when they try, they’re usually wrong).

5.12  continue

As one final improvement, let’s modify the function so that it only displays the “lowest” of each Pythagorean triple, and not the multiples.

The simplest way to eliminate the multiples is to check whether a and b share a common factor. If they do, then dividing both by the common factor yields a smaller, similar triangle that has already been checked.

MATLAB provides a gcd function that computes the greatest common divisor of two numbers. If the result is greater than 1, then a and b share a common factor and we can use the continue statement to skip to the next pair:

function res = find_triples (n) for a=1:n for b=a:n if gcd(a,b) > 1 continue end c = hypotenuse(a, b); if isintegral(c) [a, b, c] end end end end

continue causes the program to end the current iteration immediately (without executing the rest of the body), jump to the top of the loop, and “continue” with the next iteration.

In this case, since there are two loops, it might not be obvious which loop to jump to, but the rule is to jump to the inner-most loop (which is what we wanted).

I also simplified the program slightly by eliminating flag and using isintegral as the condition of the if statement.

Here are the results with n=40:

>> find_triples(40)

ans = 3 4 5 ans = 5 12 13 ans = 7 24 25 ans = 8 15 17 ans = 9 40 41 ans = 12 35 37 ans = 20 21 29

There is an interesting connection between Fibonacci numbers and Pythagorean triples. If F is a Fibonacci sequence, then

(Fn Fn+3, 2 Fn+1 Fn+2Fn+12 + Fn+22 ) 

is a Pythagorean triple for all n ≥ 1.

Exercise 1   Write a function named fib_triple that takes an input variable n, uses fibonacci2 to compute the first n Fibonacci numbers, and then checks whether this formula produces a Pythagorean triple for each number in the sequence.

5.13  Mechanism and leap of faith

Let’s review the sequence of steps that occur when you call a function:

  1. Before the function starts running, MATLAB creates a new workspace for it.
  2. MATLAB evaluates each of the arguments and assigns the resulting values, in order, to the input variables (which live in the new workspace).
  3. The body of the code executes. Somewhere in the body (often the last line) a value gets assigned to the output variable.
  4. The function’s workspace is destroyed; the only thing that remains is the value of the output variable and any side effects the function had (like displaying values or creating a figure).
  5. The program resumes from where it left off. The value of the function call is the value of the output variable.

When you are reading a program and you come to a function call, there are two ways to interpret it:

  • You can think about the mechanism I just described, and follow the execution of the program into the function and back, or
  • You can take the “leap of faith”: assume that the function works correctly, and go on to the next statement after the function call.

When you use built-in functions, it is natural to take the leap of faith, in part because you expect that most MATLAB functions work, and in part because you don’t generally have access to the code in the body of the function.

But when you start writing your own functions, you will probably find yourself following the “flow of execution.” This can be useful while you are learning, but as you gain experience, you should get more comfortable with the idea of writing a function, testing it to make sure it works, and then forgetting about the details of how it works.

Forgetting about details is called abstraction; in the context of functions, abstraction means forgetting about how a function works, and just assuming (after appropriate testing) that it works.

5.14  Glossary

side-effect:
An effect, like modifying the workspace, that is not the primary purpose of a script.
name collision:
The scenario where two scripts that use the same variable name interfere with each other.
input variable:
A variable in a function that gets its value, when the function is called, from one of the arguments.
output variable:
A variable in a function that is used to return a value from the function to the caller.
signature:
The first line of a function definition, which specifies the names of the function, the input variables and the output variables.
silent function:
A function that doesn’t display anything or generate a figure, or have any other side-effects.
logical function:
A function that returns a logical value (1 for “true” or 0 for “false”).
encapsulation:
The process of wrapping part of a program in a function in order to limit interactions (including name collisions) between the function and the rest of the program.
generalization:
Making a function more versatile by replacing specific values with input variables.
abstraction:
The process of ignoring the details of how a function works in order to focus on a simpler model of what the function does.

5.15  Exercises

Exercise 2   Take any of the scripts you have written so far, encapsulate the code in an appropriately-named function, and generalize the function by adding one or more input variables.

Make the function silent and then call it from the Command Window and confirm that you can display the output value.

Are you using one of our books in a class?

We'd like to know about it. Please consider filling out this short survey.


Think Bayes

Think Python

Think Stats

Think Complexity


Previous Up Next