Every language has three mechanisms for combining simple ideas to form more complex ideas. Primitive expressions represent the simplest entities the language is concerned with, means of combination build compound elements from simpler ones, and means of abstraction allow naming and manipulating compound elements as units.
In programming there are two kinds of elements: procedures and data (really they are not so distinct). Data is "stuff", procedures are descriptions for manipulating the data.
This chapter deals with simple numerical data to focus on the rules for building processes. Later chapters use the same rules to build procedures to manipulate compound data as well.
At a Lisp interpreter, you type an expression, and the interpreter displays the result of evaluating the expression. e.g. if you type a number (a simple expression), the interpreter will print back the number.
Expressions can also be formed by a list of expressions within parentheses, called combinations, e.g. (+ 2.7 10)
. The leftmost element is the operator, the other elements are the operands. The value is obtained by applying the procedure specified by the operator to the arguments, the values of the operands.
Note that Lisp uses prefix notation, which is distinctive and differs from usual mathematical notation. An advantage is that it can accommodate arbitrary numbers of arguments. Another advantage is that it allows nesting.
We can make complex nested expressions more readable by pretty-printing them, such that operands are aligned vertically.
The interpreter always operates in the same cycle: it reads an expression from the terminal, evaluates the expression, and prints the result. Sometimes called a read-eval-print loop (repl). Note that you don't have to explicitly instruct the interpreter to print the result.
A name identifies a variable whose value is an object. Scheme uses define
to name things. define
is the language's simplest means of abstraction.
The memory that keeps track of the name-value pairs is the environment (or global environment, as there can be multiple).
Goal of chapter: isolate issues about thinking procedurally. Consider, in evaluating combinations the interpreter itself is following a procedure.
To evaluate a combination: 1) evaluate the subexpressions, 2) apply the procedure (the operator) to the arguments (the operands).
This example demonstrates recursion, it includes as a step the need to invoke the rule itself. In particular, tree accumulation, a general kind of process that "percolates values upwards".
Take care of primitive cases by stipulating: the values of numerals are the numbers they name, the values of built-in operators are the machine instruction sequences that carry out the corresponding operations, and the values of other names are the objects associated with those things in the environment.
Key point: the role of environment in determining meaning. An expression like (+ x 1)
doesn't mean anything without specifying the environment that provides a meaning for x
or even +
.
Exceptions to the above evaluation rule are special forms, e.g. define
. Special forms have their own evaluation rules, and constitute the syntax of the language. Lisp has a very simple syntax.
Procedure definitions, a powerful abstraction technique by which a compound operation can be given a name and referred to as a unit. e.g. (define (square x) (* x x))
.
The general form of a procedure definition is (define (<name> <formal parameters>) <body>)
.
Compound procedures are used exactly the same as primitive procedures. One could not tell by looking at their use whether a procedure is built-in to the interpreter or a compound procedure.
To evaluate combinations with compound procedures, the interpreter follows the same basic process as for primitive procedures.
"To apply a compound procedure to arguments, evaluate the body of the procedure with each formal parameter replaced by the corresponding argument."
Called the substitution model for procedure application. A model that determines the "meaning" of procedure application. But two important points:
First, substitution helps think about procedure application, but doesn't provide a description of what the interpreter really does.
Second, the book will present a sequence of increasingly elaborate models of how interpreters work, culminating in a full implementation. Substitution model is first of these.
In the model above, the interpreter first evaluates the operator and operands and then applies the resulting procedure to the resulting arguments. An alternative model would not evaluate operands until their values were needed. It would first substitute operand expressions for parameters until it obtained an expression with only primitive operators, and then perform the evaluation.
This alternative, "fully expand and then reduce", in contrast to "evaluate the arguments and then apply" (that the interpreter actually uses), is called normal-order evaluation as opposed to applicative-order evaluation.
It can be proven that, for the types of procedures in the first two chapters of the book, normal-order and applicative-order yield the same results (although see Exercise 1.5 for an "invalid" value).
Lisp uses applicative-order, partly because of efficiency, and more significantly because normal-order is much more complicated to deal with when we leave the realm of procedures that can be evaluated by substitution. But normal order is still a valuable tool.
Lisp has a special form for notating a case analysis (e.g. absolute value), called cond
(stands for "conditional"). The general form is:
(cond (<p_1> <e_1>)
(<p_2> <e_2>)
...
(<p_n> e_n>))
The expressions (<p> <e>)
are called clauses. The first expression in the pair is a predicate, evaluted as either true or false.
First predicate is evaluated first, if false, second predicate is evaluated, and so on until one is true. At which point the corresponding consequent expression is the returned value. If no predicates are true, value is undefined. Can also use else
as a special symbol to be used in place of the final predicate.
Another special form if
is a restricted conditional that can be used when there are exactly two cases.
(if <predicate> <consequent> <alternative>)
In addition to primitive predicates like <
, =
, >
, there are logic composition operations:
(and <e_1> ... <e_n>)
(or <e_1> ... <e_n>)
(not <e>)
Note that and
and or
are special forms, not procedures, because the subexpressions are not necessarily all evaluated. not
is an ordinary procedure.
Below is a sequence of expressions. What is the result printed by the interpreter in response to each expression? Assume that the sequence is to be evaluated in the order in which it is presented.
10
(+ 5 3 4)
(- 9 1)
(/ 6 2)
(+ (* 2 4) (- 4 6))
(define a 3)
(define b (+ a 1))
(+ a b (* a b))
(= a b)
(if (and (> b a) (< b (* a b)))
b
a)
(cond ((= a 4) 6)
((= b 4) (+ 6 7 a))
(else 25))
(+ 2 (if (> b a) b a))
(* (cond ((> a b) a)
((< a b) b)
(else -1))
(+ a 1))
The result of running each line is indicated with a comment.
#lang sicp
10 ; 10
(+ 5 3 4) ; 12
(- 9 1) ; 8
(/ 6 2) ; 3
(+ (* 2 4) (- 4 6)) ; 6
(define a 3) ; <#undef>
(define b (+ a 1)) ; <#undef>
(+ a b (* a b)) ; 19
(= a b) ; #f
(if (and (> b a) (< b (* a b)))
b
a) ; 4
(cond ((= a 4) 6)
((= b 4) (+ 6 7 a))
(else 25)) ; 16
(+ 2 (if (> b a) b a)) ; 6
(* (cond ((> a b) a)
((< a b) b)
(else -1))
(+ a 1)) ; 16
Translate the following expression into prefix form:
(5 + 4 + (2 - (3 - (6 + 4/5)))) / (3 (6 - 2) (2 - 7))
(/ (+ 5 4 (- 2 (- 3 (+ 6 (/ 4 5))))) (* 3 (- 6 2) (- 2 7)))
Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers.
(define (f x y z)
(define (square x)
(* x x ))
(define (sum-of-squares x y)
(+ (square x) (square y)))
(define (is-smallest n)
(and (<= n x) (<= n y) (<= n z)))
(cond ((is-smallest x) (sum-of-squares y z))
((is-smallest y) (sum-of-squares x z))
(else (sum-of-squares x y))))
; Test:
(f 5 8 4) ; => 89
Observe that our model of evaluation allows for combinations whose operators are compound expressions. Use this observation to describe the behavior of the following procedure:
(define (a-plus-abs-b a b)
((if (> b 0) + -) a b))
The procedure prints the sum of a and b if b is positive, otherwise the difference.
(a-plus-abs-b 5 -10 ) ; => 15
Ben Bitdiddle has invented a test to determine whether the interpreter he is faced with is using applicative-order evaluation or normal-order evaluation. He defines the following two procedures:
(define (p) (p))
(define (test x y)
(if (= x 0) 0 y))
Then he evaluates the expression
(test 0 (p))
What behavior will Ben observe with an interpreter that uses applicative-order evaluation? What behavior will he observe with an interpreter that uses normal-order evaluation? Explain your answer. (Assume that the evaluation rule for the special form if
is the same whether the interpreter is using normal or applicative order: The predicate expression is evaluated first, and the result determines whether to evaluate the consequent or the alternative expression.)
Using applicative order, (p) will be replaced by (p) in an infinite loop. Using normal order the result 0 will be observed. The reason being that with normal order (p) will never be executed since the if will be true. With applicative order (p) will be evaluated over and over and over.
Procedures are similar to ordinary mathematical functions. But an important difference is procedures must be effective.
Consider computing square roots. Can define the square-root function as
sqrt(x) = the y such that y >= 0 and y^2 = x
This is a legitimate mathematical function but does not describe a procedure. It doesn't help to rephrase it in pseudo-Lisp:
(define (sqrt x)
(the y (and (>= y 0)
(= (square y) x))))
Contrast between function and procedure is reflection of distinction between describing properties of things and describing how to do things. The difference between declarative knowledge and imperative knowledge. Math is generally concerned with declarative (what is), computer science is concerned with imperative (how to).
Most common way to compute square roots is Newton's method of successive approximations. If we have a guess y
, can perform a simple manipulation to get a better guess by averaging y
with x/y
.
To formalize in terms of procedures, start with a value for the radicand, and a value for the guess. If the guess is "good enough", we are done; if not, repeat with improved guess.
(define (square x) (* x x))
(define (sqrt-iter guess x)
(if (good-enough? guess x)
guess
(sqrt-iter (improve guess x) x)))
(define (improve guess x)
(average guess (/ x guess)))
(define (average x y)
(/ (+ x y) 2))
(define (good-enough? guess x)
(< (abs (- (square guess) x)) 0.001))
(define (sqrt x)
(sqrt-iter 1.0 x))
Above good-enough?
test is not great, but gets us started. The above also demonstartes the language introduced so far is enough to write any purely numerical program that could be written in e.g. C. Surprising since we have not introduced any iterative constructs.
Alyssa P. Hacker doesn’t see why if
needs to be provided as a special form. “Why can’t I just define it as an ordinary procedure in terms of cond
?” she asks. Alyssa’s friend Eva Lu Ator claims this can indeed be done, and she defines a new version of if
:
(define (new-if predicate then-clause else-clause)
(cond (predicate then-clause)
(else else-clause)))
Eva demonstrates the program for Alyssa:
(new-if (= 2 3) 0 5) ; => 5
(new-if (= 1 1) 0 5) ; => 0
Delighted, Alyssa uses new-if
to rewrite the square-root program:
(define (sqrt-iter guess x)
(new-if (good-enough? guess x)
guess
(sqrt-iter (improve guess x) x)))
What happens when Alyssa attempts to use this to compute square roots? Explain.
If you are evaluating e.g. (sqrt-iter 1 4), then the key step is this one:
(if (good-enough? 1 4) 1 (sqrt-iter (improve 1 4) 1))
(new-if (good-enough? 1 4) 1 (sqrt-iter (improve 1 4) 1))
In the first case since we are using the 'if' special form we will evaluate the predicate first and only the expression (sqrt-iter) if the predicate is false. In the second case there's no special form so we will evaluate expression-out leading to infinite calls to sqrt-iter.
The good-enough?
test used in computing square roots will not be very effective for finding the square roots of very small numbers. Also, in real computers, arithmetic operations are almost always performed with limited precision. This makes our test inadequate for very large numbers. Explain these statements, with examples showing how the test fails for small and large numbers. An alternative strategy for implementing good-enough?
is to watch how guess
changes from one iteration to the next and to stop when the change is a very small fraction of the guess. Design a square-root procedure that uses this kind of end test. Does this work better for small and large numbers?
The test won't be good for small numbers because our "good enough" error of 0.001 becomes a larger and larger fraction of the number we are trying to find. e.g. (sqrt 4) is accurate to 7 decimal places but (sqrt 0.04) is only accurate to 3.
I have less of a handle on where things break down for large number but e.g. (sqrt 123456789876) never returns. I suspect this is because good-enough? is never true because we don't have enough percision at large numbers to get within 0.001 of the answer (something something floating point arithmetic).
Here are some updated definitions that look the change in the guess. I think this would work better for small numbers.
(define (sqrt-iter last-guess guess x)
(cond ((good-enough? last-guess guess x) guess)
(else (sqrt-iter guess (improve guess x) x))))
(define (good-enough? last-guess guess x)
(= (/ (abs (- last-guess guess)) guess) 0.001))
Actually I think we can just compare last-guess to guess and stop when they are equal.
(define (good-enough? last-guess guess)
(= last-guess guess))
Note that these can take a while to run.
Newton’s method for cube roots is based on the fact that if y
is an approximation to the cube root of x
,then a better approximation is given by the value
(x/y^2 + 2y) / 3
Use this formula to implement a cube-root procedure analogous to the square-root procedure. (In Section 1.3.4 we will see how to implement Newton’s method in general as an abstraction of these square-root and cube-root procedures.)
Et Voila. The equality check for good enough doesn't work here, but I'm trying to learn lisp not math, so I'm moving on.
(define (cbrt x)
(define (cbrt-iter y-1 y)
(define (good-enough?)
(< (/ (abs (- y-1 y)) y) 0.0001))
(define (improve y)
(/ (+ (/ x (square y)) (* 2 y)) 3))
(if (good-enough?)
y
(cbrt-iter y (improve y))))
(cbrt-iter 100 1 x))
(cbrt 27) ; => ~3
sqrt
is first example of a process defined by a set of mutually defined procedures. Note that the definition is recursive, defined in terms of itself.
Also observe that computing square roots breaks up naturally into a number of subproblems (whether a guess is good enough, how to improve a guess, etc). Each accomplished by a separate procedure.
It is crucial that each procedure accomplishes an identifiable task that can be used as a module elsewhere, as a black-box. In good-enough?
we are not concerned with how square
computes its result, just that it works. As far as good-enough?
is concerned, square
is not a procedure but a procedural abstraction. Any procedure that computes a square would be equally good.
A detail of implementation that should not matter to the user of a procedure: implementer's choice of names for the formal parameters.
This principle seems to be self evident but has profound consequences. First, the parameter names of the procedure must be local to the body. Called a bound variable, say that the procedure definition binds its formal parameters. The meaning of the procedure is unchanged if a bound variable is consistently renamed throughout the definition.
If a variable is not bound, it is free. The set of expressions for which a binding defines a name is called the scope.
The square-root program illustrates another way we'd like to control the use of names. The problem with the existing structure is a user only cars about the sqrt
procedure. The others (sqrt-iter
, good-enough?
, improve
) only clutter up. Problem is more severe in larger programs.
We allow a procedure to have internal definitions that are local to that procedure:
(define (sqrt x)
(define (good-enough? guess x)
(< (abs (- (square guess) x)) 0.001))
(define (improve guess x) (average guess (/ x guess)))
(define (sqrt-iter guess x)
(if (good-enough? guess x)
guess
(sqrt-iter (improve guess x) x)))
(sqrt-iter 1.0 x))
(sqrt 64) ; => 8
This nesting of definition is called block structure. However, in addition to internalizing the definitions of the procedures, we can simplify them. It is not necessary to pass x explicitly to each procedure, instead we can allow it to be a free variable in the internal definitions. This discipline is called lexical scoping.
(define (sqrt x)
(define (good-enough? guess)
(< (abs (- (square guess) x)) 0.001))
(define (improve guess)
(average guess (/ x guess)))
(define (sqrt-iter guess)
(if (good-enough? guess)
guess
(sqrt-iter (improve guess))))
(sqrt-iter 1.0))
(sqrt 64) ; => 8
Idea of block structure originated in Algol 60. Now appears in most advanced programming languages.