Abstraction

In addition to using local functions we can use abstraction as a method of improving the structure of our code. Usually, abstraction is accomplished after the main body of the code is done; in other words, it is commonly a refactoring technique.

The method of abstraction consists of taking highly repetitive code and refacting out the identical parts. The result is a single, shared helper function - called abstract because it is more general or less detailed than the original code.

Some other advantages of abstraction is that it can be used to make programs smaller or it can separate knowledge domains more clearly.

Basic abstraction method

  1. Identify two or more highly repetitive expressions
  2. Introduce a new function: a) around one copy of repetitive code b) with a more general name c) add a parameter for varying position d) use parameter in varying position
  3. replace specific expressions: a) with calls to abstract function b) pass varying value

Reference to HtDF recipe

When applying abstraction to a function, we are generally moving backwards through the HtDF recipe. Remember that abstraction is generally a refactoring technique, and therefore is generally used after the program is working. Thus, for abstraction the steps are the following:

  1. Test and debug until correct
  2. Code the function body
  3. Template and inventory
  4. Define examples, wrap each in check-expect
  5. Signature, purpose, and stub

From examples

The easiest way to design abstract function is to use already existing code that is already very repetitive. Considr the following, for example:

;; ListOfString -> Boolean
;; produce true if los includes "UBC"
 
(define (contains-ubc? los)
  (cond [(empty? los) false]
        [else
         (if (string=? (first los) "UBC")
             true
             (contains-ubc? (rest los)))]))
 
;; ListOfString -> Boolean
;; produce true if los includes "McGill"
 
(define (contains-mcgill? los)
  (cond [(empty? los) false]
        [else
         (if (string=? (first los) "McGill")
             true
             (contains-mcgill? (rest los)))]))

In the two functions above the code is exactly the same, except for the specific string that we are looking for, “UBC” or “McGill”. The function can be easily abstracted (and the code shortened) by using simple parameterization. The design is to simply replace each specific string with a parameter that is passed to the function definition:

(define (contains? s los)                 ; Add s parameter
  (cond [(empty? los) false]
        [else
         (if (string=? (first los) s)     ; Replace specific with param
             true
             (contains s (rest los)))]))
 
(define (contains-ubc? los) (contains? "UBC" los))
(define (contains-mcgill? los) (contains? "McGill" los))

We can see that by using paramters we can largely cut the amount of code that is written, making the program easier to read. Additionally, we can keep the original function names and reserv the functionality of the program by simply remapping the original functions - contains-ubc? los, for example - to our new abstract function in the body.

Note that the process can be applied to varying function calls inside the body of a bigger function. For example, consider the following functions where the first calculates the square of a list of numbers whie the second calculates the square root:

(define (squares lon)
  (cond [(empty? lon) empty]
        [else
         (cons (sqr (first lon))
               (squares (rest lon)))]))
 
(define (square-roots lon)
  (cond [(empty? lon) empty]
        [else
         (cons (sqrt (first lon))
               (square-roots (rest lon)))]))

The abstraction process for these functions is the same basic methodology. The fact that the varying position is a function call itself is of little importance; we can usually pass functions as parameters to other functions:

(define (map2 fn lon)           ; Add fn parameter for func call
    (cond [(empty? lon) empty]
    [else
        (cons (fn (first lon))  ; Replace specific call with variabl
        (map2 fn (rest lon)))])); Add extra parameter to recursive call
 
;; Replace specific expressions with calls to abstract function
(define (squares lon) (map2 sqr lon))
(define (square-roots lon) (map2 sqrt lon))

This definition of map2, where another function is passed as a parameter, is known as a “higher order function”. A higher order function is one that has a function passed as a parameter, and it can:

  1. consume one or more functions
  2. produce a function

Unit tests and purposes

Unit tests

When writing check-expects for abstract functions, we can abstract the unit tests themselves. This works by simply taking the existing examples and tests and renaming the tested function to th abstract version, as well as passing whatever paramters are needed. For example, the original tests might look lik this:

(check-expect (contains-ubc? empty) false)
(check-expect (contains-ubc? (cons "McGill" empty)) false)
(check-expect (contains-ubc? (cons "UBC" empty)) true)
(check-expect (contains-ubc? (cons "McGill" (cons "UBC" empty))) true)

Which we can then abstract to the following. Note that in doing so we would take the tests from all abstracted functions and then delete redundant tests:

(check-expect (contains? "UBC" empty) false)
(check-expect (contains? "UBC" (cons "McGill" empty)) false)
(check-expect (contains? "UBC" (cons "UBC" empty)) true)
(check-expect (contains? "UBC" (cons "McGill" (cons "UBC" empty))) true)

Purposes

Sometimes the purpose of the abstract function can be abstracted from the original purposes, but now always. Oftentimes the purpose becomes the most difficult part of abstraction; in the case of contains? the purpose is straight forward: produce true if ListOfString contains String.

In the case of map2, however, the abstract purpose is much more difficult. This is because the function can map any aritrary function to a list of numbers; as a result, the purpose would have to either be arbitrary large or far more abstract.

An appropriate purpose for map2, then, looks something like this:

;; given fn and (list n0 n1 ...) produce (list (fn n0) (fn n1) ...)

In general, finding appropriate stubs for abstract functions such as this will take considerably more effort.

Signatures

To write the signature of a function we must do “manual type-inference”. That is to say, we derive the signature from the function itself by observing the function itself.

(define (contains? s los)
    (cond [(empty? los) false]
          [else
            (if (string=? (first los) s)
                true
                (contains? s (rest los)))]))

Considering the function above, the first step is to notice a few things:

  1. How many arguments
  2. The type of the arguments themselves
  3. The output

With this, we want to start filling in blanks on a signature line:

___ ___ -> ___

Looking at what the function produces - from the outputs of the cond - we see that it is either true or false. Thus we can infer that the product of the function is a boolean:

___ ___ -> Boolean

We can then look at what the parameters are. Consider s, for example. From the function body we can tell that s is passed into the recursive call, which in itself is inconclusive. However, it is also used in the if line, where we have (string=? (first los) s). Because we know that (string=?) must take a string as an argument, we can infer that s is a string.

String ___ -> Boolean

Finally, for the second argument we can see in the function body that we use both (first los) and (rest los), indicating that los must be a list. Additionally, we know los must be a list of strings, given that (first los) is always passed as an argument to (string=?), just as s was. As a result, we can write:

String (listof String) -> Boolean

Signatures for higher order functions

Another case is one in which we have a higher order function, such as map:

(define (map2 fn lon)
    (cond [(empty? lon) empty]
          [else
            (cons (fn (first lon))
                  (map2 fn (rest lon)))]))

The process of creating a signature for this type of function is the same. However, the signature itself must be of a more abstract nature. For example, we know that the first argument is fn, which itself takes (first lon) as an argument. In a higher order function, the type of a function parameter is that function’s signature wrapped in parenthesis.

(___ -> ___) ___ -> ___

Given that map2 produces either empty or (cons ...) we can tell that its product is a list. Likewise, we know that the second argument is a list, as the body of map2 uses (first lon) and (rest lon). Thus:

(___ -> ___) (listof ___) -> (listof ___)

Because the elements of the list argument passed to map2 is only used when passed directly into fn, map2 itself never really sees the content of lon. In other words, only fn operates on (first lon); map2 itself never does.

As a consequence, whatever lon is made up of, it must be something that fn can accept. To express this, we can use type parameters in the signature. By convention, these parameters are uppercase letters (X, Y, Z, A, B, C, etc.).

Notice, too, that the output of map2 works in the same way as the input; the list uses (fn (first lon)) and then a recursive call. As a result, the type of the output list must be of the same type as the output of fn itself:

(X -> Y) (listof X) -> (listof Y)