Previous Page Next Page

Chapter 13
THE SMALLTALK LANGUAGE

Smalltalk is easy to learn and use because of its simple syntax and semantics, and few concepts. The concepts object, class, message, and method form the basis of programming in Smalltalk. The methodology for using Smalltalk consists of:

Objects

Objects are self-describing data structures. Every object is an instance of a class. The class defines the structure and behavior of all of its instances. The class of an object is determined by sending it the message class. This message is understood by all objects.

Objects are protected data structures. The data stored inside of an object is accessible only through messages. Objects can also be shared.

The word self in a method refers to the receiver object of the message that invokes the method.

Variables

All Smalltalk variables are containers for objects. A variable contains a single object pointer.

The variable name can be used in an expression to refer to the object whose pointer it contains. A variable may contain different object pointers at different times. The object pointer contained in a variable changes when an assignment expression is evaluated. An assignment makes a copy of a pointer to an object, not a copy of the object itself.

Variables are either private or shared. Private variables are accessible only to a single object. Shared variables are accessible to multiple objects. A private variable has a lower-case first letter, while a shared variable has an upper-case first letter.

There are three kinds of variables:

Instance Variables

Each object maintains its own internal state. The private memory of an object consists of its individually accessible components called instance variables. Instance variables are similar to fields of a record structure in other languages. Instance variables either have a name or are referred to with an integer index. Named instance variables are accessed by using their name. Indexed instance variables are accessed and changed only through messages (usually using at: and at:put: messages with integer indices). Each member of a class has its own separate instance variables.

Instance variables have a type--they contain either pointers or bytes. All instance variables for objects belonging to the same class are the same type. Most objects' instance variables contain pointers. The pointers refer to objects. If an object contains bytes, then its instance variables contain eight-bit values representing elementary data values.

Classes may specify both named and indexed instance variables for their member objects. The number and names of named instance variables are fixed for all members of the class. The number of instance variables may differ among members of the same class. For example, #(l 2 3) and #('up' 'down') are both objects of class Array, but they have different numbers of indexed instance variables, 3 and 2 respectively. A class with indexed instance variables creates new members with a message that specifies the number of indexed instance variables to create (usually the message new: with an integer argument). Many objects return their number of indexed instance variables in response to the message size.

Only the instance variables of the receiver of the message that invoked the method can be referred to by name.

Temporary Variables

Temporary variables include method arguments and method temporaries, and temporaries contained in block arguments.

Method arguments are assigned the associated message arguments for the message which caused method invocation. Method temporaries are initialized to nil upon method invocation. Block arguments are assigned the associated message arguments for the value: message at the time the block is activated. When a block is invoked while its containing method is still active, the block and the containing method share the same temporary variables.

Shared Variables

Shared variables are defined in Dictionaries called pools. Different kinds of shared variables are defined in different kinds of pools. All shared variable names start with an upper-case letter. The variable name and the variable value are bound together into an object that is an instance of class Association. This association object is placed in the pool.

The System Dictionary Smalltalk is a pool which contains all the global variables.

Global variables are accessible from every object.

Class variables for each class are implicitly collected into a pool for the class. Class variables are defined as part of the class specification. Class variables are accessible only to the class, subclasses, instances of the class, and instances of the subclasses.

Pool variables are contained in named Pool Dictionaries that you explicitly construct. Pool Dictionaries are global variables. To make pool variables accessible to a class and its instances, you must modify the class specification.

Classes

Classes are the program modules of Smalltalk because they describe data structures (objects), algorithms (methods), and external interfaces (message protocols). Classes provide complete capabilities to solve a particular problem.

Every object is an instance of some class. All objects which are instances of a class are similar because they have the same structure (i.e., the same instance variables), the same messages to which they respond, and the same available methods.

Classes are also objects contained in global variables which are maintained in the System Dictionary Smalltalk. As such, class names begin with a capital letter. This allows classes to be referred to in an expression.

The Class Hierarchy

Classes form a hierarchy, consisting of a root class, called Object, and many subclasses. Each class inherits the functionality of all its superclasses in the hierarchy. Class Object provides the common behavior for all objects. It includes methods for printing an object symbolically, for testing the class of an object, and for making a copy of an object. Each subclass builds on its superclasses by adding its own methods and instance variables to complete the implementation of the subclass's behavior.

Here is the complete Smalltalk/V class hierarchy.

Many of the key classes of Smalltalk are discussed in Chapter 14. A comprehensive listing of the Smalltalk specific classes is found in Part 4. For details on host operating system specific classes, especially DynamicDataExchange, WinHandle, WinInfo and WinStructure and their subclasses, see Appendix 2: Windows Classes and API Calls.

Inheritance

A class inherits all of its superclasses' instance variables, class variables, and methods. Inheritance of class variables allows the methods of a class to refer to the class variables defined in its superclasses.

Inheritance of instance variables allows the methods of a class to refer to the instance variables defined in its superclasses, but it also means that superclass instance variables are included in objects which are instances of the class.

Determining what method to perform starts with two pieces of information: the message selector and the class of the receiver of the message. The available methods for the class of the receiver are examined to see if there is a method which matches the message selector. If a match is found, that method is performed. If not, the superclass of the class of the receiver is used, and the check for a method matching the selector is performed again.

Checking for a matching method and advancing to the superclass is repeated until the method is found or until the end of the superclass chain is reached. In the latter case, a programming error occurs and a message which describes the error is sent to the receiver of the original message.

There is a special syntax form for a receiver, super, which changes the initial class used for age lookup. The word super has two implications.

The major purpose of a message to super is to be able to use a method in a superclass which is redefined in a subclass.

Class Messages

Messages to class objects are used for creating instances of the class and for initializing class variables. The most common messages for creating new instances are new and new:. Some .classes define their own messages for creating instances.

Like all objects, classes know to which messages they can respond. For other objects, the methods available are determined by the object's class. Class objects, too, belong to a "class," called a metaclass, which determines the messages to which the class can respond.

There are two important classes relating to metaclasses:

Every metaclass has exactly one instance: the class of which it is the metaclass.

Specifying a New Class

In order for you to add a new class, you first choose a superclass on which you will build. Make the new class a subclass of the chosen superclass, then add the instance variables and methods necessary to complete the new class's functionality. Classes are normally specified using a Class Hierarchy Browser.

Classes are defined by sending a message to the new or modified class's superclass with class specification information as arguments. The class information that can be specified is the following:

The message which specifies a class is sent to its superclass. There are three class definition messages. They are as follows:

ä   subclass: subclassSymbol
instanceVariableNames:
instanceVariableNameString
classVariableNames: classVariableNameString 
      poolDictionaries: poolDictionaryNameString
ä   variableSubclass: subclassSymbol
instanceVariableNames:
instanceVariableNameString
classVariableNames: classVariableNameString 
      poolDictionaries: poolDictionaryNameString
ä   variableByteSubclass: subclassSymbol
classVariableNames: classVariableNameString 
      poolDictionaries: poolDictionaryNameString

The first two messages define classes whose member objects contain pointers. The first message specifies objects with named instance variables (zero or more of them). The second message specifies objects with both named and indexed instance variables.

The third message defines classes whose member objects contain bytes. Objects with bytes contain only indexed instance variables, so there is no instance variable name string argument. Objects with bytes define elementary data values such as strings of characters.

Messages and Methods

All processing in a Smalltalk system involves sending messages to objects. Messages are the language of interaction which you use in order to express your computing requirements to objects. Messages request services from an object in terms of its variableByteSubclass interface.

Methods are the algorithms which are performed by an object in response to receiving a variableByteSubclass. Methods represent the internal details of the implementation of an object.

Protocol definitions for a class always have two parts--class methods and instance methods.

Class methods implement the messages sent to the class. The receiver of a class message is always the class object, not an instance of the class. All classes are global variables and can be referred to by their names.

Instance methods implement messages sent to instances of the class. The receiver of an instance message is always an object that is an instance of the class.

A method contains a sequence of Smalltalk expressions. There are four types of expressions:

   #variableByteSubclass    #(l  2  4  16)    'magic'    10    16rlF
   Smalltalk    x    replacementCollection
bag add:  stream next
100 factorial
   array at: index + 10 put: Bag new
   [:x :y | x name < y name ]

The beginning of a method defines its name, arguments, and any temporary variables that it uses.

Sending a message involves:

The following sections present the syntax of methods and messages both informally with examples and more precisely using a syntax metalanguage. The metalanguage definition appears in Appendix 1. If you find that the informal presentation is sufficient, you can skip over the syntax rules. A complete syntax summary and cross reference are also presented in Appendix 1.

The Syntax of Variable Names and Literals

Variable names and literals are the elemental building blocks used in higher-level syntax forms in Smalltalk.

Variable Names

A variable name identifies a variable in an object. A variable is a container for an object pointer. A variable name is a sequence of letters and digits, beginning with a letter. Example variable names are:

Display       aString      elements    x2

Variable names beginning with an upper case letter represent shared variables, while those beginning with a lower-case letter represent private variables. The rules for variable names are:

<rule>    variableName = identifier.
<rule>    identifier = letter { letter | digit }

Literals

A literal defines an object of class Number, String, Character, Symbol, or Array. Examples are given below where each of the possible literal forms is defined. The syntax rule for literals is:

<rule>    literal = number | string | characterConstant | symbolConstant
               | arrayConstant.

Numbers

Numbers are objects of class Float, Fraction, or Integer. If a number contains a decimal point, it is an object of class Float. If it contains a negative exponent and no decimal point, it belongs to class Fraction. All other numbers belong to class Integer. If the number includes r, the digits preceding r define the number radix. In this case, capital letters are used to represent digit values greater than 9, with A = 10, B = 1 1, etc. Example numbers are:

15       16rFF         3.1416    le-3      -100

The rules for numbers are:

<rule>       number = [digits "r"] ["-"] bigDigits ["."bigDigits] ["e" ["-"] digits].
<rule>       digits = digit {digit}.
<rule>       Digit = "0" | "l" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9".
<rule>       bigDigits = bigDigit {bigDigit}.
<rule>          bigDigit = digit | capitalLetter.

Strings

A string is a sequence of characters enclosed in single quotes. It is an object of class String which is a sequence of objects of class Character that can be indexed. Strings are not necessarily constant; their characters may be changed by sending a message to the string. Paired quotes within a string reduce to a single quote in the resultant string object. Example strings are:

'hello'   ' '     'isn''t'    ' " comment in string" '

The rules for strings are:

<rule>    string =  " ' " {character |  " ' ' " | ' " ' } " ' " .
<rule>    character = letter | digit | selectorCharacter | "[" | "]" | "(" | ")"

| "^" | ";" | "$" | "#" | ":" | "." | "{" | "}" | " ".

<rule>    selectorCharacter =  "," | "+" | "/" | "\" | "*" | "~" | ">" | "<" | "=" 
               | "@" | "%" | "|" | "&" | "?" | "!" .

Comments

A comment is a sequence of characters enclosed in double quotes. A comment is ignored anywhere within a method, except when occurring within a string. Example comments are:

"Answer the size of the receiver"   "goodBye"

The rule for comments is:

<rule>    comment = ' " ' {character | " ' "} ' " '.

Character Constants

A character constant is an object of class Character. A character constant appears as a dollar sign followed by any character. Example character constants are:

$$    $a  $'  $ $.

The rule for character constants is:

<rule>    characterConstant = "$" character | "$" " ' " | "$" ' " '.

Symbols

A symbol is an object of class Symbol, a sequence of objects of class Character which can be indexed. Symbols differ from strings in that their characters may not be changed. A symbol constant identifies the associated symbol object. The form of a symbol constant is a number sign, #, followed by the characters of the symbol. Example symbol constants are:

#+   #asOrderedCollection   #at:put: #==

The rules for symbols and symbol constants are:

<rule>       symbolConstant = "#" symbol.
<rule>       symbol = unarySelector | binarySelector | keyword {keyword}.
<rule>       unarySelector = identifier.
<rule>       binarySelector = selectorCharacter [selectorCharacter] | "-".
<rule>       keyword = identifier ":".

Arrays

An array is an object of class Array which may be indexed by an integer from one through the size of the array. An array is a series of literals enclosed in parentheses. An array constant identifies the associated array object. It consists of an array preceded by a number sign. Example array constants and arrays are:

#('red' 'blue' 'green')
#(yes no)
#(l 'two' three $4 (5))

The rules for arrays and array constants are:

<rule>       arrayConstant = "#" array.
<rule>       array = "(" {number | string | symbol | array |
            characterConstant} ")". 

Expression Syntax

The actions in a method are specified by a series of expressions separated by periods. A period is optional after the last expression of the series. Each expression computes a single object as its result. The expression may also include assignment of its result to one or more variables.

The final expression in an expression series may be preceded by a caret (^). The caret means that method execution terminates and answers the object computed by the expression.

The rules for expressions and expression series are:

<rule>       expressionSeries = {expression "."} [ ["^"] expression].
<rule>       expression = (variableName ":="} (primary |
   messageExpression {";" cascadeMessage} ). 
<rule>       primary = variableName | literal | block | " (" expression ")".

A message expression is a request to an object (the receiver of the message) to perform a computation and return an object as the answer. There are three kinds of message expressions: unary, binary, and keyword (n-ary). Each has a different precedence and a different syntax for its selector, the name of the message.

A unary expression sends a series of unary messages which are evaluated from left to right. A unary message has no arguments.

A binary expression sends a series of binary messages which are evaluated from left to right. A binary message has a single argument following the binary selector. The traditional arithmetic operators are implemented in Smalltalk using binary expressions. This gives all arithmetic operators the same precedence. Parentheses may be used to specify other than left-to-right evaluation.

A keyword expression sends a single keyword message with one or more arguments. The arguments to a keyword message are evaluated from left to right.

The selector of a keyword message is the concatenation of all the keywords in the message.

Unary expressions have highest precedence, followed by binary and then keyword. Parentheses may be used to specify a different evaluation order.

Cascaded messages are a series of messages to the same receiver. Each message after the first is preceded by a semicolon.

The rules for message expressions are:

<rule>       messageExpression = unaryExpression | binaryExpression |
         keywordExpression.  
<rule>       cascadeMessage = unaryMessage | binaryMessage |
         keywordMessage.  
<rule>       unaryExpression = primaryMessage {unaryMessage}.
<rule>       binaryExpression = (unaryExpression | primary)
         binaryMessage {binaryMessage}.
<rule>       keywordExpression = (binaryExpression | primary)
         keywordMessage. 
<rule>       unaryMessage = unarySelector.
<rule>       binaryMessage = binarySelector (unaryExpression | primary).
<rule>       keywordMessage = keyword (binaryExpression | primary) |
               {keyword (binaryExpression | primary) }.

Blocks

A block is a part of a method enclosed in square brackets. It is an object describing executable code. Blocks may be nested.

A block may have arguments. These are specified between the left bracket and vertical bar by preceding each block argument variable name with a colon.

The result of block execution is the final expression in the block. A block with no arguments is executed by sending it the message value.

A block with one argument is executed by sending it the message value:. The argument to the value: message is assigned to the block argument upon block execution.

A two-argument block is executed by sending it the message value:value:. The value:value: arguments are assigned to the block arguments.

A block may contain an expression preceded by a caret (^). Evaluation of such an expression causes termination of execution for both the block and the method in which the block appears.

Blocks are the basis for control structures in Smalltalk. Since control structures conform to keyword message syntax, control structures have no special syntax.

The rule for blocks is:

<rule>       block = "["[{":" variableName} " | "] expressionSeries "]".

Method Syntax

A complete method specification includes a message pattern, optional primitive number, optional temporaries, and an expression series. The message pattern specifies how to send a message to request method execution. It includes the method selector and the variable names used to refer to arguments within the method.

The rules for method syntax are:

<rule>       method = messagePattern [PrimitiveNumber] [temporaries]
expressionSeries.
<rule>       messagePattern = unarySelector | binarySelector variableName
| keyword variableName {keyword variableName}.
<rule>       primitiveNumber = "<" "Primitive:" number ">".
<rule>       temporaries = " |" {variableName} "|".

Control Structures

Control structures are invoked by sending messages with blocks as arguments. Three forms, with several variations, are predefined in the Smalltalk language. You may define additional forms in Smalltalk using these predefined ones.

Conditional Execution

The following predefined conditional execution messages are available:

ifTrue: aBlock
ifFalse: aBlock
ifTrue: trueBlock ifFalse: falseBlock
ifFalse: falseBlock ifTrue: trueBlock

In all cases, the receiver expression must be of class Boolean and the arguments must be blocks with no arguments. The ifTrue: argument block (if present) is sent the message value, if and only if, the receiver has the value true. The ifFalse: argument block (if present) is sent the message value, if and only if, the receiver has the value false. The answer of the conditional messages is the last expression in the executed block or nil if no block is executed.

Iterative Execution

The following predefined iterative execution messages are available:

whileTrue: aBlock
whileFalse: aBlock

Both the receiver and argument of these messages must be no-argument blocks. For whileTrue:, the receiver block is sent the message value. If it answers true, the argument block is sent the message value. The iteration continues until the answer of the first block evaluation is false. For whileFalse:, the sequence is the same but the iteration continues until the answer of the first block evaluation is true. The answer of both whileTrue: and whileFalse: is always nil.

Short Circuit Boolean Evaluation

The following predefined Boolean operators are available:

and: aBlock
or: aBlock

The receiver of each of these messages must be of class Boolean and the argument must be a block. For and:, if the receiver is true, the block is sent the message value, and the answer of the message is the last block expression. If, however, the receiver of the and: message is false, the answer is false, and the block is not evaluated. For or:, if the receiver is false, the block is sent the message value, and the answer is the last block expression. If, however, the receiver of the or: message is true, the answer is true, and the block is not evaluated.

Previous Page Next Page