Programming

Data Types

Integer

An integer is a data type which stores a whole number, e.g. 2147483647. An integer is stored in memory as a binary number.

An integer is typically stored as 4 bytes of memory (a byte being 8 bits, so it is stored as 32 bits of memory each with value 1 or 0.)

An integer may be signed or unsigned.

If an integer is signed, this means that it has a 'sign bit', indicating whether the integer is positive or negative. For example, in the case of a signed 4-byte integer, there is 1 bit representing the sign and 31 bits of data. (Signed integers are stored using two's complement notation, which makes mathematical operations work more easily on binary numbers.)

If an integer is unsigned, it does not have a sign bit; therefore it can only represent positive numbers (and 0).

Float

A 'float', or floating point, is a representation of a rational number, e.g. 0.5.

A floating point number is so named as it is stored using a similar method to the representation of numbers in scientific notation in mathematics. That is to say, the location of the decimal point is defined separately to the value of the number.

Scientific notation involves two main components: a number itself and an exponent, typically a power of 10. Here are two examples:

A floating point number is stored similarly. Modern floating point numbers typically involve three main components:

  • The sign, which behaves the same as the sign in an integer
  • The significand (also called mantissa or coefficient), which is the 'main' part of the number, performing the same role as in the above examples
  • The exponent, analogous to the or in the above examples. Note that only the exponent itself is stored; for example, in the first example above, one would only store the number , not .

Boolean

A boolean data type simply represents a true / false value.

Character

A character is stored as an integer, where each integer is mapped onto a character.

Two formats of representing characters are as follows:

ASCII: Stands for American Standard Code for Information Interchange. Every character is represented as an integer between 0 and 127, typically using 1 byte of memory. A full table of the mappings between integers and characters as per ASCII may be found at http://www.asciitable.com/.

Unicode: This is more modern than ASCII, able to represent many more characters, including characters from different languages than English. Typical implementations of Unicode use up to 4 bytes per character. UTF-8 is the most commonly used character encoding by websites, and the first 128 'code points' (integer-character mappings) are the same as ASCII. Therefore an ASCII string may be interpreted as UTF-8.

String

A string is represented as a list / array of characters.

Typically a string consists of a number of characters stored sequentially, with the character NULL (stored as 0) used to indicate the end of the string.

Array

An array is a data type containing a number of elements of the same data type in sequence. For example, an array may contain a number of integers, a number of floats, or even a number of other arrays, so long as each of these have the same data type.

Record

A record is similar to an array, but differs on some basic points:

  • Each element, or field, of a record is named.
  • Different fields may have different data types. (The same field, however, must always have the same data type.)

Here is a basic example of a record type, represented as a table:

Person
Name (type String) Age (type Integer)

One example of a record is as follows:

A (type Person)
Bob 30

Note that any variable of type Person must have its name as a string, and its age as an integer.

Code and Compilers

There are three main types of code.

  • Source Code: Human-readable code written at a high level for compilation / interpretation
  • Byte Code: Source code which has been compiled to an intermediary between source code and executable code. Byte code is fed into a VM which is then converted to code usable on the hardware. Not platform-specific. The only mainstream language that uses byte code is Java.
  • Executable Code: Source code which has been compiled into an executable for use on a particular machine; not platform-specific

There are two main methods of converting a program for execution.

  • Compiler: A compiler takes source code and generates executable code from it. Executing a compiled file is faster than interpreting a file.
  • Interpreter: An interpreter reads through a file line-by-line and converts and executes the lines sequentially. It will not pick up as many errors in code as a compiler will.

Programming Concepts

Constants

A constant is an unmodifiable value in a program. A constant would be used when, for example, one wanted to terminate a program after a certain number of lines of input had been read.

A constant may have any type.

Variables

A variable is a value in a program that may be modified. A variable must have a fixed data type.

Variables may have one of a number of 'scopes':

  • Local: A local variable only exists in the 'scope' of the function or block it is defined in. A local variable may be used if a variable only needs to be used within a specific function, and is of no relevance to the remainder of the program. It cannot be accessed outside the function.
  • Global: A global variable exists throughout an entire program. An example is a variable storing the cumulative sum of numbers being read from the input.
  • Parameters: Information about parameters may be found below.

Naming conventions

Variables may be named using any of the following naming conventions:

  • Snake case: Variables are named in all lowercase with an underscore as a word separator, e.g. current_speed.
  • Uppercase: This is typically reserved for constants, or sometimes calling builtin functions. The variable (or constant) is written in all uppercase, e.g. PI.
  • Camel case: Variables are written with no separator between words, and each word or phrase is capitalised, except possibly the first. For example, personDetails or CurrentFile. A subset of camel case, Pascal Case, requires every word or phrase to be capitalised, including the first; so personDetails is not in pascal case.

Control structures

A control structure is a block which analyses the value of a variable, and chooses a direction for the program to run based on the outcome of that analysis.

The following are examples of control structures.

Sequence

Operations are performed one after the other.

Selection

The value of an expression is evaluated. If that expression is true, a certain block is executed. Otherwise, another block is executed.

There are a few types of selection.

  • One-way: A block is executed if and only if an expression is true; otherwise nothing occurs.
  • Two-way: A block is executed if an expression is true; otherwise, another block is executed.
  • Multi-way: There are multiple expressions to be evaluated, each with a different outcome. Two examples of multi-way selection control structures are case (or switch) statements, and nested if statements.

Iteration

A block is performed some number of times, or until an expression evaluates to false.

There are a few types of iteration.

  • Test first: A block is executed while a particular expression evaluates to true. The testing is done at the start of the loop, i.e. the block may not execute at all.
  • Test last: A block is executed; then, if a particular expression evaluates to true, it is executed again. This is the same as a while loop but the test is done at the end of the loop, so the block must execute at least once.
  • Fixed: A block is executed some fixed number of times, possibly defined by a variable.

Stubs

A stub is a small program that exists as a substitute for a larger, more complex program; essentially, it is 'dummy' code that is used to stand in for some other functionality that may, for example, be implemented later.

Statements

A statement is a small element of code, expressing that some action should be performed.

A statement can be generically described as 'simple' or 'compound'. Examples of simple statements include function calls, return statements and variable assignment. Examples of compound statements include the 'selection' and 'iteration' control structures.

A statement may contain other statements. For example, most iterative control structures will contain another statement.

Modularisation

Modularisation is a concept used for reducing a system's complexity. It essentially involves subdividing a system into a number of distinct 'modules', which interact with one another. These may function independently of one another, but the system as a whole requires all modules to be functioning.

Essentially, it involves subdividing a complex problem (the entire system) into a number of simpler problems.

Modularisation makes code more readable as each 'chunk' of code is much smaller; in addition it makes debugging easier as unit testing may be performed on individual modules.

Modularisation also makes it much easier to reuse code.

Functions

A function is a block of code that performs a task, possibly given inputs from the 'caller' (though this is not necessary.)

A function takes any number of 'parameters', which are inputs to the function. The function then performs some operation, and may 'return' some value to the program that called the function.

Variables instantiated within functions have local scope.

Difference between functions and methods

A function can be considered a special type of module returning a single value through its name; a method (also known as a module) does not return a value (though of course it can modify parameters which were passed to it by reference).

Scope and Lifetime

The scope of an identifier (an identifier can usually be taken to be a variable or constant) is normally the function, or block, it was defined in. (Global variables were defined within the scope of the entire program, which can be considered to be a block.) Local and global scope were discussed above, with regards to variables. (Constants almost always have global scope, unless a constant is specific to some function or block.)

The 'lifetime' of an identifier is defined by the point at which it is destroyed, as it is no longer needed. The lifetime of a global variable is the lifetime of the program itself; they are only destroyed when the program finishes executing. The lifetime of a local variable is the lifetime of its function or block.

Parameters

Parameters are variables which are given to a function for processing.

Parameters may be passed using one of two methods:

  • Value: The variable itself is copied when it is passed to the function. Modifications to the variable within the function have no effect on the variable which was originally passed to the function; the copy is entirely detached from the original. This may require large amounts of memory if the variable is large.
  • Reference: A pointer to the variable is given to the function. Modifications to the variable within the function modify the variable which was passed to the function; the variable given to the function is the same as the one which was originally passed to it. This requires less memory than passing by value (unless, of course, the variable is very small).

Errors

An error is a mistake in a program that causes it to function in some manner other than that which was expected.

There are three main types of errors.

  • Syntax errors: These occur when a program has been written in such a way that it uses keywords that have not been defined. For example, a syntax error would occur if any keyword or statement was spelt incorrectly in the program (expect, of course, those in strings).
  • Runtime errors: These occur when a program is syntactically correct but an undefined operation is performed, such as attempting to use a variable that has not been defined in an expression, or dividing by zero.
  • Logical errors: These are the hardest to fix, and occur when a programmer made an error in writing the program that caused it to work in a way that is unintended, but is distinct from a syntax or runtime error. An example is iterating too few or too many times in a fixed iteration control structure (i.e. a for loop).

Pseudocode

This section will focus on describing examples of pseudocode syntax to use in the WACE examination.

Initial steps

Every program's main block should begin with the keyword MODULE, followed by the module name (Main) and end with the keyword END, again followed by Main.

For example:

MODULE Main

     ...

END Main

(Of course one would normally write END after they had written the rest of the program, as it is difficult to predict how much space some pseudocode will require before writing it.)

Each statement is typically terminated by a newline.

There may be other module or function definitions prior to the start of the Main module's definition.

Input / Output

Input and output are performed by calling the functions INPUT and OUTPUT, respectively. Since these functions are built in, it is usually better to name them in all uppercase, although it is unlikely you will lose marks if you do not do so.

The function INPUT takes as an argument a variable name (which is passed by reference); it then outputs a line of input to this variable.

MODULE Main
     INPUT(test)
     OUTPUT("Hello World!")
END Main

Variables

A variable name may be written in camelCase or PascalCase

MODULE Main
     Num ← 123
     OUTPUT(Num)
END Main

Variable assignment is done using a left arrow: ←

Operators

The standard mathematical operators +, -, × and ∕ (representing ÷) are used to represent their respective mathematical operations.

Note that the operation + is defined for strings, or combinations of strings and numbers.

MODULE Main
     INPUT(Colour)
     Input(N)
     OUTPUT("Your favourite colour is " + Colour + ".")
     OUTPUT(N × 2 + " is more than your favourite number.")
END Main

Comparison

Unlike in most programming languages, where variable comparison uses two equals signs, pseudocode only needs to use one (since variable assignment is not an equals sign).

You may also use < and > to represent less than and more than.

Control structures

All of the control structures are given by example in the following code sample. Every loop has the same function, and the same output.

MODULE Main
     i ← 0
     WHILE (i < 10)
          IF (i%2 = 0) THEN
               OUTPUT(i)
          ELSE
               OUTPUT("ODD")
          END IF
     END WHILE

     FOR i ← 0 TO 10 DO
          IF (i%2 = 0) THEN
               OUTPUT(i)
          END IF

          IF (i%2 = 1) THEN
               OUTPUT("ODD")
          END IF
     END FOR

     k ← 0
     REPEAT
          k ← k+1
          CASE (i%2) OF
               0: OUTPUT(i)
               1: OUTPUT("ODD")
          END CASE
     UNTIL (k = 10)
END Main

Functions

If one wants to use a function that is heavily implementation-dependent, such as a function that gets the length of a string, you do not have to write the function body; just make sure that the name is descriptive enough for the marker to know what you're attempting to do.

MODULE Main
     OUTPUT(Length("Test"))
END Main

In the code sample above, the function Length has been pulled out of thin air; this is acceptable because the internal storage format of a string is not necessarily known, and most programming languages do indeed have a function to the effect of Length.

User-Defined Functions

To create a user-defined function, use the keyword FUNCTION:

FUNCTION GetAverage(TotalCost, NumberItems)
     GetAverage ← TotalCost / NumberItems
END GetAverage

MODULE Main
     INPUT(TotalCost)
     INPUT(NumberItems)
     OUTPUT(GetAverage(TotalCost, NumberItems))
END Main

Note that there is no explicit return statement, like in many programming languages; instead, the function name is assumed to be the return type.

Function names should generally be PascalCase.

There generally should not be input or output statements in functions.

Modules

To write a new module, use the MODULE keyword. (The Main module is automatically defined as the 'entry point' of the program, i.e. where it begins executing from.) A module operates similarly to a function, but there are two main differences:

  • It does not return a value; any values it may need to return must be returned through reference parameters (or the module may be changed into a function.)
  • It is called like a function, but prepended with the keyword CALL.

An example of a module is demonstrated below. (Other than, of course, the Main module.) The example below asks the user how many numbers they want to input; then, it reads that any numbers from the input, and outputs their sum.

MODULE Add(NumInputs)
     Total ← 0
     FOR (i ← 0; i < NumInputs; i ← i+1)
          INPUT(Current)
          Total ← Total + Current
     END FOR
     OUTPUT(Total)
END Add

MODULE Main
     INPUT(NumInputs)
     CALL Add(NumInputs)
END Main

Module names should generally be PascalCase.

Trace Tables

A trace table is a table which is used to test the logic of an algorithm (i.e. find and eliminate logical errors).

Below is an example of a code sample and corresponding trace table. (This is not a part of a full modularised code sample, so there is no MODULE Main statement.)

Lines that do not explicitly perform any operations (e.g. lines 3 and 6) are omitted. Every line is on a new row of the trace table.

The data input is as follows:

2, 3, 6, 5, 7, 999.

1 Largest ← 0
2 INPUT(Number)
3 REPEAT
4   IF Number > Largest THEN
5     Largest ← Number
6   END IF
7   INPUT(Number)
8 UNTIL (Number = 999)
9 OUTPUT("The largest number is "+Largest)

(The numbers on the left margin are not part of the pseudocode, they are line numbers.)

Obviously if there are multiple repeat or if statements, the corresponding columns in the trace table would be replaced with something more descriptive, e.g. Number > Largest.

Line Largest Number Repeat If Output
1 0
2 2
4 T
5 2
7 3
8 F
4 T
5 3
7 6
8 F
4 T
5 6
7 5
8 F
4 F
7 7
8 F
4 T
5 7
7 999
8 T
9 The largest number is 7

Note that only the variables which have been modified since their last assignment are included in the trace table.

Condensed Trace Table

A condensed trace table is a more compact method of achieving essentially the same goal.

The procedure for writing a condensed trace table is essentially the same as for writing a normal trace table. However, the line number column is removed.

Essentially, one steps through the program. Every time a variable is assigned, or a comparison made, the result is written according to the following rules, executed in order:

  1. If the space for the variable in the last row is empty, write it there.
  2. Create a new row and write the variable.

Therefore, one can have multiple lines of execution on the same line, but every time a new row is created, all new variables or other expressions are written on that new row.

Here is an example for the code sample above.

Largest Number Repeat If Output
0 2 T
2 3 F T
3 6 F T
6 5 F F
7 7 F T
999 T The largest number is 7

Data Validation

Data validation involves ensuring inputs are within constraints set by the program.

  • Range Checking: This involves ensuring that values are within a given range, e.g. ages should not be negative.
  • Type Checking: This involves ensuring that variables are the correct type, e.g. ages should not be strings.

results matching ""

    No results matching ""