Bitter Compiler

Bitter is a demonstration language for my compiler design class.  

Description of the bitter language:

Bitter


Bitter Semantics

    Bitter is a bit string manipulation language.  A bit string is a sequence of '!' characters and '.' characters.  Bitter can concatenate two strings, or complement a string.  for example, the bitter statements
        _x = !.! + ...  |
    _y = -(_x + .)  |
set the variable _x to the string "!.!" concatenated with "...", which is the value "!.!...", and the variable _y to the complement of the "!.!....", which is ".!.!!!!".  The '+' operator concatenates two strings, and the '-' operator complements a string by flipping all '!' bits to '.' bits, and vice versa.  The parentheses in the second statement are necessary because '-' takes precedence over '+'.
    Bit strings can be constructed up to a length of 32 bits.  It is occasionally desirable to clear a variable (set it to the bit string of length 0).  This can be done with a statement like
        _clear(_x, _y, _z)   |
which sets the variables _x, _y, and _z all to the empty string.
    I/O in bitter is performed using the special variable _in and _out.  Assignment to _out produces a side effect of printing the new value at the console.  Reference to _in produces the side effect of reading a new value into the variable before the reference is processed.

Bitter Syntax

*  The notation '\|' is used to indicate that the character '|' is a token, rather than part of the BNF syntax.

BTM Semantics

    The Bitter Target Machine is a machine with two data types: a long word, with 32 bits, and a quad word, with 64 bits.  The quad word at address t, consists of the long words at t, and the t+4.  A bitter string is stored in a quad word t, by storing the bit pattern in the long word t, and the length of the string at t+4.
    The memory of the BTM consists of a register file, and a separate instruction memory.  The t register is used to store the results of expressions.  Other registers in memory are used to store intermediate results.  A binary operator evaluates its first operand, retrieves the result from the t register, and saves it in a temporary register.  It then evaluates its second operand, and saves the result from the t register.  It then operates on the two operands, and stores its result in the t register, to be used by other operators.
    Input and output in bitter is accomplished by calling the library routines input and output.

BTM Syntax

    Each instruction to the BTM is either a triplet, consisting of a verb, a source operand, and a destination operand, or a pair, consisting of a verb and a destination operand.

In BTM assembly, comments begin with a semicolon, and end at the end of the line.  Constants can either be written in decimal, such as d$\32, or in binary, such as b$\100000.

BTM Code Generation
 
The following code templates specify the code written by the compiler. The first column gives the syntactic source construct.  The second column shows the resulting target code.  
 

Program 
    _start statements _stop
     
    declareq t
    declareq _in 
    declareq _out 
    statements
Assignment 
    id = exp |

    declareq id
    movq t id
     

    if id = _out then call output id

Concatenation 
    exp1 + exp2
        
    exp1 
    declareq t1 
    movq t t1 
    exp2 

    shiftl t+4 t1
    declarel t2 
    movl d$\32 t2 
    subl t+4 t2 
    shiftl t2
    unshiftl t2
    orl t1
    addl t1+4 t+4

Complement 
    - exp
     
    compl t
Variable Reference 
    id
        

    movq id

    if id = _in then call input id

String Constant 
    str
     
    movl b$str.value
    movl d$str.length t+4
Clear 
    _clear( ..., id, ...) |
        
    declareq id
    clrq id