brainfucked.py again. Or: How not to optimise

After slaving over a warm laptop (while getting lost on the train), I re-wrote my bf interpreter to use a kind of psudo-bytecode to optimise jumps, and big sets of increments and movement operations. Below is the new code.

#!/usr/bin/python

"""
brainfucked.py a simple (and hopefully easy to understand)
Brainfuck interpreter written in Python.

Official page:   http://www.muppetlabs.com/~breadbox/bf/
Wikipedia:       http://en.wikipedia.org/wiki/Brainfuck
Program archive: .


Copyright (C) 2007  Matthew Davey

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  
02110-1301, USA.
"""


from time import time
import sys
import re
import os


class BytecodeInstruction:

    OP_MOV        = 1  
    OP_INC        = 2
    OP_LOOP_START = 3
    OP_LOOP_END   = 4
    OP_INPUT      = 5
    OP_OUTPUT     = 6

    def __init__(self, opcode, value):
        self.opcode = opcode
        self.value  = value

    def __str__(self):
        if self.opcode == self.OP_MOV:
            return "MOVE %d" % (self.value,)
        elif self.opcode == self.OP_INC:
            return "INCREMENT %d" % (self.value,)
        elif self.opcode == self.OP_LOOP_START:
            return "OPEN LOOP (END: %d)" % (self.value,)
        elif self.opcode == self.OP_LOOP_END:
            return "CLOSE LOOP (START: %d)" % (self.value,)
        elif self.opcode == self.OP_INPUT:
            return "INPUT"
        elif self.opcode == self.OP_OUTPUT:
            return "OUTPUT"


class BytecodeException(Exception):
    pass


class BrainfuckedInterpreter:

    def compile_to_bytecode(self, program):
        """
        Turn the passed brainfuck program into our pretend bytecode  

        TODO: Check for NO_OPs  (OP_MOV, 0)  or  (OP_INC, 0)
        TODO: Try to find some patterns that can be expressed simplier
        """        

        # Remove all non-instructions
        program = re.findall("[[\]<>+-.,]", program)

        # Holds the generated bytecode
        bytecode = []

        # Prime the byte code to simplify the checks when compiling the first
        # instruction.  NOTE:  This should be remove at the end if it's a NOOP
        bytecode.append(BytecodeInstruction(BytecodeInstruction.OP_MOV, 0))

        # A stack for open brackets.  Holds the bytecode location, no program_data
        bracket_stack  = []

        for instruction in program:

            if instruction in ['+', '-']:

                # If the last instruction wasn't a OP_INC, then create a new
                # OP_INC instruction initilised to '0'
                if bytecode[-1].opcode != BytecodeInstruction.OP_INC:
                    bytecode.append(BytecodeInstruction(BytecodeInstruction.OP_INC, 0))

                if instruction == '+':
                    bytecode[-1].value += 1
                else:
                    bytecode[-1].value -= 1


            elif instruction in ['>', '<']:

                # Just like above, if the previous instruction was the same
                # type, then just change the amoutn moved                
                if bytecode[-1].opcode != BytecodeInstruction.OP_MOV:
                    bytecode.append(BytecodeInstruction(BytecodeInstruction.OP_MOV, 0))

                if instruction == '>':
                    bytecode[-1].value += 1
                else:
                    bytecode[-1].value -= 1


            elif instruction == ',':
                bytecode.append(BytecodeInstruction(BytecodeInstruction.OP_INPUT, False))


            elif instruction == '.':
                bytecode.append(BytecodeInstruction(BytecodeInstruction.OP_OUTPUT, False))


            elif instruction == '[':

                # We don't know where to jump to yet, so just store False
                bytecode.append(BytecodeInstruction(BytecodeInstruction.OP_LOOP_START, False))

                # Use a stack to store our current location so we can match
                # up brackets correctly
                bracket_stack.append(len(bytecode)-1)


            elif instruction == ']':

                # The location (bytecode, not program) of the open bracket
                try:
                    bracket_location = bracket_stack.pop()
                except IndexError:
                    raise BytecodeException("Unmatched ']' encoutered")

                # Add the closeing bracket here, and point it to the location
                # of the opening bracket               
                bytecode.append(BytecodeInstruction(BytecodeInstruction.OP_LOOP_END, bracket_location))

                # Now, we go back to the open bracket and fill in the location
                # that is should jump too when the cell = 0
                bytecode[bracket_location].value = len(bytecode) - 1


        # Make sure all the brackets have been matched
        if len(bracket_stack) != 0:
            raise BytecodeException("Unmatched '[' encoutered")

        return bytecode



    def run_bytecode(self, bytecode, use_stdout = False, debug = False):

        # Program output
        output = ''

        # Out Tape/Memory
        memory         = [0]
        memory_pointer = 0

        # How we are doing processing the bytecode
        bytecode_length  = len(bytecode)
        bytecode_pointer = 0


        while bytecode_pointer < bytecode_length:

            instruction = bytecode[bytecode_pointer]

            if debug:
                print instruction


            if instruction.opcode == BytecodeInstruction.OP_INC:
                memory[memory_pointer] += instruction.value

            elif instruction.opcode == BytecodeInstruction.OP_MOV:

                # Can't move before cell: 0
                if memory_pointer + instruction.value < 0:
                    raise Exception("Tried to move before the start of the memory block")

                # Are we going past the end of the list?  Then we need to expand it
                if memory_pointer + instruction.value > len(memory)-1:

                    # Wow, talk about naive :)
                    # FIXME later
                    for i in range(0, instruction.value):
                        memory.append(0)

                memory_pointer += instruction.value

            elif instruction.opcode == BytecodeInstruction.OP_LOOP_START:
                if memory[memory_pointer] == 0:
                    bytecode_pointer = instruction.value


            elif instruction.opcode == BytecodeInstruction.OP_LOOP_END:
                if memory[memory_pointer] != 0:
                    bytecode_pointer = instruction.value


            elif instruction.opcode == BytecodeInstruction.OP_INPUT:
                char = sys.stdin.read(1)

                if char == '':
                    memory[memory_pointer] = 0
                else:
                    # Remember to turn character 'A' into it's ASCII number
                    memory[memory_pointer] = ord(char)            


            elif instruction.opcode == BytecodeInstruction.OP_OUTPUT:     
                output += chr(memory[memory_pointer])

                if use_stdout:
                    sys.stdout.write(chr(memory[memory_pointer]))            


            if debug:
                print memory


            bytecode_pointer += 1


        return output




if __name__ == '__main__':

    usage = "Usage: %s <filename> [enable_timer: True | False]" % (sys.argv[0],)
    timer = False

    if len(sys.argv) not in [2,3] or sys.argv[1] == 'help':
        print usage
        sys.exit(1)


    if not os.path.exists(sys.argv[1]):
        print usage
        print "File not found"
        sys.exit(1)


    try:
        program = open(sys.argv[1]).read()
    except Exception, e:
        print usage
        print "Unable to open file: %s" % (e.__str__(),)
        sys.exit(1)


    if len(sys.argv) == 3:
        if sys.argv[2] not in ['True', 'False']:
            print usage
            print "Second argument must be 'True' or 'False'"
            sys.exit(1)

        if sys.argv[2] == 'True':
            timer = True


    if timer:
        start_time = time()


    bf = BrainfuckedInterpreter()

    bytecode = bf.compile_to_bytecode(program)

    if timer:
        compile_time = time()

    bf.run_bytecode(bytecode, use_stdout=True)
    # print bf.run_bytecode(bytecode, use_stdout=False)


    if timer:
        print
        print "Elapsed time: %0.2f  Compile time: %0.2f" % (time() - start_time, compile_time - start_time)

Now, proof that all my work paid off…

matthewd@wintermute:~/brainfucked$ time echo 50 | python brainfucked_old.py prime.bf
Primes up to: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47

real    0m13.706s
user    0m13.653s
sys     0m0.028s
matthewd@wintermute:~/brainfucked$ time echo 50 | python brainfucked.py prime.bf
Primes up to: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47

real    0m16.305s
user    0m16.293s
sys     0m0.000s

I can assure you, that’s not what I expected either.

‘Optimsing’ this program has actually taught me a valuable lesson: ‘Never assume you know what needs fixing or speeding up’. I added more complexity to this program trying to improve it’s speed without once profiling or even adding a single extra time() statement. I blindly assumed I knew what the problem was, and dived straight into fixing it without another thought.

Still, it will make it easier to implement a Ook!, whitespace, or other Brainfuck derivatives now :-)

Posted on

Real editors

I’m neither a Vim or Emacs user, but this post makes me wonder why not.

ktr73 posted the Vim equiv. to reddit

map \( s()<Esc>P
map \[ s[]<Esc>P

Posted on

Rewriting from Scratch

Joel on Software: Things You Should Never Do, Part I

Since that article was originally posted, I’ve read it three or four times, and each time I get more out of it. When working on a non-trivial program there is a real desire to throw away the code and ‘do-it-right’, one rarely wonders just how hard would it be to fixed the program, one problem at a time. Now that I’ve finally got my chance to ‘do-it-right’, I feel that there’s a distinct possibility that it will cause more problems than it will ever solve.

Posted on

brainfucked.py

Code:

#!/usr/bin/python

"""
brainfucked.py a simple (and hopefully easy to understand) Brainfuck interrupter
written in Python.

Official page:   http://www.muppetlabs.com/~breadbox/bf/
Wikipedia:       http://en.wikipedia.org/wiki/Brainfuck
Program archive: http://esoteric.sange.fi/brainfuck/bf-source/prog/

Only one program confirmed not working:
 * Ben Olmstead 99 Bottles of Beer.  Appears to be caused by non-wrapping 8bit
   cells (after 91 bottles, cell becomes -1 and then enters infinte loop)


Copyright (C) 2007  Matthew Davey  <buzzard AT project DASH 2501 DOT net>

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
"""


from time import time
import sys
import re
import os


"""
Find matching '[' and ']' brackets.  Use a FIFO stack to keep brackets matched,
and a dictionary to map both left->right and right->left.
"""
def find_matching_brackets(program_data):

    stack = []
    brackets = {}

    program_length = len(program_data)
    program_pointer = 0

    while program_pointer < program_length:

        if program_data[program_pointer] == '[':
            stack.append(program_pointer)

        elif program_data[program_pointer] == ']':
            if len(stack) == 0:
                raise Exception("Unmatched brackets, extra closing ']'")

            # Pull the location of the last open bracket from the stack
            val = stack.pop()

            # Creating a mapping left->right as well as right->left
            brackets[val] = program_pointer;
            brackets[program_pointer] = val;

        program_pointer += 1

    if len(stack) > 0:
        raise Exception("Unmatched brackets, too many opening '['")

    return brackets



"""
Runs a Brainfuck program, passed as a string.  All comments are striped out at
start.

 * use_stdout flag indicates is we should print after each '.'
 * strict_eight_bit_cell indicates if we should enforce each cells range to be 0
   to 255 inclusive with no wrapping

Function returns outputted characters as a string (even if use_stdout is True)
"""
def run_program(program, use_stdout = False, strict_eight_bit_cell = False):

    # Remove all non-instructions
    program = re.findall("[[\]<>+-.,]", program)

    # Contains all printed characters
    output = ''

    # Points to the location of the current instruction
    instruction_pointer = 0
    program_length = len(program)

    # Memory or Tape.  Initialised to a single cell with the value 0
    tape = [0]

    # Pointer to the 'head' running over the tape
    tape_pointer = 0

    # Returns a dictionary of matching left->right and right->left brackets
    brackets = find_matching_brackets(program)

    # Until end of the program
    while instruction_pointer < program_length:

        current_operator = program[instruction_pointer]


        # Move to next cell
        if current_operator == '>':

            tape_pointer += 1

            # We are going passed the end of the tape
            if tape_pointer > len(tape)-1:
                tape.append(0)


        # Move to previous cell
        elif current_operator == '<':

            # Test for leaving the start of the tape
            if tape_pointer <= 0:
                raise Exception("Tried to move left 1 cell at position 0")

            tape_pointer -= 1


        # Decrement current cell
        elif current_operator == '+':
            tape[tape_pointer] += 1

            if strict_eight_bit_cell and tape[tape_pointer] > 0xff:
                raise Exception("Tried to increment past 255")


        # Increment current cell
        elif current_operator == '-':
            tape[tape_pointer] -= 1

            if strict_eight_bit_cell and tape[tape_pointer] < 0:
                raise Exception("Tried to decrement below 0")


        # Open bracket  
        elif current_operator == '[':
            if tape[tape_pointer] == 0:
                # We know the left bracket, so jump to the right one
                instruction_pointer = brackets[instruction_pointer]


        # Close bracket
        elif current_operator == ']':
            if tape[tape_pointer] != 0:
                # We know the right bracket, so jump to the left one
                instruction_pointer = brackets[instruction_pointer]


        # Read a single character, set current cell to '0' on EOF
        elif current_operator == ',':
            char = sys.stdin.read(1)

            if char == '':
                tape[tape_pointer] = 0
            else:
                # Remember to turn character 'A' into it's ASCII number
                tape[tape_pointer] = ord(char)


        # Record outputted character, and if use_stdout is True, output the
        # character as well (to stdout)
        elif current_operator == '.':
            output += chr(tape[tape_pointer])

            if use_stdout:
                sys.stdout.write(chr(tape[tape_pointer]))


        instruction_pointer += 1

    return output



if __name__ == '__main__':

    usage = "Usage: %s <filename> [enable_timer: True | False]" % (sys.argv[0],)
    timer = False

    if len(sys.argv) not in [2,3] or sys.argv[1] == 'help':
        print usage
        sys.exit(1)


    if not os.path.exists(sys.argv[1]):
        print usage
        print "File not found"
        sys.exit(1)


    try:
        program = open(sys.argv[1]).read()
    except Exception, e:
        print usage
        print "Unable to open file: %s" % (e.__str__(),)
        sys.exit(1)


    if len(sys.argv) == 3:
        if sys.argv[2] not in ['True', 'False']:
            print usage
            print "Second argument must be 'True' or 'False'"
            sys.exit(1)

        if sys.argv[2] == 'True':
            timer = True


    if timer:
        start_time = time()


    run_program(program, use_stdout = True, strict_eight_bit_cell = False)


    if timer:
        print
        print "Elapsed time: %0.2f" % (time() - start_time,)

Example output:

D:\code>brainfucked.py triangle.bf True
                                *
                               * *
                              *   *
                             * * * *
                            *       *
                           * *     * *
                          *   *   *   *
                         * * * * * * * *
                        *               *
                       * *             * *
                      *   *           *   *
                     * * * *         * * * *
                    *       *       *       *
                   * *     * *     * *     * *
                  *   *   *   *   *   *   *   *
                 * * * * * * * * * * * * * * * *
                *                               *
               * *                             * *
              *   *                           *   *
             * * * *                         * * * *
            *       *                       *       *
           * *     * *                     * *     * *
          *   *   *   *                   *   *   *   *
         * * * * * * * *                 * * * * * * * *
        *               *               *               *
       * *             * *             * *             * *
      *   *           *   *           *   *           *   *
     * * * *         * * * *         * * * *         * * * *
    *       *       *       *       *       *       *       *
   * *     * *     * *     * *     * *     * *     * *     * *
  *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Elapsed time: 0.13

Posted on

WWDC07

Just for the record; my Leopard predictions:

  • ZFS will be default file system
  • Time Machine will thus become a free feature due to copy-on-write

Edit: Why ZFS is cool [PDF]

Posted on

My weekend timesink

I hope everybody’s weekend was just as productive as my own:

300 Tailoring

Posted on

Lightsabers are cool

Posted on

What’s wrong with this code

Ten points if you can figure out why the below code doesn’t work as expected

$s = "$123,456.78";

if(preg_match("#^\$([,0-9]+)\.\d\d$#", $s, $matches) == 1)
{
    echo "It works!\n";
}

Yep, because I was using double quotes. PHP thought that ‘\$’ meant the literal ‘$’ (as opposed to a variable substitution) not ‘\$’ which was what I wanted to pass to preg_match.

I should have either escaped the back slash (’\\$‘) or just used single quotes (as they don’t support variable substitution).

Posted on

“This is not a rootkit...”

matthewd@carbonara:~$ telnet 10.10.42.45 9090
Trying 10.10.42.45...
Connected to 10.10.42.45.
Escape character is '^]'.


This is not a rootkit or other backdoor, it's a BitTorrent
client. Really. Why should you be worried, can't you read this
reassuring message? Now just listen to this social engi, er, I mean,
completely truthful statement, and go about your business. Your box is
safe and completely impregnable, the marketing hype for your OS even
says so. You can believe everything you read. Now move along, nothing
to see here.Connection closed by foreign host.
matthewd@carbonara:~$

Posted on

Future Food

There is a revolution happening in the farm fields and on the dinner tables of America — a revolution that is transforming the very nature … all » of the food we eat.

THE FUTURE OF FOOD offers an in-depth investigation into the disturbing truth behind the unlabeled, patented, genetically engineered foods that have quietly filled U.S. grocery store shelves for the past decade.

Watch @ Google Video

Posted on