class: center, middle # Lexical Analysis - Part 2 _CMPU 331 - Compilers_ --- # Update * Assignment 1 is posted on the calendar * The lexer project instructions are on the calendar and in GitHub --- # Finite-state machines Remember finite-state machines from CMPU 240?  * Regular expressions are powerful, but can be slow (and memory-hungry) * One way of optimizing a lexer is to transform it into a DFA --- # Lexer Project * Accept the assignment in GitHub * Who will be working as pairs? * Clone the repository to your machine * Who will be working on their own laptop? * Do you have git and Python 3 installed already? * Add token definitions with regular expressions * Commit your changes, and push them back to GitHub --- class: center, middle # Python Flash Cards --- # Python Regular Expressions * **`.`** matches any character * **`*`** zero or more repetitions * **`+`** one or more repetitions * **`\d`** a digit * **`[a-z]`** a character class * **`\s`** a space * **`\t`** a tab * **`\n`** a newline * **`|`** alternation, like **`a|b`** * **`(...)`** a group, like **`(a | b)+`** --- # Python Regular Expressions Characters with special meaning in regular expressions need to be escaped * **`\+`** a literal plus character * **`\*`** a literal asterisk * **`\(`** literal open parentheses * **`\)`** literal close parentheses * **`\[`** literal open square bracket * **`\]`** literal close square bracket * **`\{`** literal open curly bracket * **`\}`** literal close curly bracket --- class: center, middle # Tour of SLY