1 abstract
Computational thinking centers on designing abstract models of problems, applying computational steps and efficient algorithms to solve problems—a concept that not only serves computer science (CS), but gradually permeates science and everyday life.
Abstraction is at the heart of computational thinking and the subject of this article.
“Abstract” has always been an important concept in computer science, and the emphasis on computational thinking highlights the importance of abstraction when teaching computer knowledge to a broad audience.
In computer science, abstraction is not limited to physical reality, so we find useful abstractions everywhere, such as “quantum mechanics”. It has a derived computational abstraction called “quantum circuits” that starts with physical concepts and catalyzes programming languages for simulations, as well as theoretical algorithms that take advantage of its unique capabilities, which are expected to be implemented on large machines.
“Abstract” in computer science tends to include the following:
A data model contains one or more types of data and possible relationships between the data. For example, an undirected graph can be described as consisting of nodes and edges, with each edge connecting two nodes.
Some programming languages do not perform data manipulation. This might be a traditional programming language, or it might just do some specific operations. The language has always had a formal semantics – a specification of how a program affects data.
Therefore, each abstract model allows us to design algorithms that manipulate data in a specific way.
Our goal is to design “quality” abstract models with multiple benefits. The ease of abstraction is an important indicator when designing solutions. For example, we will discuss in Section 3.1 how the relational model has led to a surge in database usage. The resulting algorithm also has other performance metrics, such as runtime on serial or parallel machines. Again, we favor abstractions that are easy to implement. Finally, some abstractions provide an easy way to measure the efficiency of an algorithm (since with traditional programming languages we can estimate an upper bound on the running time of a program), while other abstractions require us to discuss the efficiency of an algorithm even approximately Implement at the lower level first.
1.1 Compile
Some levels of abstraction are too high to provide meaningful performance metrics. Therefore, operations of a high-level abstraction may need to be implemented at a lower level.
In fact, there may be multiple levels of abstraction at a level that is progressively closer to the machine itself. As shown in Figure 1, operations of a high-level abstraction (Abstract 1) can be implemented by a lower-level abstraction (Abstract 2), which in turn can be implemented by a lower-level abstraction (not shown in the figure). There are some interesting levels of abstraction that take us from high-level programs to machine instructions, physical hardware, logic gates, transistors, and finally electronics. However, we only focus on the higher layers.
The Turing Award winner and author of “Dragon Book” explain in a 10,000-character long essay: What is “abstract”?
Figure 1. Abstraction layer and algorithm layer
Algorithms using operations of abstraction 1 are compiled into algorithms in the lower level abstraction 2. In this article, we use the term compiler in a general sense, not just a regular compiler for the programming languages highlighted in the Book of Dragons, but also an algorithm that converts one abstract program into another, This presumably belongs to a lower-level abstraction. Therefore, in some cases, the compilation process is simple, and each operation at the higher level is replaced by one or more specific operations at the lower level. In other cases, especially when compiling from traditional languages (such as C) to machine-level languages, the translation algorithms are very complex. There are other cases, such as when the high-level abstraction uses powerful algebraic operations (such as linear or relational algebra), optimization is critical because the original compilation often results in an algorithm that costs several more than the algorithm generated by optimizing compilation. orders of magnitude time.
This may be because Abstraction 2 is too close to the machine level and therefore has meaningful performance metrics. If so, abstraction 1 can inherit these metrics, providing high-quality concepts for algorithms written in abstraction 1. But high-level abstractions can often be implemented in several different low-level abstractions. Each low-level abstraction may provide a completely different concept of running time or another measure, and therefore necessarily contains a different concept of algorithmic goodness at a high level.
1.2 Abstract Taxonomy
We can identify at least four different types of abstractions that can be divided according to their intended goals. In the discussions that form the main body of this article, we will give corresponding examples and explore their interactions.
1
abstract
Computational thinking centers on designing abstract models of problems, applying computational steps and efficient algorithms to solve problems—a concept that not only serves computer science (CS), but gradually permeates science and everyday life.
Abstraction is at the heart of computational thinking and the subject of this article. “Abstract” has always been an important concept in computer science, and the emphasis on computational thinking highlights the importance of abstraction when teaching computer knowledge to a broad audience.
In computer science, abstraction is not limited to physical reality, so we find useful abstractions everywhere, such as “quantum mechanics”. It has a derived computational abstraction called “quantum circuits” that starts with physical concepts and catalyzes programming languages for simulations, as well as theoretical algorithms that take advantage of its unique capabilities, which are expected to be implemented on large machines.
“Abstract” in computer science tends to include the following:
A data model contains one or more types of data and possible relationships between the data. For example, an undirected graph can be described as consisting of nodes and edges, with each edge connecting two nodes.
Some programming languages do not perform data manipulation. This might be a traditional programming language, or it might just do some specific operations. The language has always had formal semantics – a specification of how a program affects data.
Therefore, each abstract model allows us to design algorithms that manipulate data in a specific way.
Our goal is to design “quality” abstract models with multiple benefits. The ease of abstraction is an important indicator when designing solutions. For example, we will discuss in Section 3.1 how the relational model has led to a surge in database usage. The resulting algorithm also has other performance metrics, such as runtime on serial or parallel machines. Again, we favor abstractions that are easy to implement. Finally, some abstractions provide an easy way to measure the efficiency of an algorithm (since with traditional programming languages we can estimate an upper bound on the running time of a program), while other abstractions require us to discuss the efficiency of an algorithm even approximately Implement at the lower level first.
1.1 Compile
Some levels of abstraction are too high to provide meaningful performance metrics. Therefore, operations of a high-level abstraction may need to be implemented at a lower level.
In fact, there may be multiple levels of abstraction at a level that is progressively closer to the machine itself. As shown in Figure 1, operations of a high-level abstraction (Abstract 1) can be implemented by a lower-level abstraction (Abstract 2), which in turn can be implemented by a lower-level abstraction (not shown in the figure). There are some interesting levels of abstraction that take us from high-level programs to machine instructions, physical hardware, logic gates, transistors, and finally electronics. However, we only focus on the higher layers.
The Turing Award winner and author of “Dragon Book” explain in a 10,000-character long essay: What is “abstract”?
Figure 1. Abstraction layer and algorithm layer
Algorithms using operations of abstraction 1 are compiled into algorithms in the lower level abstraction 2. In this article, we use the term compiler in a general sense, not just a regular compiler for the programming languages highlighted in the Book of Dragons, but also an algorithm that converts one abstract program into another, This presumably belongs to a lower-level abstraction. Therefore, in some cases, the compilation process is simple, and each operation at the higher level is replaced by one or more specific operations at the lower level. In other cases, especially when compiling from traditional languages (such as C) to machine-level languages, the translation algorithms are very complex. There are other cases, such as when the high-level abstraction uses powerful algebraic operations (such as linear or relational algebra), optimization is critical because the original compilation often results in an algorithm that costs several more than the algorithm generated by optimizing compilation. orders of magnitude time.
This may be because Abstraction 2 is too close to the machine level and therefore has meaningful performance metrics. If so, abstraction 1 can inherit these metrics, providing high-quality concepts for algorithms written in abstraction 1. But high-level abstractions can often be implemented in several different low-level abstractions. Each low-level abstraction may provide a completely different concept of running time or another measure, and therefore necessarily contains a different concept of algorithmic goodness at a high level.
1.2 Abstract Taxonomy
We can identify at least four different types of abstractions that can be divided according to their intended goals. In the discussions that form the main body of this article, we will give corresponding examples and explore their interactions.
2 Compiler abstraction
Modern compilers refine the translation process into stages, each of which transforms one representation of the source program into another semantically equivalent representation, usually at a lower level of abstraction. Stages in a compiler typically include lexical analysis, syntactic analysis, semantic analysis, intermediate code generation, code optimization, and object code generation. A symbol table shared by all stages is used to collect and provide information about various structures in the source program. The first four stages are usually called the front end of the compiler, and the last two stages are called the back end.
Advances in compiler implementation involve many important abstractions. We will discuss three such abstractions in detail: regular expressions, context-free grammars, and flow graphs. The first two are declarative abstractions with interesting optimization stories. The third, while not declarative, also poses interesting implementation challenges.
2.1 Regular expressions and syntactic analysis
Syntactic analysis is the first stage of the compiler, it reads the source program as a sequence of characters and maps it to a sequence of symbols called tokens, which is then passed to the next stage, the parser.
Example 2.1 If the source program contains the statement: Fahrenheit=Celsius*1.8+32, the parser can map the statement to a sequence of seven tokens: <id><=><id><*><real><+> <int> . Here id is the token of any program variable or identifier, the operators =, *, and + are tokens themselves, and these two constants are converted to tokens real and int, respectively.
A major advance in compiler construction was the creation of a generator for syntactic analysis, a program like Lex that takes as input a description of tokens, generates a program that breaks the source program into tokens, and returns the tokens corresponding to the source program sequence. The abstraction that enables Lex to be applied is regular expressions. Systems like Lex that use a regular expression abstraction use a number of useful shorthands that make writing regular expressions simpler, but do not change the set of strings that can be defined in this abstraction.
Example 2.2 In some programming languages, the set of strings that are legal identifiers can be defined as follows:
letter = [a-zA-Z]
digit = [0-9]
id = letter(letter+digit)*
In this shorthand, expressions like a-z represent the union of single strings with ASCII codes between a and z. So the regular expressions for letters are in the original set of three operators:
a+b+…+z+A+B+…+Z
Numbers are defined similarly, and then the set of strings for tag <id> is defined as a string of letters followed by zero or more strings of letters and/or numbers.
2.1.1 Before the Lex Program: Bibliographic Search
As well understood from theoretical studies, regular expression abstractions can be compiled into one of several abstract implementations, such as deterministic or non-deterministic finite automata (NFA and DFA). However, when it comes to solving practical problems, there are still some technologies to be broken through.
Bell Labs took an interesting step in their first attempt at automating the search for relevant literature: They kept the titles of the entire Bell Labs library on tape and developed software to get a list of keywords and find documents that contained those keywords . However, when given a long list of keywords, the search is slow because it traverses the tape for each keyword it searches.
The Aho-Corasick algorithm improves on this, and instead of searching for each keyword individually, the keyword list is treated as a regular expression containing the set of all strings in which any keyword occurs, i.e.:
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
Note that dot is an extension of “any character”. This expression is transformed into a deterministic finite automaton. No matter how many keywords are involved, one pass can be made on tape. Each title is checked once by a finite automaton to see if any keywords are found in it.
2.1.2 Generator Design for Syntactic Analysis
Essentially, a generator of syntactic analysis like Lex does the same thing as the idea embodied in Section 2.1.1. Write regular expressions for each token, then apply the union operator to those expressions. The expression is converted into a deterministic finite automaton that reads characters until a string prefix matching the token is found, then removes the character read from the input, adds the token to the output stream, and repeats the process.
There are some additional considerations, because unlike keywords, there can be some complex interactions between tags. For example, although it looks like an identifier, it is actually a keyword used for control flow in a program. Therefore, when seeing this sequence of characters, the lexer must return the token <while>, not <id>. In Lex, the order in which regular expressions are listed in their input file breaks ambiguities like these, so all that is required is to list keywords before identifiers, ensuring that keywords are properly distinguished and not treated as identifiers . Another problem is that some tokens can be a prefix of another token. If the next character entered is =, we don’t want to recognize < as a token. Instead, we want to recognize <= as a tag. To avoid errors like this, the parser is designed to keep reading as long as what it sees is accepted as a valid token by the finite automaton.
2.1.3 Lazy Evaluation of DFA
There is another optimization method that can use the regular expression abstraction to improve the running time of an algorithm – lazy evaluation.
You are probably familiar with the standard way of converting regular expressions into deterministic finite automata. Regular expressions are first converted into non-deterministic finite automata by McNaughton-Yamada’s algorithm. This transformation is simple, and if the regular expression is of length n, generates an NFA with at most 2n states. When converting an NFA to a DFA, it starts out with difficulty, which requires a Rabin-Scott subset construction. In the worst case, this construction can transform an NFA with 2n states into a Turing Award-winning author of The Dragon Book. A DFA of one state, which actually doesn’t work. In practice, the probability of the worst case happening is small.
However, in other applications of regular expressions, near worst-case scenarios can and do occur. grep is one of the earliest UNIX commands and stands for “get regular expression and print”. This command will take a string and determine if it has a substring of the given regular expression language. The simplest implementation is to convert the regular expression to an NFA, then to a DFA, and let the DFA read the string. Converting an NFA to a DFA takes more time than scanning a string when the DFA is large.
However, when the regular expression is only used to scan the string once, there are more efficient ways to implement the command, such as grep. The first research paper by Ken Thompson showed that instead of converting a small NFA into a large DFA, it is more efficient to simulate the NFA directly. That is, NFAs that read strings are usually in a set of states after each character is read. So just keep track of these NFA states after each character, and when the next character is read, build the set of states reachable by that character from the previous set of states.
Greater efficiency can be achieved with a lazy NFA to DFA conversion. That is, read the input string one character at a time, then tabulate the NFA states that actually result from the prefixes read so far. These sets of NFA states correspond to DFA states, so we only build the part of the DFA transition table needed to process this particular input string. If the DFA for a given regular expression is not too large, most or all of the DFA will be constructed before finishing processing the string, and you gain the benefit of using the DFA directly, rather than constructing the NFA state set after each character of the string. But if the DFA is larger than the string, most of the DFA will never be constructed, so we’ll take advantage of both cases. This improvement was implemented in a version of grep called egrep.
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
Figure 2. Syntax tree for the expression a + b * c
2.2 Context-free grammars and parsing
In the second stage of the compiler, the parser or “parser” maps the token sequence generated by the lexer into a tree representation, thereby unambiguously specifying the grammatical structure in the token sequence. A typical representation is a syntax tree, where each internal node represents some structure and the children of that node represent the components of that structure.
Example 2.3 The parser can map the token sequence a+b*c into the syntax tree shown in Figure 2. Here, E represents an expression. The operands a, b, and c are themselves expressions. But b*c is also an expression, consisting of the operator token * and two expressions b and c. At the root we see another expression that uses the operator + and two operand expressions a and b*c.
It is important to obey many conventions about operator precedence. In general, multiplication takes precedence over addition, which is why the syntax tree multiplies b by c before adding a, rather than adding a and b first.
The syntax tree structure required for a given programming language is usually defined by a declarative abstraction, a context free grammar (CFG), and readers are expected to be familiar with this concept. A CFG is a collection of rules called production rules that provide methods for constructing various syntactic categories, such as expressions or statements, from other syntactic categories and terminals (tokens generated by the syntactic analyzer). For example, if E represents the grammatical category of well-formed expressions in the language, then we might find a rule like: Turing Award-winning author of the Dragon Book 10,000-word essay: What is “abstract”? , which means that one way to construct an expression is to place a plus sign between two smaller expressions.
2.2.1 LR(k) parsing
In the 1960s, there were a series of proposals on how to construct efficient parsers from CFGs. It is recognized that for general-purpose programming languages, as long as the grammar has certain properties, it is possible to do a left-to-right scan of the program without backtracking and build a syntax tree from the grammar of the language.
Some decisions are tricky. For example, when processing the expression a+b*c, after reading only a+b, you must decide whether to combine the expressions a and b with the plus sign into a larger expression. If you look forward a marker and see *, you know that it is incorrect to combine a and b, but must move on and eventually combine b and c. Only on this basis is it correct to combine a with the expression b*c.
This method of parsing is called “shift-reduce parsing”. As the input is scanned, each step has to decide whether to move the next input token by pushing it onto the stack or to reduce the symbol at the top of the stack. When reducing, the sign of the reduction must be on the right side of the CFG. These symbols are popped off the stack and replaced on the left side of the same production. Additionally, syntax tree nodes are created for the symbols on the left-hand side of the production. Its child node is the root corresponding to the symbol just popped from the stack. If a token is popped, its tree is just a node, but if a syntactic category is popped, its tree is the tree previously constructed for the symbols on the stack.
Don Knuth proposed LR(k) parsing, which applies to the most general grammar categories, does a single left-to-right scan of the input, uses shift-reduce normal form and sees that the input is parsed correctly after up to k symbols preceding it . This work seems to address how the parser should be structured. However, not every CFG, even of every typical programming language, satisfies the conditions necessary to be an LR(k) grammar for any k. While common programming languages do seem to have LR(1) grammars, i.e. grammars that allow shift-reduce analysis using only one lookahead symbol on the input, these grammars are fairly complex in design and often have more grammar classes than intuitively required out an order of magnitude.
2.2.2 Generator for Yacc parsing
So, after Knuth’s paper, there were several attempts to find a way to parse using LR(1), but to make it work for simpler CFGs. We were inspired by a Princeton graduate student, Al Korenjak, whose paper was on methods for compressing LR(1) parsers. We figured out that for universal languages, it is possible to start with a non-LR(1) grammar and still build a left-to-right shift-reduce parser for that grammar. When the grammar is not in LR(1) form, we can also use two different productions to reduce and shift or just reduce in some cases. But we can resolve the ambiguity in the real case by considering operator precedence and looking a token ahead in the input.
Example 2.4 Consider the situation mentioned in Example 2.3. After processing a+b of input a+b*c, the top of the stack will have E+E where both a and b were previously reduced to expressions. There exists a production E → E + E, which reduces E + E to an E and constructs a node of the parse tree with the label E and the subexpressions E, +, and E. But * has higher precedence than +, and we see * as the next input symbol, which means it is correct to move * on the stack. Later, we also move c and reduce c to the expression E. At this point there is E + E * E at the top of the stack. We correctly reduce the first three symbols to E, leaving E + E. Now, it is correct to reduce these symbols to E , since nothing is entered (or there are other inputs that are not part of the expression, such as a semicolon ending a statement). In this way, we will generate the syntax tree shown in Figure 2.
Our colleague Steve Johnson at Bell Labs took this idea and implemented a generator for parsing called Yacc. To help resolve ambiguities between shift and reduce operations, or between reductions of two different productions, Yacc judges based on the order in which the productions appear. In the case where both productions are reducible, whichever comes first is preferred. To resolve conflicts between shift and reduce, it is assumed that the operator that appears first in the Yacc input file takes precedence.
Yacc quickly became an important tool in the implementation of compilers, not only compilers for traditional programming languages, but also compilers for many more limited-use “niche languages”. Together with Lex, Yacc provides an easy way to experiment with the design of syntactic structures in new languages. These two tools are used throughout a semester of compiler courses in academia, where students design and implement a new domain-specific programming language.
2 Compiler abstraction
Modern compilers refine the translation process into stages, each of which transforms one representation of the source program into another semantically equivalent representation, usually at a lower level of abstraction. Stages in a compiler typically include lexical analysis, syntactic analysis, semantic analysis, intermediate code generation, code optimization, and object code generation. A symbol table shared by all stages is used to collect and provide information about various structures in the source program. The first four stages are usually called the front end of the compiler, and the last two stages are called the back end.
Advances in compiler implementation involve many important abstractions. We will discuss three such abstractions in detail: regular expressions, context-free grammars, and flow graphs. The first two are declarative abstractions with interesting optimization stories. The third, while not declarative, also poses interesting implementation challenges.
2.1 Regular expressions and syntactic analysis
Syntactic analysis is the first stage of the compiler, it reads the source program as a sequence of characters and maps it to a sequence of symbols called tokens, which is then passed to the next stage, the parser.
Example 2.1 If the source program contains the statement: Fahrenheit=Celsius*1.8+32, the parser can map the statement to a sequence of seven tokens: <id><=><id><*><real><+> <int> . Here id is the token of any program variable or identifier, the operators =, *, and + are tokens themselves, and these two constants are converted to tokens real and int, respectively.
A major advance in compiler construction was the creation of a generator for syntactic analysis, a program like Lex that takes as input a description of tokens, generates a program that breaks the source program into tokens, and returns the tokens corresponding to the source program sequence. The abstraction that enables Lex to be applied is regular expressions. Systems like Lex that use a regular expression abstraction use a number of useful shorthands that make writing regular expressions simpler, but do not change the set of strings that can be defined in this abstraction.
Example 2.2 In some programming languages, the set of strings that are legal identifiers can be defined as follows:
letter = [a-zA-Z]
digit = [0-9]
id = letter(letter+digit)*
In this shorthand, expressions like a-z represent the union of single strings with ASCII codes between a and z. So the regular expressions for letters are in the original set of three operators:
a+b+…+z+A+B+…+Z
Numbers are defined similarly, and then the set of strings for tag <id> is defined as a string of letters followed by zero or more strings of letters and/or numbers.
2.1.1 Before the Lex Program: Bibliographic Search
As well understood from theoretical studies, regular expression abstractions can be compiled into one of several abstract implementations, such as deterministic or non-deterministic finite automata (NFA and DFA). However, when it comes to solving practical problems, there are still some technologies to be broken through.
Bell Labs took an interesting step in their first attempt at automating the search for relevant literature: They kept the titles of the entire Bell Labs library on tape and developed software to get a list of keywords and find documents that contained those keywords . However, when given a long list of keywords, the search is slow because it traverses the tape for each keyword it searches.
The Aho-Corasick algorithm improves on this, and instead of searching for each keyword individually, the keyword list is treated as a regular expression containing the set of all strings in which any keyword occurs, i.e.:
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
Note that dot is an extension of “any character”. This expression is transformed into a deterministic finite automaton. No matter how many keywords are involved, one pass can be made on tape. Each title is checked once by a finite automaton to see if any keywords are found in it.
2.1.2 Generator Design for Syntactic Analysis
Essentially, a generator of syntactic analysis like Lex does the same thing as the idea embodied in Section 2.1.1. Write regular expressions for each token, then apply the union operator to those expressions. The expression is converted into a deterministic finite automaton that reads characters until a string prefix matching the token is found, then removes the character read from the input, adds the token to the output stream, and repeats the process.
There are some additional considerations, because unlike keywords, there can be some complex interactions between tags. For example, although it looks like an identifier, it is actually a keyword used for control flow in a program. Therefore, when seeing this sequence of characters, the lexer must return the token <while>, not <id>. In Lex, the order in which regular expressions are listed in their input file breaks ambiguities like these, so all that is required is to list keywords before identifiers, ensuring that keywords are properly distinguished and not treated as identifiers . Another problem is that some tokens can be a prefix of another token. If the next character entered is =, we don’t want to recognize < as a token. Instead, we want to recognize <= as a tag. To avoid errors like this, the parser is designed to keep reading as long as what it sees is accepted as a valid token by the finite automaton.
2.1.3 Lazy Evaluation of DFA
There is another optimization method that can use the regular expression abstraction to improve the running time of an algorithm – lazy evaluation.
You are probably familiar with the standard way of converting regular expressions into deterministic finite automata. Regular expressions are first converted into non-deterministic finite automata by McNaughton-Yamada’s algorithm. This transformation is simple, and if the regular expression is of length n, generates an NFA with at most 2n states. When converting an NFA to a DFA, it starts out with difficulty, which requires a Rabin-Scott subset construction. In the worst case, this construction can transform an NFA with 2n states into a Turing Award-winning author of The Dragon Book. A DFA of one state, which actually doesn’t work. In practice, the probability of the worst case happening is small.
However, in other applications of regular expressions, near worst-case scenarios can and do occur. grep is one of the earliest UNIX commands and stands for “get regular expression and print”. This command will take a string and determine if it has a substring of the given regular expression language. The simplest implementation is to convert the regular expression to an NFA, then to a DFA, and let the DFA read the string. Converting an NFA to a DFA takes more time than scanning a string when the DFA is large.
However, when the regular expression is only used to scan the string once, there are more efficient ways to implement the command, such as grep. The first research paper by Ken Thompson showed that instead of converting a small NFA into a large DFA, it is more efficient to simulate the NFA directly. That is, NFAs that read strings are usually in a set of states after each character is read. So just keep track of these NFA states after each character, and when the next character is read, build the set of states reachable by that character from the previous set of states.
Greater efficiency can be achieved with a lazy NFA to DFA conversion. That is, read the input string one character at a time, then tabulate the NFA states that actually result from the prefixes read so far. These sets of NFA states correspond to DFA states, so we only build the part of the DFA transition table needed to process this particular input string. If the DFA for a given regular expression is not too large, most or all of the DFA will be constructed before finishing processing the string, and you gain the benefit of using the DFA directly, rather than constructing the NFA state set after each character of the string. But if the DFA is larger than the string, most of the DFA will never be constructed, so we’ll take advantage of both cases. This improvement was implemented in a version of grep called egrep.
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
Figure 2. Syntax tree for the expression a + b * c
2.2 Context-free grammars and parsing
In the second stage of the compiler, the parser or “parser” maps the token sequence generated by the lexer into a tree representation, thereby unambiguously specifying the grammatical structure in the token sequence. A typical representation is a syntax tree, where each internal node represents some structure and the children of that node represent the components of that structure.
Example 2.3 The parser can map the token sequence a+b*c into the syntax tree shown in Figure 2. Here, E represents an expression. The operands a, b, and c are themselves expressions. But b*c is also an expression, consisting of the operator token * and two expressions b and c. At the root we see another expression that uses the operator + and two operand expressions a and b*c.
It is important to obey many conventions about operator precedence. In general, multiplication takes precedence over addition, which is why the syntax tree multiplies b by c before adding a, rather than adding a and b first.
The syntax tree structure required for a given programming language is usually defined by a declarative abstraction, a context free grammar (CFG), and readers are expected to be familiar with this concept. A CFG is a collection of rules called production rules that provide methods for constructing various syntactic categories, such as expressions or statements, from other syntactic categories and terminals (tokens generated by the syntactic analyzer). For example, if E represents the grammatical category of well-formed expressions in the language, then we might find a rule like: Turing Award-winning author of the Dragon Book 10,000-word essay: What is “abstract”? , which means that one way to construct an expression is to place a plus sign between two smaller expressions.
2.2.1 LR(k) parsing
In the 1960s, there were a series of proposals on how to construct efficient parsers from CFGs. It is recognized that for general-purpose programming languages, as long as the grammar has certain properties, it is possible to do a left-to-right scan of the program without backtracking and build a syntax tree from the grammar of the language.
Some decisions are tricky. For example, when processing the expression a+b*c, after reading only a+b, you must decide whether to combine the expressions a and b with the plus sign into a larger expression. If you look forward a marker and see *, you know that it is incorrect to combine a and b, but must move on and eventually combine b and c. Only on this basis is it correct to combine a with the expression b*c.
This method of parsing is called “shift-reduce parsing”. As the input is scanned, each step has to decide whether to move the next input token by pushing it onto the stack or to reduce the symbol at the top of the stack. When reducing, the sign of the reduction must be on the right side of the CFG. These symbols are popped off the stack and replaced on the left side of the same production. Additionally, syntax tree nodes are created for the symbols on the left-hand side of the production. Its child node is the root corresponding to the symbol just popped from the stack. If a token is popped, its tree is just a node, but if a syntactic category is popped, its tree is the tree previously constructed for the symbols on the stack.
Don Knuth proposed LR(k) parsing, which applies to the most general grammar categories, does a single left-to-right scan of the input, uses shift-reduce normal form and sees that the input is parsed correctly after up to k symbols preceding it . This work seems to address how the parser should be structured. However, not every CFG, even of every typical programming language, satisfies the conditions necessary to be an LR(k) grammar for any k. While common programming languages do seem to have LR(1) grammars, i.e. grammars that allow shift-reduce analysis using only one lookahead symbol on the input, these grammars are fairly complex in design and often have more grammar classes than intuitively required out an order of magnitude.
2.2.2 Generator for Yacc parsing
So, after Knuth’s paper, there were several attempts to find a way to parse using LR(1), but to make it work for simpler CFGs. We were inspired by a Princeton graduate student, Al Korenjak, whose paper was on methods for compressing LR(1) parsers. We figured out that for universal languages, it is possible to start with a non-LR(1) grammar and still build a left-to-right shift-reduce parser for that grammar. When the grammar is not in LR(1) form, we can also use two different productions to reduce and shift or just reduce in some cases. But we can resolve the ambiguity in the real case by considering operator precedence and looking a token ahead in the input.
Example 2.4 Consider the situation mentioned in Example 2.3. After processing a+b of input a+b*c, the top of the stack will have E+E where both a and b were previously reduced to expressions. There exists a production E → E + E, which reduces E + E to an E and constructs a node of the parse tree with the label E and the subexpressions E, +, and E. But * has higher precedence than +, and we see * as the next input symbol, which means it is correct to move * on the stack. Later, we also move c and reduce c to the expression E. At this point there is E + E * E at the top of the stack. We correctly reduce the first three symbols to E, leaving E + E. Now, it is correct to reduce these symbols to E , since nothing is entered (or there are other inputs that are not part of the expression, such as a semicolon ending a statement). In this way, we will generate the syntax tree shown in Figure 2.
Our colleague Steve Johnson at Bell Labs took this idea and implemented a generator for parsing called Yacc. To help resolve ambiguities between shift and reduce operations, or between reductions of two different productions, Yacc judges based on the order in which the productions appear. In the case where both productions are reducible, whichever comes first is preferred. To resolve conflicts between shift and reduce, it is assumed that the operator that appears first in the Yacc input file takes precedence.
Yacc quickly became an important tool in the implementation of compilers, not only compilers for traditional programming languages, but also compilers for many more limited-use “niche languages”. Together with Lex, Yacc provides an easy way to experiment with the design of syntactic structures in new languages. These two tools are used throughout a semester of compiler courses in academia, where students design and implement a new domain-specific programming language.
4: Quantum Computing
Recently, there has been a lot of interest in quantum computing and quantum programming languages around the world. Quantum computing is particularly interesting because the computational models in quantum programming languages are very different from those in classical programming languages.
The story begins with quantum mechanics, a fundamental theory of physics developed in the early 20th century that describes natural physical properties on the scale of atoms and subatomic particles. We will introduce the basic assumptions of quantum mechanics, from which all the laws of quantum mechanics can be derived. From these assumptions, we can derive the abstraction of quantum circuits, one of the fundamental computational models of quantum programming languages.
4.1 Assumptions of quantum mechanics
Complex linear algebra and Hilbert spaces (complex vector spaces with inner products) are often used to describe the assumptions of quantum mechanics. Nielsen and Chuang’s book, Quantum Computing and Quantum Information: A 10th Anniversary Edition, is an important reference book for learning the subject. First, let’s review some basic definitions of complex linear algebra used in the hypothesis. It is helpful to think of operators as complex matrices acting on vectors. The Hermitian conjugate form of matrix U is U†, which represents the conjugate transpose of matrix U, that is, the transpose of U is taken first, and then the complex part of each value is negated.
The concept of unitary operators is at the heart of quantum mechanics. Operator U is unitary if UU† = /, where / is the identity. This means that the effect of each unitary transformation is reversible. Invertible means recoverable, that is, we can reconstruct the input from the output. If U = U†, the operator U is said to be a Hermitian operator, and a Hermitian operator is a self-adjoint operator.
Assumption 1: The state space of an isolated physical system can be modeled by a Hilbert space. The state of the system is completely described by unit vectors in the state space.
Assumption 1 allows us to define qubits as unit vectors in a two-dimensional state space. A quantum bit is a quantum computing analog of a bit (0 or 1) in classical computing. If the winner of the Vector Turing Award and the author of “Dragon Book” explains in a 40-character long article: What is “abstract”? And the Turing Award winner, author of “Dragon Book”, explained in a long article: What is “abstract”? Used as an orthonormal basis for a two-dimensional Hilbert space, then any state vector in this space, Turing Award winner and author of “Dragon Book”, explained in a long article: What is “abstract”? It can be written as a Turing Award winner and author of “Dragon Book” in a 10,000-character long essay: What is “abstract”? Or Turing Award winner and author of “Dragon Book” to explain in a 10,000-character long article: What is “abstract”? . where α and β are complex numbers. Because the Turing Award winner and author of “Dragon Book” explained in a 10,000-character long article: What is “abstract”? It is a unit vector, so the Turing Award winner and author of “Dragon Book” explained in a long article: What is “abstract”? .
The qubit Turing Award winner and author of “Dragon Book” explains in a 10,000-character long article: What is “abstract”? exhibits an inherent phenomenon of quantum mechanics called superposition states. Unlike the bits in classical computing that are always 0 or 1, in the case of unknown α and β, it cannot be said that the qubit Turing Award winner and author of “Dragon Book” explained in a long article: What is “abstract”? Must be in the state Turing Award winner, author of “Dragon Book” 10,000-character long article to explain: What is “abstract”? Or definitely in the state Turing Award winner and author of “Dragon Book” to explain in a 40-character long article: What is “abstract”? . We can only say that it is some combination of these two states.
Assumption 2: The evolution of the state of a closed quantum system from one moment to another can be described by unitary operators.
There is an equivalent way of expressing Hypothesis 2 using the Schrodinger equation. However, we only consider the unitary formula here because it leads naturally to the computational model of quantum circuits.
Assumption 3: In order to obtain information from a closed quantum system, we can measure the system. Returns a measurement with some probability. The probabilities of the possible outcomes sum to 1. Measurements change the state of a quantum system.
We won’t go into the details of Hypothesis 3, but for the sake of discussion, we can explain in a 10,000-word essay, a single qubit Turing Award winner and author of the Book of the Dragon: What is “abstract”? The measurement of , is regarded as the Hermitian operator, which is explained in a 10,000-character long essay, the Turing Award winner and author of the Book of Dragons: What is “abstract”? The probability of returning a result of 0 is explained by the Turing Award winner and author of the “Dragon Book” in a 10,000-character long article: What is “abstract”? The probability of returning result 1. Recall that, because the Turing Award winner and author of “Dragon Book” explained in a 10,000-character long article: What is “abstract”? It is a unit vector, so the Turing Award winner and author of “Dragon Book” explained in a long article: What is “abstract”? . The measurement collapses the state vector into one of two basis vectors in a two-dimensional Hilbert space. We note that Heisenberg’s famous uncertainty principle of quantum mechanics can be derived from the rules of complex linear algebra and assumptions 1-3.
The fourth hypothesis shows how the dimensionality of the state space of composite physical systems grows when we combine physical systems.
Assumption 4: The state space of a composite physical system is the tensor product of the state spaces that make up the physical system.
Assumption 4 suggests that if we add a single qubit to a physical system, its state space doubles in dimension. Therefore, if we combine n single-qubit systems, by taking the tensor product of the state spaces of the n single-qubit systems, we can obtain a state space dimension. The Turing Award winner and author of “Dragon Book” explains in a long article: what Is it “abstract”? composite system. This exponential growth of state space makes it difficult to simulate the behavior of large quantum systems on classical computers.
4.2 Quantum circuits
Starting from the four assumptions of quantum mechanics, we can derive a computational model called a “quantum circuit”, which is the fundamental abstraction of quantum programming languages. Quantum circuits consist of quantum gates and quantum circuits. They are similar to Boolean circuits in classical computing, but with several important differences. It is helpful for analysis to think of a quantum gate as an orthogonal matrix of complex numbers and its output as a vector obtained by applying the matrix to the input vector.
1) Single qubit gate
A single-qubit gate has a line leading to the gate and a line leading out of the gate. The input line will explain a 10,000-character long article, a qubit Turing Award winner and author of “Dragon Book”: What is “abstract”? feed to the quantum gate. The quantum gate applies the unitary transformation U to the incoming qubit Turing Award winner and author of “Dragon Book” to explain: What is “abstract”? , and explain the output of the qubit Turing Award winner and author of “Dragon Book” in a 1000-character long article: What is “abstract”? sent to the output line.
In a classical Boolean circuit, there is only one non-trivial unit logic gate, the Boolean NOT gate. In quantum circuits, any unitary transformation in a two-dimensional complex Hilbert space can be a single-qubit quantum gate. Two important single-qubit gates are introduced here.
Example 4.1 The quantum NOT gate, usually denoted as X, explains the qubit Turing Award winner and author of “Dragon Book” in a 10,000-character long article: What is “abstract”? Mapping as a qubit Turing Award winner and author of “Dragon Book” explains in a long article: What is “abstract”? . Fundamentally, it flips the vector coefficients that represent qubits in a two-dimensional Hilbert space. Note that the Turing Award winner and author of “Dragon Book” explains in a 10,000-character long article: What is “abstract”? And Turing Award winner and author of “Dragon Book” to explain: What is “abstract”? .
Quantum NOT X can be explained in a 10,000-character long article, the winner of the Turing Award and the author of “Dragon Book”: What is “abstract”? Matrix representation:
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
Example 4.2 The quantum Hadamard gate is represented as H, and the qubit Turing Award winner and author of the “Dragon Book” is explained in a 10,000-character long article: What is “abstract”? Mapping to qubits:
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
Note the identity operator HH = I.
Quantum Hadamard Gate H can be explained by Turing Award winner and author of “Dragon Book” in a 40-character long article: What is “abstract”? Matrix representation:
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
There are many other useful single-qubit quantum gates.
2) Multi-qubit gates
A multi-qubit quantum gate has n input lines leading to the gate and n output lines emanating from the gate. The logic gate consists of a unitary operator U, which can be explained in a 10,000-character long essay, a Turing Award winner and author of “Dragon Book”: What is “abstract”? A complex matrix representation of , whose rows and columns are orthogonal.
Example 4.3 The controlled NOT gate (CNOT for short) is a very useful two-qubit gate. It has two input lines and two output lines, one is called the control line and the other is called the target line. The action of the switch is as follows: If the input qubit of the control line is a Turing Award winner and author of the “Dragon Book”, a 10,000-character long article explains: What is “abstract”? , then the qubit on the target line will pass through unchanged; if the incoming control qubit is a Turing Award winner and author of the “Dragon Book”, a 10,000-character long article explains: What is “abstract”? , then flip the target qubit. In both cases, the qubits of the control wire are unchanged. If the Turing Award winner and the author of “Dragon Book” explained in a 10,000-character long article: What is “abstract”? Expressed as Turing Award winner and author of “Dragon Book”, a 10,000-character long essay explains: What is “abstract”? (Qubit Turing Award winner and author of “Dragon Book” in a 10,000-character long article: What is “abstract”? And Turing Award winner and “Dragon Book” author in a 10-character long article to explain: What is “abstract”? The tensor product ), then we can describe the role of the CNOT gate as follows:
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
3) Circuit
Quantum circuits are the fundamental computational models of quantum computing and quantum programming languages, and are acyclic graphs composed of wires, quantum gates, and measurement gates. Because quantum circuits are acyclic, there are no loops or feedback. Since logical or is not a unitary operator, there is no fan-in where the wires are connected together. Also, in quantum mechanics, it is impossible to replicate an unknown quantum state (no-cloning theorem), so fan-out is also impossible.
The measurement gate takes a line as an input, and explains in a 10,000-character long article, the winner of the State Turing Award and the author of “Dragon Book”: What is “abstract”? Introduce a single quantum bit and generate a probabilistic classical bit as the output, and explain it in a 10,000-character long article, the winner of the Turing Award and the author of “Dragon Book”: What is “abstract”? The probability of is 0 or the Turing Award winner and author of “Dragon Book” can explain in a 40-character long article: What is “abstract”? The probability of is 1.
We conclude our discussion of quantum circuits with an example that illustrates an unusual property of quantum computing: entanglement.
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
Figure 5 Quantum circuit generating EPR state from input |00\rangle
Example 4.4 As shown in Figure 5, consider a quantum circuit with two input lines x and y. The x line is connected to the Hadamard gate, and the output of the Hadamard gate becomes the control line for the CNOT gate. The y line is the target line for the CNOT gate. We call it the EPR quantum circuit, after the initials of Einstein, Podolsky, and Rosen, who pointed out the strange properties of the circuit’s output state. The following is the circuit’s explanation of the two-input qubit Turing Award winner and author of the Dragon Book: What is “abstract”? The transformation of the four values of :
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
The operation of a quantum circuit can be described as a sequence of state vectors that represent the state of the quantum system after applying each level of gates. For Figure 5, the state vectors obtained at each stage are summarized as follows:
1) Before the H gate: Turing Award winner, author of “Dragon Book”, 10,000-character long essay explanation: What is “abstract”?
2) After the H gate and before the CNOT gate: The Turing Award winner and author of “Dragon Book” explained in a 10,000-character long essay: What is “abstract”?
3) Behind the CNOT gate: Turing Award winner and author of “Dragon Book” explained in a 10,000-character long essay: What is “abstract”?
The state of a composite quantum system cannot be written as the tensor product of the states of its constituent systems, which is called an entangled state. It can be seen that the EPR output state above is entangled. There are no two single-qubit states Turing Award winners and author of “Dragon Book”. And the Turing Award winner, author of “Dragon Book”, explained in a long article: What is “abstract”? Make the following formula hold.
The Turing Award winner and author of “Dragon Book” explains in a 10,000-character long essay: What is “abstract”?
The role of entanglement in quantum computing is crucial, but the physics of entanglement remain a mystery to physicists. In fact, Einstein called it “the ghost effect at a distance.”
4.3 Quantum Algorithms
Quantum computing devices are likely to be used as auxiliary devices controlled by classical computers. Quantum computer programs are often expressed as a mixture of classical computing and quantum algorithms. Quantum algorithms are often presented as quantum circuits with the following structure:
1) At the beginning of the circuit, all input qubits are set to a specific state, which is usually explained in a 10,000-word long article by the Turing Award winner and author of “Dragon Book”: What is “abstract”? .
2) The circuit is in a superposition state.
3) The circuit acts on this superposition through a unitary gate.
4) Measure the qubits in the circuit by returning the classical bits (0 and 1) as outputs to the controlling classical computer by measuring gates.
Quantum computing ushered in a leap forward in 1994, when Peter Shor of Bell Labs published an algorithm for factoring n-bit integers on a hybrid classical/quantum computer with a time complexity of Turing Award-winning, “Dragon”. The author of “Book” explains in a 10,000-character long article: What is “abstract”? . Even today, there is no algorithm that can factor integers on a classical computer in polynomial time.
Shor uses classical number theory to reduce the integer factorization problem to an ordering problem. The ordering problem is as follows: Given positive integers x and N, where x<N and x is relatively prime to N, find the smallest positive integer r, so that the Turing Award winner and author of “Dragon Book” can explain in a 10,000-character long article: What is “abstract” “? . The integer r is called the order of x in N. For example, the order of 5 in 21 is 6, because it is necessary to make the Turing Award winner and author of “Dragon Book” a 10,000-character long article to explain: What is “abstract”? holds, 6 is the smallest positive integer.
Shor designed a quantum algorithm to solve the ordering problem with a polynomial number of quantum gates. There is currently no known algorithm that can solve the ordering problem on a classical computer in polynomial time.
Quantum algorithms often use special techniques not found in traditional computer algorithms. For example, Shor’s algorithm uses the quantum Fourier transform as part of its ordering calculation.
5: Future directions
Abstraction has had a considerable impact on many areas of computer science. There are many more papers on abstract stories in computer science. Here are some directions that may be of interest to theoretical researchers and have practical implications.
5.1 Quantum future
Quantum computing is still a nascent field. While quantum circuits can approximate any single operator to any desired precision, today’s quantum gate computers have only 50 to 100 qubits available. Furthermore, there are only a handful of practical quantum algorithms, so more work needs to be done in both the hardware and algorithmic domains of quantum computing to overcome these limitations.
In theory, many unanswered questions also remain. For example, if we could prove that a problem of integers cannot be factored on a classical computer in polynomial time, then we would have an example of a quantum computer solving the problem faster than a classical computer. This is just one of many deep theoretical questions that remain unresolved. You may wish to consult Aaronson for a list of algorithmic challenges in quantum abstraction.
Researchers have developed many full-stack quantum computing programming languages. Columbia University doctoral student Krysta Svore shows that the compiler architecture discussed in Section 2 can be combined with error correction into the layered software architecture of quantum computing design tools. After graduation, she joined Microsoft Research, where she and her colleagues subsequently developed the Q# quantum programming language, which is part of the Microsoft Quantum Development Kit.
5.2 Abstraction of computer systems and hardware
The success of map-reduce and other high-level abstractions for a particular type of computing platform (in this case, a computing cluster) suggests that other platforms may have similar abstractions. For example, there is currently a lot of interest in serverless computing, where data is only kept in a file system and the computation is done by renting one or more servers for a short period of time.
On a smaller scale, dedicated hardware is a growing trend and is likely to play an increasingly important role in accelerating the execution of important algorithms on large-scale data. You may have heard of Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). Plasticine, another chip designed to support high communication bandwidth and parallelism, may be available soon. Having high-level abstractions that match these architectures will work because algorithms written using these abstractions can be compiled into efficient implementations using one or more chip types.
5.3 Abstract Taxonomy
Over the years, powerful abstractions related to programming language processing have been invented, helping to transform the field of compiler design from an art to a science. But the final paper has not been written yet. It would be of great benefit to expand the basic taxonomy we abstracted in Section 1.2 to cover more programming languages and compiler fields, and even more computer science fields. Abstractions related to continuously running systems such as operating systems, networks, and the Internet are naturally included.
In addition, we hope that through the lectures organized in the data structure course, you will realize that the power of taxonomy is much more than that. We prefer to study what makes one abstraction more useful than another. For example, we mentioned in Section 3.1 how relational models naturally become declarative abstractions, while previous database models were not suitable for languages such as SQL, which set the stage for the emergence of higher-order programming. Similarly, regular expressions seem to be very suitable for describing programming language tokens and other interesting sets of strings, while equivalent notations such as Chomsky’s type-3 grammar (a special case of CFG) are used in applications such as syntactic analysis Never found much use. It might be natural to ask, “Why is this happening?”
An interesting new area is the use of machine learning to create software applications that use data rather than source programs written in some programming language. In a sense, machine learning is a way of creating software that doesn’t involve traditional compilation. An abstraction that can guide the effective creation of powerful applications using machine learning would greatly benefit.
GIPHY App Key not set. Please check settings