Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/src/ml-yacc/doc/mlyacc.tex
ViewVC logotype

Diff of /sml/trunk/src/ml-yacc/doc/mlyacc.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 250, Sat Apr 17 18:57:03 1999 UTC revision 627, Tue Apr 25 20:40:35 2000 UTC
# Line 1  Line 1 
1  \documentstyle{article}  \documentstyle{article}
2  \title{                        ML-Yacc User's Manual \\  \title{                        ML-Yacc User's Manual \\
3                                 Version 2.3                                 Version 2.4
4        }        }
5  \author{                David R. Tarditi$^1$\\  \author{                David R. Tarditi$^1$\\
6                          Andrew W. Appel$^2$\\                          Andrew W. Appel$^2$\\
7  \\  \\
8  $^1$School of Computer Science \\  $^1$Microsoft Research \\
 Carnegie Mellon University \\  
 Pittsburgh, PA 15213 \\  
9  \\  \\
10  $^2$Department of Computer Science \\  $^2$Department of Computer Science \\
11      Princeton University \\      Princeton University \\
12      Princeton, NJ 08544      Princeton, NJ 08544
13  }  }
14  \date{October 6, 1994}  \date{April 24, 2000}
15    
16  \begin{document}  \begin{document}
17  \maketitle  \maketitle
# Line 29  Line 27 
27    
28  New in this version:  Improved error correction directive \verb|%change|  New in this version:  Improved error correction directive \verb|%change|
29  that allows multi-token insertions, deletions, substitutions.  that allows multi-token insertions, deletions, substitutions.
30    Explanation of how to build a parser (Section 5) and the Calc example
31    (Section 7) revised for SML/NJ Version 110 and the use of CM.
32    
33  \newpage  \newpage
34  \tableofcontents  \tableofcontents
# Line 139  Line 139 
139    
140  \subsection{Error Recovery}  \subsection{Error Recovery}
141    
142  The error recovery algorithm is able to accurately recover  The error recovery algorithm is able to accurately recover from many
143  from many single token syntax errors.  single token syntax errors.  It tries to make a single token
144  It tries to make a single token correction at the token in the  correction at the token in the input stream at which the syntax error
145  input stream at which the syntax error was detected and any of  was detected and any of the 15 tokens\footnote{An arbitrary number
146  the 15 tokens\footnote{An arbitrary number chosen because numbers  chosen because numbers above this do not seem to improve error
147  above this do not seem to improve error correction much.} before that token.  The algorithm  correction much.} before that token.  The algorithm checks corrections
148  checks corrections before the point of error detection because  before the point of error detection because a syntax error is often
149  a syntax error is often not detected until several tokens beyond  not detected until several tokens beyond the token which caused the
150  the token which caused the error.\footnote{An LR parser detects a syntax error  error.\footnote{An LR parser detects a syntax error as soon as
151  as soon as possible, but this does not necessarily mean that the  possible, but this does not necessarily mean that the token at which
152  token at which the error was detected caused the error.}  the error was detected caused the error.}
153    
154  The algorithm works by trying corrections at each  The algorithm works by trying corrections at each
155  of the 16 tokens up to and including the token at which the  of the 16 tokens up to and including the token at which the
# Line 333  Line 333 
333  \begin{quote}  \begin{quote}
334  \raggedright  \raggedright
335  \tt  \tt
336                  \%arg \%eop \%header \%token\_sig\_info \%keyword                  \%arg \%eop \%header \%keyword \%left \%name \%nodefault
                 \%left \%name \%nodefault  
337                  \%nonassoc \%nonterm \%noshift \%pos \%prec \%prefer                  \%nonassoc \%nonterm \%noshift \%pos \%prec \%prefer
338                  \%pure \%right \%start \%subst \%term \%value \%verbose                  \%pure \%right \%start \%subst \%term \%value \%verbose
339  \end{quote}  \end{quote}
# Line 366  Line 365 
365  \mbox{cmd} & ::= & \mbox{{\tt \%arg} (Any-ML-pattern) {\tt :} ML-type} \\  \mbox{cmd} & ::= & \mbox{{\tt \%arg} (Any-ML-pattern) {\tt :} ML-type} \\
366   & | & \mbox{{\tt \%eop} symbol-list} \\   & | & \mbox{{\tt \%eop} symbol-list} \\
367   & | & \mbox{{\tt \%header} code} \\   & | & \mbox{{\tt \%header} code} \\
  & | & \mbox{{\tt \%token\_sig\_info} code} \\  
368   & | & \mbox{{\tt \%keyword} symbol-list} \\   & | & \mbox{{\tt \%keyword} symbol-list} \\
369   & | & \mbox{{\tt \%left} symbol-list} \\   & | & \mbox{{\tt \%left} symbol-list} \\
370   & | & \mbox{{\tt \%name} identifier} \\   & | & \mbox{{\tt \%name} identifier} \\
# Line 490  Line 488 
488  while evaluating a correction to a syntax error otherwise.  This may  while evaluating a correction to a syntax error otherwise.  This may
489  confuse the lexer.  confuse the lexer.
490  \item[{\tt \%header}]  \item[{\tt \%header}]
 \begin{samepage}  
491  You may define code to head the functor \{parser name\}LrValsFun here.  This  You may define code to head the functor \{parser name\}LrValsFun here.  This
492  may be useful for adding additonal parameter structures to the functor.  may be useful for adding additonal parameter structures to the functor.
493  The functor must be parameterized by the Token structure, so  The functor must be parameterized by the Token structure, so
494  the declaration should always have the form:  the declaration should always have the form:
495  \begin{tt}  \begin{quote}
496  \begin{verbatim}  \begin{verbatim}
497  %header (functor {parser name}LrValsFun(  %header (functor {parser name}LrValsFun(
498                                  structure Token : TOKEN                                  structure Token : TOKEN
499                                         ...)                                         ...)
500          )          )
501  \end{verbatim}  \end{verbatim}
502  \end{tt}  \end{quote}
503  \end{samepage}  
 \item[{\tt \%token\_sig\_info}]  
 \begin{samepage}  
 You may add specifications to the signature \{parser name\}_TOKENS  
 here.  This may be useful in adding structures or types that  
 subsequently occur as arguments to various tokens.  The declartion  
 should be of the form:  
 \begin{tt}  
 \begin{verbatim}  
 %token_sig_info (type mytype  
                  structure MyStructure : MyStructure_sig  
                  ...  
                 )  
 \end{verbatim}  
 \end{tt}  
 Any specification you put here must be satisfied by the code you give  
 in the user declarations.%  
 \end{samepage}  
504  \item[{\tt \%left},{\tt \%right},{\tt \%nonassoc}]  \item[{\tt \%left},{\tt \%right},{\tt \%nonassoc}]
505  You should list the precedence declarations in order of increasing (tighter-binding)  You should list the precedence declarations in order of increasing (tighter-binding)
506   precedence.  Each precedence declaration consists   precedence.  Each precedence declaration consists
# Line 566  Line 546 
546  \end{enumerate}  \end{enumerate}
547  \end{description}  \end{description}
548    
549  \subsection{Declarations for improving error-recovery.}  \subsection{Declarations for improving error-recovery}
550    
551  These optional declarations improve error-recovery:  These optional declarations improve error-recovery:
552    
# Line 652  Line 632 
632  The value is ignored if the nonterminal has no value, but is still  The value is ignored if the nonterminal has no value, but is still
633  evaluated for side-effects.  evaluated for side-effects.
634    
635  \section{Producing files with ML-Yacc.}  \section{Producing files with ML-Yacc}
636    
637  ML-Yacc may be used from the interactive system or built as a  ML-Yacc may be used from the interactive system or built as a
638  stand-alone program which may be run from the Unix command line.  stand-alone program which may be run from the Unix command line.
# Line 747  Line 727 
727  declaration, but you must follow special instructions for tying the parser  declaration, but you must follow special instructions for tying the parser
728  and lexer together.  and lexer together.
729    
730  \section{Creating the parser.}  \section{Creating the parser}
731  \label{create-parser}  \label{create-parser}
732  Let the name of the specification file be denoted by \{spec name\} and  Let the name of the grammar specification file be denoted by
733  the parser name in the specification be \{n\}.  \{grammar\} and the name of the lexer specification file be
734  To construct a parser, do the following:  denoted by \{lexer\} (e.g. in our calculator example these would
735    stand for calc.grm and calc.lex, respectively).
736    Let the parser name in the specification be represented by \{n\}
737    (e.g. Calc in our calculator example).
738    
739    To construct a parser, do the following:
740  \begin{enumerate}  \begin{enumerate}
741    \item In the appropriate CM description file (e.g. for your main
742  \item Run ML-Yacc on the specification file for a grammar.  program or one of its subgroups or libraries), include the lines:
743  \item Run ML-Lex to create the lexical analyzer.  \begin{quote}
744  \item Load the file base.sml from the ML-Yacc directory.  This file contains  \begin{verbatim}
745  the common modules.  If you have already loaded this file, you do not need  ml-yacc-lib.cm
746  to load it again.  {lexer}
747  \item Load the file \{spec name\}.sig produced by ML-Yacc.  {grammar}
748  \item Load the file produced by ML-Lex.  \end{verbatim}
749  \item Load the file \{spec name\}.sml by ML-Yacc.  \end{quote}
750    This will cause ML-Yacc to be run on \{grammar\}, producing source files
751    \{grammar\}.sig and \{grammar\}.sml, and ML-Lex to be run on
752    \{lexer\}, producing a source file \{lexer\}.sml.  Then these files
753    will be compiled after loading the necessary signatures and modules
754    from the ML-Yacc library as specified by {\tt ml-yacc-lib.cm}.
755  \item Apply functors to create the parser:  \item Apply functors to create the parser:
   
 \end{enumerate}  
756  \begin{quote}  \begin{quote}
 \tt  
757  \begin{verbatim}  \begin{verbatim}
758     structure {n}LrVals =     structure {n}LrVals =
759          {n}LrValsFun(structure Token = LrParser.Token)          {n}LrValsFun(structure Token = LrParser.Token)
760     structure {n}Lex =     structure {n}Lex =
761          {n}LexFun(structure Tokens =    {n}LexFun(structure Tokens = {n}LrVals.Tokens)
                              {n}LrVals.Tokens)  
762     structure {n}Parser=     structure {n}Parser=
763           Join(structure ParserData = {n}LrVals.ParserData           Join(structure ParserData = {n}LrVals.ParserData
764                structure Lex={n}Lex                structure Lex={n}Lex
# Line 781  Line 766 
766  \end{verbatim}  \end{verbatim}
767  \end{quote}  \end{quote}
768  If the lexer was created using the {\tt \%arg} declaration in ML-Lex,  If the lexer was created using the {\tt \%arg} declaration in ML-Lex,
769  the last step  the definition of \{n\}Parser must be changed to use another functor
770  must be changed to use another functor called JoinWithArg:  called JoinWithArg:
771  \begin{quote}  \begin{quote}
 \tt  
772  \begin{verbatim}  \begin{verbatim}
773       structure {n}Parser=       structure {n}Parser=
774           JoinWithArg           JoinWithArg
# Line 793  Line 777 
777                structure LrParser=LrParser)                structure LrParser=LrParser)
778  \end{verbatim}  \end{verbatim}
779  \end{quote}  \end{quote}
780    \end{enumerate}
781    
782  The following outline summarizes this process:  The following outline summarizes this process:
783  \begin{quote}  \begin{quote}
 \tt  
784  \begin{verbatim}  \begin{verbatim}
785  (* available at top level *)  (* available at top level *)
786    
# Line 847  Line 831 
831          (* code in header section of specification *)          (* code in header section of specification *)
832    
833              structure Header = ...              structure Header = ...
834              type svalue      type svalue = ...
835              type result      type result = ...
836              type pos      type pos = ...
837              structure Actions = ...              structure Actions = ...
838              structure EC = ...              structure EC = ...
839              val table = ...              val table = ...
840            end            end
841    
842          structure Tokens : {n}_TOKENS =          structure Tokens : {n}_TOKENS =
843             struct             struct
844               structure Token = ParserData.Token               structure Token = ParserData.Token
845               type svalue      type svalue = ...
846               fun PLUS(p1,p2) = ...               fun PLUS(p1,p2) = ...
847               fun INTLIT(i,p1,p2) = ...               fun INTLIT(i,p1,p2) = ...
848            end            end
849    
850     end     end
851    
852  (* to be done by the user: *)  (* to be done by the user: *)
853    
854  structure {n}LrVals =  structure {n}LrVals =
855           {n}LrValsFun(structure Token = LrParser.Token)           {n}LrValsFun(structure Token = LrParser.Token)
856    
857  structure {n}Lex =  structure {n}Lex =
858           {n}LexFun(structure Tokens =    {n}LexFun(structure Tokens = {n}LrVals.Tokens)
859                              {n}LrVals.Tokens)  
860  structure {n}Parser =  structure {n}Parser =
861             Join(structure Lex = {n}Lex             Join(structure Lex = {n}Lex
862                  ParserData = {n}ParserData         structure ParserData = {n}ParserData
863                  structure LrParser = LrParser)                  structure LrParser = LrParser)
864  \end{verbatim}  \end{verbatim}
865  \end{quote}  \end{quote}
# Line 880  Line 867 
867  \section{Using the parser}  \section{Using the parser}
868  \subsection{Parser Structure Signatures}  \subsection{Parser Structure Signatures}
869  The final structure created will have the signature PARSER:  The final structure created will have the signature PARSER:
870  \begin{tt}  \begin{quote}
871  \begin{verbatim}  \begin{verbatim}
872  signature PARSER =  signature PARSER =
873      sig      sig
# Line 895  Line 882 
882    
883          val makeLexer : (int -> string) ->          val makeLexer : (int -> string) ->
884                           (svalue,pos) Token.token Stream.stream                           (svalue,pos) Token.token Stream.stream
   
885          val parse :          val parse :
886              int * ((svalue,pos) Token.token Stream.stream) *              int * ((svalue,pos) Token.token Stream.stream) *
887              (string * pos * pos -> unit) * arg ->              (string * pos * pos -> unit) * arg ->
# Line 905  Line 891 
891                bool                bool
892       end       end
893  \end{verbatim}  \end{verbatim}
894  \end{tt}  \end{quote}
895  or the signature ARG\_PARSER if you used {\tt \%arg} to create the lexer.  or the signature ARG\_PARSER if you used {\tt \%arg} to create the lexer.
896  This signature differs from ARG\_PARSER in that it  This signature differs from ARG\_PARSER in that it
897  which has an additional type {\tt lexarg} and a different type  which has an additional type {\tt lexarg} and a different type
898  for {\tt makeLexer}:  for {\tt makeLexer}:
899  \begin{tt}  \begin{quote}
900  \begin{verbatim}  \begin{verbatim}
901      type lexarg      type lexarg
902      val makeLexer : (int -> string)  -> lexarg ->      val makeLexer : (int -> string)  -> lexarg ->
903                           (svalue,pos) token stream                           (svalue,pos) token stream
904  \end{verbatim}  \end{verbatim}
905  \end{tt}  \end{quote}
906    
907  The signature STREAM is:  The signature STREAM (providing lazy streams) is:
908  \begin{tt}  \begin{quote}
909  \begin{verbatim}  \begin{verbatim}
 (* STREAM: signature for a lazy stream.*)  
   
910  signature STREAM =  signature STREAM =
911   sig type 'a stream  sig
912       val streamify : (unit -> '_a) -> '_a stream    type 'a stream
913       val cons : '_a * '_a stream -> '_a stream    val streamify : (unit -> 'a) -> 'a stream
914       val get : '_a stream -> '_a * '_a stream    val cons : 'a * 'a stream -> 'a stream
915      val get : 'a stream -> 'a * 'a stream
916   end   end
917  \end{verbatim}  \end{verbatim}
918  \end{tt}  \end{quote}
919    
920  \subsection{Using the parser structure}  \subsection{Using the parser structure}
921    
# Line 994  Line 979 
979    See the directory examples for examples of parsers constructed using    See the directory examples for examples of parsers constructed using
980  ML-Yacc.  Here is a small sample parser and lexer for an interactive  ML-Yacc.  Here is a small sample parser and lexer for an interactive
981  calculator, from the directory examples/calc, along with code for  calculator, from the directory examples/calc, along with code for
982  creating a parsing function.  The calculator reads one or  creating a parsing function.  The calculator reads one or more
983  more expressions from the standard input, evaluates the expression, and  expressions from the standard input, evaluates the expressions, and
984  prints its value.  Expressions should be separated by semicolons, and may  prints their values.  Expressions should be separated by semicolons,
985  also be ended by using an end-of-file.  This shows  and may also be ended by using an end-of-file.  This shows how to
986  how to construct an interactive parser which reads a top-level declaration  construct an interactive parser which reads a top-level declaration
987  and processes the declaration before reading the next top-level  and processes the declaration before reading the next top-level
988  declaration.  declaration.
989    
# Line 1101  Line 1086 
1086  \end{tt}  \end{tt}
1087  \subsection{Top-level code}  \subsection{Top-level code}
1088    
1089  You must follow instructions one through six in Section~\ref{create-parser}  You must follow the instructions in Section~\ref{create-parser}
1090  to create the parser and lexer functors and load them.  After you have  to create the parser and lexer functors and load them.  After you have
1091  done this, you must then apply the functors to produce the {\tt CalcParser}  done this, you must then apply the functors to produce the {\tt CalcParser}
1092  structure.  The code for doing this is shown below.  structure.  The code for doing this is shown below.
1093    \begin{quote}
1094  \begin{verbatim}  \begin{verbatim}
1095  structure CalcLrVals =  structure CalcLrVals =
1096      CalcLrValsFun(structure Token = LrParser.Token)      CalcLrValsFun(structure Token = LrParser.Token)
1097    
1098  structure CalcLex =  structure CalcLex =
1099      CalcLexFun(structure Tokens = CalcLrVals.Tokens);      CalcLexFun(structure Tokens = CalcLrVals.Tokens);
1100    
1101  structure CalcParser =  structure CalcParser =
1102       Join(structure LrParser = LrParser       Join(structure LrParser = LrParser
1103            structure ParserData = CalcLrVals.ParserData            structure ParserData = CalcLrVals.ParserData
1104            structure Lex = CalcLex)            structure Lex = CalcLex)
1105  \end{verbatim}  \end{verbatim}
1106    \end{quote}
1107    
1108  Now we need a function which given a lexer invokes the parser.  The  Now we need a function which given a lexer invokes the parser.  The
1109  function {\tt invoke} does this.  function {\tt invoke} does this.
1110    
1111    \begin{quote}
1112  \begin{verbatim}  \begin{verbatim}
1113  val invoke = fn lexstream =>  fun invoke lexstream =
1114    let val print_error = fn (s,i:int,_) =>      let fun print_error (s,i:int,_) =
1115         output(std_out,"Error, line " ^              TextIO.output(TextIO.stdOut,
1116                        (makestring i) ^ ", " ^ s ^ "\n")                            "Error, line " ^ (Int.toString i) ^ ", " ^ s ^ "\n")
1117    in CalcParser.parse(0,lexstream,print_error,())    in CalcParser.parse(0,lexstream,print_error,())
1118    end    end
1119  \end{verbatim}  \end{verbatim}
1120    \end{quote}
 We also need a function which reads a line of input from the terminal  
 \footnote{Standard ML of New Jersey has a function input\_line in its  
 built-in environment that also does this.}:  
 \begin{verbatim}  
   val input_line = fn f =>  
     let fun loop result =  
           let val c = input (f,1)  
               val result = c :: result  
           in if String.size c = 0 orelse c = "\n" then  
                 String.implode (rev result)  
              else loop result  
           end  
     in loop nil  
     end  
 \end{verbatim}  
1121    
1122  Finally, we need a function which can read one or more expressions from  Finally, we need a function which can read one or more expressions from
1123  the standard input.  The function {\tt parse}, shown below, does this.  the standard input.  The function {\tt parse}, shown below, does this.
1124  It runs the calculator on the standard input and terminates  It runs the calculator on the standard input and terminates
1125  when an end-of-file is encountered.  when an end-of-file is encountered.
1126    
1127    \begin{quote}
1128  \begin{verbatim}  \begin{verbatim}
1129  val parse = fn () =>  fun parse () =
1130      let val lexer = CalcParser.makeLexer (fn _ => input_line std_in)      let val lexer = CalcParser.makeLexer
1131                          (fn _ => TextIO.inputLine TextIO.stdIn)
1132          val dummyEOF = CalcLrVals.Tokens.EOF(0,0)          val dummyEOF = CalcLrVals.Tokens.EOF(0,0)
1133          val dummySEMI = CalcLrVals.Tokens.SEMI(0,0)          val dummySEMI = CalcLrVals.Tokens.SEMI(0,0)
1134          fun loop lexer =          fun loop lexer =
1135             let val (result,lexer) = invoke lexer             let val (result,lexer) = invoke lexer
1136                 val (nextToken,lexer) = CalcParser.Stream.get lexer                 val (nextToken,lexer) = CalcParser.Stream.get lexer
1137                 val _ = case result               in case result
1138                    of SOME r =>                    of SOME r =>
1139                        output(std_out,                        TextIO.output(TextIO.stdOut,
1140                           "result = " ^ (makestring r) ^ "\n")                               "result = " ^ (Int.toString r) ^ "\n")
1141                     | NONE => ()                     | NONE => ();
1142             in if CalcParser.sameToken(nextToken,dummyEOF) then ()                  if CalcParser.sameToken(nextToken,dummyEOF) then ()
1143                else loop lexer                else loop lexer
1144             end             end
1145       in loop lexer       in loop lexer
1146       end       end
1147  \end{verbatim}  \end{verbatim}
1148    \end{quote}
1149    
1150  \section{Signatures}  \section{Signatures}
1151    
# Line 1177  Line 1155 
1155    
1156  \subsection{Parsing structure signatures}  \subsection{Parsing structure signatures}
1157    
1158  \begin{tt}  \begin{quote}
1159  \begin{verbatim}  \begin{verbatim}
1160  (* STREAM: signature for a lazy stream.*)  (* STREAM: signature for a lazy stream.*)
1161    
1162  signature STREAM =  signature STREAM =
1163   sig type 'a stream  sig
1164       val streamify : (unit -> '_a) -> '_a stream    type 'a stream
1165       val cons : '_a * '_a stream -> '_a stream    val streamify : (unit -> 'a) -> 'a stream
1166       val get : '_a stream -> '_a * '_a stream    val cons : 'a * 'a stream -> 'a stream
1167      val get : 'a stream -> 'a * 'a stream
1168   end   end
1169    
1170  (* LR_TABLE: signature for an LR Table.*)  (* LR_TABLE: signature for an LR Table.*)
1171    
1172  signature LR_TABLE =  signature LR_TABLE =
1173   sig   sig
1174     datatype ('a,'b) pairlist = EMPTY    datatype ('a,'b) pairlist
1175        = EMPTY
1176                               | PAIR of 'a * 'b * ('a,'b) pairlist                               | PAIR of 'a * 'b * ('a,'b) pairlist
1177     datatype state = STATE of int     datatype state = STATE of int
1178     datatype term = T of int     datatype term = T of int
# Line 1207  Line 1187 
1187     val numRules : table -> int     val numRules : table -> int
1188     val describeActions : table -> state ->     val describeActions : table -> state ->
1189                       (term,action) pairlist * action                       (term,action) pairlist * action
1190     val describeGoto : table -> state -> (nonterm,state) pairlist    val describeGoto : table -> state ->
1191                           (nonterm,state) pairlist
1192     val action : table -> state * term -> action     val action : table -> state * term -> action
1193     val goto : table -> state * nonterm -> state     val goto : table -> state * nonterm -> state
1194     val initialState : table -> state     val initialState : table -> state
# Line 1244  Line 1225 
1225    
1226        val parse:        val parse:
1227         {table : LrTable.table,         {table : LrTable.table,
1228          lexer : ('_b,'_c) Token.token Stream.stream,          lexer : ('b,'c) Token.token Stream.stream,
1229          arg: 'arg,          arg: 'arg,
1230          saction : int *          saction : int *
1231                   '_c *                   'c *
1232                   (LrTable.state * ('_b * '_c * '_c)) list *                   (LrTable.state * ('b * 'c * 'c)) list *
1233                   'arg ->                   'arg ->
1234                    LrTable.nonterm *                    LrTable.nonterm *
1235                    ('_b * '_c * '_c) *                    ('b * 'c * 'c) *
1236                    ((LrTable.state *('_b * '_c * '_c)) list),                    ((LrTable.state *('b * 'c * 'c)) list),
1237          void : '_b,          void : 'b,
1238          ec : {is_keyword : LrTable.term -> bool,          ec : {is_keyword : LrTable.term -> bool,
1239                noShift : LrTable.term -> bool,                noShift : LrTable.term -> bool,
1240                preferred_subst:LrTable.term -> LrTable.term list,                preferred_subst:LrTable.term -> LrTable.term list,
1241                preferred_insert : LrTable.term -> bool,                preferred_insert : LrTable.term -> bool,
1242                errtermvalue : LrTable.term -> '_b,               errtermvalue : LrTable.term -> 'b,
1243                showTerminal : LrTable.term -> string,                showTerminal : LrTable.term -> string,
1244                terms: LrTable.term list,                terms: LrTable.term list,
1245                error : string * '_c * '_c -> unit               error : string * 'c * 'c -> unit
1246                },                },
1247          lookahead : int  (* max amount of lookahead used in *)          lookahead : int (* max amount of lookahead used in
1248                           (* error correction *)                           * error correction *)
1249         } -> '_b * (('_b,'_c) Token.token Stream.stream)         } -> 'b * (('b,'c) Token.token Stream.stream)
1250    end    end
1251  \end{verbatim}  \end{verbatim}
1252  \end{tt}  \end{quote}
1253    
1254  \subsection{Lexers}  \subsection{Lexers}
1255    
1256  Lexers for use with ML-Yacc's output must match one of these signatures.  Lexers for use with ML-Yacc's output must match one of these signatures.
1257    
1258  \begin{tt}  \begin{quote}
1259  \begin{verbatim}  \begin{verbatim}
1260  signature LEXER =  signature LEXER =
1261     sig     sig
# Line 1285  Line 1266 
1266                  type svalue                  type svalue
1267             end             end
1268          val makeLexer : (int -> string) -> unit ->          val makeLexer : (int -> string) -> unit ->
1269            (UserDeclarations.svalue,         (UserDeclarations.svalue, UserDeclarations.pos)
1270             UserDeclarations.pos) UserDeclarations.token         UserDeclarations.token
1271     end     end
1272    
1273  (* ARG_LEXER: the %arg option of ML-Lex allows users to produce  (* ARG_LEXER: the %arg option of ML-Lex allows users to
1274     lexers which also take an argument before yielding a function     produce lexers which also take an argument before
1275     from unit to a token.     yielding a function from unit to a token.
1276  *)  *)
1277    
1278  signature ARG_LEXER =  signature ARG_LEXER =
# Line 1303  Line 1284 
1284          type svalue          type svalue
1285          type arg          type arg
1286        end        end
1287       val makeLexer : (int -> string) -> UserDeclarations.arg ->    val makeLexer :
1288          (int -> string) ->
1289          UserDeclarations.arg ->
1290            unit ->            unit ->
1291              (UserDeclarations.svalue,         (UserDeclarations.svalue, UserDeclarations.pos)
1292               UserDeclarations.pos) UserDeclarations.token         UserDeclarations.token
1293     end     end
1294  \end{verbatim}  \end{verbatim}
1295  \end{tt}  \end{quote}
1296    
1297  \subsection{Signatures for the functor produced by ML-Yacc}  \subsection{Signatures for the functor produced by ML-Yacc}
1298    
1299  The following signature is used in signatures generated by  The following signature is used in signatures generated by
1300  ML-Yacc:  ML-Yacc:
1301  \begin{tt}  \begin{quote}
1302  \begin{verbatim}  \begin{verbatim}
1303  (* PARSER_DATA: the signature of ParserData structures in  (* PARSER_DATA: the signature of ParserData structures in
1304     {n}LrValsFun functor produced by ML-Yacc. All such structures     {n}LrValsFun functor produced by ML-Yacc. All such
1305     match this signature.*)     structures match this signature. *)
1306    
1307  signature PARSER_DATA =  signature PARSER_DATA =
1308     sig     sig
# Line 1362  Line 1345 
1345          val table : LrTable.table          val table : LrTable.table
1346      end      end
1347  \end{verbatim}  \end{verbatim}
1348  \end{tt}  \end{quote}
1349    
1350  ML-Yacc generates these two signatures:  ML-Yacc generates these two signatures:
1351  \begin{tt}  \begin{quote}
1352  \begin{verbatim}  \begin{verbatim}
1353  (* printed out in .sig file created by parser generator: *)  (* printed out in .sig file created by parser generator: *)
1354    
# Line 1384  Line 1367 
1367       sharing type ParserData.svalue = Tokens.svalue       sharing type ParserData.svalue = Tokens.svalue
1368    end    end
1369  \end{verbatim}  \end{verbatim}
1370  \end{tt}  \end{quote}
1371  \subsection{User parser signatures}  \subsection{User parser signatures}
1372    
1373  Parsers created by applying the Join functor will match this signature:  Parsers created by applying the Join functor will match this signature:
1374  \begin{tt}  \begin{quote}
1375  \begin{verbatim}  \begin{verbatim}
1376  signature PARSER =  signature PARSER =
1377      sig      sig
# Line 1413  Line 1396 
1396                bool                bool
1397       end       end
1398  \end{verbatim}  \end{verbatim}
1399  \end{tt}  \end{quote}
1400  Parsers created by applying the JoinWithArg functor will match this  Parsers created by applying the JoinWithArg functor will match this
1401  signature:  signature:
1402  \begin{tt}  \begin{quote}
1403  \begin{verbatim}  \begin{verbatim}
1404  signature ARG_PARSER =  signature ARG_PARSER =
1405    sig    sig
# Line 1442  Line 1425 
1425               bool               bool
1426    end    end
1427  \end{verbatim}  \end{verbatim}
1428  \end{tt}  \end{quote}
1429    
1430  \section{Sharing constraints}  \section{Sharing constraints}
1431    
# Line 1450  Line 1433 
1433  you have not created a lexer which takes an argument, and  you have not created a lexer which takes an argument, and
1434  you have followed the directions given earlier for creating the parser, you  you have followed the directions given earlier for creating the parser, you
1435  will have the following structures with the following signatures:  will have the following structures with the following signatures:
1436  \begin{tt}  \begin{quote}
1437  \begin{verbatim}  \begin{verbatim}
1438  (* always present *)  (* always present *)
1439    
# Line 1472  Line 1455 
1455  structure Lex : LEXER  structure Lex : LEXER
1456  structure {n}Parser : PARSER  structure {n}Parser : PARSER
1457  \end{verbatim}  \end{verbatim}
1458  \end{tt}  \end{quote}
1459    
1460  The following sharing constraints will exist:  The following sharing constraints will exist:
1461  \begin{tt}  \begin{quote}
1462  \begin{verbatim}  \begin{verbatim}
1463  sharing {n}Parser.Token = LrParser.Token =  sharing {n}Parser.Token = LrParser.Token =
1464            {n}LrVals.ParserData.Token            {n}LrVals.ParserData.Token
# Line 1495  Line 1478 
1478  sharing {n}LrVals.LrTable = LrParser.LrTable  sharing {n}LrVals.LrTable = LrParser.LrTable
1479    
1480  \end{verbatim}  \end{verbatim}
1481  \end{tt}  \end{quote}
1482    
1483  If you used a lexer which takes an argument, then you will  If you used a lexer which takes an argument, then you will
1484  have:  have:
1485  \begin{tt}  \begin{quote}
1486  \begin{verbatim}  \begin{verbatim}
1487  structure ARG_LEXER  structure ARG_LEXER
1488  structure {n}Parser : PARSER  structure {n}Parser : PARSER
# Line 1508  Line 1491 
1491    
1492  sharing type {n}Parser.lexarg = Lex.UserDeclarations.arg  sharing type {n}Parser.lexarg = Lex.UserDeclarations.arg
1493  \end{verbatim}  \end{verbatim}
1494  \end{tt}  \end{quote}
1495    
1496  \section{Hints}  \section{Hints}
1497  \subsection{Multiple start symbols}  \subsection{Multiple start symbols}
# Line 1522  Line 1505 
1505  Assuming that you have followed the naming conventions used before,  Assuming that you have followed the naming conventions used before,
1506  create the lexer using the makeLexer function in the \{n\}Parser structure.  create the lexer using the makeLexer function in the \{n\}Parser structure.
1507  Then, place the dummy token on the front of the lexer:  Then, place the dummy token on the front of the lexer:
1508  \begin{tt}  \begin{quote}
1509  \begin{verbatim}  \begin{verbatim}
1510  val dummyLexer =  val dummyLexer =
1511      {n}Parser.Stream.cons      {n}Parser.Stream.cons
# Line 1530  Line 1513 
1513                   ({dummy lineno},{dummy lineno}),                   ({dummy lineno},{dummy lineno}),
1514          lexer)          lexer)
1515  \end{verbatim}  \end{verbatim}
1516  \end{tt}  \end{quote}
1517     You have to pass a Tokens structure to the lexer.  This Tokens structure     You have to pass a Tokens structure to the lexer.  This Tokens structure
1518  contains functions which construct tokens from values and line numbers.  contains functions which construct tokens from values and line numbers.
1519  So to create your dummy token just apply the appropriate token constructor  So to create your dummy token just apply the appropriate token constructor
# Line 1562  Line 1545 
1545    
1546  If you wish to encapsulate the code necessary to invoke the  If you wish to encapsulate the code necessary to invoke the
1547  parser, your functor generally will have form:  parser, your functor generally will have form:
1548  \begin{tt}  \begin{quote}
1549  \begin{verbatim}  \begin{verbatim}
1550  functor Encapsulate(  functor Encapsulate(
1551       structure Parser : PARSER       structure Parser : PARSER
# Line 1577  Line 1560 
1560          ...          ...
1561    end    end
1562  \end{verbatim}  \end{verbatim}
1563  \end{tt}  \end{quote}
1564    
1565  The signature INTERFACE, defined below, is a possible signature for  The signature INTERFACE, defined below, is a possible signature for
1566  a structure  a structure
# Line 1587  Line 1570 
1570  because  because
1571  these types will be abstract types inside the body of your  these types will be abstract types inside the body of your
1572  functor.  functor.
1573  \begin{tt}  \begin{quote}
1574  \begin{verbatim}  \begin{verbatim}
1575  signature INTERFACE =  signature INTERFACE =
1576    sig    sig
# Line 1601  Line 1584 
1584       val nothing : arg       val nothing : arg
1585    end    end
1586  \end{verbatim}  \end{verbatim}
1587  \end{tt}  \end{quote}
1588    
1589  The directory example/fol contains a sample parser in which  The directory example/fol contains a sample parser in which
1590  the code for tying together the lexer and parser has been  the code for tying together the lexer and parser has been

Legend:
Removed from v.250  
changed lines
  Added in v.627

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0