Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] View of /sml/trunk/src/smlnj-lib/Doc/HTML/intro.html
ViewVC logotype

View of /sml/trunk/src/smlnj-lib/Doc/HTML/intro.html

Parent Directory Parent Directory | Revision Log Revision Log

Revision 137 - (download) (as text) (annotate)
Mon Sep 7 21:09:17 1998 UTC (23 years, 4 months ago) by monnier
File size: 3087 byte(s)
Initial revision
<!-- HTML/intro.html -->

<!-- COPYRIGHT (c) 1998 Bell Labs, Lucent Technologies. -->

<BODY><A NAME="top"></A>
<H1 align=CENTER>SML/NJ Library Manual</H1>

<H2><A NAME="section:0">Introduction</A></H2>
This is a regular expressions library. It is based on a decoupling of the surface syntax used to specify regular expressions (the frontend) and the engine that implements the matcher (the matcher). An abstract syntax is used to communicate between the front end and the back end of the system.
Given a structure <CODE>S1</CODE> describing a surface syntax and a structure <CODE>S2</CODE> describing a matching engine, a regular expression package can be defined by applying the functor <CODE>RegExpFn</CODE>: 
RegExpFn (structure P=S1  structure E=S2) : REGEXP
To match a regular expression, one first needs to compile a representation in the surface syntax. The type of a compiled regular expression is given in the <CODE>REGEXP</CODE> signature as:  
type regexp
Once a regular expression has been compiled, three functions are provided to perform the matching, <CODE>find</CODE>, <CODE>prefix</CODE> and <CODE>match</CODE>. These functions operate on readers as defined in the <CODE>StringCvt</CODE> structure of the Basis Library. A reader of type <CODE>('a,'b) reader</CODE> is a function <CODE>'b -&gt; ('a,'b) option</CODE> taking a stream of type <CODE>'b</CODE> and returning an element of type <CODE>'a</CODE> and the remainder of the stream, or <CODE>NONE</CODE> if the end of the stream is reached.
  The function <CODE>find</CODE> returns a reader that searches a stream and attempts to match the given regular expression. The function <CODE>prefix</CODE> returns a reader that attempts to match the regular expression at the current position in the stream. The function <CODE>match</CODE> takes a list of regular expressions and functions and returns a reader that attempts to match one of the regular expressions at the current position in the stream. The function corresponding to the matched regular expression is invoked on the matching information. 
Once a match is found, it is returned as a <CODE>match_tree</CODE> datatype This is a hierarchical structure describing the matches of the various subexpressions appearing in the matched regular expression. A match for an expression is a record containing the position of the match and its length. The root of the structure always describes the outermost match (the whole string matched by the regular expression). 

<B>[ <A HREF="#top">Top</A>
   | <A HREF="regexp-lib-part.html">Parent</A>
   | <A HREF="toc.html">Contents</A>
   | <A HREF="index-all.html">Index</A>
   | <A HREF="smlnj-lib.html">Root</A>
<I>Last Modified June 1, 1998</I><BR>
<I>Comments to <A HREF="mailto:riccardo@research.bell-labs.com">Riccardo Pucella.</A></I><BR>

Copyright &copy; 1998 Bell Labs, Lucent Technologies <BR>


ViewVC Help
Powered by ViewVC 1.0.0