[go: nahoru, domu]

Jump to content

Lex (software): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Task 70: Update syntaxhighlight tags - remove use of deprecated <source> tags
No edit summary
 
(41 intermediate revisions by 24 users not shown)
Line 1: Line 1:
{{Short description|Standard UNIX utility}}
{{Infobox Software
{{Infobox software
| name = Lex
| name = Lex
| logo =
| logo =
Line 10: Line 11:
| latest release version =
| latest release version =
| latest release date =
| latest release date =
| programming language = [[C (programming language)|C]]
| operating system = [[Unix]] and [[Unix-like]]
| operating system = [[Unix]], [[Unix-like]], [[Plan 9 from Bell Labs|Plan 9]]
| platform = [[Cross-platform]]
| genre = [[Command (computing)|Command]]
| genre = [[Command (computing)|Command]]
| license =
| license = Plan 9: [[MIT License]]
| website =
| website =
}}
}}
'''Lex''' is a [[computer program]] that generates [[lexical analysis|lexical analyzer]]s ("scanners" or "lexers").<ref>{{cite book |last1=Levine |first1=John R. | authorlink1=John R. Levine |last2=Mason |first2=Tony |last3=Brown |first3=Doug |title=lex & yacc |pages=[https://archive.org/details/lexyacc00levi/page/1 1]–2 |publisher=[[O'Reilly Media|O'Reilly]] |year=1992 |edition=2 |isbn=1-56592-000-7 |url=https://archive.org/details/lexyacc00levi|url-access=registration }}</ref><ref>{{Cite book | last=Levine | first=John | authorlink=John Levine | title =flex & bison | publisher=O'Reilly Media | date=August 2009 | page=304 | url=http://oreilly.com/catalog/9780596155988 | isbn=978-0-596-15597-1}}</ref>
'''Lex''' is a [[computer program]] that generates [[lexical analysis|lexical analyzer]]s ("scanners" or "lexers").<ref>{{cite book |last1=Levine |first1=John R. | author-link1=John R. Levine |last2=Mason |first2=Tony |last3=Brown |first3=Doug |title=lex & yacc |pages=[https://archive.org/details/lexyacc00levi/page/1 1]–2 |publisher=[[O'Reilly Media|O'Reilly]] |year=1992 |edition=2 |isbn=1-56592-000-7 |url=https://archive.org/details/lexyacc00levi|url-access=registration }}</ref><ref>{{Cite book | last=Levine | first=John | author-link=John Levine | title =flex & bison | publisher=O'Reilly Media | date=August 2009 | page=304 | url=http://oreilly.com/catalog/9780596155988 | isbn=978-0-596-15597-1}}</ref> It is commonly used with the [[yacc]] [[parser generator]] and is the standard lexical analyzer generator on many [[Unix]] and [[Unix-like]] systems. An equivalent tool is specified as part of the [[POSIX]] standard.<ref>[https://pubs.opengroup.org/onlinepubs/9699919799/utilities/lex.html The Open Group Base Specifications Issue 7, 2018 edition § Shell & Utilities § Utilities § lex]</ref>


Lex reads an input [[stream (computing)|stream]] specifying the lexical analyzer and writes [[source code]] which implements the lexical analyzer in the [[C (programming language)|C programming language]].
Lex is commonly used with the [[yacc]] [[parser generator]]. Lex, originally written by [[Mike Lesk]] and [[Eric Schmidt]]<ref>{{cite web|first1=M.E. |last1=Lesk |first2=E. |last2=Schmidt |title=Lex – A Lexical Analyzer Generator|url=http://dinosaur.compilertools.net/lex/index.html|accessdate=August 16, 2010}}</ref> and described in 1975,<ref>{{cite web|url=http://epaperpress.com/lexandyacc/download/lex.pdf |title=Lex – A Lexical Analyzer Generator |first1=M.E. |last1=Lesk |first2=E. |last2=Schmidt |date=July 21, 1975| work=UNIX TIME-SHARING SYSTEM:UNIX PROGRAMMER’S MANUAL, Seventh Edition, Volume 2B |publisher=bell-labs.com|accessdate= Dec 20, 2011}}</ref><ref>{{cite journal |last1=Lesk |first1=M.E. |date=October 1975 |title=Lex – A Lexical Analyzer Generator |journal=Comp. Sci. Tech. Rep. No. 39 |location=Murray Hill, New Jersey |publisher=Bell Laboratories}}</ref> is the standard [[lexical analyzer]] generator on many [[Unix]] systems, and an equivalent tool is specified as part of the [[POSIX]] standard.<ref>[https://pubs.opengroup.org/onlinepubs/9699919799/utilities/lex.html The Open Group Base Specifications Issue 7, 2018 edition § Shell & Utilities § Utilities § lex]</ref>


In addition to C, some old versions of Lex could generate a lexer in [[Ratfor]].<ref>{{cite book|title=Lex & Yacc |url=https://archive.org/details/lexyacc00levi |url-access=registration |author1=John R. Levine |author2=John Mason |author3=Doug Brown |publisher=O'Reilly |date=1992 |isbn=9781565920002 }}</ref>
Lex reads an input [[stream (computing)|stream]] specifying the lexical analyzer and outputs [[source code]] implementing the lexer in the [[C (programming language)|C programming language]].
In addition to C, some old versions of Lex could also generate a lexer in [[Ratfor]].<ref>{{cite book|title=Lex & Yacc |url=https://archive.org/details/lexyacc00levi |url-access=registration |author1=John R. Levine |author2=John Mason |author3=Doug Brown |publisher=O'Reilly |date=1992 }}</ref>


==Open source==
==History==
Lex was originally written by [[Mike Lesk]] and [[Eric Schmidt]]<ref>{{cite web|first1=M.E. |last1=Lesk |first2=E. |last2=Schmidt |title=Lex – A Lexical Analyzer Generator|archive-url=https://archive.today/20120728112736/http://dinosaur.compilertools.net/lex/index.html|archive-date=2012-07-28|url-status=dead|url=http://dinosaur.compilertools.net/lex/index.html|access-date=August 16, 2010}}</ref> and described in 1975.<ref>{{cite web|url=http://epaperpress.com/lexandyacc/download/lex.pdf |title=Lex – A Lexical Analyzer Generator |first1=M.E. |last1=Lesk |first2=E. |last2=Schmidt |date=July 21, 1975| work=UNIX TIME-SHARING SYSTEM:UNIX PROGRAMMER’S MANUAL, Seventh Edition, Volume 2B |publisher=bell-labs.com|access-date= Dec 20, 2011}}</ref><ref>{{cite journal |last1=Lesk |first1=M.E. |date=October 1975 |title=Lex – A Lexical Analyzer Generator |journal=Comp. Sci. Tech. Rep. No. 39 |location=Murray Hill, New Jersey |publisher=Bell Laboratories}}</ref>
Though originally distributed as proprietary software, some versions of Lex are now [[open-source software|open source]]. Open source versions of Lex, based on the original proprietary code, are now distributed with open source operating systems such as [[OpenSolaris]] and [[Plan 9 from Bell Labs]]. One popular open source version of Lex, called [[flex lexical analyser|flex]], or the "fast lexical analyzer", is not derived from proprietary coding.
In the following years, Lex became standard lexical analyzer generator on many [[Unix]] and [[Unix-like]] systems. In 1983, Lex was one of several UNIX tools available for Charles River Data Systems' [[UNOS (operating system)|UNOS]] operating system under [[Bell Laboratories]] license.<ref>{{Cite book|year=1983|title=The Insider's Guide To The Universe|publisher=Charles River Data Systems, Inc.|url=https://www.1000bit.it/ad/bro/charles/CharlesRiverSystem-Universe.pdf|page=13}}</ref>
Although originally distributed as proprietary software, some versions of Lex are now [[open-source software|open-source]]. Open-source versions of Lex, based on the original proprietary code, are now distributed with open-source operating systems such as [[OpenSolaris]] and [[Plan 9 from Bell Labs]]. One popular open-source version of Lex, called [[flex lexical analyser|flex]], or the "fast lexical analyzer", is not derived from proprietary coding.


==Structure of a Lex file==
==Structure of a Lex file==
The structure of a Lex file is intentionally similar to that of a yacc file; files are divided into three sections, separated by lines that contain only two percent signs, as follows
The structure of a Lex file is intentionally similar to that of a yacc file: files are divided into three sections, separated by lines that contain only two percent signs, as follows:
*The '''definitions''' section defines [[Macro (computer science)|macros]] and imports [[header file]]s written in [[C (programming language)|C]]. It is also possible to write any C code here, which will be copied verbatim into the generated source file.


*The '''definition''' section defines [[Macro (computer science)|macros]] and imports [[header file]]s written in [[C (programming language)|C]]. It is also possible to write any C code here, which will be copied verbatim into the generated source file.
*The '''rules''' section associates [[regular expression]] patterns with C [[statement (programming)|statement]]s. When the lexer sees text in the input matching a given pattern, it will execute the associated C code.
*The '''rules''' section associates [[regular expression]] patterns with C [[statement (programming)|statement]]s. When the lexer sees text in the input matching a given pattern, it will execute the associated C code.
*The '''C code''' section contains C statements and [[function (programming)|function]]s that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file linked in at [[compiler|compile]] time.
*The '''C code''' section contains C statements and [[function (programming)|function]]s that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file linked in at [[compiler|compile]] time.
Line 42: Line 44:
#include <stdio.h>
#include <stdio.h>
%}
%}

/* This tells flex to read only one input file */
%option noyywrap


%%
%%
Line 77: Line 76:


===Using Lex with parser generators===
===Using Lex with parser generators===
Lex and parser generators, such as [[Yacc]] or [[GNU bison|Bison]], are commonly used together. Parser generators use a [[formal grammar]] to parse an input stream, something which Lex cannot do using simple [[regular expression]]s (Lex is limited to simple [[Finite state machine|finite state automata]]). {{Clarify|reason=mixes together grammars, regular expressions and FSMs into a confusing result|date=May 2012}}


Lex, as with other lexical analyzers, limits rules to those which can be described by [[regular expressions]]. Due to this, Lex can be implemented by a [[Finite state machine|finite state automata]] as shown by the [[Chomsky hierarchy]] of languages. To recognize more complex languages, Lex is often used with parser generators such as [[Yacc]] or [[GNU bison|Bison]]. Parser generators use a [[formal grammar]] to parse an input stream.
It is typically preferable to have a (Yacc-generated, say) parser be fed a token-stream as input, rather than having it consume the input character-stream directly. Lex is often used to produce such a token-stream.

It is typically preferable to have a parser, one generated by Yacc for instance, accept a stream of tokens (a "token-stream") as input, rather than having to process a stream of characters (a "character-stream") directly. Lex is often used to produce such a token-stream.


[[Scannerless parsing]] refers to parsing the input character-stream directly, without a distinct lexer.
[[Scannerless parsing]] refers to parsing the input character-stream directly, without a distinct lexer.
Line 106: Line 106:


==External links==
==External links==
{{Wikibooks|Guide to Unix|Commands}}
*[http://www.mactech.com/articles/mactech/Vol.16/16.07/UsingFlexandBison/ Using Flex and Bison at Macworld.com]
*[http://www.mactech.com/articles/mactech/Vol.16/16.07/UsingFlexandBison/ Using Flex and Bison at Macworld.com]
* {{man|1|lex|Solaris}}
* {{man|1|lex|Plan 9}}


{{Unix commands}}
{{Unix commands}}
{{Plan 9 commands}}


[[Category:Compiling tools]]
[[Category:Compiling tools]]
[[Category:Unix programming tools]]
[[Category:Unix programming tools]]
[[Category:Unix SUS2008 utilities]]
[[Category:Unix SUS2008 utilities]]
[[Category:Plan 9 commands]]
[[Category:Finite automata]]
[[Category:Finite automata]]
[[Category:Lexical analysis]]
[[Category:Lexical analysis]]

Latest revision as of 20:05, 10 February 2024

Lex
Original author(s)Mike Lesk, Eric Schmidt
Initial release1975; 49 years ago (1975)
Repository
Written inC
Operating systemUnix, Unix-like, Plan 9
PlatformCross-platform
TypeCommand
LicensePlan 9: MIT License

Lex is a computer program that generates lexical analyzers ("scanners" or "lexers").[1][2] It is commonly used with the yacc parser generator and is the standard lexical analyzer generator on many Unix and Unix-like systems. An equivalent tool is specified as part of the POSIX standard.[3]

Lex reads an input stream specifying the lexical analyzer and writes source code which implements the lexical analyzer in the C programming language.

In addition to C, some old versions of Lex could generate a lexer in Ratfor.[4]

History

[edit]

Lex was originally written by Mike Lesk and Eric Schmidt[5] and described in 1975.[6][7] In the following years, Lex became standard lexical analyzer generator on many Unix and Unix-like systems. In 1983, Lex was one of several UNIX tools available for Charles River Data Systems' UNOS operating system under Bell Laboratories license.[8] Although originally distributed as proprietary software, some versions of Lex are now open-source. Open-source versions of Lex, based on the original proprietary code, are now distributed with open-source operating systems such as OpenSolaris and Plan 9 from Bell Labs. One popular open-source version of Lex, called flex, or the "fast lexical analyzer", is not derived from proprietary coding.

Structure of a Lex file

[edit]

The structure of a Lex file is intentionally similar to that of a yacc file: files are divided into three sections, separated by lines that contain only two percent signs, as follows:

  • The definitions section defines macros and imports header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file.
  • The rules section associates regular expression patterns with C statements. When the lexer sees text in the input matching a given pattern, it will execute the associated C code.
  • The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file linked in at compile time.

Example of a Lex file

[edit]

The following is an example Lex file for the flex version of Lex. It recognizes strings of numbers (positive integers) in the input, and simply prints them out.

/*** Definition section ***/

%{
/* C code to be copied verbatim */
#include <stdio.h>
%}

%%
    /*** Rules section ***/

    /* [0-9]+ matches a string of one or more digits */
[0-9]+  {
            /* yytext is a string containing the matched text. */
            printf("Saw an integer: %s\n", yytext);
        }

.|\n    {   /* Ignore all other characters. */   }

%%
/*** C Code section ***/

int main(void)
{
    /* Call the lexer, then quit. */
    yylex();
    return 0;
}

If this input is given to flex, it will be converted into a C file, lex.yy.c. This can be compiled into an executable which matches and outputs strings of integers. For example, given the input:

abc123z.!&*2gj6

the program will print:

Saw an integer: 123
Saw an integer: 2
Saw an integer: 6

Using Lex with other programming tools

[edit]

Using Lex with parser generators

[edit]

Lex, as with other lexical analyzers, limits rules to those which can be described by regular expressions. Due to this, Lex can be implemented by a finite state automata as shown by the Chomsky hierarchy of languages. To recognize more complex languages, Lex is often used with parser generators such as Yacc or Bison. Parser generators use a formal grammar to parse an input stream.

It is typically preferable to have a parser, one generated by Yacc for instance, accept a stream of tokens (a "token-stream") as input, rather than having to process a stream of characters (a "character-stream") directly. Lex is often used to produce such a token-stream.

Scannerless parsing refers to parsing the input character-stream directly, without a distinct lexer.

Lex and make

[edit]

make is a utility that can be used to maintain programs involving Lex. Make assumes that a file that has an extension of .l is a Lex source file. The make internal macro LFLAGS can be used to specify Lex options to be invoked automatically by make.[9]

See also

[edit]

References

[edit]
  1. ^ Levine, John R.; Mason, Tony; Brown, Doug (1992). lex & yacc (2 ed.). O'Reilly. pp. 1–2. ISBN 1-56592-000-7.
  2. ^ Levine, John (August 2009). flex & bison. O'Reilly Media. p. 304. ISBN 978-0-596-15597-1.
  3. ^ The Open Group Base Specifications Issue 7, 2018 edition § Shell & Utilities § Utilities § lex
  4. ^ John R. Levine; John Mason; Doug Brown (1992). Lex & Yacc. O'Reilly. ISBN 9781565920002.
  5. ^ Lesk, M.E.; Schmidt, E. "Lex – A Lexical Analyzer Generator". Archived from the original on 2012-07-28. Retrieved August 16, 2010.
  6. ^ Lesk, M.E.; Schmidt, E. (July 21, 1975). "Lex – A Lexical Analyzer Generator" (PDF). UNIX TIME-SHARING SYSTEM:UNIX PROGRAMMER’S MANUAL, Seventh Edition, Volume 2B. bell-labs.com. Retrieved Dec 20, 2011.
  7. ^ Lesk, M.E. (October 1975). "Lex – A Lexical Analyzer Generator". Comp. Sci. Tech. Rep. No. 39. Murray Hill, New Jersey: Bell Laboratories.
  8. ^ The Insider's Guide To The Universe (PDF). Charles River Data Systems, Inc. 1983. p. 13.
  9. ^ "make". The Open Group Base Specifications (6). The IEEE and The Open Group. 2004. IEEE Std 1003.1, 2004 Edition.
[edit]