--- title: Documentation --- Documentation ============= Jison takes a context-free grammar as input and outputs a JavaScript file capable of parsing the language described by that grammar. You can then use the generated script to parse inputs and accept, reject, or perform actions based on the input. If you're familiar with Bison or Yacc, or other clones, you're almost ready to roll. * [Installation](#installation) * [Usage from the command line](#usage-from-the-command-line) * [Usage from a CommonJS Module](#usage-from-a-commonjs-module) * [Using the Generated Parser](#using-the-generated-parser) * [Using the Parser from the Web](#using-the-parser-from-the-web) * [The Concepts of Jison](#the-concepts-of-jison) * [Specifying a Language](#specifying-a-language) * [Lexical Analysis](#lexical-analysis) * [Tracking Locations](#tracking-locations) * [Custom Scanners](#custom-scanners) * [Sharing Scope](#sharing-scope) * [Parsing algorithms](#parsing-algorithms) * [Real world examples](#real-world-examples) * [Contributors](#contributors) * [License](#license) Installation ------------ Jison can be installed for [Node](http://nodejs.org) using [`npm`](http://github.com/isaacs/npm/) Using npm: npm install jison -g Usage from the command line ----------------------- Clone the github repository for examples: git clone git://github.com/zaach/jison.git cd jison/examples Now you're ready to generate some parsers: jison calculator.jison This will generate `calculator.js` in your current working directory. This script can be used to parse an input file, like so: echo "2^32 / 1024" > testcalc node calculator.js testcalc This will print out `4194304`. Usage from a CommonJS Module -------------------------- You can generate parsers programatically from JavaScript as well. Assuming Jison is in your commonjs environment's load path: // mygenerator.js var Parser = require("jison").Parser; var grammar = { "lex": { "rules": [ ["\\s+", "/* skip whitespace */"], ["[a-f0-9]+", "return 'HEX';"] ] }, "bnf": { "hex_strings" :[ "hex_strings HEX", "HEX" ] } }; var parser = new Parser(grammar); // generate source, ready to be written to disk var parserSource = parser.generate(); // you can also use the parser directly from memory parser.parse("adfe34bc e82a"); // returns true parser.parse("adfe34bc zxg"); // throws lexical error Using the Generated Parser -------------------------- Once you have generated the parser and saved it, you no longer need Jison or any other dependencies. As demonstrated before, the parser can be used from the command line: narwhal calculator.js testcalc Though, more ideally, the parser will be a dependency of another module. You can require it from another module like so: // mymodule.js var parser = require("./calculator").parser; function exec (input) { return parser.parse(input); } var twenty = exec("4 * 5"); Or more succinctly: // mymodule.js function exec (input) { return require("./calculator").parse(input); } var twenty = exec("4 * 5"); Using the Parser from the Web ---------------------------- The generated parser script may be included in a web page without any need for a CommonJS loading environment. It's as simple as pointing to it via a script tag: When you generate the parser, you can specify the variable name it will be declared as: // mygenerator.js var parserSource = generator.generate({moduleName: "calc"}); // then write parserSource to a file called, say, calc.js Whatever `moduleName` you specified will be the the variable you can access the parser from in your web page: The moduleName you specify can also include a namespace, e.g: // mygenerator.js var parserSource = parser.generate({moduleName: "myCalculator.parser"}); And could be used like so: Or something like that -- you get the picture. A demo of the calculator script used in a web page is [here](/jison/demos/calc/). The Concepts of Jison --------------------- Until the [Bison guide](http://dinosaur.compilertools.net/bison/bison_4.html#SEC7) is properly ported for Jison, you can refer to it for the major concepts, which are equivalent (except for the bits about static typing of semantic values, and other obvious C artifacts.) Other helpful sections: * [Bison Grammar Files](http://dinosaur.compilertools.net/bison/bison_6.html#SEC34) * [The Bison Parser Algorithm](http://dinosaur.compilertools.net/bison/bison_8.html#SEC68) * [Error Recovery](http://dinosaur.compilertools.net/bison/bison_9.html#SEC81) (alpha support, at this point) Specifying a Language --------------------- The process of parsing a language involves two phases: **lexical analysis** (tokenizing) and **parsing**, which the Lex/Yacc and Flex/Bison combinations are famous for. Jison lets you specify a parser much like you would using Bison/Flex, with separate files for tokenization rules and for the language grammar, or with the tokenization rules embedded in the main grammar. For example, here is the grammar for the calculator parser: /* description: Parses end executes mathematical expressions. */ /* lexical grammar */ %lex %% \s+ /* skip whitespace */ [0-9]+("."[0-9]+)?\b return 'NUMBER'; "*" return '*'; "/" return '/'; "-" return '-'; "+" return '+'; "^" return '^'; "(" return '('; ")" return ')'; "PI" return 'PI'; "E" return 'E'; <> return 'EOF'; /lex /* operator associations and precedence */ %left '+' '-' %left '*' '/' %left '^' %left UMINUS %start expressions %% /* language grammar */ expressions : e EOF {print($1); return $1;} ; e : e '+' e {$$ = $1+$3;} | e '-' e {$$ = $1-$3;} | e '*' e {$$ = $1*$3;} | e '/' e {$$ = $1/$3;} | e '^' e {$$ = Math.pow($1, $3);} | '-' e %prec UMINUS {$$ = -$2;} | '(' e ')' {$$ = $2;} | NUMBER {$$ = Number(yytext);} | E {$$ = Math.E;} | PI {$$ = Math.PI;} ; which compiles down to this JSON representation used directly by Jison: { "lex": { "rules": [ ["\\s+", "/* skip whitespace */"], ["[0-9]+(?:\\.[0-9]+)?\\b", "return 'NUMBER';"], ["\\*", "return '*';"], ["\\/", "return '/';"], ["-", "return '-';"], ["\\+", "return '+';"], ["\\^", "return '^';"], ["\\(", "return '(';"], ["\\)", "return ')';"], ["PI\\b", "return 'PI';"], ["E\\b", "return 'E';"], ["$", "return 'EOF';"] ] }, "operators": [ ["left", "+", "-"], ["left", "*", "/"], ["left", "^"], ["left", "UMINUS"] ], "bnf": { "expressions" :[[ "e EOF", "print($1); return $1;" ]], "e" :[[ "e + e", "$$ = $1 + $3;" ], [ "e - e", "$$ = $1 - $3;" ], [ "e * e", "$$ = $1 * $3;" ], [ "e / e", "$$ = $1 / $3;" ], [ "e ^ e", "$$ = Math.pow($1, $3);" ], [ "- e", "$$ = -$2;", {"prec": "UMINUS"} ], [ "( e )", "$$ = $2;" ], [ "NUMBER", "$$ = Number(yytext);" ], [ "E", "$$ = Math.E;" ], [ "PI", "$$ = Math.PI;" ]] } } Jison accepts both the Bison/Flex style format, or the raw JSON format, e.g: narwhal bin/jison examples/calculator.jison or narwhal bin/jison examples/calculator.json When the lexical grammar resides in its own (.jisonlex) file, use that as the second argument to Jison, e.g.: narwhal bin/jison examples/classy.jison examples/classy.jisonlex More examples can be found in the [`examples/`](http://github.com/zaach/jison/tree/master/examples/) and [`tests/parser/`](http://github.com/zaach/jison/tree/master/tests/parser/) directories. Lexical Analysis ---------------- Jison includes a rather rudimentary scanner generator, though **any module that supports the basic scanner API could be used** in its place. The format of the [input file](http://dinosaur.compilertools.net/flex/flex_6.html#SEC6) (including macro support) and the style of the [pattern matchers](http://dinosaur.compilertools.net/flex/flex_7.html#SEC7) are modeled after Flex. Several [metacharacters have been added](https://github.com/zaach/jison/wiki/DeviationsFromBison), but there is also one minor inconvenience compared to Flex patterns, namely exact string patterns must be placed in quotes e.g.: Bad: [0-9]+zomg print(yytext) Good: [0-9]+"zomg" print(yytext); Actions that span multiple lines should be surrounded by braces: [0-9]+"zomg" %{ print(yytext); return 'ZOMG'; %} A recently added feature are **[start conditions](http://dinosaur.compilertools.net/flex/flex_11.html)**, which allow certain rules to only match in certain states. If the lexer is not in that state, then the rule is ignored. The lexer starts in the `INITIAL` state, but can move to new states specified by you. Read that link for the run-down. An example below shows where Jison differs, namely `this.begin('state')` instead of `BEGIN(STATE)` for changing states within an action: %s expect %% expect-floats this.begin('expect'); [0-9]+"."[0-9]+ { console.log( "found a float, = " + yytext ); } \n %{ /* that's the end of the line, so * we need another "expect-number" * before we'll recognize any more * numbers */ this.begin('INITIAL'); %} [0-9]+ console.log( "found an integer, = " + yytext ); "." console.log( "found a dot" ); Additionaly, use `this.popState()` within an action to revert to the previous state. Tracking Locations ------------------ Jison's lexical analyzer will track line number and column number information for each token and make them available within parser actions. The API is identical to [Bison's](http://www.gnu.org/software/bison/manual/html_node/Actions-and-Locations.html#Actions-and-Locations). Custom Scanners --------------- You don't have to use the builtin Jison lexical scanner. An object with a `lex` and a `setInput` function would suffice, e.g.: parser.lexer = {lex: function () { return 'NIL'; }, setInput: function (str) {} } This lexer would simply return `NIL` tokens *ad infinitum*. *TODO: more examples* Sharing Scope ------------ In Bison, code is expected to be lexically defined within the scope of the semantic actions. E.g., chunks of code may be included in the generated parser source, which are available from semantic actions. Jison is more modular. Instead of pulling code into the generated module, the generated module is expected to be required and used by other modules. This means that if you want to expose functionality to the semantic actions, you can't rely on it being available through lexical scoping. Instead, the parser has a `yy` property which is exposed to actions as the `yy` free variable. Any functionality attached to this property is available in both lexical and semantic actions through the `yy` free variable. An example from orderly.js: var parser = require("./orderly/parse").parser; // set parser's shared scope parser.yy = require("./orderly/scope"); // returns the JSON object var parse = exports.parse = function (input) { return parser.parse(input); }; ... The `scope` module contains logic for building data structures, which is used within the semantic actions. *TODO: More on this.* Parsing algorithms ------------------ Like Bison, Jison can recognize languages described by LALR(1) grammars, though it also has modes for LR(0), SLR(1), and LR(1). It also has a special mode for generating LL(1) parse tables (requested by my professor,) and could be extended to generate a recursive descent parser for LL(k) languages in the future. But, for now, Jison is geared toward bottom-up parsing. **LR(1) mode is currently not practical for use with anything other than toy grammars, but that is entirely a consequence of the algorithm used, and may change in the future.* Projects using Jison ------------------ View them on the [wiki](https://github.com/zaach/jison/wiki/ProjectsUsingJison), or add your own. Contributors ------------ via [github](http://github.com/zaach/jison/contributors) License ------- > Copyright (c) 2009 Zachary Carter > > Permission is hereby granted, free of > charge, to any person obtaining a > copy of this software and associated > documentation files (the "Software"), > to deal in the Software without > restriction, including without > limitation the rights to use, copy, > modify, merge, publish, distribute, > sublicense, and/or sell copies of the > Software, and to permit persons to > whom the Software is furnished to do > so, subject to the following > conditions: > > The above copyright notice and this > permission notice shall be included > in all copies or substantial portions > of the Software. > > THE SOFTWARE IS PROVIDED "AS IS", > WITHOUT WARRANTY OF ANY KIND, EXPRESS > OR IMPLIED, INCLUDING BUT NOT LIMITED > TO THE WARRANTIES OF MERCHANTABILITY, > FITNESS FOR A PARTICULAR PURPOSE AND > NONINFRINGEMENT. IN NO EVENT SHALL THE > AUTHORS OR COPYRIGHT HOLDERS BE > LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > LIABILITY, WHETHER IN AN ACTION OF > CONTRACT, TORT OR OTHERWISE, ARISING > FROM, OUT OF OR IN CONNECTION WITH THE > SOFTWARE OR THE USE OR OTHER DEALINGS > IN THE SOFTWARE. [1]: http://dinosaur.compilertools.net/bison/bison_4.html [2]: http://github.com/280north/narwhal