[ACCEPTED]-Lexer written in Javascript?-lexer

Accepted answer
Score: 28

Something like http://jscc.phorward-software.com/, maybe?

JS/CC is the first 29 available parser development system for 28 JavaScript and ECMAScript-derivates. It has been developed, both, with 27 the intention of building a productive compiler 26 development system and with the intention 25 of creating an easy-to-use academic environment 24 for people interested in how parse table 23 generation is done general in bottom-up 22 parsing.

The platform-independent software 21 unions both: A regular expression-based lexical analyzer generator matching individual tokens 20 from the input character stream and a LALR(1) parser generator, computing 19 the parse tables for a given context-free grammar specification 18 and building a stand-alone, working parser. The 17 context-free grammar fed to JS/CC is defined 16 in a Backus-Naur-Form-based meta language, and allows the 15 insertion of individual semantic code to 14 be evaluated on a rule's reduction.

JS/CC itself has been entirely written in ECMAScript so it 13 can be executed in many different ways: as 12 platform-independent, browser-based JavaScript 11 embedded on a Website, as a Windows Script 10 Host Application, as a compiled JScript.NET 9 executable, as a Mozilla/Rhino or Mozilla/Spidermonkey 8 interpreted application, or a V8 shell script 7 on Windows, *nix, Linux and Mac OSX. However, for 6 productive execution, it is recommended 5 to use the command-line versions. These 4 versions are capable of assembling a complete 3 compiler from a JS/CC parser specification, which 2 is then stored to a .js JavaScript source 1 file.

Score: 15

If you want to build JavaScript parsers 20 and code generators, check out the MetaII 19 implementation in Javascript.

A MetaII Compiler 18 tutorial walks you through building a completely 17 self-contained compiler system that can 16 translate itself and other languages:

MetaII Compiler Tutorial

This 15 is all based on an amazing little 10-page 14 technical paper by Val Schorre: META II: A 13 Syntax-Oriented Compiler Writing Language 12 from honest-to-god 1964. The MetaII compiler 11 complete self-description is about 30 lines! I 10 learned how to build compilers from this 9 back in 1970. There's a mind-blowing moment 8 when you finally grok how the compiler can 7 regenerate itself....

The tutorial explains 6 MetaII, how it works, and implements MetaII 5 compiling MetaII into JavaScript. You can 4 easily modify this compiler to parse other 3 langauges, and produce different Javascript.

I 2 know the website author from my college 1 days, but have nothing to do with the website.

Score: 9

Jison is probably the best and most active 4 lexer & parser generator out there for 3 Javascript. It mimics Bison and Yacc.

Jison: http://zaach.github.io/jison/

If 2 you want just a light weight lexer (~100 1 sloc) you can take a look at Lexed.js: https://github.com/tantaman/lexed.js

Score: 6

For simple parsing tasks I'm quite fond 6 of using a variant of Pratt's Top Down Operator Precedence parser. While Pratt 5 wrote the original paper using an old Lisp 4 dialect, the same concepts can easily be 3 used in most any language. In fact, Douglas 2 Crockford wrote an excellent article on 1 Top Down Operator Precedence parsing in JavaScript, which might be just what you need.

Score: 6

Here is an example of a parser for a "pseudo" natural 9 language of instructions, which was implemented 8 in pure JavaScript with Chevrotain Parsing DSL:

https://github.com/SAP/chevrotain/blob/master/examples/parser/inheritance/inheritance.js

This example even includes 7 support for multiple natural languages (English 6 & German) using grammar inheritance.

Chevrotain 5 falls under the category of "libraries out there for parsing that are 100% javascript" as it performs 4 no code generation. Using Chevrotain is similar to "hand 3 crafting" a recursive decent parser, only 2 without most of the headache such as:

  • Lookahead function creation (deciding which alternative to take)
  • Automatic Error Recovery.
  • Left recursion detection
  • Ambiguity Detection.
  • Position information.
  • ...

as Chevrotain handles 1 that automatically.

Score: 2

Depending on the design of the 'set of instructions', you 6 may be able to use Javascript's built-in 5 eval function, which parses Javascript source; you 4 may be able to write a simple translator 3 to convert the instructions to Javascript 2 code.

By the way, be very careful about XSS 1 holes.

Score: 2

If you want a lexer and nothing but a lexer 3 then take a look at this: https://github.com/aaditmshah/lexer

It's a pure JavaScript 2 lexer with lots of powerful features written 1 in just a few lines of code.

Score: 2
Score: 1

if you're really looking for just a lexer, try 1 prettify.

Score: 0

I was looking for something similar that 7 wouldn't have any security holes and I came 6 across two resources. They don't parse the 5 script, but actually run it in a "safe" environment 4 - something you can't guarantee when using 3 the eval function. So, I don't know if it's 2 exactly what you are looking for but take 1 a look:

  1. jsandbox - Javascript sandbox
  2. Google Caja - virtual iframe.

More Related questions