Principled Parsing for Indentation-Sensitive Languages: Revisiting Landin's Offside Rule

Michael D. Adams

Status: Published at POPL 2013

Abstract

Several popular languages, such as Haskell, Python, and F#, use theindentation and layout of code as part of their syntax. Because context-freegrammars cannot express the rules of indentation, parsers for theselanguages currently use ad hoc techniques to handle layout. These techniquestend to be low-level and operational in nature and forgo the advantages ofmore declarative specifications like context-free grammars. For example,they are often coded by hand instead of being generated by a parsergenerator.

This paper presents a simple extension to context-free grammars that can express these layout rules, and derives GLR and LR(k) algorithmsfor parsing these grammars. These grammars are easy to write and can beparsed efficiently. Examples for several languages are presented, as arebenchmarks showing the practical efficiency of these algorithms.

Keywords

Parsing, Indentation, Offside rule

Citation

Michael D. Adams. Principled parsing for indentation-sensitive languages: Revisiting Landin’s offside rule. In Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL ’13, pages 511–522. ACM, New York, NY, USA, 2013. ISBN 978-1-4503-1832-7. doi: 10.1145/2429069.2429129.

BibTeX Entry

@inproceedings{adams2012layout,
  author = {Adams, Michael D.},
  title = {Principled Parsing for Indentation-Sensitive Languages: Revisiting {L}andin's Offside Rule},
  booktitle = {Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages},
  pages = {511--522},
  year = {2013},
  series = {POPL~'13},
  address = {New York, NY, USA},
  publisher = {ACM},
  isbn = {978-1-4503-1832-7},
  doi = {10.1145/2429069.2429129},
}

Copyright Notice

© ACM, 2013. This is the author’s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, (2013). http://doi.acm.org/10.1145/2429069.2429129.