Indentation-sensitive parsing for Parsec

Michael D. Adams and Ömer S. Ağacan

Status: Published at Haskell 2014

Note: This paper has an erratum under “Annotation” below.

Abstract

Several popular languages including Haskell and Python use the indentation and layout of code as an essential part of their syntax. In the past, implementations of these languages used ad hoc techniques to implement layout. Recent work has shown that a simple extension to context-free grammars can replace these ad hoc techniques and provide both formal foundations and efficient parsing algorithms for indentation sensitivity.

However, that previous work is limited to bottom-up, LR(k) parsing, and many combinator-based parsing frameworks including Parsec use top-down algorithms that are outside its scope. This paper remedies this by showing how to add indentation sensitivity to parsing frameworks like Parsec. It explores both the formal semantics of and efficient algorithms for indentation sensitivity. It derives a Parsec-based library for indentation-sensitive parsing and presents benchmarks on a real-world language that show its efficiency and practicality.

Keywords

indentation sensitivity; layout; offside rule; parsec; parsing

Annotation

Erratum: In the “Non-Terminal” clause of Figure 6, there is a typo. The “m” in the result should be replaced with “f”.

Citation

Michael D. Adams and Ömer S. Ağacan. Indentation-sensitive parsing for Parsec. In Proceedings of the 2014 ACM SIGPLAN Symposium on Haskell, Haskell ’14, pages 121–132. ACM, New York, NY, USA, September 2014. ISBN 978-1-4503-3041-1. doi: 10.1145/2633357.2633369.

BibTeX Entry

@inproceedings{adams2014layout2,
  author = {Adams, Michael D. and A{\u{g}}acan, {\"O}mer S.},
  title = {Indentation-sensitive parsing for {P}arsec},
  booktitle = {Proceedings of the 2014 ACM SIGPLAN Symposium on Haskell},
  pages = {121--132},
  year = {2014},
  series = {Haskell '14},
  address = {New York, NY, USA},
  month = sep,
  publisher = {ACM},
  isbn = {978-1-4503-3041-1},
  doi = {10.1145/2633357.2633369},
  annote = {**Erratum:** In the "Non-Terminal" clause of Figure 6, there is a typo. The "m" in the result should be replaced with "f".},
}

Copyright Notice

© ACM, 2014. This is the author’s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2014 ACM SIGPLAN Symposium on Haskell, (2014). http://doi.acm.org/10.1145/2633357.2633369.