Learning Treetop
Sadly, Treetop's documentation sucks. A lot. And the examples on the website aren't helpful. I found that dzone has a pretty large collection of treetop grammars :
Treetop grammars
Custom Methods for Treetop Syntax Nodes
This is a significant weakness in the design of Treetop.
I (as maintainer) didn't want to slow it down further by
passing yet another argument to every SyntaxNode,
and break any custom SyntaxNode classes folk have
written. These constructors get the "input" object, a Range
that selects part of that input, and optionally an array
of child SyntaxNodes. They should have received the
Parser itself instead of the input as a member.
So instead, for my own use (some years back), I made
a custom proxy for the "input" and attached my Context
to it. You might get away with doing something similar:
https://github.com/cjheath/activefacts-cql/blob/master/lib/activefacts/cql/parser.rb#L203-L249
Ruby Treetop how to include everything that does not match the grammar
It's a common idiom in PEG grammars to repeatedly match any character .
that isn't part of a rule !body
. Something like this:
rule bodies
((!body .)* body)+ (!body .)*
end
Simplest treetop grammar is returning a parse error, just learning
AFAIK, treetop starts parsing with the first rule in your grammar (the rule word
, in your case!). Now, if you input is 'John Smith'
(i.e.: word
, s
, word
), it stops parsing after matching the rule word
for the first time. And produces an error when it encounters the first s
since word
does not match s
.
You need to add a rule to the top of your grammar that describes an entire name: that is a word, followed by a space followed by a word, etc.
grammar FullName
rule name
word (s word)* {
def value
text_value
end
}
end
rule word
[^\s]+ {
def value
text_value
end
}
end
rule s
[\s]+ {
def value
text_value
end
}
end
end
A quick test with the script:
#!/usr/bin/env ruby
require 'rubygems'
require 'treetop'
require 'polyglot'
require 'FullName'
parser = FullNameParser.new
name = parser.parse('John Smith').value
print name
will print:
John Smith
Treetop Grammar does not recognize /
I found this issue: https://github.com/nathansobo/treetop/issues/25, and it appears to have answered my question.
My grammar did not contain a top level rule that would allow an opening or closing tag, therefore the second possibility was not even considered:
grammar BBCode
rule document
(open_tag / close_tag)
end
rule open_tag
("[" tag_name "]")
end
rule tag_name
[a-zA-Z\*]+
end
rule close_tag
("[/" tag_name "]")
end
end
Rule's order does matter in TreeTop?
I think I just figured out where is wrong!!! There should be a top rule that includes other rules, which is placed as the first rule:
grammar Fortran
rule statement
( id / integer )* {
def content
elements.map { |e| e.content }
end
}
end
rule id
[a-zA-Z] [a-zA-Z0-9]* {
def content
[:id, text_value]
end
}
end
rule integer
[1-9] [0-9]* {
def content
[:integer, text_value]
end
}
end
end
parser = FortranParser.new
ast = parser.parse('1')
Then the result is
[[:integer, "1"]]
Writing Treetop rule to parse input in any order
Here's an example of parsing in any order. The only trouble is you would have to handle duplicates by hand since Treetop doesn't have a rule for unordered-non-repeating elements.
rule top
((gender / age_under) ' '?)*
end
rule gender
'women' / 'men'
end
rule age_under
'under ' age
end
rule age
[0-9]+
end
Treetop ignore grammar rules
Like Jörg mentioned, you need to use your comma
and space
rules in the grammar. I built a simple example of what I think you're trying to accomplish below. It should match "100"
, "1,000"
, "1,000,000"
, etc.
If you look at the numeric rule, first I test for a subtraction sign '-'?
, then I test for one to three digits, then I test for zero or more combinations of comma
's and three digits.
require 'treetop'
Treetop.load_from_string DATA.read
parser = PovParser.new
p parser.parse('1,000,000')
__END__
grammar Pov
rule numeric
'-'? digit 1..3 (comma space* (digit 3..3))*
end
rule digit
[0-9]
end
rule comma
','
end
rule space
[\s]
end
end
Related Topics
Ruby on Rails: What Reporting And/Or Charting Tools Are Available
How to Rescue Model Transaction and Show the User an Error
Best Place to Store Model Specific Constants in Rails 3.1
Best Way to Combine Fragment and Object Caching for Memcached and Rails
How to Get the Ruby Documentation from the Command Line
Gem File with Git Remote Failing on Heroku Push
Ruby Merging Two Arrays into One
Creating an Empty File in Ruby: "Touch" Equivalent
How to Filter an Array of Hashes to Get Only the Keys in Another Array
Carrierwave Crop Specific Version
Multiple Robots.Txt for Subdomains in Rails
How to Put Assertions in Ruby Code
Creating an Md5 Hash of a Number, String, Array, or Hash in Ruby