Differences in Parse from Rebol 2 to Rebol 3

Differences in Parse from Rebol 2 to Rebol 3.

Accept, Break

Rebol 2: Break existed in later versions of Rebol 2, but since I didn't learn it then I'm adding it here. Accept did not exist in Rebol 2.

Rebol 3: Break the match rule immediately, accepting everything matched up to that point. Always returns success.

>> parse [x][some [break] skip]
== true

And

The input will not be advanced after testing the match pattern.

Rebol 2:

>> parse [(x)] [p: any-block! :p paren! :p into [word!] :p skip]
== true

Rebol 3:

>> parse [(x)] [and any-block! and paren! and into [word!] skip]
== true

Any

The match pattern will be tested until it fails, but the rule succeeds even if no match was made.

Rebol 2: Any was not sensitive to whether the input has advanced allowing the possibility of infinite loops if care was not taken.

Rebol 3: Any is sensitive to whether the input has advanced.

Compare with While.

Copy var None

Rebol 2: A copy with NONE would set var to None.

Rebol 3: Var is set to an empty string.

Change, Insert, Remove

Rebol 3: Manipulate the input series.

Change and Remove - Input matched by the match pattern is modified.

Insert and Change - The supplied value is inserted into the series. The value is not further evaluated, simply inserted as is. An Only keyword is provided.

>> parse b: [x] [skip insert (y z)]
== true
>> b
== [x y z]
>> parse b: [x] [skip insert only (y z)]
== true
>> b
== [x (y z)]
>> parse b: [x] [change word! [y z]]
== true
>> b
== [y z]
>> parse b: [x] [change word! only [y z]]
== true
>> b
== [[y z]]
>> parse b: [x y z] [remove 2 word! skip]
== true
>> b
== [z]

It is important to note that when the value parameter supplied to Change or Insert is a word! then the word's value is inserted:

>> the-answer: 42
== 42
>> parse s: [x] [change word! the-answer]
== true
>> s
== [42]
>> parse s: [x] [change word! only now]
== true
>> s
== [make native! [[
        "Returns date and time."
        /year "Returns year only"
        /month "Returns month only"
...

Do

Rebol 3: Do evalutes the next expression within the input (like DO/next) and the result, a block, is then tested by the match pattern argument (like doing an Into on the result of the evalution).

>> parse [now][do date!]
== true
>> parse [now 1 / 0] [do [result: date!] 3 skip]
== true
>> result
== [13-May-2013/22:43:47+10:00]

Another example:

>> parse [now/date now/time][return 2 do [?? skip]]
skip: [29-Jul-2013]
skip: [14:40:22]
== [now/date now/time]

Interestingly, when None is given as the match pattern argument to DO a test (NONE?) is performed - normally None as a match pattern in Parse is equivalent to a no-op:

>> parse [x][none 'x] ; Normally None is a no-op.
== true

>> parse [(none)] [do none] ; Here none is equivalent to a test for None
== true

Fail

Cause the rule to fail.

Rebol 2: An impossible match pattern was used to force the rule to fail.

parse [x][end skip | 'x]

Rebol 3: Fail Makes the rule fail immediately.

parse [x] [ fail | 'x]

If

Rebol 3: A guard condition. The result of the expression determines if the current rule continues or fails.

>> parse [x] [if (1 = 1) skip]
== true
>> parse [x] [if (1 = 0) skip]
== false

Into

Rebol 2: Into enters block! types, if not a block, fails.

>> parse [{string}][into ['w]]
== false

Rebol 3: Into will enter series types including string! with success.

>> parse [{string}][into ['w]]
** Script error: PARSE - invalid rule or usage of rule: 'w
** Where: parse
** Near: parse ["string"] [into ['w]]

The error shown above is a result of parse being asked to match a character of the string with 'w.

So to port Into from Rebol 2 to Rebol 3, explicitly test for any-block! before the Into.

>> parse [{string}][and any-block! into ['w]]
== false

Not

Rebol 2: Required complex dynamic rules to simulate Not. Eg:

>> parse [x][(guard: none) opt ['y (guard: [end skip])] guard skip]
== true

Rebol 3: Inverts the match status of the next match pattern.

NOT does not consume input, so it is like AND in that aspect, so you can use it as a guard to subsequent rules.

>> parse [x][not 'y skip]
== true

Quote

Rebol 3: Matches the argument exactly as it is except for paren!:

>> parse [x] [quote x]
== true
>> parse ['x] [quote 'x]
== true
>> parse [[x]] [quote [x]]
== true

Parens are evaluated:

>> parse [x][thru [quote (print now 'x)]]
16-May-2013/14:38:25+10:00
== true

Reject

Rebol 3: Break the match repetition loop, rejecting everything matched during the loop.

>> parse [x x x ] [ any [word! | integer! reject] | p:]
== true
>> parse [x x x 1] [ any [word! | integer! reject] | p:]
== false
>> p
== [x x x 1]

Return

Rebol 3: Cause parse to quit rule processing and return a value immediately.

Return can take a match pattern as an argument and return the matched input as the return value of Parse.

>> parse [x 9 y][any [word! | return integer!]]
== [9]

Return can also take a paren expression and return the result of the evaluation as the return value of Parse.

>> parse [x][return (now)]
== 15-May-2013/17:13:04+10:00

Then

Rebol 3: Used within a set of alternate rules to disable the next alternate rule.

As soon as THEN is encountered the next alternate rule, the one after the current one, is disabled temporarily. Rule matching continues after the THEN. Further alternates are unaffected. I found it helps to mentally translate THEN to "Disable Alternate" while learning this keyword.

>> parse [x] [  ?? 'x THEN fail | ?? word! | ?? skip ]
'x: [x]
skip: [x]
== true

Commenting the above example:

parse [x] [
    ?? 'x ; This is matched, but...
    THEN ; This disables the next alternate rule (starting at the next |)
    fail ; This rule fails, so the next enabled rule will be tried.
    | ?? word! ; This rule has been disabled by the THEN
    | ?? skip ; This alternate is unaffected, so it will match the 'x
]

In the following example notice how the third alternate is matched the second time around the Any loop, because that's the only time the first alternate matches and activates the Then:

>> parse "abcd" [any [?? "bc" THEN fail | skip (print 1) | skip (print 2)]]
"bc": "abcd"
1
"bc": "bcd"
2
"bc": "cd"
1
"bc": "d"
1
"bc": ""
== true

In the following example, when the second alternate is skipped by the Then, there is no further alternates to try and the Any completes with failure:

; by rgchris
>> parse "abcd" [any [?? "bc" THEN fail | skip]]
"bc": "abcd"
"bc": "bcd"
== false

To, Thru

Rebol 2: To and Tru are limited to simple match patterns.

Rebol 3: To and Thru can take a rule argument now.

; by rgchris
>> parse [a b] [thru ['b | 'a] 'b]
== true

However not every arbitrary rule is supported - (bug?).

While

Rebol 3: Matches 0 or mulitple patterns like Any does. But unlike Any, While is insensitive to if the input has been advanced. This makes it useful with input modification commands.

>> c: 0 parse [x] [while [if (c: c + 1 ?? c < 3)] 'x]
c: 1
c: 2
c: 3
== true

Any is sensitive to whether the input position has changed:

>> parse s: [1 2 x ][any [remove integer!] 'x]
== false
>> s
== [2 x]

While is not sensitive to whether the input position has changed:

>> parse s: [1 2 x ][while [remove integer!] 'x]
== true
>> s
== [x]

??

Rebol 3: Debugging output showing next rule and input position.

>> parse [x y z ][skip ?? skip]
skip: [y z]
== false

Parsing path!

Rebol 2 and Rebol3 return the series reference differently when parsing paths. Eg for:

parse [a/b] [into [x:]]

Rebol 2:

>> x
== [a b]

Rebol 3:

>> x
== a/b

Oddities/Bugs

Accept = Break, and seems to pair with Reject, but is not documented.

The following does not work (saphiron-view version: 2.101.0.3.1 build: 22-Feb-2013/11:09:25), which I think is a bug:

>> parse [[x]] [thru [quote [x]]]
** Script error: PARSE - invalid rule or usage of rule: [x]
** Where: parse
** Near: parse [[x]] [thru [quote [x]]]

The rules that To and Thru take are not general (at this time at least). Perhaps it is due to this bug:

>> parse [a b] [thru [?? end]]
** Script error: PARSE - invalid rule or usage of rule: ??
** Where: parse
** Near: parse [a b] [thru [?? end]]

Return with Do as an argument is odd:

>> parse [now] [return do date!]
== [now]

At the moment, Do appears to have a bug when Quote is used:

>> parse [(to lit-word! 'x)][do quote 'x] ; Bugged?
== false
>> parse [(to lit-word! 'x)][do [quote 'x]]
== true

And here:

>> parse [1][do [p: none]] ; Bugged?
== true
>> p
== [1]
>> parse [1][do none]
== false

I think the new Parse grammar and it's implementation should be thought through more.