In language/regexes§
See primary documentation in context for Longest alternation: |
In short, in regex branches separated by |, the longest token match wins, independent of the textual ordering in the regex. However, what | really does is more than that. It does not decide which branch wins after finishing the whole match, but follows the longest-token matching (LTM) strategy.
Briefly, what | does is this:
First, select the branch which has the longest declarative prefix.
say "abc" ~~ /ab | a.* /; # OUTPUT: «⌜abc⌟»say "abc" ~~ /ab | a .* /; # OUTPUT: «⌜ab⌟»say "if else" ~~ / if | if else /; # OUTPUT: «「if」»say "if else" ~~ / if | if \s+ else /; # OUTPUT: «「if else」»
As is shown above, a.* is a declarative prefix, while a {} .* terminates at {}, then its declarative prefix is a. Note that non-declarative atoms terminate declarative prefix. This is quite important if you want to apply | in a rule, which automatically enables :s, and <.ws> accidentally terminates declarative prefix.
If it's a tie, select the match with the highest specificity.
say "abc" ~~ /a. | ab /; # OUTPUT: «win「ab」»
When two alternatives match at the same length, the tie is broken by specificity. That is, ab, as an exact match, counts as closer than a., which uses character classes.
If it's still a tie, use additional tie-breakers.
say "abc" ~~ /a\w| a. /; # OUTPUT: «⌜ab⌟»
If the tie breaker above doesn't work, then the textually earlier alternative takes precedence.
For more details, see the LTM strategy.
Quoted lists are LTM matches§
Using a quoted list in a regex is equivalent to specifying the longest-match alternation of the list's elements. So, the following match:
say 'food' ~~ //; # OUTPUT: «「food」»
is equivalent to:
say 'food' ~~ / f | fo | foo | food /; # OUTPUT: «「food」»
Note that the space after the first < is significant here: <food> calls the named rule food while < food > and < food> specify quoted lists with a single element, 'food'.
If the first branch is an empty string, it is ignored. This allows you to format your regexes consistently:
/| f| fo| foo| food/
Arrays can also be interpolated into a regex to achieve the same effect:
my = <f fo foo food>;say 'food' ~~ /@increasingly-edible/; # OUTPUT: «「food」»
This is documented further under Regex Interpolation, below.
In language/signatures§
See primary documentation in context for Capture parameters
Prefixing a parameter with a vertical bar | makes the parameter a Capture, using up all the remaining positional and named arguments.
This is often used in proto definitions (like proto foo (|) {*}) to indicate that the routine's multi definitions can have any type constraints. See proto for an example.
If bound to a variable, arguments can be forwarded as a whole using the slip operator |.
sub a(Int , Str )sub b(|c)b(42, "answer");# OUTPUT: «CaptureInt Str»