Range Operators

Range Operators

Binary ".." is the range operator, which is really two different operators depending on the context. In list context, it returns a list of values counting (up by ones) from the left value to the right value. If the left value is greater than the right value then it returns the empty list. The range operator is useful for writing foreach (1..10) loops and for doing slice operations on arrays. In the current implementation, no temporary array is created when the range operator is used as the expression in foreach loops, but older versions of Perl might burn a lot of memory when you write something like this:

   for (1 .. 1_000_000) {
# code
   }

The range operator also works on strings, using the magical auto-increment, see below.

In scalar context, ".." returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each ".." operator maintains its own boolean state, even across calls to a subroutine that contains it. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again. It doesn't become false till the next time the range operator is evaluated. It can test the right operand and become false on the same evaluation it became true (as in awk), but it still returns true once. If you don't want it to test the right operand until the next evaluation, as in sed, just use three dots ("..." ) instead of two. In all other regards, "..." behaves just like ".." does.

The right operand is not evaluated while the operator is in the "false" state, and the left operand is not evaluated while the operator is in the "true" state. The precedence is a little lower than || and &&. The value returned is either the empty string for false, or a sequence number (beginning with 1) for true. The sequence number is reset for each range encountered. The final sequence number in a range has the string "E0" appended to it, which doesn't affect its numeric value, but gives you something to search for if you want to exclude the endpoint. You can exclude the beginning point by waiting for the sequence number to be greater than 1.

If either operand of scalar ".." is a constant expression, that operand is considered true if it is equal (== ) to the current input line number (the $. variable).

To be pedantic, the comparison is actually int(EXPR) == int(EXPR) , but that is only an issue if you use a floating point expression; when implicitly using $. as described in the previous paragraph, the comparison is int(EXPR) == int($.) which is only an issue when $. is set to a floating point value and you are not reading from a file. Furthermore, "span" .. "spat" or 2.18 .. 3.14 will not do what you want in scalar context because each of the operands are evaluated using their integer representation.

Examples:

As a scalar operator:

if (101 .. 200) { print; } # print 2nd hundred lines, short for
                           #  if ($. == 101 .. $. == 200) { print; }

next LINE if (1 .. /^$/);  # skip header lines, short for
                           #   next LINE if ($. == 1 .. /^$/);
                           # (typically in a loop labeled LINE)

s/^/> / if (/^$/ .. eof());  # quote body

# parse mail messages
while (<>) {
    $in_header =   1  .. /^$/;
    $in_body   = /^$/ .. eof;
    if ($in_header) {
        # do something
    } else { # in body
        # do something else
    }
} continue {
    close ARGV if eof;             # reset $. each file
}

Here's a simple example to illustrate the difference between the two range operators:

@lines = ("   - Foo",
          "01 - Bar",
          "1  - Baz",
          "   - Quux");

foreach (@lines) {
    if (/0/ .. /1/) {
        print "$_\n";
    }
}

This program will print only the line containing "Bar". If the range operator is changed to ... , it will also print the "Baz" line.

And now some examples as a list operator:

for (101 .. 200) { print }      # print $_ 100 times
@foo = @foo[0 .. $#foo];        # an expensive no-op
@foo = @foo[$#foo-4 .. $#foo];  # slice last 5 items

The range operator (in list context) makes use of the magical auto-increment algorithm if the operands are strings. You can say

@alphabet = ("A" .. "Z");

to get all normal letters of the English alphabet, or

$hexdigit = (0 .. 9, "a" .. "f")[$num & 15];

to get a hexadecimal digit, or

@z2 = ("01" .. "31");
print $z2[$mday];

to get dates with leading zeros.

If the final value specified is not in the sequence that the magical increment would produce, the sequence goes until the next value would be longer than the final value specified.

If the initial value specified isn't part of a magical increment sequence (that is, a non-empty string matching /^[a-zA-Z]*[0-9]*\z/ ), only the initial value will be returned. So the following will only return an alpha:

use charnames "greek";
my @greek_small =  ("\N{alpha}" .. "\N{omega}");

To get the 25 traditional lowercase Greek letters, including both sigmas, you could use this instead:

use charnames "greek";
my @greek_small =  map { chr } ( ord("\N{alpha}") 
                                    ..
                                 ord("\N{omega}") 
                               );

However, because there are many other lowercase Greek characters than just those, to match lowercase Greek characters in a regular expression, you could use the pattern /(?:(?=\p{Greek})\p{Lower})+/ (or the experimental feature /(?[ \p{Greek} & \p{Lower} ])+/ ).

Because each operand is evaluated in integer form, 2.18 .. 3.14 will return two elements in list context.

@list = (2.18 .. 3.14); # same as @list = (2 .. 3);
doc_perl
2016-12-06 03:26:55
Comments
Leave a Comment

Please login to continue.