Perl regular expressions and position of match

Dear all,
I would be very grateful if you could help me with the meaning of the following:

>Perl saves the positions of matches in the special arrays @- and @+
>The variables $-[0] and $+[0]are the start and end of the entire match
>The rest hold the starts and ends of the memories (brackets):

Example:

               
               3       10 14 16   20
 $line = "   CDS    4815..5888";
 $line =~ m/CDS\s+(\d+)\.\.(\d+)/;
 print " starts: @- 
 ends:  @+ 
";
 starts: 3 10 16 
 ends:  20 14 20

a) I do not undestand the numbering of the starting and the ending positions (how is it counted?)
b) what do the $-[0] and $+[0] in this example?
c) in this example @- and @+ do not save the position of matches (as mention in the beginning of the
paragraph) but hold the starts and ends of the memories.

I look forward to hearing from you,
mariaig

a. Positions in strings are counted from 0.
b. $-[0] is 3 and $+[0] is 20
c. Why do you think they are not the positions of matches?

Dear ken_yap,
thank you for yur reply,

a. Positions in strings are counted from 0.
b. $-[0] is 3 and $+[0] is 20

a,b I understand vwey well.
However, how can position 14 be after three digits from position 10? How can 16 be after three spaces from 14?

           3       10 14 16   20

$line = " CDS 4815…5888";
$line =~ m/CDSs+(d+)…(d+)/;

(ignore c )

I look forward to hearing from you,
thank you again,
mariaig

Let’s look at your string, with the positions underneath, I omit the tens digit.

   CDS    4815..5888
012345678901234567890

So the whole match is from position 0 to position 20 (the end of the match is the position after the last character of the match, this is the key insight you need).

The first capturing match is from 10 to 14.

The second capturing match is from 16 to 20.

Incidentally most people are not interested in the positions of the matches, but just want to extract the matches. You can get the matches without knowing the positions.

I just noticed a small error in previous post. The whole match is from position 3 to 20. Your misunderstanding was counting the positions from the letter C. Actually the positions are from the beginning of the string, i.e. the first blank.

Dear Ken_yap,
thank you very much for your reply,
I think there must be a small unimportant error in the counting. But it is not important. I think the most important thing out of the code is that @+ and @- capture the starts and the ends of the matching positions between the string and the pattern.
thank you very much for your replies and moreover for coming back to your previous post.
mariaig

No, there is no error in the behaviour of the Perl code. Imagine a language that didn’t behave as specified in the documentation. It would drive programmers mad (both in the meanings of crazy and angry). There may be a error in your data though.