Perl regex question

How can I match a group of chars without matching its boundaries.
Something like the following:


my $text = "-probando regex-";
$text =~ m/-(\w|\s)+(?=-)/g;
print "$&
";

I want to match only the text between “-” and “-”.
The output for the code above is:


-probando regex

but it matches the first “-”. I use the expression “(?=-)” to include the final “-” in the regex but without include it in the matched string. I though use it in the beginning will work but does not.
The following code matches nothing.


$text =~ m/(?=-)(\w|\s)+(?=-)/g;

Help please.

Hi
You could treat the - as a word?


$text =~ m/\w(\w+|\s)+(?=-)/g;


Cheers Malcolm °¿° (Linux Counter #276890)
openSUSE 11.0 x86 Kernel 2.6.25.16-0.1-default
up 3:53, 2 users, load average: 0.00, 0.04, 0.14
GPU GeForce 6600 TE/6200 TE - Driver Version: 173.14.12

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This works for me to just get ‘probando regex’ back:

my $text = “-probando regex-”;
$text =~ m/-(\s\w]+)/g;
print "$1
";

Good luck.

castord wrote:
> How can I match a group of chars without matching its boundaries.
> Something like the following:
>
> Code:
> --------------------
>
> my $text = “-probando regex-”;
> $text =~ m/-(\w|\s)+(?=-)/g;
> print "$&
";
>
> --------------------
>
> I want to match only the text between “-” and “-”.
> The output for the code above is:
>
> Code:
> --------------------
>
> -probando regex
>
> --------------------
>
> but it matches the first “-”. I use the expression “(?=-)” to include
> the final “-” in the regex but without include it in the matched string.
> I though use it in the beginning will work but does not.
> The following code matches nothing.
>
> Code:
> --------------------
>
> $text =~ m/(?=-)(\w|\s)+(?=-)/g;
>
> --------------------
>
>
> Help please.
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFI6QWp3s42bA80+9kRAtRNAJ9irhru4gT7mDlyL0prJQsC50UZvACeK+Xb
sqiAkQ33e20euxS61FAAFDM=
=Fjdb
-----END PGP SIGNATURE-----

Thank you very much for the quick answer!
I actually did try those two ways (with minor variations) but there should be a way to do it with more elegance. I’m looking for that particular way because the string is much more complicated than that. I am improving a ini parser (that I made some time ago) with more sophisticated regular expression :slight_smile:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

What exactly are you trying to do, then? Depending on your needs there
may be really simply ways to your ends but otherwise there are probably
a half-dozen ways to solves this particular question.

Good luck.

castord wrote:
> Thank you very much for the quick answer!
> I actually did try those two ways (with minor variations) but there
> should be a way to do it with more elegance. I’m looking for that
> particular way because the string is much more complicated than that. I
> am improving a ini parser (that I made some time ago) with more
> sophisticated regular expression :slight_smile:
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFI6X4B3s42bA80+9kRArKpAJ0aAOdOAdL3mABJhFgsj8QuZ3KY/ACfTJ1b
YywL9Q3aGBB/rHwjYIFje1I=
=h3qb
-----END PGP SIGNATURE-----


my $blockname = 'GENERAL'
$whole_text=~m/\$blockname\](
|\W\|w|\d|^\])+/g;
return $&;

the code above matches a block specified by the block name.


[GENERAL]
port = "7777";
dbserver = "192.168.10.95";
[OTHER]
other = "text";

using the text above and the previous code will produce the following output


[GENERAL]
port = "7777";
dbserver = "192.168.10.95";

I want to avoid removing explicitly the block name in the matched string. I other words, I don’t want to match the string “[GENERAL]”.
If there is a solution I’ll use it to do the same thing with “<text>”, and match only <text>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Well, a couple things come to mind. I suppose you’re breaking apart the
key/value pairs with something like split. That would make it very
simple, at least. As far as removing the quotes, there are really easy
ways to do it with a few different functions.

With that said is this a new application? Is there a requirement for
the quotes around the value? Can you use instead an XML-based file and
use the parsers built into Perl to handle all of that for you?

Good luck.

castord wrote:
> Code:
> --------------------
>
> my $blockname = ‘GENERAL’
> $whole_text=~m/$blockname](
|\W|w|\d|^])+/g;
> return $&;
>
> --------------------
>
> the code above matches a block specified by the block name.
>
> Code:
> --------------------
>
> [GENERAL]
> port = “7777”;
> dbserver = “192.168.10.95”;
> [OTHER]
> other = “text”;
>
> --------------------
>
> using the text above and the previous code will produce the following
> output
>
> Code:
> --------------------
>
> [GENERAL]
> port = “7777”;
> dbserver = “192.168.10.95”;
>
> --------------------
>
> I want to avoid removing explicitly the block name in the matched
> string. I other words, I don’t want to match the string “[GENERAL]”.
> If there is a solution I’ll use it to do the same thing with “<text>”,
> and match only <text>.
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFI6YiG3s42bA80+9kRAlEJAJ9B3N5FmkN2a/mDwzMHpulS/jbMQgCbB7Fb
wyZ14Av5ZOlv5jsEmld6uPc=
=RaLS
-----END PGP SIGNATURE-----

Yes, the KVP issue was kind of easy to solve, but not that easy because using split was too restrictive in the value party. i.e.:


@array = split('=', 'key = "value";');

but what if


@array = split('=', 'key = "value = sub value";');

then the regex fails.
and doing something like this


my @array = split(/=\s*(?=")/, 'key = "value = sub value"');

is ugly but works.
I can’t use XML because is hard to modify and i want to keep compatibility with .conf files located in the /etc/ directory. The final script will act as POSIX/LINUX daemon.
Is a new version of a previous app that I wrote.
The quotes are necessary because some directives will contain quotes as value, and some other complex data like large SQL queries.