[pmwiki-devel] regex question
Hans
design5 at softflow.co.uk
Fri Aug 28 06:04:59 CDT 2009
Thursday, August 27, 2009, 8:57:28 PM, Peter wrote:
> if (($bracket_end = strpos($line, "@]")) != false) {
> if (($bracket_start = strpos($line, "[@")) == false ||
$bracket_start >> $bracket_end)
> $line = '[@'.$line;
> (and the same for [=...=])
> I believe that strpos() is MUCH faster than preg_match and
> additionally (perhaps esp in this case) would be much easier to
> understand and so easier to maintain in the future...
Thanks Peter!
I condensed your code a bit, and added the check for
not-closed brackets, like this
//fix orphaned @],[@,=],[=
foreach(array("@","=") as $x) {
$a = strpos($row,'['.$x); $b = strpos($row,$x.']');
if ($b!=0 && ($a==0 || $a>$b)) $row = '['.$x.$row;
else if ($a!=0 && ($b==0 || $a>$b)) $row .= $x.']';
}
Still, it is more readable that the regex patterns:
if (preg_match("/\\[([=@])(?!.*\\1\\])/", $row, $m)) $row .= $m[1]."]";
if (preg_match("/^(?:\\[[^@]|[^[])*[@]\\]/", $row)) $row = "[@".$row;
if (preg_match("/^(?:\\[[^=]|[^[])*[=]\\]/", $row)) $row = "[=".$row;
I checked for speed, and found no real difference using the one or the
other way.
Searching all PmWiki.* pages for 'input|output' got me 215 results on
34 pages from 103 pages searched, and it took just a second.
Most of the time goes into opening and reading the pages, and into
MarkupToHTML formatting, not into processing the text lines.
I should be finished soon with this overhaul of TextExtract.
Just need to run more random tests to find defects.
Hans
More information about the pmwiki-devel
mailing list