[Pmwiki-users] Smart Quotes
John Rankin
john.rankin at affinity.co.nz
Wed May 28 16:34:24 CDT 2003
An algorithm emerges, blinking. JR
--
In that case, perhaps the best thing to do is to take care of the &lsquo's
first (which are generally paired and do not stand alone), then anything=20
that remains that is converted to an &rsquo, unless it's part of an HTML
tag attribute (finding this last part might be a bit tricky). For this
last part, maybe something like:
$InlineReplacements("/'([^>]*(<|\$))/"] =3D "'’\$1'";
which says to replace any single quote that is not followed by a '>'
(the closing part of an HTML tag) up to a '<' or the end of the string--
i.e., quotes that aren't part of an HTML tag. There might need to be
some global modifiers added to this pattern to deal with multiple
quotes outside of tags.
Just a thought. Interesting problem.
Pm
--
Step 1
Pass over any pattern of the form "/<.*?>/"
(I'm still working on this, having first concentrated on...)
Step 2
Smarten patterns of the form "/.?['\"]+/"
Quotes have 2 properties: handedness (left or right) and type (single or =
double) -- in fact the algorithm works just as well if, for example, =
handedness is up or down.
The problem is to decide the handedness and type of a quote mark and =
return &[l|r][s|d]quo;.
If the . character is null, a space or possibly an =3D, handedness starts =
left otherwise it's right.
If the . character is a backtick, set it to null (this takes care of '90s =
written as `'90s).
smartstring =3D character
for each quote mark:
If the i'th quote mark is the same type as the (i-1)'th quote mark,=20
flip the handedness=20
(2 successive singles come out as left right,
a double followed by a single comes out as left left)
smartstring .=3D &handednesstypequo;
return smartstring
There are boundary issues to take care of, such as paragraphs that start =
with multiple quote marks, and a stripslashes, but step 2 works correctly =
for all the test cases I could think of.
Your approach has the benefit of starting with quote pairs, so you set =
handedness at a global level, then tidy up the left overs, which I can see =
has a number of advantages. One possible defect with the algorithm I wrote =
is that spacequotespace or spacequoteendofline comes out as an orphaned =
right quote. Maybe it should leave it dumb?
There are issues with simply not smartening =3D'...' because =3D '...' is =
valid HTML (I think, although pmwiki always writes =3D'...') and what if I =
write =3D'This is Not a Pipe' in the body of a page.
Back to my day job while I mull step 1. Interestingly, Safari doesn't seem =
to mind HTML entities for quote marks on attributes, but I can't imagine =
all browsers are so forgiving.
JR
More information about the pmwiki-users
mailing list