regex for php to find all self
I've got a system that uses a DomDocumentFragment which is created based on markup from a database or another area of the system (ie other XHTML code).
One such tag that may be included is:
<div class="clear"></div>
Before the string is added to the DomDocumentFragment, the content is correct - the class is closing correctly.
However, the DomDocumentFragment transforms this into:
<div class="clear"/>
This does not display correctly in browsers due to the incorrect closing of the tag.
So my thought is to post-process the XML string that the DomDocument returns me (that includes the incorrect div structure, as shown above), and transform self-closing tags back to their correct structure... ie turn back to .
But I'm having trouble with the pattern for preg_match to find these tags - I've seen some patterns that return all tags (ie find all tags), but not just those that are self closing.
I've tried something along the lines of this, but my head gets a little confused with regex (and I start over-complicating things)
/<div(["dws])/>/
The aim is for a pattern to match , where the "...." could be any valid XHTML attributes.
Any suggestions or pointers to put me back on track?
Limit the problem domain -- you need to change <div class="clear"/>
to <div class="clear"></div>
... so search for the former, and replace it with the latter using a straightforward find and replace operation. It should be faster and it will definitely be safer
Whatever you do, do not try to parse HTML with a regular expression (which you're trying to do by building a regex that can detect a <div>
with arbitrary attributes.)
Putting
<div></div>
into a DomDocumentFragment doesn't actually change it into
<div/>
it changes it into
A-DOM-Element-Node-with-name-"div"-and-no-content.
It's only when the DomDocumentFragment is serialized that either <div></div>
or <div/>
is created. In other words, the problem lies not with the DomDocumentFragment, but with the serialization process that you are using.
PHP is not my language, so I can't be much more help, but I would be looking for an HTML-compatible serializer for your DomDocumentFragment, rather than try to patch the output after serialization.
链接地址: http://www.djcxy.com/p/18982.html上一篇: 服务XHTML和自我
下一篇: 正则表达式为PHP来找到所有的自我