Replacing elements with lxml.html
I'm fairly new to lxml and HTML Parsers as a whole. I was wondering if there is a way to replace an element within a tree with another element...
For example I have:
body = """<code> def function(arg): print arg </code> Blah blah blah <code> int main() { return 0; } </code> """
doc = lxml.html.fromstring(body)
codeblocks = doc.cssselect('code')
for block in codeblocks:
lexer = guess_lexer(block.text_content())
hilited = highlight(block.text_content(), lexer, HtmlFormatter())
doc.replace(block, hilited)
I want to do something along those lines, but this results in a "TypeError" because "hilited" isn't an lxml.etree._Element.
Is this feasible?
Regards,
Regarding lxml,
In doc.replace(block, hilited)
block is the lxml's Element object, hilited is string, you cannot replace that.
There is 2 ways to do that
block.text=hilited
or
body=body.replace(block.text,hilited)
如果你是python HTML解析器的新手,你可以试试BeautifulSoup,一个html / xml解析器,它可以让你轻松修改解析树。
链接地址: http://www.djcxy.com/p/43766.html下一篇: 用lxml.html替换元素