Delphi，MSXML：如何在没有文档名称空间的情况下检索节点XML？

2018-07-02 16:21:40

我需要从XML文档中进行一些解析和信息检索。 XML文档绑定到XML数据绑定，然后解析特定元素。一旦我将需要分解的元素分离出来，我就依次采用每个元素（我们称之为E_parent），并尝试在E_parent的整个XML文本中标识每个非文本子元素（E_child）的位置并进行一些操作或其他。

我遇到的问题是，XML文档的名称空间在被单独访问时被添加到子元素的XML中。

举一个例子，说原始文件如下所示：

<?xml version="1.0" encoding="windows-1252"?>
<RootNode xml:lang="en" xmlns="urn:blah:names:blahblah">
<E_parent>Some text <E_child>child text</E_child> more parent text</E_parent>
</RootNode>
</xml>

当我尝试通过执行如下操作从E_parent或E_child元素访问XML时：

xmlParent := parentNode.XML;

我得到：

<E_parent xmlns="urn:blah:names:blahblah">Some text <E_child>child text</E_child> more parent text</E_parent>

同样的事情，如果我尝试访问E_child的XML，我会得到：

<E_child xmlns="urn:blah:names:blahblah">child text</E_child>

当我尝试在父元素上进行文本搜索时，这是一个问题，因为“真实”文本不包含该名称空间声明：

Some text <E_child>child text</E_child> more parent text

到目前为止，我已经通过在字符串中查找/删除不需要的名称空间属性来处理这个问题，但效率很低，而且很丑陋; o）所以，我的问题是，如何从绑定中检索各种节点的XML XML文档没有将文档名称空间添加到标签中？

=========

谢谢雷米，这很明显，我只需要从一个空白字符串开始，并建立起来，而不是从内部XML开始！

请注意，这是一个比我在这种特定情况下更好的解决方法，但不是我想要的 - 获取没有名称空间的元素的XML对于其他事情（例如日志记录）仍然有用，我希望在那里它在原始文档中出现的节点的确切XML。

使用DOM来处理E_parent的内容。然后，取回E_parent的XML ，然后在其中搜索E_child标记，然后使用DOM确定E_child节点前面存在的纯文本（纯文本将具有其自己的子节点），并确定该文本的长度纯文本将告诉你E_Child的确切文本位置，而不需要检索E_parent的XML 。 E-parent将在未标记文本的每个部分的相关位置中具有多个纯文本子节点。

换句话说，考虑到您展示的XML，DOM的结构将如下所示：

RootNode
|
-- E_parent
   |
   |- "Some text "
   |
   |- E_child
   |  |
   |  -- "child text"
   |
   -- " more parent text"

另一种方法是使用XPath来导航你的xml。

给出示例XML

<?xml version="1.0" encoding="windows-1252"?>
<RootNode xml:lang="en" xmlns="urn:blah:names:blahblah">
<E_parent>Some text <E_child>child text</E_child> more parent text</E_parent>
</RootNode>

您可以使用MSXML解析器直接使用一点XPath导航到您的E_child元素。首先，您需要制作自己的MSXML2_TLB单元副本。您可以使用看起来像这样的Delphi代码来访问E_child节点：

uses MSXMLDOM,MSXML2_TLB;

procedure Sample;
var
  doc: IXMLDOMDocument2;
  root: IXMLDomElement;
  nodes: IXMLDOMNodeList;
  node: IXMLDOMNode;
begin

  doc := CoDOMDocument60.Create;
  doc.async := false;
  // Use same namespace as the default namespace here
  doc.setProperty('SelectionNamespaces', 'xmlns:t="urn:blah:names:blahblah"');
  doc.setProperty('SelectionLanguage', 'XPath');
  doc.loadXML(XmlSource.Text);

  root := doc.documentElement;
  nodes := root.selectNodes('//t:E_child');

  // Now thee nodes contains all E_child nodes
  // Processs them here
  // ...
end;

关键在于，您为XPath查询的文档默认名称空间使用了特定的前缀。 // t：E_child是用于查找E_child元素的实际XPath表达式。

使用您的代码，然后使用Pos / PoxEx来查找E_Child元素的开始和结束。

var
  cStart, cEnd: Integer;
  ChildName, ChildText: string;
begin
  ... other code
  xmlParent := parentNode.XML;
  ChildName := 'E_Child';
  // Find starting position of child tag
  cStart := Pos('<' + E_Child, xmlParent);
  // You now have the opening <
  cEnd   := PosEx('</' + E_Child, xmlParent, cStart);
  // You now have the final < of the child.
  // Add the length of the child's name + the closing >
  Inc(cEnd, Length('</' + E_Child + '>'));
  // Grab the entire child XML
  ChildText := System.Copy(xmlParent, cStart, cEnd - cStart);
  // Do whatever you want with the child. For instance,
  // remove the original text.
  System.Delete(xmlParent, cStart, cEnd - cStart);
  // Replace it with new text
  System.Insert(NewChildText, xmlParent, cStart);
end;

链接地址: http://www.djcxy.com/p/91089.html

上一篇: Delphi, MSXML: how to retrieve node XML without the document namespace?

下一篇: uac elevate while using ifileoperation copyitem