javascript - Regular expression for extracting XML tag

admin管理员组
文章数量:1026194

I have some XML which I want to extract via a javascript regular expression. An example of the XML is shown below:

<rules><and><gt propName="Unit" value="5" type="System.Int32"/><or><startsWith propName="DeviceType"/></or></and></rules>

I’m having problems extracting just the xml names “gt” and “startsWith”. For example, with the following expression

<(.+?)\s

I get:

“<rules><and><gt”

rather than just “gt”.

Can anyone supply the correct expression?

I have some XML which I want to extract via a javascript regular expression. An example of the XML is shown below:

<rules><and><gt propName="Unit" value="5" type="System.Int32"/><or><startsWith propName="DeviceType"/></or></and></rules>

I’m having problems extracting just the xml names “gt” and “startsWith”. For example, with the following expression

<(.+?)\s

I get:

“<rules><and><gt”

rather than just “gt”.

Can anyone supply the correct expression?

Share Improve this question edited Jul 9, 2011 at 4:33 Brad Mace 27.9k18 gold badges109 silver badges152 bronze badges asked Sep 20, 2010 at 11:35 Retrocoder 4,72311 gold badges50 silver badges72 bronze badges

You shouldn't use a regex but <([^> ]+) will probably do :) – jensgram Commented Sep 20, 2010 at 11:44

Add a ment |

4 Answers 4

Sorted by: Reset to default 4

Regex is a poor tool to parse xml. You can easily parse the XML in JavaScript. A library like jQuery makes this task especially easy (for example):

var xml = '<rules><and><gt propName="Unit" value="5" type="System.Int32"/><or><startsWith propName="DeviceType"/></or></and></rules>';
var gt = $('gt', xml);
var t = gt.attr('type'); //System.Int32

Well, \s matches whitespace. So you actually tell the regex engine to:

<(.+?)\s
^^    ^
||    \ until you find a whitespace
|\ slurp in anything (but whitespace)
\ as long as it starts with an opening pointy bracket

You could, for example use:

<([^\s>]+?)

but you should always consider this.

Don't use a regex to do this kind of things. Rather use the DOM processing functions such as

var gtElements = document.getElementsByTagName('gt');
var startsWithElements = document.getElementsByTagName('startsWith');

The most robust method would be to use the browser's built-in XML parser and standard DOM methods for extracting the elements you want:

var parseXml;

if (window.DOMParser) {
    parseXml = function(xmlStr) {
        return ( new window.DOMParser() ).parseFromString(xmlStr, "text/xml");
    };
} else if (typeof window.ActiveXObject != "undefined" &&
        new window.ActiveXObject("Microsoft.XMLDOM")) {
    parseXml = function(xmlStr) {
        var xmlDoc = new window.ActiveXObject("Microsoft.XMLDOM");
        xmlDoc.async = "false";
        xmlDoc.loadXML(xmlStr);
        return xmlDoc;
    };
} else {
    parseXml = function() { return null; }
}

var xmlStr = '<rules><and>' +
    '<gt propName="Unit" value="5" type="System.Int32"/><or>' + 
    '<startsWith propName="DeviceType"/></or></and></rules>';

var xmlDoc = parseXml(xmlStr);
if (xmlDoc) {
    var gt = xmlDoc.getElementsByTagName("gt")[0];
    alert( gt.getAttribute("propName") );
}

I have some XML which I want to extract via a javascript regular expression. An example of the XML is shown below:

<rules><and><gt propName="Unit" value="5" type="System.Int32"/><or><startsWith propName="DeviceType"/></or></and></rules>

I’m having problems extracting just the xml names “gt” and “startsWith”. For example, with the following expression

<(.+?)\s

I get:

“<rules><and><gt”

rather than just “gt”.

Can anyone supply the correct expression?

I have some XML which I want to extract via a javascript regular expression. An example of the XML is shown below:

<rules><and><gt propName="Unit" value="5" type="System.Int32"/><or><startsWith propName="DeviceType"/></or></and></rules>

I’m having problems extracting just the xml names “gt” and “startsWith”. For example, with the following expression

<(.+?)\s

I get:

“<rules><and><gt”

rather than just “gt”.

Can anyone supply the correct expression?

You shouldn't use a regex but <([^> ]+) will probably do :) – jensgram Commented Sep 20, 2010 at 11:44

Add a ment |

4 Answers 4

Sorted by: Reset to default 4

Regex is a poor tool to parse xml. You can easily parse the XML in JavaScript. A library like jQuery makes this task especially easy (for example):

var xml = '<rules><and><gt propName="Unit" value="5" type="System.Int32"/><or><startsWith propName="DeviceType"/></or></and></rules>';
var gt = $('gt', xml);
var t = gt.attr('type'); //System.Int32

Well, \s matches whitespace. So you actually tell the regex engine to:

<(.+?)\s
^^    ^
||    \ until you find a whitespace
|\ slurp in anything (but whitespace)
\ as long as it starts with an opening pointy bracket

You could, for example use:

<([^\s>]+?)

but you should always consider this.

Don't use a regex to do this kind of things. Rather use the DOM processing functions such as

var gtElements = document.getElementsByTagName('gt');
var startsWithElements = document.getElementsByTagName('startsWith');

The most robust method would be to use the browser's built-in XML parser and standard DOM methods for extracting the elements you want:

var parseXml;

if (window.DOMParser) {
    parseXml = function(xmlStr) {
        return ( new window.DOMParser() ).parseFromString(xmlStr, "text/xml");
    };
} else if (typeof window.ActiveXObject != "undefined" &&
        new window.ActiveXObject("Microsoft.XMLDOM")) {
    parseXml = function(xmlStr) {
        var xmlDoc = new window.ActiveXObject("Microsoft.XMLDOM");
        xmlDoc.async = "false";
        xmlDoc.loadXML(xmlStr);
        return xmlDoc;
    };
} else {
    parseXml = function() { return null; }
}

var xmlStr = '<rules><and>' +
    '<gt propName="Unit" value="5" type="System.Int32"/><or>' + 
    '<startsWith propName="DeviceType"/></or></and></rules>';

var xmlDoc = parseXml(xmlStr);
if (xmlDoc) {
    var gt = xmlDoc.getElementsByTagName("gt")[0];
    alert( gt.getAttribute("propName") );
}

本文标签： javascriptRegular expression for extracting XML tagStack Overflow

版权声明：本文标题：javascript - Regular expression for extracting XML tag - Stack Overflow 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://it.en369.cn/questions/1745623233a2159705.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

369IT编程

javascript - Regular expression for extracting XML tag - Stack Overflow

4 Answers 4

4 Answers 4

更多相关文章