admin管理员组
文章数量:1025509

I know that a parser would best be suited for this situation but in my current situation it has to be just straight javascript.

I have a regex to find the closing body tag of an html doc.

var closing_body_tag = /(<\/body>)/i;

However, this fails when source has more than 1 body tag set. So I was thinking about going with something like this..

var last_closing_body_tag = /(<\/body>)$/gmi;

This works for the case when multiple tags are found, but for some reason it is failing on cases with just 1 set of tags.

Am I making a mistake that would cause mixed results for single tag cases?

Yes, I understand more than one body tag is incorrect, however, we have to handle all bad source.

I know that a parser would best be suited for this situation but in my current situation it has to be just straight javascript.

I have a regex to find the closing body tag of an html doc.

var closing_body_tag = /(<\/body>)/i;

However, this fails when source has more than 1 body tag set. So I was thinking about going with something like this..

var last_closing_body_tag = /(<\/body>)$/gmi;

This works for the case when multiple tags are found, but for some reason it is failing on cases with just 1 set of tags.

Am I making a mistake that would cause mixed results for single tag cases?

Yes, I understand more than one body tag is incorrect, however, we have to handle all bad source.

Share Improve this question edited Apr 24, 2015 at 15:16 asked Apr 24, 2015 at 15:06 Adam 3,6656 gold badges36 silver badges52 bronze badges

7 And why would you have more than one body tag ? – adeneo Commented Apr 24, 2015 at 15:08
1 Just curious. Why do you need to find the closing body tag? What are you going to do with that? – hindmost Commented Apr 24, 2015 at 15:09
3 You don't need jQuery for parsing HTML. – Ram Commented Apr 24, 2015 at 15:09
1 @Adam You don't need Regexp for that. Use DOM manipulation methods instead – hindmost Commented Apr 24, 2015 at 15:11
1 document.body.appendChild inserts an element right before the closing tag. A regex does not ? – adeneo Commented Apr 24, 2015 at 15:13

| Show 10 more ments

4 Answers 4

Sorted by: Reset to default 3

You can use this regex:

  /<\/body>(?![\s\S]*<\/body>[\s\S]*$)/i

(?![\s\S]*<\/body>[\s\S]*$) is a lookahead that ensures there is no more closing body tag before the end of the string.

Here is a demo.

Sample code for adding a tag:

var re = /<\/body>(?![\s\S]*<\/body>[\s\S]*$)/i; 
var str = '<html>\n<body>\n</body>\n</html>\n<html>\n<body>\n</body>\n</html>';
var subst = '<tag/>'; 
var result = str.replace(re, subst);

RegExp

As I suggested in the ments, use:

/^[\S\s]+(<\/body>)/i

How

This will get all text (greedy) until the text </body> the flag i means case-insensitive. This will work no matter how many body tags you have

</body>
</BODY>
</BoDY>
</body><!--This one's selected-->

You said you were using JavaScript which can be used as:

yourString.match(/^[\S\s]+(<\/body>)/i)[1];

.match works fine when you don't have the g flag. To further explain this RegExp

Explanation

^ Matches it at the beginning of the whole string because we don't have the m flag

[\S\s]+ will match everything until the following. The + can be replaced by a *

(<\/body>) will get the body tag after the previous (the last one) and add it as a match

i the i flag makes the string case-insensitive (remove if you want it to be case sensitive)

JavaScript appendChild

If you have multiple body tags, you can still add an element before it.

var elem = document.createElement('div');
elem.setAttribute('id', 'mydiv');
elem.innerHTML = 'Foo';

Now, elem can be added in multiple ways:

1:

window.document.body.appenedChild(elem);

2:

var body_elems = document.getElementsByTagName('body');
body_elems[body_elems.length - 1].appendChild(elem);

Use

/(.|[\r\n])*(<\/body>)/mi

as a regexp. Capture group is $2.

This exploits greedy matching in connection with the multiline option. Note that the 'any char' symbol does not match newlines/carriage returns, which thus need explicit referral.

The regex to match the last body tag is fairly simple:

/[\s\S]*(</body>)/i

What this does is match as many possible of any character (more specifically, any whitespacespace or anything that's not whitespace) before </body>.

The i flag means that it'll match any case for </body>, so anything like:

</body>
</BODY>
</BodY>

Will all match.

I used [\s\S] instead of . because . matches everything but the newline operators, which probably isn't what you want. \s matches all whitespace -- spaces, tabs, every kind of newline -- and \S is equivalent to [^\s], so it matches everything that isn't whitespace. Together, these match every possible character. I'd imagine a similar thing is possible with \w\W, \d\D, etc., but \s\S is my preference.

I know that a parser would best be suited for this situation but in my current situation it has to be just straight javascript.

I have a regex to find the closing body tag of an html doc.

var closing_body_tag = /(<\/body>)/i;

However, this fails when source has more than 1 body tag set. So I was thinking about going with something like this..

var last_closing_body_tag = /(<\/body>)$/gmi;

This works for the case when multiple tags are found, but for some reason it is failing on cases with just 1 set of tags.

Am I making a mistake that would cause mixed results for single tag cases?

Yes, I understand more than one body tag is incorrect, however, we have to handle all bad source.

I know that a parser would best be suited for this situation but in my current situation it has to be just straight javascript.

I have a regex to find the closing body tag of an html doc.

var closing_body_tag = /(<\/body>)/i;

However, this fails when source has more than 1 body tag set. So I was thinking about going with something like this..

var last_closing_body_tag = /(<\/body>)$/gmi;

This works for the case when multiple tags are found, but for some reason it is failing on cases with just 1 set of tags.

Am I making a mistake that would cause mixed results for single tag cases?

Yes, I understand more than one body tag is incorrect, however, we have to handle all bad source.

Share Improve this question edited Apr 24, 2015 at 15:16 asked Apr 24, 2015 at 15:06 Adam 3,6656 gold badges36 silver badges52 bronze badges

7 And why would you have more than one body tag ? – adeneo Commented Apr 24, 2015 at 15:08
1 Just curious. Why do you need to find the closing body tag? What are you going to do with that? – hindmost Commented Apr 24, 2015 at 15:09
3 You don't need jQuery for parsing HTML. – Ram Commented Apr 24, 2015 at 15:09
1 @Adam You don't need Regexp for that. Use DOM manipulation methods instead – hindmost Commented Apr 24, 2015 at 15:11
1 document.body.appendChild inserts an element right before the closing tag. A regex does not ? – adeneo Commented Apr 24, 2015 at 15:13

| Show 10 more ments

4 Answers 4

Sorted by: Reset to default 3

You can use this regex:

  /<\/body>(?![\s\S]*<\/body>[\s\S]*$)/i

(?![\s\S]*<\/body>[\s\S]*$) is a lookahead that ensures there is no more closing body tag before the end of the string.

Here is a demo.

Sample code for adding a tag:

var re = /<\/body>(?![\s\S]*<\/body>[\s\S]*$)/i; 
var str = '<html>\n<body>\n</body>\n</html>\n<html>\n<body>\n</body>\n</html>';
var subst = '<tag/>'; 
var result = str.replace(re, subst);

RegExp

As I suggested in the ments, use:

/^[\S\s]+(<\/body>)/i

How

This will get all text (greedy) until the text </body> the flag i means case-insensitive. This will work no matter how many body tags you have

</body>
</BODY>
</BoDY>
</body><!--This one's selected-->

You said you were using JavaScript which can be used as:

yourString.match(/^[\S\s]+(<\/body>)/i)[1];

.match works fine when you don't have the g flag. To further explain this RegExp

Explanation

^ Matches it at the beginning of the whole string because we don't have the m flag

[\S\s]+ will match everything until the following. The + can be replaced by a *

(<\/body>) will get the body tag after the previous (the last one) and add it as a match

i the i flag makes the string case-insensitive (remove if you want it to be case sensitive)

JavaScript appendChild

If you have multiple body tags, you can still add an element before it.

var elem = document.createElement('div');
elem.setAttribute('id', 'mydiv');
elem.innerHTML = 'Foo';

Now, elem can be added in multiple ways:

1:

window.document.body.appenedChild(elem);

2:

var body_elems = document.getElementsByTagName('body');
body_elems[body_elems.length - 1].appendChild(elem);

Use

/(.|[\r\n])*(<\/body>)/mi

as a regexp. Capture group is $2.

This exploits greedy matching in connection with the multiline option. Note that the 'any char' symbol does not match newlines/carriage returns, which thus need explicit referral.

The regex to match the last body tag is fairly simple:

/[\s\S]*(</body>)/i

What this does is match as many possible of any character (more specifically, any whitespacespace or anything that's not whitespace) before </body>.

The i flag means that it'll match any case for </body>, so anything like:

</body>
</BODY>
</BodY>

Will all match.

本文标签： javascriptRegex find last body tagStack Overflow

版权声明：本文标题：javascript - Regex find last body tag - Stack Overflow 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://it.en369.cn/questions/1745629118a2160051.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

369IT编程

javascript - Regex find last body tag - Stack Overflow

4 Answers 4

RegExp

How

Explanation

JavaScript appendChild

1:

2:

4 Answers 4

RegExp

How

Explanation

JavaScript appendChild

1:

2:

更多相关文章

javascript - Regex find last body tag - Stack Overflow

发表评论

推荐文章

javascript - Regex Comma Separated Phone Number - Stack Overflow

Remove Featured Image &amp; All Media Uploaded to the Post

custom field - Meta_query &#39;compare&#39; =&gt; &#39;LIKE&#39; not working?

password - Entering a WP site with a SMS code

Opening a JQuery modal window on click of a button with a JQuery plugin

热门文章

javascript - Append style to div with JS - Stack Overflow

flycheck - Parametrize project root in emacs project - Stack Overflow

javascript - Vuetify text field in data table item slot - Stack Overflow

javascript - Error 0x8007000e (NS_ERROR_OUT_OF_MEMORY) - Stack Overflow

javascript - View base64 encoded PNG in browser console - Stack Overflow

javascript - React - Invariant Violation: Minified React error #130. ONLY in Production - Stack Overflow

Submit page using dynamic action javascript in Oracle apex - Stack Overflow

user meta - Existing user_meta fields not updated

javascript - VueJS 3 - &lt;router-link-active&gt; class not applied to routes that start with the same path name - Stack

javascript - jQuery Form Validation - success + showErrors - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

程序员刚毕业，先去大厂镀金还是先去小厂攒经验？

万象2008清空boss账户密码

【Tools】GitBook简明教程

oracle exadata celldisk 闪存盘受损导致性能下降

SDUT 2138 图结构练习——BFSDFS——判断可达性

python - How to enableswap secondary viewbox of plotWidget to listen mouse events in plot area? - Stack Overflow

javascript - How to get the Object atributes from ajax response - Stack Overflow

javascript - Calling selected row in jqGrid only works once - Stack Overflow

python - How to create a streamlit dashboard that runs only the selected tab - Stack Overflow

javascript - Firefox 29.0.1 WebSocket problems - Stack Overflow

Remove Featured Image & All Media Uploaded to the Post

custom field - Meta_query 'compare' => 'LIKE' not working?

javascript - VueJS 3 - <router-link-active> class not applied to routes that start with the same path name - Stack