javascript - A regex to remove id, style, class attributes from HTML tags in JS

admin管理员组
文章数量:1130349

I got a html String in javascript and using regex I want to remove id, style and class attributes in html tags, for example I have:

New York City.<div style="padding:20px" id="upp" class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>

I want this String to become:

New York City.<div><div>This message is.</div></div>

I got a html String in javascript and using regex I want to remove id, style and class attributes in html tags, for example I have:

New York City.<div style="padding:20px" id="upp" class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>

I want this String to become:

New York City.<div><div>This message is.</div></div>

Share Improve this question edited Sep 10, 2012 at 22:24 asked Sep 10, 2012 at 22:22 Jimmy Page 3431 gold badge6 silver badges12 bronze badges

3 /me is casting a link to the legendary don't-parse-html-with-regex answer... – zerkms Commented Sep 10, 2012 at 22:26
How about removeAttribute(), maybe? – David Thomas Commented Sep 10, 2012 at 22:26
1 Convert it to a DOM element and use the appropriate tools to manipulate it. That's a far more stable solution. – You Commented Sep 10, 2012 at 22:31

Add a comment |

8 Answers 8

Sorted by: Reset to default 11

Instead of parsing the HTML using regular expressions, which is a bad idea, you could take advantage of the DOM functionality that is available in all browsers. We need to be able to walk the DOM tree first:

var walk_the_DOM = function walk(node, func) {
    func(node);
    node = node.firstChild;
    while (node) {
        walk(node, func);
        node = node.nextSibling;
    }
};

Now parse the string and manipulate the DOM:

var wrapper= document.createElement('div');
wrapper.innerHTML= '<!-- your HTML here -->';
walk_the_DOM(wrapper.firstChild, function(element) {
    if(element.removeAttribute) {
        element.removeAttribute('id');
        element.removeAttribute('style');
        element.removeAttribute('class');
    }
});
result = wrapper.innerHTML;

See also this JSFiddle.

If you are willing to remove everything but the div tag names-

string=string.replace(/<(div)[^>]+>/ig,'<$1>');

This will return <DIV> if the html is upper Case.

If you just want to remove the attributes, then regex is the wrong tool. I'd suggest, instead:

function stripAttributes(elem){
    if (!elem) {
        return false;
    }
    else {
        var attrs = elem.attributes;
        while (attrs.length) {
            elem.removeAttribute(attrs[0].name);
        }
    }
}

var div = document.getElementById('test');

stripAttributes(div);

JS Fiddle demo.

i used this

var html = 'New York City.<div style="padding:20px" id="upp"
class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>';

function clear_attr(str,attrs){
    var reg2 = /\s*(\w+)=\"[^\"]+\"/gm;
    var reg = /<\s*(\w+).*?>/gm;
    str = str.replace(reg,function(match, i) {
        var r_ = match.replace(reg2,function(match_, i) {
            var reg2_ = /\s*(\w+)=\"[^\"]+\"/gm;
            var m = reg2_.exec(match_);
            if(m!=null){
                if(attrs.indexOf(m[1])>=0){
                    return match_;
                }
            }
            return '';
        });        
        return r_;
    });
    return str;
}
clear_attr(html,[]);

Use regular expression. That is fast (in production time) and easy (in development time).

htmlCode = htmlCode.replace(/<([^ >]+)[^>]*>/ig,'<$1>');

Trying to parse HTML with regexes will cause problems. This answer may be helpful in explaining them. If you are using jQuery, you may be able to do something like this:

var transformedHtml = $(html).find("*").removeAttr("id").removeAttr("style").removeAttr("class").outerHTML()

For this to work, you need to be using the outerHTML plugin described here.

If you don't want to use jQuery, it will be trickier. These question may have some helpful answers as to how to convert the string to a collection of DOM elements: Converting HTML string into DOM elements?, Creating a new DOM element from an HTML string using built-in DOM methods or prototype. You may be able to loop through the elements and remove the attributes using the built-in removeAttr function. I don't have the time or motivation to figure out all the details for you.

A plain script solution would be something like:

function removeProperties(markup) {
  var div = document.createElement('div');
  div.innerHTML = markup;
  var el, els = div.getElementsByTagName('*');

  for (var i=0, iLen=els.length; i<iLen; i++) {
    el = els[i];
    el.id = '';
    el.style = '';
    el.className = '';
  }
  // now add elements to the DOM
  while (div.firstChild) {
   // someElement.appendChild(div.firstChild);
  }
}

A more general solution would get the property names as extra arguments, or say a space separated string, then iterate over the names to remove them.

I don't know about RegEx, but I sure as hell know about jQuery.

Convert the given HTML string into a DOM element, parse it, and return its contents.

function cleanStyles(html){
    var temp = $(document.createElement('div'));
        temp.html(html);

        temp.find('*').removeAttr('style');
        return temp.html();
}

I got a html String in javascript and using regex I want to remove id, style and class attributes in html tags, for example I have:

New York City.<div style="padding:20px" id="upp" class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>

I want this String to become:

New York City.<div><div>This message is.</div></div>

I got a html String in javascript and using regex I want to remove id, style and class attributes in html tags, for example I have:

New York City.<div style="padding:20px" id="upp" class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>

I want this String to become:

New York City.<div><div>This message is.</div></div>

Share Improve this question edited Sep 10, 2012 at 22:24 asked Sep 10, 2012 at 22:22 Jimmy Page 3431 gold badge6 silver badges12 bronze badges

3 /me is casting a link to the legendary don't-parse-html-with-regex answer... – zerkms Commented Sep 10, 2012 at 22:26
How about removeAttribute(), maybe? – David Thomas Commented Sep 10, 2012 at 22:26
1 Convert it to a DOM element and use the appropriate tools to manipulate it. That's a far more stable solution. – You Commented Sep 10, 2012 at 22:31

Add a comment |

8 Answers 8

Sorted by: Reset to default 11

var walk_the_DOM = function walk(node, func) {
    func(node);
    node = node.firstChild;
    while (node) {
        walk(node, func);
        node = node.nextSibling;
    }
};

Now parse the string and manipulate the DOM:

var wrapper= document.createElement('div');
wrapper.innerHTML= '<!-- your HTML here -->';
walk_the_DOM(wrapper.firstChild, function(element) {
    if(element.removeAttribute) {
        element.removeAttribute('id');
        element.removeAttribute('style');
        element.removeAttribute('class');
    }
});
result = wrapper.innerHTML;

See also this JSFiddle.

If you are willing to remove everything but the div tag names-

string=string.replace(/<(div)[^>]+>/ig,'<$1>');

This will return <DIV> if the html is upper Case.

If you just want to remove the attributes, then regex is the wrong tool. I'd suggest, instead:

function stripAttributes(elem){
    if (!elem) {
        return false;
    }
    else {
        var attrs = elem.attributes;
        while (attrs.length) {
            elem.removeAttribute(attrs[0].name);
        }
    }
}

var div = document.getElementById('test');

stripAttributes(div);

JS Fiddle demo.

i used this

var html = 'New York City.<div style="padding:20px" id="upp"
class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>';

function clear_attr(str,attrs){
    var reg2 = /\s*(\w+)=\"[^\"]+\"/gm;
    var reg = /<\s*(\w+).*?>/gm;
    str = str.replace(reg,function(match, i) {
        var r_ = match.replace(reg2,function(match_, i) {
            var reg2_ = /\s*(\w+)=\"[^\"]+\"/gm;
            var m = reg2_.exec(match_);
            if(m!=null){
                if(attrs.indexOf(m[1])>=0){
                    return match_;
                }
            }
            return '';
        });        
        return r_;
    });
    return str;
}
clear_attr(html,[]);

Use regular expression. That is fast (in production time) and easy (in development time).

htmlCode = htmlCode.replace(/<([^ >]+)[^>]*>/ig,'<$1>');

Trying to parse HTML with regexes will cause problems. This answer may be helpful in explaining them. If you are using jQuery, you may be able to do something like this:

var transformedHtml = $(html).find("*").removeAttr("id").removeAttr("style").removeAttr("class").outerHTML()

For this to work, you need to be using the outerHTML plugin described here.

A plain script solution would be something like:

function removeProperties(markup) {
  var div = document.createElement('div');
  div.innerHTML = markup;
  var el, els = div.getElementsByTagName('*');

  for (var i=0, iLen=els.length; i<iLen; i++) {
    el = els[i];
    el.id = '';
    el.style = '';
    el.className = '';
  }
  // now add elements to the DOM
  while (div.firstChild) {
   // someElement.appendChild(div.firstChild);
  }
}

A more general solution would get the property names as extra arguments, or say a space separated string, then iterate over the names to remove them.

I don't know about RegEx, but I sure as hell know about jQuery.

Convert the given HTML string into a DOM element, parse it, and return its contents.

function cleanStyles(html){
    var temp = $(document.createElement('div'));
        temp.html(html);

        temp.find('*').removeAttr('style');
        return temp.html();
}

本文标签：

版权声明：本文标题：javascript - A regex to remove id, style, class attributes from HTML tags in JS - Stack Overflow 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://it.en369.cn/questions/1738645352a1590977.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

369IT编程

javascript - A regex to remove id, style, class attributes from HTML tags in JS - Stack Overflow

8 Answers 8

8 Answers 8

更多相关文章

MySQL5.7.32 64位解压缩版 windows操作系统安装教程图解

Go 语言安装教程（Windows 系统）

计算机基础----32位操作系统和64位操作系统的区别

STM32的USB虚拟串口驱动在Windows 7 64位和32位系统上无法正常安装的解决办法-STM虚拟串口驱动在Windows 7上的终极解决方案

7zip下载、安装

windows下，python3.6 32位、64位共存及开发工具vscode配置

64位windows下安装eclipse及其所需32位版JDK详细教程

32位系统无法运行64位系统安装文件

计算机怎么弄64位,64位系统怎么装？安装64位系统教程

win7 32位与64位下载地址存档

32位系统安装 64位windows 7的方法

Windows XP2000Vista788.110系统 hosts文件位置及使用

VirtualBox或VM Ware只能安装32位系统的解决办法

JetBrains PyCharm windows32位 安装

matlab7.0 32位&amp;64位 下载和安装说明

Windows11系统p2psvc.dll文件丢失问题

Windows10 安装oracle 11gR2 client 32位的方法

jumpserver 修改源码实现密钥+密码

【免费下载】 VNCViewer 注册码资源下载

Sublime 32位 激活码

发表评论

推荐文章

计算机c盘哪些文件不能删,window7c盘哪些文件可以直接删除的？删除C盘无用文件释放空间的方法...

程序员小妙招：只需一个代码！就能删除C盘垃圾，释放几十G

python qq模块_用python写一个QQ机器人

Deepin下载缓慢问题解决方法

联想电脑尺寸在哪里看_联想电脑型号怎么查看【详细介绍】

热门文章

电脑截图快捷键ctrl加什么键?电脑截图快捷键大全分享

Anaconda prompt 提示系统找不到指定路径

苹果版的ChatGPT官方Siri助手来了，立马体验

【亲测免费】 HiWiFi 智能工具箱：打造你的智能路由器管理新体验

windows环境下修改pip镜像源

文件系统加密软件有哪些,经典的六款文件加密软件排行榜

抖音直播间截流软件

Ubuntu16.04安装Vrep3.5和Vrep3.6(有下载好的软件包：百度网盘永久链接)

android中自动翻译你看不懂的英文代码插件，让你实现在androistudio中学习英语！！

商用计算机选购的要点什么,2018选购笔记本的技巧 笔记本电脑选购要点

最新文章

Sublime 32位 激活码

windows下载安装远程桌面工具RealVNC-Server教程(RealVNC_E4_6_1版带注册码)

【亲测免费】 抖音直播伴侣推流密钥获取工具使用教程

【亲测免费】 Proxifer 安装包与注册码

Royal TSX许可证密钥(6.x后所有版本都可以用)

程序员刚毕业，先去大厂镀金还是先去小厂攒经验？

万象2008清空boss账户密码

【Tools】GitBook简明教程

oracle exadata celldisk 闪存盘受损导致性能下降

SDUT 2138 图结构练习——BFSDFS——判断可达性

WordPress get parent category taxonomy

Omit specific product categories from WooCommerce shortcode

Updating Posts table in database without overwriting user generated content

php - Use wp_get_recent_posts with search term

responsive - How to exclude an image size from the Wordpress srcset

JetBrains PyCharm windows32位安装

matlab7.0 32位&64位下载和安装说明

Sublime 32位激活码

商用计算机选购的要点什么,2018选购笔记本的技巧笔记本电脑选购要点

Sublime 32位激活码

【亲测免费】抖音直播伴侣推流密钥获取工具使用教程