admin管理员组

文章数量:1026989

I want to write a regex to match characters a-z except for e, n p. I can write:

[a-df-moq-z]

I'm just wondering if there's a way to write something like ([a-z except ^enp]) just to make the regex more easy to decipher which characters are excluded.

I want to write a regex to match characters a-z except for e, n p. I can write:

[a-df-moq-z]

I'm just wondering if there's a way to write something like ([a-z except ^enp]) just to make the regex more easy to decipher which characters are excluded.

Share Improve this question asked May 28, 2013 at 13:15 neodymiumneodymium 3,9366 gold badges26 silver badges32 bronze badges
Add a comment  | 

3 Answers 3

Reset to default 31

You can use negative lookahead like this:

(?![enp])[a-z]

Live Demo: http://www.rubular.com/r/1LnJswio3F

Explanation:

  • It means match any character in the range of a-z except when character is one of [enp].
  • (?![enp]) is a negative lookahead expression that fails the match when we have letters e or n or p at the next position,

There are a few ways to do this, depending on which regex flavor you're using. @anubhava's solution is the most portable, because it works in any flavor that supports lookaheads.

If you want to match a whole word or a whole string, you need to wrap that regex in a group, forcing the regex engine to treat the whole thing as one atom:

/\b(?:(?![enp])[a-z])+\b/

/^(?:(?![enp])[a-z])+$/

Another possibility is to scan the whole word/string to make sure it doesn't contain any of the unwanted characters, and then match it in the usual way:

/\b(?!\w*[enp])[a-z]+\b/

/^(?!\w*[enp])[a-z]+$/

It's all pretty hackish, but in JavaScript it's what you're stuck with. Some of the other flavors provide tools specifically for this purpose, like set intersection (Java, Ruby 1.9.x):

[a-z&&[^enp]]

..or set subtraction (.NET):

[a-z-[enp]]

The Unicode Consortium has gone hog wild with all this set arithmetic stuff, but as far as I know no real-world regex flavor has come anywhere near implementing all of its proposals.

You can use [^[^a-z]enp] which works but is a bit confusing to understand.

[^a-z]enp defines a class that includes all characters that are not a-z and adds e, n and p. Then by inverting that class you get a class that matches a-z except e, n, and p.

You can try it here http://www.rubular.com/r/VEZNFgxgfI

Update: But it seems to not work in JavaScript (tested Chrome). Ruby and PCRE should work.

I want to write a regex to match characters a-z except for e, n p. I can write:

[a-df-moq-z]

I'm just wondering if there's a way to write something like ([a-z except ^enp]) just to make the regex more easy to decipher which characters are excluded.

I want to write a regex to match characters a-z except for e, n p. I can write:

[a-df-moq-z]

I'm just wondering if there's a way to write something like ([a-z except ^enp]) just to make the regex more easy to decipher which characters are excluded.

Share Improve this question asked May 28, 2013 at 13:15 neodymiumneodymium 3,9366 gold badges26 silver badges32 bronze badges
Add a comment  | 

3 Answers 3

Reset to default 31

You can use negative lookahead like this:

(?![enp])[a-z]

Live Demo: http://www.rubular.com/r/1LnJswio3F

Explanation:

  • It means match any character in the range of a-z except when character is one of [enp].
  • (?![enp]) is a negative lookahead expression that fails the match when we have letters e or n or p at the next position,

There are a few ways to do this, depending on which regex flavor you're using. @anubhava's solution is the most portable, because it works in any flavor that supports lookaheads.

If you want to match a whole word or a whole string, you need to wrap that regex in a group, forcing the regex engine to treat the whole thing as one atom:

/\b(?:(?![enp])[a-z])+\b/

/^(?:(?![enp])[a-z])+$/

Another possibility is to scan the whole word/string to make sure it doesn't contain any of the unwanted characters, and then match it in the usual way:

/\b(?!\w*[enp])[a-z]+\b/

/^(?!\w*[enp])[a-z]+$/

It's all pretty hackish, but in JavaScript it's what you're stuck with. Some of the other flavors provide tools specifically for this purpose, like set intersection (Java, Ruby 1.9.x):

[a-z&&[^enp]]

..or set subtraction (.NET):

[a-z-[enp]]

The Unicode Consortium has gone hog wild with all this set arithmetic stuff, but as far as I know no real-world regex flavor has come anywhere near implementing all of its proposals.

You can use [^[^a-z]enp] which works but is a bit confusing to understand.

[^a-z]enp defines a class that includes all characters that are not a-z and adds e, n and p. Then by inverting that class you get a class that matches a-z except e, n, and p.

You can try it here http://www.rubular.com/r/VEZNFgxgfI

Update: But it seems to not work in JavaScript (tested Chrome). Ruby and PCRE should work.

本文标签: