Regex to match words with hyphens and/or apostrophes
use this pattern
(?=\S*['-])([a-zA-Z'-]+)
Demo
(?= # Look-Ahead
\S # <not a whitespace character>
* # (zero or more)(greedy)
['-] # Character in ['-] Character Class
) # End of Look-Ahead
( # Capturing Group (1)
[a-zA-Z'-] # Character in [a-zA-Z'-] Character Class
+ # (one or more)(greedy)
) # End of Capturing Group (1)
Match words with hyphens and apostrophes
Your \w+(?:'|\-\w+)?
starts matching with a word character \w
, thus all "words" starting with '
are not matched as per the requirements.
In general, you can match words with and without hyphens with
\w+(?:-\w+)*
In the current scenario, you may include the \w
and '
into a character class and use
'?\w[\w']*(?:-\w+)*'?
See the regex demo
If a "word" can only have 1 hyphen, replace *
at the end with the ?
quantifier.
Breakdown:
'?
- optional apostrophe\w
- a word character[\w']*
- 0+ word character or an apostrophe(?:-\w+)*
- 0+ sequences of:-
- a hyphen\w+
- 1+ word character
'?
- optional apostrophe
How to match a pattern with a hyphen or apostrophe
Your regex ^[a-zA-Z]['][-]$
matches a letter followed with '
and -
. Something like a'-
.
You need to add quantifiers and an optional group (*
will allow 0 or more occurrences), e.g.
^[a-zA-Z]+(?:['-][a-zA-Z]+)*$
^^^^^^^^^^^^^^^^^^^
See the regex demo
Debuggex Demo
The pattern anchors the whole match (it should match the whole string) and it matches 1 or more letters ([a-zA-Z]+
) and then 0 or more occurrences of a '
or -
(thanks to ['-]
) followed by 1+ letters.
Regex to allow only alphabetical characters, hyphens, apostrophes and period
Remove *
quantifier to make letters be at beginning and consider them at end:
^[a-zA-Z](?:[ '.\-a-zA-Z]*[a-zA-Z])?$
Live demo
Regex match hyphenated word with hyphen-less query
My solution to scenarios like this is always to introduce content- and query-processing.
Content processing is easier when you use the push model via the SDK, but you could achieve the same by creating a shadow/copy of your table where the content is manipulated for indexing purposes. You let your original table stay intact. And then you maintain a duplicate table where your text is processed.
Query processing is something you should use regardless. In its simplest form you want to clean the input from the end users before you use it in a query. Additional steps can be to handle special characters like a hyphen. Either escape it, strip it, or whatever depending on what your requirements are.
EXAMPLE
I have to support searches for ordering codes that may contain hyphens or other special characters. The maintainers of our ordering codes may define ordering codes in an inconsistent format. Customers visiting our sites are just as inconsistent.
The requirement is that ABC-123-DE_F-4.56G should match any of
- ABC-123-DE_F-4.56G
- ABC123-DE_F-4.56G
- ABC_123_DE_F_4_56G
- ABC.123.DE.F.4.56G
- ABC 123 DEF 56 G
- ABC123DEF56G
I solve this using my suggested approach above. I use content processing to generate a version of the ordering code without any special characters (using a simple regex). Then, I use query processing to transform the end user's input into an OR-query, like:
<verbatim-user-input-cleaned> OR OrderCodeVariation:<verbatim-user-input-without-special-chars>
So, if the user entered ABC.123.DE.F.4.56G I would effecively search for
ABC.123.DE.F.4.56G OR OrderingCodeVariation:ABC123DEF56G
JavaScript regular expression for word boundaries, tolerating in-word hyphens and apostrophes
You can organize your word-boundary characters into two groups.
- Characters that cannot be alone.
- Characters that can be alone.
A regex that works with your example would be:
[\s.,'-]{2,}|[\s.]
Regex101 Demo
Now all that's left is to keep adding all non-word characters into those two groups until it fits all of your needs. So you might start adding symbols and more punctuation to those character classes.
Regular expression for alpahbet,underscore,hyphen,apostrophe only
Your regex is wrong. Try this:
/^[0-9A-Za-z_@'-]+$/
OR
/^[\w@'-]+$/
Hyphen needs to be at first or last position inside a character class to avoid escaping. Also if empty string isn't allowed then use +
(1 or more) instead of *
(0 or more)
Explanation:
^ assert position at start of the string
[\w@'-]+ match a single character present in the list below
Quantifier: Between one and unlimited times, as many times as possible
\w match any word character [a-zA-Z0-9_]
@'- a single character in the list @'- literally
$ assert position at end of the string
Related Topics
Bootstrap Modal Only Showing Backdrop
How to Make a Modal Popup to Take an Input Value Using Jquery or JavaScript
Starting the Week on Monday With Isoweekday()
What Does an Exclamation Mark Before a Variable Mean in JavaScript
Check If a Key Exists Inside a Json Object
How to Get the Average from Array of Objects
No Overload Matches This Call. Type 'String' Is Not Assignable to Type 'Signals'
Fullcalendar - Change View for Mobile Devices
Element Implicitly Has an 'Any' Type Because Index Expression Is Not of Type 'Number' [7015]
How to Change Image When Two Buttons Are Clicked
How to Autoplay a Music Using Audio Tag in Jquery
Bootstrap 4 Navbar-Toggler-Icon Does Not Appear
Switch Fontawesome Icon on Click, Not Working
How to Implement Multiple Checkbox Using React Hook
React Open Modal Window on Click in Another Component
Owl Carousel, Navigation Disabled After Reaching First/Last Item