PHP Strip Punctuation

PHP strip punctuation

# to keep letters & numbers
$s = preg_replace('/[^a-z0-9]+/i', '_', $s); # or...
$s = preg_replace('/[^a-z\d]+/i', '_', $s);

# to keep letters only
$s = preg_replace('/[^a-z]+/i', '_', $s);

# to keep letters, numbers & underscore
$s = preg_replace('/[^\w]+/', '_', $s);

# same as third example; suggested by @tchrist; ^\w = \W
$s = preg_replace('/\W+/', '_', $s);

for string

$s = "Hello, is StackOverflow a helpful website!? Yes!";

result (for all examples) is

Hello_is_StackOverflow_a_helpful_website_Yes_

Enjoy!

how to strip punctuation in php

Since you need to match some Unicode characters () it would be sensible to use a regular expression. The pattern \p{P} matches any known punctuation, and the assertion excludes your desired special characters from vanishing:

 $text = preg_replace("/(?![.=$'€%-])\p{P}/u", "", $text);

PHP - Remove all punctuation from the start and end of the string

You can use

$new = preg_replace('/^[^\p{L}0-9]+|[^\p{L}0-9]+\z/u', '', $str);

The regex matches

  • ^[^\p{L}0-9]+ - any one or more chars other than Unicode letters and ASCII digits at the start of string
  • | - or
  • [^\p{L}0-9]+\z - any one or more chars other than Unicode letters and ASCII digits at the end of string.

See the PHP demo online and a regex demo.

PHP strip punctuation keep apostrophe

If you want to support all Unicode punctuation characters as well then use this regex:

$str = preg_replace("#((?!')\pP)+#", '', $str);

This regex is matching Unicode punctuation character class \pP and match will avoid apostrophe character using negative lookahead.

PHP preg_replace: remove punctuation from beginning and end of string

I wouldn't use a regex, probably something like...

$str = trim($str, '"\'');

Where the second argument is what you define as punctuation.

Assuming what you really meant was to strip out stuff which isn't letters, digits, etc, I'd go with...

$str = preg_replace('/^\PL+|\PL\z/', '', $str);

Strip all non-alphanumeric, spaces and punctuation symbols from a string

preg_replace("/[^a-zA-Z0-9\s\p{P}]/", "", $str);

Example:

php > echo preg_replace("/[^a-zA-Z0-9\s\p{P}]/", "", "⟺f✆oo☃. ba⟗r!");
foo. bar!

\p{P} matches all Unicode punctuation characters (see Unicode character properties). If you only want to allow specific punctuation, simply add them to the negated character class. E.g:

preg_replace("/[^a-zA-Z0-9\s.?!]/", "", $str);

How to remove all punctuation in a string just get the words separated by spaces in PHP

Check for any repeated instance of a non-number, non-letter character and repeat with a space:

# string(41) "This is a demo String Need to format this"
$str = trim( preg_replace( "/[^0-9a-z]+/i", " ", $str ) );

Demo: http://codepad.org/hXu6skTc


/ # Denotes start of pattern
[ # Denotes start of character class
^ # Not, or negative
0-9 # Numbers 0 through 9 (Or, "Not a number" because of ^
a-z # Letters a through z (Or, "Not a letter or number" because of ^0-9
] # Denotes end of character class
+ # Matches 1 or more instances of the character class match
/ # Denotes end of pattern
i # Case-insensitive, a-z also means A-Z

What is the best way to remove punctuation marks, symbols, diacritics, special characters?

Depending on how greedy you'd like to be, you could do something like:

$pg_url = preg_replace("/[^a-zA-Z 0-9]+/", " ", $pg_url);

This will replace anything that isn't a letter, number or space.



Related Topics



Leave a reply



Submit