Python Replace Single Quotes Except Apostrophes
What you really need to properly replace starting and ending '
is regex.
To match them you should use:
^'
for starting'
(opensingle),'$
for ending'
(closesingle).
Unfortunately, replace
method does not support regexes,
so you should use re.sub
instead.
Below you have an example program, printing your desired output
(in Python 3):
import re
str = "don't 'George ma'am end.' didn't.' 'Won't"
words = str.split(" ")
for word in words:
word = re.sub(r"^'", '<opensingle>\n', word)
word = re.sub(r"'$", '\n<closesingle>', word)
word = word.replace('.', '\n<period>')
word = word.replace(',', '\n<comma>')
print(word)
Replace the single quote (') character from a string
As for how to represent a single apostrophe as a string in Python, you can simply surround it with double quotes ("'"
) or you can escape it inside single quotes ('\''
).
To remove apostrophes from a string, a simple approach is to just replace the apostrophe character with an empty string:
>>> "didn't".replace("'", "")
'didnt'
Replace single quotes with double with exclusion of some elements
First attempt
You can also use this regex:
(?:(?<!\w)'((?:.|\n)+?'?)'(?!\w))
DEMO IN REGEX101
This regex match whole sentence/word with both quoting marks, from beginning and end, but also campure the content of quotation inside group nr 1, so you can replace matched part with "\1"
.
(?<!\w)
- negative lookbehind for non-word character, to exclude words like: "you'll", etc., but to allow the regex to match quatations after characters like\n
,:
,;
,.
or-
,etc. The assumption that there will always be a whitespace before quotation is risky.'
- single quoting mark,(?:.|\n)+?'?)
- non capturing group: one or more of any character or
new line (to match multiline sentences) with lazy quantifire (to avoid
matching from first to last single quoting mark), followed by
optional single quoting sing, if there would be two in a row'(?!\w)
- single quotes, followed by non-word character, to exclude
text like "i'm", "you're" etc. where quoting mark is beetwen words,
The s' case
However it still has problem with matching sentences with apostrophes occurs after word ending with s, like: 'the classes' hours'
. I think it is impossible to distinguish with regex when s
followed by '
should be treated as end of quotation, or as or s
with apostrophes. But I figured out a kind of limited work around for this problem, with regex:
(?:(?<!\w)'((?:.|\n)+?'?)(?:(?<!s)'(?!\w)|(?<=s)'(?!([^']|\w'\w)+'(?!\w))))
DEMO IN REGEX101
PYTHON IMPLEMENTATION
with additional alternative for cases with s'
: (?<!s)'(?!\w)|(?<=s)'(?!([^']|\w'\w)+'(?!\w)
where:
(?<!s)'(?!\w)
- if there is nos
before'
, match as regex above (first attempt),(?<=s)'(?!([^']|\w'\w)+'(?!\w)
- if there iss
before'
, end a match on this'
only if there is no other'
followed by non-word
character in following text, before end or before another'
(but only'
preceded by letter other thans
, or opening of next quotaion). The\w'\w
is to include in such match a'
wich are between letters, like ini'm
, etc.
this regex should match wrong only it there is couple s'
cases in a row. Still, it is far from perfect solution.
Flaws of \w
Also, using \w
there is always chance that '
would occur after sybol or non-[a-zA-Z_0-9]
but still letter character, like some local language character, and then it will be treated as beginning of a quatation. It could be avoided by replacing (?<!\w)
and (?!\w)
with (?<!\p{L})
and (?!\p{L})
or something like (?<=^|[,.?!)\s])
, etc., positive lookaround for characters wich can occour in sentence before quatation. However a list could be quite long.
Replace single quotes in a string but not escaped single quotes
You need a negative lookbehind, not a negative lookahead ("no backslash before a quote"):
result = '''{'key1': 4, 'key2': 'I\\'m home'}'''
print(re.sub(r"(?<!\\)'", '"', result))
#{"key1": 4, "key2": "I\'m home"}
Removing single quotes if they aren't in the middle of a word
Split the string, use strip()
on each word to remove leading and trailing characters on it, then join it all back together.
>>> s = "'here is some stuff 'now there are quotes' now there's not'"
>>> print(' '.join(w.strip("'") for w in s.split()).lower())
here is some stuff now there are quotes now there's not
How to find all occurances of a single quote not within a word with python regex
Try with following regex.
Regex: (?<![a-zA-Z])'|'(?![a-zA-Z])
and replace with "
Explanation:
(?<![a-zA-Z])'
matches apostrophe not preceded by a letter.'(?![a-zA-Z])
matches the apostrophe not followed by a letter.
Regex101 Demo
Related Topics
Asking the User for Input Until They Give a Valid Response
How to Send Smtp Email for Office365 With Python Using Tls/Ssl
Python/Pandas: How to Match List of Strings With a Dataframe Column
Reading a Text File and Converting String to Float
How to Create a for Loop That Goes Through All Diagonal Possibilities of a List
Python: How to Read and Load an Excel File from Aws S3
Python Turning 2 Dimensional Strings on My List into Floats
Running an Excel Macro Via Python
How to Use Anaconda Python to Execute a .Py File
How to Dynamically Build a Json Object
How to Write Multiple Images (Subplots) into One Image
Why Is This Going Out of Range
Pandas.To_Sql Replace Old Data With New Data Based on 'Unique Id'
Changing File Permission in Python
How to Center a Window on the Screen in Tkinter
Spliting a Row to Multiple Row Pyspark
How to Read a List of Parquet Files from S3 as a Pandas Dataframe Using Pyarrow