Removing character in list of strings
Try this:
lst = [("aaaa8"),("bb8"),("ccc8"),("dddddd8")]
print([s.strip('8') for s in lst]) # remove the 8 from the string borders
print([s.replace('8', '') for s in lst]) # remove all the 8s
Removing a character from a string in a list of lists
string
s are immutable , and as such item.replace('*','')
returns back the string with the replaced characters, it does not replace them inplace (it cannot , since string
s are immutable) . you can enumerate over your lists, and then assign the returned string back to the list -
Example -
for lst in testList:
for j, item in enumerate(lst):
lst[j] = item.replace('*', '')
You can also do this easily with a list comprehension -
testList = [[item.replace('*', '') for item in lst] for lst in testList]
How to delete characters from an element array by conditions in python?
How about this:
ids = [
["MedGen:100,OMIM:1,Orpha:D23", "na", "na"],
["na", "OMIM:2,MedGen:20,Orpha:D33", "MedGen:500", "na", "na"],
["OMIM:22,Orpha:D36,MedGen:34"]
]
import re
for i in ids:
for j in range(len(i)):
result = re.findall(r"MedGen\s*:\s*\d+", i[j])
if len(result) == 0:
pass
else:
i[j] = result[0]
print(ids)
All you have to do is iterate over all the values in the array/s, and use regex findall to check whether it contains "MedGen:\d\d". If yes, then extract it; if no, then keep it as it is.
Regex
A quick summary of what r"MedGen\s*:\s*\d+"
means - You're searching for MedGen
, followed by 0-or-more-spaces \s*
, followed by a colon, followed by 0-or-more-spaces, followed by one-or-more digits (\d+
). If something like this is found, the result will contain the match at index 0. Then we can set that element's value to the match itself.
If not found, we keep the element as it is.
How to remove a character from every string in a list, based on the position where a specific character occurs in the first member of said list?
Solution using zip()
>>> shortened = [*zip(*[t for t in zip(*list_strings) if t[0] != "-"])]
>>> shortened
[('A', 'C', 'T', 'G'), ('A', 'C', 'T', 'A'), ('A', 'G', 'G', 'A'), ('A', 'G', 'G', 'G')]
>>>
>>> new_strings = ["".join(t) for t in shortened]
>>> new_strings
['ACTG', 'ACTA', 'AGGA', 'AGGG']
So, there are plenty of ways to do this, but this particular method zips the gene strings together and filters out the tuples which start with a "-"
. Think of stacking the four gene strings on top of each other: zip()
takes the "columns" of that stack:
>>> [*zip(*list_strings)]
[('A', 'A', 'A', 'A'), ('-', 'T', 'T', 'T'), ('C', 'C', 'G', 'G'), ('-', 'G', 'C', 'C'), ('T', 'T', 'G', 'G'), ('G', 'A', 'A', 'G'), ('-', 'G', 'T', 'T'), ('-', 'C', 'C', 'C')]
After removing the tuples that start with "-"
, the tuples are zipped back together the other way (think now of taking these tuples and stacking them vertically, then in the same way as before, zip()
takes the columns of that stack). Finally, "".join()
turns the tuples of characters into strings.
"What am I doing wrong?"
To answer the question "what am I doing wrong?", I've added print statements to your code. Try running this and interpreting the output:
list_strings=["A-C-TG--","ATCGTAGC","ATGCGATC","ATGCGGTC"]
new_list_strings=[]
positions=[i for i, letter in enumerate(list_strings[0]) if letter == "-"]
for string in list_strings:
print(f"string: {string}")
for i in range(len(string)):
print(f" i: {i}")
for pos in positions:
print(f" pos: {pos}")
if i==pos:
string2=string[:i]+string[i+1:]
print(f" match! string2 result: {string2}")
new_list_strings.append(string2)
print()
Notice that for each string
, multiple string2
objects are created.
Solution using a plain-Jane accumulator pattern
The barebones accumulator pattern does work for this problem:
list_strings = ["A-C-TG--","ATCGTAGC","ATGCGATC","ATGCGGTC"]
positions = [i for i, letter in enumerate(list_strings[0]) if letter == "-"]
new_list_strings = []
for string in list_strings:
new_str = ""
for idx, char in string:
if idx not in positions:
new_str += char
new_list_strings.append(new_str)
Remove specific characters from String List - Python
It can be implemented much simpler by directly traversing the file and writing its content to a variable with filtering out unwanted characters.
For example, here is the 'file1.txt'
file with the content:
Hello how are you? Very good!
Then we can do the following:
def main():
characters = '!?¿-.:;'
with open('file1.txt') as f:
aux = ''.join(c for c in f.read() if c not in characters)
# print(aux) # Hello how are you Very good
As we see aux
is the file's content without unwanted chars and it can be easily edited based on the desired output format.
For example, if we want a list of words, we can do this:
def main():
characters = '!?¿-.:;'
with open('file1.txt') as f:
aux = ''.join(c for c in f.read() if c not in characters)
aux = aux.split()
# print(aux) # ['Hello', 'how', 'are', 'you', 'Very', 'good']
How to remove characters from a string after a certain point within a list?
You can use a list comprehension:
original = ['0.04112243,0.04112243,right,4.11%', '0.12733313,0.05733313,right,12.73%', '0.09203131,0.02203131,right,9.2%']
new = [s[:10] for s in original]
Output:
['0.04112243', '0.12733313', '0.09203131']
You can also be a bit more flexible if you want to keep everything before the first comma:
new = [s.partition(',')[0] for s in original]
Related Topics
Python Multiprocessing Pool Hangs At Join
Importerror: No Module Named Sklearn (Python)
Pandas: Sum Dataframe Rows for Given Columns
How to Get the Latest File in a Folder
How to Wait Until I Receive Data Using a Python Socket
Python Strftime - Date Without Leading 0
Printing Each Letter of a Word + Another Letter - Python
Replacing Special Characters in a List in Python
Get Absolute Paths of All Files in a Directory
Count Duplicates Between 2 Lists
Animate a Rotating 3D Graph in Matplotlib
Valueerror: X and Y Must Be the Same Size
Python: How to Calculate the Average Word Length in a Sentence Using the .Split Command
List of the Most Recently Updated Files in Python
How to Close an Internet Tab With Cmd/Python
Pyqt: Getting Widgets to Resize Automatically in a Qdialog
How to Use a Pre-Trained Neural Network With Grayscale Images