How to Delete a Character in an Item in a List (Python)

Removing character in list of strings

Try this:

lst = [("aaaa8"),("bb8"),("ccc8"),("dddddd8")]
print([s.strip('8') for s in lst]) # remove the 8 from the string borders
print([s.replace('8', '') for s in lst]) # remove all the 8s

Removing a character from a string in a list of lists

strings are immutable , and as such item.replace('*','') returns back the string with the replaced characters, it does not replace them inplace (it cannot , since strings are immutable) . you can enumerate over your lists, and then assign the returned string back to the list -

Example -

for lst in testList:
for j, item in enumerate(lst):
lst[j] = item.replace('*', '')

You can also do this easily with a list comprehension -

testList = [[item.replace('*', '') for item in lst] for lst in testList]

How to delete characters from an element array by conditions in python?

How about this:

ids = [
["MedGen:100,OMIM:1,Orpha:D23", "na", "na"],
["na", "OMIM:2,MedGen:20,Orpha:D33", "MedGen:500", "na", "na"],
["OMIM:22,Orpha:D36,MedGen:34"]
]

import re

for i in ids:
for j in range(len(i)):
result = re.findall(r"MedGen\s*:\s*\d+", i[j])
if len(result) == 0:
pass
else:
i[j] = result[0]

print(ids)

All you have to do is iterate over all the values in the array/s, and use regex findall to check whether it contains "MedGen:\d\d". If yes, then extract it; if no, then keep it as it is.

Regex

A quick summary of what r"MedGen\s*:\s*\d+" means - You're searching for MedGen, followed by 0-or-more-spaces \s*, followed by a colon, followed by 0-or-more-spaces, followed by one-or-more digits (\d+). If something like this is found, the result will contain the match at index 0. Then we can set that element's value to the match itself.

If not found, we keep the element as it is.

How to remove a character from every string in a list, based on the position where a specific character occurs in the first member of said list?

Solution using zip()

>>> shortened = [*zip(*[t for t in zip(*list_strings) if t[0] != "-"])]
>>> shortened
[('A', 'C', 'T', 'G'), ('A', 'C', 'T', 'A'), ('A', 'G', 'G', 'A'), ('A', 'G', 'G', 'G')]
>>>
>>> new_strings = ["".join(t) for t in shortened]
>>> new_strings
['ACTG', 'ACTA', 'AGGA', 'AGGG']

So, there are plenty of ways to do this, but this particular method zips the gene strings together and filters out the tuples which start with a "-". Think of stacking the four gene strings on top of each other: zip() takes the "columns" of that stack:

>>> [*zip(*list_strings)]
[('A', 'A', 'A', 'A'), ('-', 'T', 'T', 'T'), ('C', 'C', 'G', 'G'), ('-', 'G', 'C', 'C'), ('T', 'T', 'G', 'G'), ('G', 'A', 'A', 'G'), ('-', 'G', 'T', 'T'), ('-', 'C', 'C', 'C')]

After removing the tuples that start with "-", the tuples are zipped back together the other way (think now of taking these tuples and stacking them vertically, then in the same way as before, zip() takes the columns of that stack). Finally, "".join() turns the tuples of characters into strings.

"What am I doing wrong?"

To answer the question "what am I doing wrong?", I've added print statements to your code. Try running this and interpreting the output:

list_strings=["A-C-TG--","ATCGTAGC","ATGCGATC","ATGCGGTC"]
new_list_strings=[]
positions=[i for i, letter in enumerate(list_strings[0]) if letter == "-"]

for string in list_strings:
print(f"string: {string}")
for i in range(len(string)):
print(f" i: {i}")
for pos in positions:
print(f" pos: {pos}")
if i==pos:
string2=string[:i]+string[i+1:]
print(f" match! string2 result: {string2}")
new_list_strings.append(string2)
print()

Notice that for each string, multiple string2 objects are created.

Solution using a plain-Jane accumulator pattern

The barebones accumulator pattern does work for this problem:

list_strings = ["A-C-TG--","ATCGTAGC","ATGCGATC","ATGCGGTC"]
positions = [i for i, letter in enumerate(list_strings[0]) if letter == "-"]

new_list_strings = []
for string in list_strings:
new_str = ""
for idx, char in string:
if idx not in positions:
new_str += char
new_list_strings.append(new_str)

Remove specific characters from String List - Python

It can be implemented much simpler by directly traversing the file and writing its content to a variable with filtering out unwanted characters.

For example, here is the 'file1.txt' file with the content:

Hello how are you? Very good!

Then we can do the following:

def main():

characters = '!?¿-.:;'

with open('file1.txt') as f:
aux = ''.join(c for c in f.read() if c not in characters)

# print(aux) # Hello how are you Very good

As we see aux is the file's content without unwanted chars and it can be easily edited based on the desired output format.

For example, if we want a list of words, we can do this:

def main():

characters = '!?¿-.:;'

with open('file1.txt') as f:
aux = ''.join(c for c in f.read() if c not in characters)
aux = aux.split()

# print(aux) # ['Hello', 'how', 'are', 'you', 'Very', 'good']

How to remove characters from a string after a certain point within a list?

You can use a list comprehension:

original = ['0.04112243,0.04112243,right,4.11%', '0.12733313,0.05733313,right,12.73%', '0.09203131,0.02203131,right,9.2%']

new = [s[:10] for s in original]

Output:

['0.04112243', '0.12733313', '0.09203131']

You can also be a bit more flexible if you want to keep everything before the first comma:

new = [s.partition(',')[0] for s in original]


Related Topics



Leave a reply



Submit