Iterating each character in a string using Python
As Johannes pointed out,
for c in "string":
#do something with c
You can iterate pretty much anything in python using the for loop
construct,
for example, open("file.txt")
returns a file object (and opens the file), iterating over it iterates over lines in that file
with open(filename) as f:
for line in f:
# do something with line
If that seems like magic, well it kinda is, but the idea behind it is really simple.
There's a simple iterator protocol that can be applied to any kind of object to make the for
loop work on it.
Simply implement an iterator that defines a next()
method, and implement an __iter__
method on a class to make it iterable. (the __iter__
of course, should return an iterator object, that is, an object that defines next()
)
See official documentation
How do I iterate over every character in a string and multiply it with the place in its string?
enumerate
is convenient for getting an integer index and an element for each iteration.
def f(x):
size = len(x)
for i, char in enumerate(x):
num = i+1 # number of characters
if num == size: # don't print - after last character
ending = ""
else:
ending = "-"
print(num*char, end = ending)
f("string")
Your logic was only a bit off. If we didn't use enumerate
and just indexed a string with integers:
def g(x):
size = len(x)
for i in range(size):
num = i+1 # number of characters
if num == size: # don't print - after last character
ending = ""
else:
ending = "-"
print(num*x[i], end = ending)
Iterate over character in string with index
.index()
finds the first index of an element in the given list. If it appears multiple times, only the first occurrence is returned.
If you want to get pairs of elements and their respective indexes, you can use the built-in function enumerate()
:
for idx, char in enumerate(message):
print(char, idx)
how to iterate through a string and change a character in it in python
Assuming this is Python, you can it by assigning the newly uppercase letter to the list at the lowercase letter's index. For example, if we want to make every third letter upper case in the string enter name
, then:
name = "enter name"
name_list = list(name)
for i in range(len(name_list)):
if i % 3 == 0:
name_list[i] = name_list[i].upper()
print(''.join(name_list))
Output:
EntEr NamE
Speed up iterating over characters in long strings
Here is how I would do it.
Starting from a dataframe:
symbol sequence chromosome start end strand
0 XYZ ATACAAG 12 9067664 9067671 -
I would explode
the sequence, reindex
to have all combinations of A/C/G/T and keep only that where the initial base is different
import numpy as np
df2 = df.assign(base=df['sequence'].apply(list)).explode('base').reset_index()
df2 = (df2.reindex(df2.index.repeat(4))
.assign(variant=np.tile(list('ACGT'), len(df2)))
.loc[lambda d: d['base'].ne(d['variant'])]
.assign(var=lambda d:d['base']+'/'+d['variant'])
)
Intermediate output:
>>> df2.head()
index symbol sequence chromosome start end strand base variant var
0 0 XYZ ATACAAG 12 9067664 9067671 - A C A/C
0 0 XYZ ATACAAG 12 9067664 9067671 - A G A/G
0 0 XYZ ATACAAG 12 9067664 9067671 - A T A/T
1 0 XYZ ATACAAG 12 9067664 9067671 - T A T/A
1 0 XYZ ATACAAG 12 9067664 9067671 - T C T/C
Then export:
df2[['start', 'end', 'var', 'strand']].to_csv('variants.txt', sep='\t', index=False, header=None)
example output (first lines):
9067664 9067671 A/C -
9067664 9067671 A/G -
9067664 9067671 A/T -
9067664 9067671 T/A -
9067664 9067671 T/C -
9067664 9067671 T/G -
9067664 9067671 A/C -
9067664 9067671 A/G -
9067664 9067671 A/T -
9067664 9067671 C/A -
optimization
Now we remove everything that is not needed to keep the size minimal:
df2 = (df.drop(columns=['symbol', 'chromosome'])
.assign(sequence=df['sequence'].apply(list))
.explode('sequence').reset_index(drop=True)
)
df2 = (df2.reindex(df2.index.repeat(4))
.assign(var=np.tile(list('ACGT'), len(df2)))
.loc[lambda d: d['sequence'].ne(d['var'])]
.assign(var=lambda d:d['sequence']+'/'+d['var'])
)
df2[['start', 'end', 'var', 'strand']].to_csv('variants.txt', sep='\t', index=False, header=None)
Trying to Iterate through strings to add certain characters to new string
you should use the in
operator for comparing a value with multiple other values:
if s[i] not in 'AEIOUaeiou' or s[i-1] == ' ':
# if you prefer lists / tuples / sets of characters, you can use those instead:
# if s[i] not in ['A', 'E', 'I', 'O', ...]
answer_string += s[i]
Iterate over a string 2 (or n) characters at a time in Python
I don't know about cleaner, but there's another alternative:
for (op, code) in zip(s[0::2], s[1::2]):
print op, code
A no-copy version:
from itertools import izip, islice
for (op, code) in izip(islice(s, 0, None, 2), islice(s, 1, None, 2)):
print op, code
How to iterate over each position of a character of a string, for a list of strings?
list_strings=["a-a--","-ab-b","a---a","b-b-a","aab-a"]
# if every string in `list_strings` is same length:
out = [v.count('-') for v in zip(*list_strings)]
print(out)
Prints:
[1, 3, 1, 5, 1]
If some strings are different length:
from itertools import zip_longest
out = [v.count('-') for v in zip_longest(*list_strings)]
Python: iterate through string but need to know index of current character
Two ways to do this. Iterate the index i
between 0
and the len
of the str:
for i in range(len(word)):
c = word[i]
Or use python's enumerate function to do both at once:
for i, c in enumerate(word):
...
Behaviour of Python when attempting to iterate over characters in a string
You are telling for
to expand each iteration value to assign to three separate variables:
for a,b,c in "cat":
# ^^^^^ the target for the loop variable, 3 different names
However, iteration over a string produces a string with a single character, you can't assign a single character to three variables:
>>> loopiterable = 'cat'
>>> loopiterable[0] # first element
'c'
>>> a, b, c = loopiterable[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: not enough values to unpack (expected 3, got 1)
The error message tells you why this didn't work; you can't take three values out of a string of length 1.
When you put the string into a list, you changed what you loop over. You now have a list with one element, so the loop iterates just once, and the value for the single iteration is the string 'cat'
. That string just happens to have 3 characters, so can be assigned to three variables:
>>> loopiterable = ['cat']
>>> loopiterable[0] # first element
'cat'
>>> a, b, c = loopiterable[0]
>>> a
'c'
>>> b
'a'
>>> c
't'
This still would fail if the contained string has a different number of characters:
>>> for a, b, c in ['cat', 'hamster']:
... print(a, b, c)
...
c a t
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack (expected 3)
'hamster'
is 7 characters, not 3; that's 4 too many.
The correct solution is to just use one variable for the loop target, not 3:
for character in 'cat':
print(character)
Now you are printing each character separately:
>>> for character in 'cat':
... print(character)
...
c
a
t
Now, if you wanted to pass all characters of a string to print()
as separate arguments, just use *
to expand the string to separate arguments to the call:
>>> my_pet = 'cat'
>>> print(*my_pet)
c a t
Related Topics
Round to 5 (Or Other Number) in Python
Differencebetween Slice Assignment That Slices the Whole List and Direct Assignment
Creating a Pandas Dataframe from a Numpy Array: How to Specify the Index Column and Column Headers
Most Recent Previous Business Day in Python
How to Print Variables Without Spaces Between Values
Django Filefield with Upload_To Determined at Runtime
Run Python Script Without Windows Console Appearing
How to Create a Set of Sets in Python
How to Overload _Init_ Method Based on Argument Type
Windows Is Not Passing Command Line Arguments to Python Programs Executed from the Shell
Python Multithreading Wait Till All Threads Finished
How to Flatten a Nested JSON Recursively, with Flatten_JSON
Replacing Column Values in a Pandas Dataframe
Windows Cmd Encoding Change Causes Python Crash
Replace() Method Not Working on Pandas Dataframe
Python 3: Importerror "No Module Named Setuptools"
Make 2 Functions Run at the Same Time
Pandas Dataframe Stored List as String: How to Convert Back to List