Iterating Each Character in a String Using Python

Iterating each character in a string using Python

As Johannes pointed out,

for c in "string":
#do something with c

You can iterate pretty much anything in python using the for loop construct,

for example, open("file.txt") returns a file object (and opens the file), iterating over it iterates over lines in that file

with open(filename) as f:
for line in f:
# do something with line

If that seems like magic, well it kinda is, but the idea behind it is really simple.

There's a simple iterator protocol that can be applied to any kind of object to make the for loop work on it.

Simply implement an iterator that defines a next() method, and implement an __iter__ method on a class to make it iterable. (the __iter__ of course, should return an iterator object, that is, an object that defines next())

See official documentation

How do I iterate over every character in a string and multiply it with the place in its string?

enumerate is convenient for getting an integer index and an element for each iteration.

def f(x):
size = len(x)
for i, char in enumerate(x):
num = i+1 # number of characters
if num == size: # don't print - after last character
ending = ""
else:
ending = "-"
print(num*char, end = ending)

f("string")

Your logic was only a bit off. If we didn't use enumerate and just indexed a string with integers:

def g(x):
size = len(x)
for i in range(size):
num = i+1 # number of characters
if num == size: # don't print - after last character
ending = ""
else:
ending = "-"
print(num*x[i], end = ending)

Iterate over character in string with index

.index() finds the first index of an element in the given list. If it appears multiple times, only the first occurrence is returned.

If you want to get pairs of elements and their respective indexes, you can use the built-in function enumerate():

for idx, char in enumerate(message):
print(char, idx)

how to iterate through a string and change a character in it in python

Assuming this is Python, you can it by assigning the newly uppercase letter to the list at the lowercase letter's index. For example, if we want to make every third letter upper case in the string enter name, then:

name = "enter name"
name_list = list(name)

for i in range(len(name_list)):
if i % 3 == 0:
name_list[i] = name_list[i].upper()

print(''.join(name_list))

Output:

EntEr NamE

Speed up iterating over characters in long strings

Here is how I would do it.

Starting from a dataframe:

  symbol sequence chromosome    start      end strand
0 XYZ ATACAAG 12 9067664 9067671 -

I would explode the sequence, reindex to have all combinations of A/C/G/T and keep only that where the initial base is different

import numpy as np

df2 = df.assign(base=df['sequence'].apply(list)).explode('base').reset_index()
df2 = (df2.reindex(df2.index.repeat(4))
.assign(variant=np.tile(list('ACGT'), len(df2)))
.loc[lambda d: d['base'].ne(d['variant'])]
.assign(var=lambda d:d['base']+'/'+d['variant'])
)

Intermediate output:

>>> df2.head()
index symbol sequence chromosome start end strand base variant var
0 0 XYZ ATACAAG 12 9067664 9067671 - A C A/C
0 0 XYZ ATACAAG 12 9067664 9067671 - A G A/G
0 0 XYZ ATACAAG 12 9067664 9067671 - A T A/T
1 0 XYZ ATACAAG 12 9067664 9067671 - T A T/A
1 0 XYZ ATACAAG 12 9067664 9067671 - T C T/C

Then export:

df2[['start', 'end', 'var', 'strand']].to_csv('variants.txt', sep='\t', index=False, header=None)

example output (first lines):

9067664 9067671 A/C -
9067664 9067671 A/G -
9067664 9067671 A/T -
9067664 9067671 T/A -
9067664 9067671 T/C -
9067664 9067671 T/G -
9067664 9067671 A/C -
9067664 9067671 A/G -
9067664 9067671 A/T -
9067664 9067671 C/A -

optimization

Now we remove everything that is not needed to keep the size minimal:

df2 = (df.drop(columns=['symbol', 'chromosome'])
.assign(sequence=df['sequence'].apply(list))
.explode('sequence').reset_index(drop=True)
)
df2 = (df2.reindex(df2.index.repeat(4))
.assign(var=np.tile(list('ACGT'), len(df2)))
.loc[lambda d: d['sequence'].ne(d['var'])]
.assign(var=lambda d:d['sequence']+'/'+d['var'])
)
df2[['start', 'end', 'var', 'strand']].to_csv('variants.txt', sep='\t', index=False, header=None)

Trying to Iterate through strings to add certain characters to new string

you should use the in operator for comparing a value with multiple other values:

if s[i] not in 'AEIOUaeiou' or s[i-1] == ' ':
# if you prefer lists / tuples / sets of characters, you can use those instead:
# if s[i] not in ['A', 'E', 'I', 'O', ...]
answer_string += s[i]

Iterate over a string 2 (or n) characters at a time in Python

I don't know about cleaner, but there's another alternative:

for (op, code) in zip(s[0::2], s[1::2]):
print op, code

A no-copy version:

from itertools import izip, islice
for (op, code) in izip(islice(s, 0, None, 2), islice(s, 1, None, 2)):
print op, code

How to iterate over each position of a character of a string, for a list of strings?

list_strings=["a-a--","-ab-b","a---a","b-b-a","aab-a"]

# if every string in `list_strings` is same length:
out = [v.count('-') for v in zip(*list_strings)]
print(out)

Prints:

[1, 3, 1, 5, 1]

If some strings are different length:

from itertools import zip_longest
out = [v.count('-') for v in zip_longest(*list_strings)]

Python: iterate through string but need to know index of current character

Two ways to do this. Iterate the index i between 0 and the len of the str:

for i in range(len(word)):
c = word[i]

Or use python's enumerate function to do both at once:

for i, c in enumerate(word):
...

Behaviour of Python when attempting to iterate over characters in a string

You are telling for to expand each iteration value to assign to three separate variables:

for a,b,c in "cat":
# ^^^^^ the target for the loop variable, 3 different names

However, iteration over a string produces a string with a single character, you can't assign a single character to three variables:

>>> loopiterable = 'cat'
>>> loopiterable[0] # first element
'c'
>>> a, b, c = loopiterable[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: not enough values to unpack (expected 3, got 1)

The error message tells you why this didn't work; you can't take three values out of a string of length 1.

When you put the string into a list, you changed what you loop over. You now have a list with one element, so the loop iterates just once, and the value for the single iteration is the string 'cat'. That string just happens to have 3 characters, so can be assigned to three variables:

>>> loopiterable = ['cat']
>>> loopiterable[0] # first element
'cat'
>>> a, b, c = loopiterable[0]
>>> a
'c'
>>> b
'a'
>>> c
't'

This still would fail if the contained string has a different number of characters:

>>> for a, b, c in ['cat', 'hamster']:
... print(a, b, c)
...
c a t
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack (expected 3)

'hamster' is 7 characters, not 3; that's 4 too many.

The correct solution is to just use one variable for the loop target, not 3:

for character in 'cat':
print(character)

Now you are printing each character separately:

>>> for character in 'cat':
... print(character)
...
c
a
t

Now, if you wanted to pass all characters of a string to print() as separate arguments, just use * to expand the string to separate arguments to the call:

>>> my_pet = 'cat'
>>> print(*my_pet)
c a t


Related Topics



Leave a reply



Submit