Python Sum, Why Not Strings

Python sum, why not strings?

Python tries to discourage you from "summing" strings. You're supposed to join them:

"".join(list_of_strings)

It's a lot faster, and uses much less memory.

A quick benchmark:

$ python -m timeit -s 'import operator; strings = ["a"]*10000' 'r = reduce(operator.add, strings)'
100 loops, best of 3: 8.46 msec per loop
$ python -m timeit -s 'import operator; strings = ["a"]*10000' 'r = "".join(strings)'
1000 loops, best of 3: 296 usec per loop

Edit (to answer OP's edit): As to why strings were apparently "singled out", I believe it's simply a matter of optimizing for a common case, as well as of enforcing best practice: you can join strings much faster with ''.join, so explicitly forbidding strings on sum will point this out to newbies.

BTW, this restriction has been in place "forever", i.e., since the sum was added as a built-in function (rev. 32347)

Why Python builtin sum() function does not support strings?

Summing strings is very inefficient; summing strings in a loop requires that a new string is created for each two strings being concatenated, only to be destroyed again when the next string is concatenated with that result.

For example, for summing ['foo', 'bar', 'baz', 'spam', 'ham', 'eggs'] you'd create 'foobar', then 'foobarbaz', then 'foobarbazspam', then 'foobarbazspamham', then finally 'foobarbazspamhameggs', discarding all but the last string object.

You'd use the str.join() method instead:

''.join(str_list)

which creates one new string and copies in the contents of the constituent strings.

Note that sum() uses a default starting value of 0, which is why you get your specific exception message:

>>> 0 + ''
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'

You can give sum() a different starting value as the second argument; for strings that'll give you a more meaningful error message:

>>> sum(['foo', 'bar'], '')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]

The function is otherwise not limited to just numbers; you can use it for any other type that defines __add__ operations, but you have to specify a sensible start value. You could 'sum' lists, for example:

>>> sum([['foo', 'bar'], ['ham', 'spam']], [])
['foo', 'bar', 'ham', 'spam']

but note the [] value for the second (start) argument! This is also just as inefficient as summing strings; the efficient method would be using list(itertools.chain.from_iterable(list_of_lists)).

Python not summing (add) numbers, just sticking them together

In python (and a lot of other languages), the + operator serves a dual purpose. It can be used to get the sum of two numbers (number + number), or concatenate strings (string + string). Concatenate here means join together.

When you use raw_input, you get back the user's input in the form of a string. Thus, doing fruits + beverages invokes the latter meaning of +, which is string concatenation.

To treat the user's input as a number, simply use the built-in int() function:

all_items = add(int(fruits), int(beverages))

int() here converts both strings to integers. Those numbers are then passed to add(). Keep in mind that unless you implement a check to make sure that the user has inputted a number, invalid input will cause a ValueError.

Sum ignoring strings in pandas dataframe

If you want to ignore the string values, then this will work:

for col in df.columns:
df[col] = pd.to_numeric(df[col], errors='coerce')
df['sum'] = df.sum(axis=1)

Sum a python list without the string values

I will do something like this

a = [1,2,3,4,'']
print sum(x if not isinstance(x,str) else 0 for x in a)

sum of str and float number by using the type function not working?

You need to write:

print(sum(float(i) for i in [var1,var2,var3]))

But instead you can easily use it as below:

values = [str(2),float(3.0),str(5)]

in case the types are not matching you need to convert it to same type and then add it so you can do in one line by list comprehension as below:

float_values = [float(i) for i in values]  # list of all values as float
print(sum(float_values))

+ operator

when you use strings with + operator it will concatenate strings

Ex.

x = '4' + '5'  # result will be '45'

If you use it for integer it will give you addition of values
Ex.

x = 4 + 5  # result will be 9

Find the sum and average of the numbers within a string(/sentence), ignoring all the characters

If your numbers are positive integers you can use below logic to extract them:

input_str = "1time3 %times4"

numbers = ''.join((ch if ch in '0123456789' else ' ') for ch in input_str)
numbers_list = [int(i) for i in numbers.split()]

print(f"Extracted numbers: {numbers_list}")
print(f"Sum: {sum(numbers_list)}, Average: {sum(numbers_list)/ len(numbers_list)}")

sum of list of strings raises TypeError

The sum function takes a second argument - the initial accumulator value. When this is not provided, it is assumed to be 0. Thus, the first addition in your sum(a) is 0 + '0', producing the type error in question.

Instead you want:

a = ['0', 'a']
print(''.join(a)) # '0a'

If you try to use sum on strings, you will get an error saying to use ''.join(seq) instead.



Related Topics



Leave a reply



Submit