Why Does '.Sort()' Cause the List to Be 'None' in Python

Why does '.sort()' cause the list to be 'None' in Python?

Simply remove the assignment from

result = result.sort()

leaving just

result.sort()

The sort method works in-place (it modifies the existing list), so it returns None. When you assign its result to the name of the list, you're assigning None. So no assignment is necessary.


But in any case, what you're trying to accomplish can easily (and more efficiently) be written as a one-liner:

max(len(Ancestors(T,x)) for x in OrdLeaves(T))

max operates in linear time, O(n), while sorting is O(nlogn). You also don't need nested list comprehensions, a single generator expression will do.

Why does return list.sort() return None, not the list?

list.sort sorts the list in place, i.e. it doesn't return a new list. Just write

newList.sort()
return newList

Python .sort() on a list of a set returns None

Yes, list.sort method sorts the list in place and returns None. If you want to return the sorted list use sorted method.

>>> lst=[5, 2, 1, 4, 3]
>>> lst.sort()
>>> lst
[1, 2, 3, 4, 5]
>>> lst=[5, 2, 1, 4, 3]
>>> lst=sorted(lst)
>>> lst
[1, 2, 3, 4, 5]
>>>

So you will have to use: common_elements = sorted(list(set.union(*x))) or you can sort in place like this:

common_elements = list(set.union(*x))
common_elements.sort()

Why do I get AttributeError: 'NoneType' object has no attribute 'something'?

NoneType means that instead of an instance of whatever Class or Object you think you're working with, you've actually got None. That usually means that an assignment or function call up above failed or returned an unexpected result.

Sorting list by an attribute that can be None

The ordering comparison operators are stricter about types in Python 3, as described here:

The ordering comparison operators (<, <=, >=, >) raise a TypeError
exception when the operands don’t have a meaningful natural ordering.

Python 2 sorts None before any string (even empty string):

>>> None < None
False

>>> None < "abc"
True

>>> None < ""
True

In Python 3 any attempts at ordering NoneType instances result in an exception:

>>> None < "abc"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: NoneType() < str()

The quickest fix I can think of is to explicitly map None instances into something orderable like "":

my_list_sortable = [(x or "") for x in my_list]

If you want to sort your data while keeping it intact, just give sort a customized key method:

def nonesorter(a):
if not a:
return ""
return a

my_list.sort(key=nonesorter)

decompose() for time series: ValueError: You must specify a period or x must be a pandas object with a DatetimeIndex with a freq not set to None

Having the same ValueError, this is just the result of some testing and little research on my own, without the claim to be complete or professional about it. Please comment or answer whoever finds something wrong.

Of course, your data should be in the right order of the index values, which you would assure with df.sort_index(inplace=True), as you state it in your answer. This is not wrong as such, though the error message is not about the sort order, and I have checked this: the error does not go away in my case when I sort the index of a huge dataset I have at hand. It is true, I also have to sort the df.index, but the decompose() can handle unsorted data as well where items jump here and there in time: then you simply get a lot of blue lines from left to the right and back, until the whole graph is full of it. What is more, usually, the sorting is already in the right order anyway. In my case, sorting does not help fixing the error. Thus I also doubt that index sorting has fixed the error in your case, because: what does the error actually say?

ValueError: You must specify:

  1. [either] a period
  2. or x must be a pandas object with a DatetimeIndex with a freq not set to None

Before all, in case you have a list column so that your time series is nested up to now, see Convert pandas df with data in a "list column" into a time series in long format. Use three columns: [list of data] + [timestamp] + [duration] for details how to unnest a list column. This would be needed for both 1.) and 2.).

Details of 1.: "You must specify [either] a period ..."

Definition of period

"period, int, optional" from https://www.statsmodels.org/stable/generated/statsmodels.tsa.seasonal.seasonal_decompose.html:

Period of the series. Must be used if x is not a pandas object or if
the index of x does not have a frequency. Overrides default
periodicity of x if x is a pandas object with a timeseries index.

The period parameter that is set with an integer means the number of cycles which you expect to be in the data. If you have a df with 1000 rows with a list column in it (call it df_nested), and each list with for example 100 elements, then you will have 100 elements per cycle. It is probably smart taking period = len(df_nested) (= number of cycles) in order to get the best split of seasonality and trend. If your elements per cycle vary over time, other values may be better.
I am not sure about how to rightly set the parameter, therefore the question statsmodels seasonal_decompose(): What is the right “period of the series” in the context of a list column (constant vs. varying number of items) on Cross Validated which is not yet answered.

The "period" parameter of option 1.) has a big advantage over option 2.). Though it uses the time index (DatetimeIndex) for its x-axis, it does not require an item to hit the frequency exactly, in contrast to option 2.). Instead, it just joins together whatever is in a row, with the advantage that you do not need to fill any gaps: the last value of the previous event is just joined with the next value of the following event, whether it is already in the next second or on the next day.

What is the max possible "period" value? In case you have a list column (call the df "df_nested" again), you should first unnest the list column to a normal column. The max period is len(df_unnested)/2.

Example1: 20 items in x (x is the amount of all items of df_unnested) can maximally have a period = 10.

Example2: Having the 20 items and taking period=20 instead, this throws the following error:

ValueError: x must have 2 complete cycles requires 40
observations. x only has 20 observation(s)

Another side-note:
To get rid of the error in question, period = 1 should already take it away, but for time series analysis, "=1" does not reveal anything new, every cycle is just 1 item then, the trend is the same as the original data, the seasonality is 0, and the residuals are always 0.

####

Example borrowed from Convert pandas df with data in a "list column" into a time series in long format. Use three columns: [list of data] + [timestamp] + [duration]

df_test = pd.DataFrame({'timestamp': [1462352000000000000, 1462352100000000000, 1462352200000000000, 1462352300000000000],
'listData': [[1,2,1,9], [2,2,3,0], [1,3,3,0], [1,1,3,9]],
'duration_sec': [3.0, 3.0, 3.0, 3.0]})
tdi = pd.DatetimeIndex(df_test.timestamp)
df_test.set_index(tdi, inplace=True)
df_test.drop(columns='timestamp', inplace=True)
df_test.index.name = 'datetimeindex'

df_test = df_test.explode('listData')
sizes = df_test.groupby(level=0)['listData'].transform('size').sub(1)
duration = df_test['duration_sec'].div(sizes)
df_test.index += pd.to_timedelta(df_test.groupby(level=0).cumcount() * duration, unit='s')

The resulting df_test['listData'] looks as follows:

2016-05-04 08:53:20    1
2016-05-04 08:53:21 2
2016-05-04 08:53:22 1
2016-05-04 08:53:23 9
2016-05-04 08:55:00 2
2016-05-04 08:55:01 2
2016-05-04 08:55:02 3
2016-05-04 08:55:03 0
2016-05-04 08:56:40 1
2016-05-04 08:56:41 3
2016-05-04 08:56:42 3
2016-05-04 08:56:43 0
2016-05-04 08:58:20 1
2016-05-04 08:58:21 1
2016-05-04 08:58:22 3
2016-05-04 08:58:23 9

Now have a look at different period's integer values.

period = 1:

result_add = seasonal_decompose(x=df_test['listData'], model='additive', extrapolate_trend='freq', period=1)
plt.rcParams.update({'figure.figsize': (5,5)})
result_add.plot().suptitle('Additive Decompose', fontsize=22)
plt.show()

Sample Image

period = 2:

result_add = seasonal_decompose(x=df_test['listData'], model='additive', extrapolate_trend='freq', period=2)
plt.rcParams.update({'figure.figsize': (5,5)})
result_add.plot().suptitle('Additive Decompose', fontsize=22)
plt.show()

Sample Image

If you take a quarter of all items as one cycle which is 4 (out of 16 items) here.

period = 4:

result_add = seasonal_decompose(x=df_test['listData'], model='additive', extrapolate_trend='freq', period=int(len(df_test)/4))
plt.rcParams.update({'figure.figsize': (5,5)})
result_add.plot().suptitle('Additive Decompose', fontsize=22)
plt.show()

Sample Image

Or if you take the max possible size of a cycle which is 8 (out of 16 items) here.

period = 8:

result_add = seasonal_decompose(x=df_test['listData'], model='additive', extrapolate_trend='freq', period=int(len(df_test)/2))
plt.rcParams.update({'figure.figsize': (5,5)})
result_add.plot().suptitle('Additive Decompose', fontsize=22)
plt.show()

Sample Image

Have a look at how the y-axes change their scale.

####

You will increase the period integer according to your needs. The max in your case of the question:

sm.tsa.seasonal_decompose(df, model = 'additive', period = int(len(df)/2))

Details of 2.: "... or x must be a pandas object with a DatetimeIndex with a freq not set to None"

To get x to be a DatetimeIndex with a freq not set to None, you need to assign the freq of the DatetimeIndex using .asfreq('?') with ? being your choice among a wide range of offset aliases from https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases.

In your case, this option 2. is the better suited as you seem to have a list without gaps. Your monthly data then should probably be introduced as "month start frequency" --> "MS" as offset alias:

sm.tsa.seasonal_decompose(df.asfreq('MS'), model = 'additive')

See How to set frequency with pd.to_datetime()? for more details, also about how you would deal with gaps.

If you have data that is highly scattered in time so that you have too many gaps to fill or if gaps in time are nothing important, option 1 of using "period" is probably the better choice.

In my example case of df_test, option 2. is not good. The data is totally scattered in time, and if I take a second as the frequency, you get this:

Output of df_test.asfreq('s') (=frequency in seconds):

2016-05-04 08:53:20      1
2016-05-04 08:53:21 2
2016-05-04 08:53:22 1
2016-05-04 08:53:23 9
2016-05-04 08:53:24 NaN
...
2016-05-04 08:58:19 NaN
2016-05-04 08:58:20 1
2016-05-04 08:58:21 1
2016-05-04 08:58:22 3
2016-05-04 08:58:23 9
Freq: S, Name: listData, Length: 304, dtype: object

You see here that although my data is only 16 rows, introducing a frequency in seconds forces the df to be 304 rows only to reach out from "08:53:20" till "08:58:23", 288 gaps are caused here. What is more, here you have to hit the exact time. If you have 0.1 or even 0.12314 seconds as your real frequency instead, you will not hit most of the items with your index.

Here an example with min as the offset alias, df_test.asfreq('min'):

2016-05-04 08:53:20      1
2016-05-04 08:54:20 NaN
2016-05-04 08:55:20 NaN
2016-05-04 08:56:20 NaN
2016-05-04 08:57:20 NaN
2016-05-04 08:58:20 1

We see that only the first and the last minute are filled at all, the rest is not hit.

Taking the day as as the offset alias, df_test.asfreq('d'):

2016-05-04 08:53:20    1

We see that you get only the first row as the resulting df, since there is only one day covered. It will give you the first item found, the rest is dropped.

The end of it all

Putting together all of this, in your case, take option 2., while in my example case of df_test, option 1 is needed.

Converting a list to a set changes element order

  1. A set is an unordered data structure, so it does not preserve the insertion order.

  2. This depends on your requirements. If you have an normal list, and want to remove some set of elements while preserving the order of the list, you can do this with a list comprehension:

    >>> a = [1, 2, 20, 6, 210]
    >>> b = set([6, 20, 1])
    >>> [x for x in a if x not in b]
    [2, 210]

    If you need a data structure that supports both fast membership tests and preservation of insertion order, you can use the keys of a Python dictionary, which starting from Python 3.7 is guaranteed to preserve the insertion order:

    >>> a = dict.fromkeys([1, 2, 20, 6, 210])
    >>> b = dict.fromkeys([6, 20, 1])
    >>> dict.fromkeys(x for x in a if x not in b)
    {2: None, 210: None}

    b doesn't really need to be ordered here – you could use a set as well. Note that a.keys() - b.keys() returns the set difference as a set, so it won't preserve the insertion order.

    In older versions of Python, you can use collections.OrderedDict instead:

    >>> a = collections.OrderedDict.fromkeys([1, 2, 20, 6, 210])
    >>> b = collections.OrderedDict.fromkeys([6, 20, 1])
    >>> collections.OrderedDict.fromkeys(x for x in a if x not in b)
    OrderedDict([(2, None), (210, None)])

What is the use of assert in Python?

The assert statement exists in almost every programming language. It has two main uses:

  1. It helps detect problems early in your program, where the cause is clear, rather than later when some other operation fails. A type error in Python, for example, can go through several layers of code before actually raising an Exception if not caught early on.

  2. It works as documentation for other developers reading the code, who see the assert and can confidently say that its condition holds from now on.

When you do...

assert condition

... you're telling the program to test that condition, and immediately trigger an error if the condition is false.

In Python, it's roughly equivalent to this:

if not condition:
raise AssertionError()

Try it in the Python shell:

>>> assert True # nothing happens
>>> assert False
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError

Assertions can include an optional message, and you can disable them when running the interpreter.

To print a message if the assertion fails:

assert False, "Oh no! This assertion failed!"

Do not use parenthesis to call assert like a function. It is a statement. If you do assert(condition, message) you'll be running the assert with a (condition, message) tuple as first parameter.

As for disabling them, when running python in optimized mode, where __debug__ is False, assert statements will be ignored. Just pass the -O flag:

python -O script.py

See here for the relevant documentation.

TypeError: 'list' object is not callable in python

Seems like you've shadowed the builtin name list, which points at a class, by the same name pointing at an instance of it. Here is an example:

>>> example = list('easyhoss')  # here `list` refers to the builtin class
>>> list = list('abc') # we create a variable `list` referencing an instance of `list`
>>> example = list('easyhoss') # here `list` refers to the instance
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: 'list' object is not callable

I believe this is fairly obvious. Python stores object names (functions and classes are objects, too) in namespaces (which are implemented as dictionaries), hence you can rewrite pretty much any name in any scope. It won't show up as an error of some sort. As you might know, Python emphasizes that "special cases aren't special enough to break the rules". And there are two major rules behind the problem you've faced:

  1. Namespaces. Python supports nested namespaces. Theoretically you can endlessly nest them. As I've already mentioned, they are basically dictionaries of names and references to corresponding objects. Any module you create gets its own "global" namespace, though in fact it's just a local namespace with respect to that particular module.

  2. Scoping. When you reference a name, the Python runtime looks it up in the local namespace (with respect to the reference) and, if such name does not exist, it repeats the attempt in a higher-level namespace. This process continues until there are no higher namespaces left. In that case you get a NameError. Builtin functions and classes reside in a special high-order namespace __builtins__. If you declare a variable named list in your module's global namespace, the interpreter will never search for that name in a higher-level namespace (that is __builtins__). Similarly, suppose you create a variable var inside a function in your module, and another variable var in the module. Then, if you reference var inside the function, you will never get the global var, because there is a var in the local namespace - the interpreter has no need to search it elsewhere.

Here is a simple illustration.

>>> example = list("abc")  # Works fine
>>>
>>> # Creating name "list" in the global namespace of the module
>>> list = list("abc")
>>>
>>> example = list("abc")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'list' object is not callable
>>> # Python looks for "list" and finds it in the global namespace,
>>> # but it's not the proper "list".
>>>
>>> # Let's remove "list" from the global namespace
>>> del list
>>> # Since there is no "list" in the global namespace of the module,
>>> # Python goes to a higher-level namespace to find the name.
>>> example = list("abc") # It works.

So, as you see there is nothing special about Python builtins. And your case is a mere example of universal rules. You'd better use an IDE (e.g. a free version of PyCharm, or Atom with Python plugins) that highlights name shadowing to avoid such errors.

You might as well be wondering what is a "callable", in which case you can read this post. list, being a class, is callable. Calling a class triggers instance construction and initialisation. An instance might as well be callable, but list instances are not. If you are even more puzzled by the distinction between classes and instances, then you might want to read the documentation (quite conveniently, the same page covers namespaces and scoping).

If you want to know more about builtins, please read the answer by Christian Dean.

P.S. When you start an interactive Python session, you create a temporary module.



Related Topics



Leave a reply



Submit