Convert string price value to double type in pyspark
First, you have to remove the dots from the European string number format of the price and replace the comma by a dot. Then you can cast it to double type.
Try this:
df = spark.createDataFrame([("-3.104,15",), ("-3.104,15",)], ['price_european_format'])
df.withColumn("price_double", regexp_replace(regexp_replace(
col("price_european_format"), '\\.', ''), ',', '\\.').cast("double"))\
.show()
Gives:
+---------------------+------------+
|price_european_format|price_double|
+---------------------+------------+
| -3.104,15| -3104.15|
| -3.104,15| -3104.15|
+---------------------+------------+
PySpark: How to transform data from string to data (or integer) in an easy-to-read manner
Just append an index before each of column, like this 01Jan20, 02Feb20, ... 10Oct20, ...
. Don't forget the leading zeros, you might need more than one depends on the number of columns you have.
Converting double datatype column to binary and returning sum of digits in new column PySpark Dataframe
Let us use bin
to convert the column values into binary string representation, then replace 0's
with empty string and count the length of resulting string to calculate number of 1's
df.select(*[F.length(F.regexp_replace(F.bin(c), '0', '')).alias(c) for c in df.columns])
+-----+-----+-----+-----+-----+-----+
|bit_1|bit_2|bit_3|bit_4|bit_5|bit_6|
+-----+-----+-----+-----+-----+-----+
| 0| 1| 1| 0| 0| 0|
| 3| 0| 1| 1| 0| 0|
| 2| 0| 0| 0| 1| 2|
| 2| 4| 4| 0| 0| 0|
| 2| 0| 2| 0| 0| 0|
| 6| 2| 0| 0| 0| 0|
| 4| 2| 3| 1| 0| 0|
| 2| 3| 4| 3| 5| 4|
+-----+-----+-----+-----+-----+-----+
PySpark: creating aggregated columns out of a string type column different values
groupy sum then groupby pivot the outcome
df.groupby('topic','emotion').agg(sum('counts').alias('counts')).groupby('topic').pivot('emotion').agg(F.first('counts')).na.fill(0).show()
+-----+----+---+-------+--------+
|topic|fear|joy|sadness|surprise|
+-----+----+---+-------+--------+
| dog| 0| 0| 4| 13|
| cat| 0| 2| 0| 1|
| bird| 3| 0| 0| 0|
+-----+----+---+-------+--------+
Related Topics
Pip Installation /Usr/Local/Opt/Python/Bin/Python2.7: Bad Interpreter: No Such File or Directory
Find the Longest Substring in Alphabetical Order
How to Append Two Bytes in Python
Django - How to Retrieve Data in Database in Dropdownlist
How to Add Thousand Separator to Numbers in Python Pandas Dataframe
Asking the User for Input Until They Give a Valid Response
How to Send Smtp Email for Office365 With Python Using Tls/Ssl
Python/Pandas: How to Match List of Strings With a Dataframe Column
Formatting Datetimefield in Django
Generate List of Quarters Betweeen Given Dates
How to Clear All Widgets from a Tkinter Window in One Go Without Referencing Them All Directly
Robot Framework Using Python, Key Press Without Selecting Any Button or Element in the Page
How to Connect to a Remote Windows Machine to Execute Commands Using Python
How to Remove a Single Quotes from a List
Replacing Pandas or Numpy Nan With a None to Use With Mysqldb
Reading Columns of a Txt File on Python
Add One Month to a Given Date (Rounded Day After) With Python