Spark SQL Converting String to Timestamp

Pyspark SQL : Converting a string to timestamp custom format

try

select date_format(to_timestamp('072022','MMyyyy'),'yyyy-MM-dd HH:mm:ss.SSS')

replace the 072022 with your column

how to convert date of format string to timestamp in spark?

You haven't given value for the new column to convert. you should use withColumn to add the new date column and tell him to use Date column values.

import org.apache.spark.sql.functions.{col, to_date}
import org.apache.spark.sql.types._

val df = Seq((20110813),(20090724)).toDF("Date")
val newDf = df.withColumn("to_date", to_date(col("Date").cast(TimestampType), "yyyy-MM-dd"))
newDf.show()

Spark fails to convert String to TIMESTAMP

Trim your extra 0s. Then,

df.withColumn("new", to_timestamp($"date".substr(lit(1),length($"date") - 6), "yyyy-MM-dd HH:mm:ss.SSS")).show(false)

the result is:

+-----------------------------+-------------------+
|date |new |
+-----------------------------+-------------------+
|2019-05-07 00:03:53.837000000|2019-05-07 00:03:53|
+-----------------------------+-------------------+

The schema:

root
|-- date: string (nullable = true)
|-- new: timestamp (nullable = true)

How to convert string date into timestamp in pyspark?

06/21/2021 9:27 AM doesn't contain the second-of-minute value so you should remove the :ss in the parser format, see this example:

spark.sql("select from_unixtime(unix_timestamp('06/21/2021 9:27 AM', 'MM/dd/yyyy hh:mm a')) ts").show()

+-------------------+
| ts|
+-------------------+
|2021-06-21 09:27:00|
+-------------------+

Spark scala convert string to timestamp (1147880044 - mm/dd/yyyy HH:mm:ss format)

Use from_unixtime & date_format functions.

scala> val df = Seq(("1","296","5.0","1147880044","null"),("1","306","3.5","1147868817","null")).toDF("userId","movieId","rating","ts","ratingtimestamp")
df: org.apache.spark.sql.DataFrame = [userId: string, movieId: string ... 3 more fields]

scala> df.show(false)
+------+-------+------+----------+---------------+
|userId|movieId|rating|ts |ratingtimestamp|
+------+-------+------+----------+---------------+
|1 |296 |5.0 |1147880044|null |
|1 |306 |3.5 |1147868817|null |
+------+-------+------+----------+---------------+
scala> df.withColumn("ratingtimestamp",date_format(from_unixtime($"ts"),"MM/dd/yyyy HH:mm:ss")).show(false)
+------+-------+------+----------+-------------------+
|userId|movieId|rating|ts |ratingtimestamp |
+------+-------+------+----------+-------------------+
|1 |296 |5.0 |1147880044|05/17/2006 21:04:04|
|1 |306 |3.5 |1147868817|05/17/2006 17:56:57|
+------+-------+------+----------+-------------------+
scala> df.withColumn("ratingtimestamp",from_unixtime($"ts","MM/dd/yyyy HH:mm:ss")).show(false)
+------+-------+------+----------+-------------------+
|userId|movieId|rating|ts |ratingtimestamp |
+------+-------+------+----------+-------------------+
|1 |296 |5.0 |1147880044|05/17/2006 21:04:04|
|1 |306 |3.5 |1147868817|05/17/2006 17:56:57|
+------+-------+------+----------+-------------------+


Related Topics



Leave a reply



Submit