在 Dataframes 中将日期从字符串转换为日期格式

问题描述:

我正在尝试使用 to_date 函数将字符串格式的列转换为日期格式,但它返回的是 Null 值.

I am trying to convert a column which is in String format to Date format using the to_date function but its returning Null values.

df.createOrReplaceTempView("incidents")
spark.sql("select Date from incidents").show()

+----------+
|      Date|
+----------+
|08/26/2016|
|08/26/2016|
|08/26/2016|
|06/14/2016|

spark.sql("select to_date(Date) from incidents").show()

+---------------------------+
|to_date(CAST(Date AS DATE))|
 +---------------------------+
|                       null|
|                       null|
|                       null|
|                       null|

日期列是字符串格式:

 |-- Date: string (nullable = true)

使用 to_date 和 Java SimpleDateFormat.

Use to_date with Java SimpleDateFormat.

TO_DATE(CAST(UNIX_TIMESTAMP(date, 'MM/dd/yyyy') AS TIMESTAMP))

示例:

spark.sql("""
  SELECT TO_DATE(CAST(UNIX_TIMESTAMP('08/26/2016', 'MM/dd/yyyy') AS TIMESTAMP)) AS newdate"""
).show()

+----------+
|        dt|
+----------+
|2016-08-26|
+----------+