I have the below date time in string type. I want to convert that into UTC with offset
> spark = SparkSession.builder.appName("Test").enableHiveSupport().getOrCreate()
> print("Print statement-1")
> schema = StructType([
> StructField("author", StringType(), False),
> StructField("dt", StringType(), False)
> ])
>
> data = [
> ["author1", "2022-07-22T09:25:47.261Z"],
> ["author2", "2022-07-22T09:26:47.291Z"],
> ["author3", "2022-07-22T09:23:47.411Z"],
> ["author4", "2022-07-224T09:25:47.291Z"]
> ]
>
> df = spark.createDataFrame(data, schema)
> ```
I want to convert dt column as UTC with offset.
For example the first row value as 2022-07-22T09:25:47.2610000 +00:00
How to do that in pyspark and sparkSQL.
I can easily do that using regex_replace
bcoz Z is same as +00:00. But I am not sure that regexp_replace is correct of doing the conversion. Is there any method which can do the correct conversion rather than regex_replace?
You are in luck! Python has several modules devoted to handling date and time values, including the datetime and zoneinfo. I encourage you to read the documentation for those modules.
Some code using strptime(), like the following, might help (not tested):
from datetime import datetime as DT
utc_dt = "2022-07-22T09:25:47.261Z".replace("Z","UTC")
dt = DT.strptime(utc_dt, "%Y-%m-%dT%H:%M:%S.%f%Z")
# dt is a datetime instance with tzinfo set to timezone.UTC .
In general, when I am working on a certain kind of data or interaction, my first question is usually, “Does the Python standard library already have a way of handling this?” The place to look is The Python Standard Library — Python 3.10.6 documentation . My second question is, “has someone else written a package to handle this?” The place to look is https://pypi.org/ .
I hope this is helpful. Good luck!
—Jim DeLaHunt, Vancouver, Canada
Sorry, I can’t help you there. I don’t know anything about Spark. I do know that Python 2.7 is very old by now. It is no longer supported. If you have a way to move your project to Python 3 (preferably Python 3.10) you will perhaps save yourself some problems.