ObjectNotFound: (transaction_type.py:String) [], CommandNotFoundException

Please assist. I am getting the above error.

##import required libraries
import pyspark

##create spark session
spark = pyspark.sql.SparkSession \
   .builder \
   .appName("Python Spark SQL basic example") \
   .config('spark.driver.extraClassPath', "C:\Users\Anthony.DESKTOP-ES5HL78\Downloads\sqljdbc_12.2.0.0_enu\sqljdbc_12.2\enu\mssql-jdbc-12.2.0.jre8.jar") \
   .getOrCreate()


##read table from db using spark jdbc
movies_df = spark.read \
   .format("jdbc") \
   .option("url", "jdbc:sqlserver://DESKTOP-ES5HL78/kazang") \
   .option("dbtable", "transaction_type") \
   .option("user", "anthony") \
   .option("password", "Musicbook2023...") \
   .option("driver", "com.microsoft.sqlserve.Driver") \
   .load()

##print the movies_df
print(transaction_type.df.show())


At line:1 char:1

  • transaction_type.py
  •   + CategoryInfo          : ObjectNotFound: (transaction_type.py:String) [], CommandNotFoundException
      + FullyQualifiedErrorId : CommandNotFoundException
    



Tip if you use () around multiline code you do not need to use \ on the end of lines.

You have a windows path where you have quoted the \ in the path.
You can use / instead of \ in the path.
You can double up the \ as \
You can use r’dir\name’ that does not need you to double the .

Also check that jdbc is on you PATH.

Would you please correct the code for me(I don’t understand your explanantion)? I attached a screenshot of where my JDBC driver is.

##import required libraries
import pyspark

##create spark session
spark = pyspark.sql.SparkSession \
   .builder \
   .appName("Python Spark SQL basic example") \
   .config('spark.driver.extraClassPath', "C:\Users\Anthony.DESKTOP-ES5HL78\Downloads\sqljdbc_12.2.0.0_enu\sqljdbc_12.2\enu\mssql-jdbc-12.2.0.jre8.jar") \
   .getOrCreate()


##read table from db using spark jdbc
movies_df = spark.read \
   .format("jdbc") \
   .option("url", "jdbc:sqlserver://DESKTOP-ES5HL78/kazang") \
   .option("dbtable", "transaction_type") \
   .option("user", "anthony") \
   .option("password", "Musicbook2023...") \
   .option("driver", "com.microsoft.sqlserve.Driver") \
   .load()

##print the movies_df
print(transaction_type.df.show())



My JDBC path(C:\Users\Anthony.DESKTOP-ES5HL78\Downloads\sqljdbc_12.2.0.0_enu\sqljdbc_12.2\enu\mssql-jdbc-12.2.0.jre8.jar
![msjdbc|556x499](upload://976BugmRfJtHo77D6Q7NAoyzL7A.png)
)

Barry meant that you should do it like this:

##import required libraries
import pyspark

##create spark session
spark = (pyspark.sql.SparkSession
   .builder
   .appName("Python Spark SQL basic example")
   .config('spark.driver.extraClassPath', r"C:\Users\Anthony.DESKTOP-ES5HL78\Downloads\sqljdbc_12.2.0.0_enu\sqljdbc_12.2\enu\mssql-jdbc-12.2.0.jre8.jar") 
   .getOrCreate()
)


##read table from db using spark jdbc
movies_df = (spark.read
   .format("jdbc")
   .option("url", "jdbc:sqlserver://DESKTOP-ES5HL78/kazang")
   .option("dbtable", "transaction_type")
   .option("user", "anthony")
   .option("password", "Musicbook2023...")
   .option("driver", "com.microsoft.sqlserve.Driver")
   .load()
)

##print the movies_df
print(transaction_type.df.show())

I am getting this error [{ "resource": "/C:/Users/Anthony.DESKTOP-ES5HL78/AppData/Local/Programs/Python/Python310/Scripts/transaction_type.py", "owner": "_generated_diagnostic_collection_name_#1", "code": { "value": "reportMissingImports", "target": { "$mid": 1, "external": "https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportMissingImports", "path": "/microsoft/pyright/blob/main/docs/configuration.md", "scheme": "https", "authority": "github.com", "fragment": "reportMissingImports" } }, "severity": 4, "message": "Import \"pyspark\" could not be resolved", "source": "Pylance", "startLineNumber": 5, "startColumn": 8, "endLineNumber": 5, "endColumn": 15 },{ "resource": "/C:/Users/Anthony.DESKTOP-ES5HL78/AppData/Local/Programs/Python/Python310/Scripts/transaction_type.py", "owner": "_generated_diagnostic_collection_name_#1", "code": { "value": "reportUndefinedVariable", "target": { "$mid": 1, "external": "https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportUndefinedVariable", "path": "/microsoft/pyright/blob/main/docs/configuration.md", "scheme": "https", "authority": "github.com", "fragment": "reportUndefinedVariable" } }, "severity": 4, "message": "\"transaction_type\" is not defined", "source": "Pylance", "startLineNumber": 28, "startColumn": 7, "endLineNumber": 28, "endColumn": 23 }]
Using this code ‘’’

##import required libraries
import pyspark

##create spark session
spark = (pyspark.sql.SparkSession
.builder
.appName(“Python Spark SQL basic example”)
.config(‘spark.driver.extraClassPath’, r"C:\Users\Anthony.DESKTOP-ES5HL78\Downloads\sqljdbc_12.2.0.0_enu\sqljdbc_12.2\enu\mssql-jdbc-12.2.0.jre8.jar")
.getOrCreate()
)

##read table from db using spark jdbc
transaction_type_df = (spark.read
.format(“jdbc”)
.option(“url”, “jdbc:sqlserver://DESKTOP-ES5HL78/kazang”)
.option(“dbtable”, “transaction_type”)
.option(“user”, “anthony”)
.option(“password”, “Musicbook2023…”)
.option(“driver”, "com.microsoft.sqlserver.jdbc.SQLServerDriver’, Server ")
.load()
)

##print the movies_df
print(transaction_type.df.show())
‘’’

I don’t have an answer, but this bit looks wrong to me:

.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver', Server ")

Shouldn’t it be this?

.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")

I altered my code to `##import required libraries
import pyspark

##create spark session
spark = (pyspark.sql.SparkSession
.builder
.appName(“Python Spark SQL basic example”)
.config(‘spark.driver.extraClassPath’, r"C:\Users\Anthony.DESKTOP-ES5HL78\Downloads\sqljdbc_12.2.0.0_enu\sqljdbc_12.2\enu\mssql-jdbc-12.2.0.jre8.jar")
.getOrCreate()
)

##read table from db using spark jdbc
transaction_type_df = (spark.read
.format(“jdbc”)
.option(“url”, “jdbc:sqlserver://DESKTOP-ES5HL78/kazang”)
.option(“dbtable”, “transaction_type”)
.option(“user”, “anthony”)
.option(“password”, “Musicbook2023…”)
.option(“driver”, “com.microsoft.sqlserver.jdbc.SQLServerDriver”)
.load()
)

##print the transaction_type_df
transaction_type_df.show()

I am getting error: Import "pyspark" could not be resolved
`