SparkSQL error Table Not Found
Make sure to import the implicits._ from the same SQLContext. Temporary tables are kept in-memory in one specific SQLContext.
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._
my_rdd.toDF().registerTempTable("my_rdd")
val my_df = sqlContext.sql("SELECT * from my_rdd LIMIT 5")
my_df.collect().foreach(println)
spark-sql Table or view not found error
It's because Spark is not registering any tables from any schemas from connection by default in Spark SQL Context. You must register it by yourself:
jdbcDF2.createOrReplaceTempView("mytable");
spark.sql("select count(*) from mytable");
Your jdbcDF2 has a source in myschema.mytable
from MySQL and will load data from this table on some action.
Remember that MySQL table is not the same as Spark table or view. You are telling Spark to read data from MySQL, but you must register this DataFrame or Dataset as table or view in current Spark SQL Context or Spark Session
PySpark : org.apache.spark.sql.AnalysisException: Table or view not found
Spark follows the Hive SQL syntax and it doesn't support subqueries for joins
So you can rewrite your query like this:
select
table_a.*, table_b.*
from
table_a
left join
table_b
on
table_a.id = table_b.fk_a_id
where
table_b.id = (select max(id) from table_b where fk_a_id in (select id from table_a))
or
select
table_a.*, t_b.*
from
table_a
left join
(
select * from table_b
where id = (select max(id) from table_b where fk_a_id in (select id from table_a)
) as t_b
on
table_a.id = t_b.fk_a_id
Spark Error: Table or view 'users' not found in database 'default';
You are using the API's incorrectly and at wrong places:-
First create a spark object:
val spark = SparkSession.
builder().
master("local").
appName("SQL- Transformations").
getOrCreate()
Then using this spark object read or write the data which you want:-
For reading the table use below code:-
val jdbcDF = spark.read
.format("jdbc")
.option("url", "jdbc:postgresql:dbserver")
.option("dbtable", "schema.tablename")
.option("user", "username")
.option("password", "password")
.load()
For writing a dataframe into a table use below code:-
jdbcDF.write
.format("jdbc")
.option("url", "jdbc:postgresql:dbserver")
.option("dbtable", "schema.tablename")
.option("user", "username")
.option("password", "password")
.save()
And If you want to send a query while reading the data through spark jdbc use below code:-
spark.read.format("jdbc")
.option("url", jdbcUrl)
.option("query", "select c1, c2 from t1") //passing query
.option("user", "username")
.option("password", "password")
.load()
Below are couple of restrictions while using query option.
1.It is not allowed to specify dbtable
and query
options at the same time.
2.It is not allowed to specify query
and partitionColumn
options at the same time. When specifying partitionColumn
option is required, the subquery can be specified using dbtable
option instead and partition columns can be qualified using the subquery alias provided as part of dbtable
.
once you have read the data into a datframe register it as temp table and then you can query on it using spark sql
jdbcDF.createOrReplaceTempView("df")
spark.sql("select * from df").show()
pyspark.sql.utils.AnalysisException: Table not found: test_result;
The entire game was of syntax-df.write.format("jdbc").option("driver", "org.postgresql.Driver").option("url","jdbc:postgresql://*************************ast-1.rds.a mazonaws.com/dbname").option("port","5432").option("dbtable","public.table_name").option("user","abc").option("password","abc").save()
Related Topics
Combine Output Inserted.Id with Value from Selected Row
Redshift: Executing a Dynamic Query from a String
SQL to Find Duplicate Entries (Within a Group)
Ms SQL Server Last Inserted Id
Interview - Detect/Remove Duplicate Entries
How to Count the Number of Words in a String in Oracle
Creating User with Password from Variables in Anonymous Block
Displaying Zero Valued Months with SQL
How to "Reset" Running Sum After It Reaches a Threshold
Cascade on Delete or Use Triggers
Query Last N Related Rows Per Row
Parse a Date from Unformatted Text in SQL
Find Most Common Elements in Array with a Group By
Slick 3.0 - Update Columns in a Table and Return Whole Table Object