How to convert SQL to Relational Algebra in case of SQL Joins?
Relational algebra for:
Find an
Id
of theEmployee
who has taken everytraining
.
Actually you need division %
operator in relational algebra:
r ÷ s
is used when we wish to express queries with“all”
:Example:
- Which persons have a bank account at
ALL
the banks in the country?- Retrieve the name of employees who work on
ALL
the projects thatJon Smith
works on?
Read also this slid for division operator:
You also need query % operator for your query: "Employee who has taken all training".
First list off all Training codes:
Training (
TrainingCode
, TrainingName, TrainingType, TrainingInstructor)
Primary key is: TrainingCode
:
TC =
∏TrainingCode
(Training)
A pair of employeeID and trainingCode: a employee take the training.
ET =
∏EmployeeId, TrainingCode
(Outcome)
Apply % Division operation which gives you desired employee's codes with trainingCode then apply projection to filter out employee code only.
Result =
∏EmployeeId
(ET % TC)
"Fundamentals of Database Systems" is the book I always keep in my hand.
6.3.4 The DIVISION Operation
The DIVISION operation is defined for convenience for dealing with
queries that involvesuniversal quantification
or theall
condition. Most RDBMS implementation with SQL as the primary query
language do not directly implement division. SQL has round way of
dealing with the type of query using EXISTS, CONTAINS and NOT EXISTS
key words.The general DIVISION operation applied to two relations
T(Y) = R(Z) %
, where
S(X)X ⊆ Z
andY = Z - X
(and henceZ =
); that is
X ∪ YY
is the set of attributes ofR
that are not
attributes ofS
e.g.X = {A}, Z = {A, B} then Y = {B}
,B
attribute is not present in relationS
.
T(Y)
the result of DIVISION is a relation includes a tuplet
if
tupletR
appear in relationR
with
tR[Y] = t
, and with
tR[X] = tS
forevery
tuple in
S
. This means that. for a tuplet
to appear in the resultT
of the DIVISION, the value oft
must be appear inR
in combination with every tuple inS
.
I would also like to add that the set of relational algebra operations {
σ
,
∏
,
⋈
,
Χ
,
-
}
namely Selection, Projection, Join, Cartesian Cross and Minus is a complete set; that is any of the other original relational algebra operation can be expressed as a sequence of operations from this set. Division operation %
can also be expressed in the form of ∏
, ⋈
, and -
operations as follows:
T1 <-- ∏Y(R)
T2 <-- ∏Y((S Χ T1) - R)
T3 <-- T1 - T2
To represent your question using basic relational algebraic operation just replace R by Outcome, S by Training and attribute set Y by EmployeeId.
I hope this help.
Convert SQL to Relational Algebra
... and I suppose you meant D.Fname = 'John' in WHERE clause
Convert SQL Query to Relational Algebra
Your SQL code will result in duplicate columns for CustomerCode and the use of SELECT [ALL]
is likely to result in duplicate rows. Because the result is not a relation, it cannot be expressed in relational algebra.
These problems are easily fixed in SQL:
SELECT DISTINCT *
FROM Customer NATURAL JOIN Appointment
WHERE Appointment.ServerCode IN
(
SELECT ServerCode FROM Appointment WHERE CustomerCode = '102'
)
;
You didn't specify which relational algebra you are intereted in. Date and Darwen proposed an algebra named A, specified an A language named D, and designed a D language named Tutorial D.
Tutorial D uses operators JOIN
for natural join, WHERE
for restriction and MATCHING
for semijoin, The slight complication is the comparison in SQL:
CustomerCode = '102'
The comparison of a CustomerCode
value to a CHAR
value in SQL is possible because of implicit coercion. Tutorial D is stricter -- type safe, if you will -- requiring you to overload the equality operator or, more practically, define a selector operator for CHAR
, which would typically have the same name as the type.
Therefore, the above (revised) SQL may be written in Tutorial D as:
( Customer JOIN Appointment )
MATCHING ( ( Appointment WHERE CustomerCode = CustomerCode ( '102' ) ) { ServerCode } )
Converting a certain SQL query into relational algebra
I don't know. And anybody else is not likely to know either.
RA courses typically limit themselves to the selection, projection and join operators. Aggregations are not typically covered by an RA course. There even isn't any standard approach (that I know of) that the RA takes on aggregations.
What is the operator that your course defines for doing aggregations on relations ? What type of value does that operator produce for its result ? A relation ? Something else ? If something else, how does your course explain doing relational restrictions on that result, given that these result values aren't relations, but restriction works only on relations ?
Algebraically, this case starts with a natural join (produkt x ordre).
[The result of] this natural join is subjected to an aggregation operation. Thus this natural join is to appear where you specify the relational input argument to your aggregation operator. The other needed specs for specifying the aggregation are the output attribute names (total), and the way to compute them (SUM(...)). Those might appear in subscript next to your aggregation operator symbol as "annotations", much like the attribute lists on projection and the restriction condition on restriction. But anything concerning this operator is course-specific, because there isn't any agreed-upon standard notation for aggregations, as far as I know.
Then if your aggregation operator is defined to return a relation, you can specify your aggregation result as the input argument to a restriction with condition "total>=10000".
Library to convert SQL to relational algebra
No, I wouldn't count on it.
SQL is a horrendous language, parsing it is an immense task, and parsing it for the purpose of capturing the original algebraic intent is considered infeasible by just about the whole world, as far as I know.
And then I haven't even begun to mention the various ways in which vendors turn it into something that is actually no less than a completely proprietary language, despite a possible superficial resemblance to what is supposed to be a standard.
And even if such a package existed, what would you do with the output you obtained from it ?
SQL to Relational-Algebra
The first step would be to put the query into words: what is being asked? In this case we want the dates of all concerts where the show at the concert was sung by 'A' and the hall the concert was in was named 'B'. (This step isn't always necessary - in many cases, we could probably generate the RA straight from the SQL - but it's never a bad idea to make sure you understand what it is you're doing).
So let's build your RA statement from the inside out. First, we want to make sure that anytime we look at a show and a concert at the same, we are only looking for the show at that concert, and in this case, the show's id is a field in concert. So your intuition to use natural join/|><|
is correct. I've always learned that unless otherwise specified, natural joins will automatically match on all equivalent column names (though I would check with your teacher's standard first), so in this case, we don't need to specify the join condition that the show id's match. Even if we did, that would be a subscript of the join, and not part of the select statement. Next we want to make sure that when we look at a hall and a show, we only look at shows in that hall. So we want to natural join hall to the previous result (note that order of joining matters - you can't naturally join concert and hall if one hasn't already been joined to show).
The next step is to only get the rows of information we want. In this case, the select/sigma
statements are pretty easy - just specify that the name and hall must be what you want. Finally, we project/pi
only the data element we want from the rows we have retrieved - in this case we will only project the date. After that, we're done. And from your RA so far, I can see that you already have all the operators in the right order, you're just missing the operands!
Note that there are a few other things to consider. You might want to consider using rename/rho
on each table to simplify writing things out. Furthermore, if your RA needs to exactly duplicate the SQL, you'll have to use rename, and also replace your natural joins with cross join/X
and place the join conditions in your select statement.
Converting SQL Query with Aggregate Function to Relational Algebra Expression in Apache Calcite - No match found for function signature
The problem with the code is that new SqlStdOperatorTable()
creates a validator which is not initialized. The correct way to use SqlStdOperatorTable
is to use SqlStdOperatorTable.instance()
.
I found the solution after emailing the dev@calcite.apache.org mailing list. I would like to thank Yuzhao Chen for looking into the question I had and pointing out the problem with my code.
Related Topics
Pivot - SQL - Values from Subquery
How to Query All Rows Within a 5-Mile Radius of My Coordinates
How to Use SQL Server Compact Edition Ce on Mono
How to Use User Defined Table Type Inside Another User Defined Table Type in SQL
Select Top N Records Ordered by X, But Have Results in Reverse Order
Oracle (11.2.0.1.0) - Recursive Cte with a Date Expression
How to Fill Missing Dates and Values in Partitioned Data
How to Replace Multiple Characters in Access SQL
SQL Query Where Date = Today Minus 7 Days
How Does 'In' Clause Works in Oracle
How to Convert SQL to Relational Algebra in Case of SQL Joins
What's the Asymptotic Complexity of Groupby Operation
How to Use a Pg Sequence on a Per Record Label
Credentials Error When Integrating Google Drive With