Google spreadsheet =QUERY join() equivalent function?
Short answer
Google QUERY Language version 0.7 (2016) doesn't include a JOIN (LEFT JOIN) operator but this could be achieved by using an array formula which result could be used as input for the QUERY function or for other uses.
Explanation
Array formulas and the array handling features of Google Sheets make possible to make a JOIN between two simple tables. In order to make easier to read, the proposed formula use named ranges instead of range references.
Named Ranges
- table1 : Sheet1!A1:C3
- table2 : Sheet2!A1:C3
- ID : Sheet1!A1:A3
Formula
=ArrayFormula(
{
table1,
vlookup(ID,table2,COLUMN(Indirect("R1C2:R1C"&COLUMNS(table2),0)),0)
}
)
Remarks:
- Using open ended ranges is possible but this could make the spreadsheet slower.
- To speed up the recalculation time :
- Replace
Indirect("R1C2:R1C"&COLUMNS(table2),0)
by an array of constants from 2 to number of columns of table2. - Remove the empty rows from the spreadsheet
Example
See this sheet for an example
Note
On 2017 Google improved the official help article in English about QUERY, QUERY function. It still doesn't include yet topics like this but could be helpful to understand how it works.
INNER JOIN in Google Spreadsheets
try:
=ARRAYFORMULA({TRIM(FLATTEN(SPLIT(QUERY(REPT(IF($A2:A="",,$A2:A&"♦"), IFNA(VLOOKUP($B2:$B, QUERY({$D2:$D},
"select Col1,count(Col1) where Col1 is not null group by Col1"), 2, 0))),,9^9), "♦"))), IFNA(VLOOKUP(
TRIM(FLATTEN(SPLIT(QUERY(REPT(IF($A2:A="",,$B2:B&"♦"), IFNA(VLOOKUP($B2:$B, QUERY({$D2:$D},
"select Col1,count(Col1) where Col1 is not null group by Col1"), 2, 0))),,9^9), "♦")))&COUNTIFS(
TRIM(FLATTEN(SPLIT(QUERY(REPT(IF($A2:A="",,$A2:A&"♦"), IFNA(VLOOKUP($B2:$B, QUERY({$D2:$D},
"select Col1,count(Col1) where Col1 is not null group by Col1"), 2, 0))),,9^9), "♦"))),
TRIM(FLATTEN(SPLIT(QUERY(REPT(IF($A2:A="",,$A2:A&"♦"), IFNA(VLOOKUP($B2:$B, QUERY({$D2:$D},
"select Col1,count(Col1) where Col1 is not null group by Col1"), 2, 0))),,9^9), "♦"))), ROW(INDIRECT("O1:O"&COUNTA(
TRIM(FLATTEN(SPLIT(QUERY(REPT(IF($A2:A="",,$B2:B&"♦"), IFNA(VLOOKUP($B2:$B, QUERY({$D2:$D},
"select Col1,count(Col1) where Col1 is not null group by Col1"), 2, 0))),,9^9), "♦")))))), "<="&ROW(INDIRECT("O1:O"&COUNTA(
TRIM(FLATTEN(SPLIT(QUERY(REPT(IF($A2:A="",,$B2:B&"♦"), IFNA(VLOOKUP($B2:$B, QUERY({$D2:$D},
"select Col1,count(Col1) where Col1 is not null group by Col1"), 2, 0))),,9^9), "♦"))))))),
{D2:D&COUNTIFS(D2:D, D2:D, ROW(O2:O), "<="&ROW(O2:O)), E2:E}, 2, 0))})
Join tables in google sheet - full join
previous answer is incorrect. use:
=ARRAYFORMULA(QUERY(UNIQUE(IFNA({G2:H10,
VLOOKUP(G2:G10, A2:C10, {2, 3}, 0);
VLOOKUP({A3:A10; G3:G10}, {G3:H10; {A3:A10, IF(A3:A10, )}}, {1, 2}, 0), {B3:C10;
VLOOKUP(G3:G10, {A3:C10; {G3:G10, IF(G3:G10, ), IF(G3:G10, )}}, {2, 3}, 0)}})),
"where Col1 is not null order by Col1", 1))
How to join tables using Query or Vlookup
Sample File
Please try to use basic formula for joining sheets + query to skip some rows.
Basic formula:
=FILTER({Sales!B14:B,Sales!C14:C,Sales!E14:E, vlookup(Sales!E14:E,{People!B2:B,People!D2:D},2,), Sales!F14:F},Sales!A14:A<>"")
The formula you need:
=QUERY(FILTER({Sales!B14:B,Sales!C14:C,Sales!E14:E, IFERROR(vlookup(Sales!E14:E,{People!B2:B,People!D2:D},2,), "do not show"), Sales!F14:F},Sales!A14:A<>""),"select * where Col4 <> 'do not show'")
How it works
It uses iferror(expression, "do not show")
to replace #N/A
error with the text "do not show".
Then it uses query to hide rows with value "do not show":
"select * where Col4 <> 'do not show'"
Basic formula and the question is here
Notes about usage:
Sales!B14:B,Sales!C14:C,Sales!E14:E
are some columns you need to select firstIFERROR(vlookup(Sales!E14:E,{People!B2:B,People!D2:D},2,)
will return column from "People", it is columnPeople!D2:D
. Please note that columnsSales!E14:E
andPeople!B2:B
are both containing a key, in the sample it is e-mail list.Sales!F14:F
is a column to return next.- filter condition
Sales!A14:A<>""
is to skip empty rows from original sheet. You may also add other conditions into a filter. query
part will shorten the report and skip values where keys (emails) do not match. You may add other conditions into a query.
Note. Query language in Google Sheets does not have joins
, and provided solution is a workaround. It shows how to make a join with a help of vlookup
function in Sheets.
- The provided formula gathers rows from sheet "Sales" and makes
vlookup
to sheet "People". You also may want to make vice-versa report: collect data from "People" and 'vlookup` data from "Sales".
"Vice-versa" report:
=QUERY(FILTER({vlookup(People!B2:B ,{Sales!E14:E, Sales!B14:B, Sales!C14:C, Sales!E14:E},{2, 3, 4},), People!B2:B, IFERROR(vlookup(People!B2:B ,{Sales!E14:E, Sales!F14:F},2,),100500)},People!A2:A<>""),"select * where Col5 <> 100500")
note! In this formula used numeric value 100500 to skip values because query works with the single data type (number or text).
Please comment here if you have any questions about how it works.
google sheets query left join one-to-many
OK well here is an inner join to start with:
=ArrayFormula(query(iferror(split(flatten(if(transpose(filter(Table2!B2:B,Table2!B2:B<>""))=filter(Table1!A2:A,Table1!A2:A<>""),filter(Table1!A2:A,Table1!A2:A<>"")&"|"&transpose(filter(Table2!A2:A,Table2!A2:A<>"")),)),"|")),"select Col1,Col2 where Col1 is not null label Col1 '',Col2 ''"))
which builds up a 2D array and fills in the positions where the two sets of data match, then flattens it back into a 1D array and splits it back into two columns.
I think you just have to add the non-matching rows to get a left outer join:
=ArrayFormula({query(iferror(split(flatten(if(transpose(filter(Table2!B2:B,Table2!B2:B<>""))=filter(Table1!A2:A,Table1!A2:A<>""),
filter(Table1!A2:A,Table1!A2:A<>"")&"|"&transpose(filter(Table2!A2:A,Table2!A2:A<>"")),)),"|")),"select Col1,Col2 where Col1 is not null label Col1 '',Col2 ''");
filter(Table1!A2:B,isna(vlookup(Table1!A2:A,Table2!B2:B,1,false)))})
Note
This is a special case where the first table just consists of keys (ID), and you want just the key plus the other column from the second table for rows where the IDs match. It would be straightforward to add more columns separated by a pipe symbol (or any other character of choice), but these would have to be hard-coded: I don't know of any way with this approach to automatically include all columns from both tables.
This is in contrast to the answers here which do automatically combine columns from both tables but don't allow for a one-to-many relationship.
ArrayFormula a Filter in a Join (Google Spreadsheets)
It is generally a bit complicated to apply an aggregating function like CONCATENATE row-by-row.
=ArrayFormula(TRIM(TRANSPOSE(SPLIT(CONCATENATE(REPT(TRANSPOSE(Sheet1!B:B&" ");A:A=TRANSPOSE(Sheet1!A:A))&REPT(" "&CHAR(9);TRANSPOSE(ROW(Sheet1!A:A))=ROWS(Sheet1!A:A)));CHAR(9)))))
(edit: apologies I haven't had the opportunity to test for bugs/typos, will remove this line if you can confirm it works)
Google spreadsheet query formula - join like
You could try this formula in cell E1:
=SORT(UNIQUE(QUERY({A:A;C:C},"where Col1 <> ''")),1,1)
and then this formula in cell F1 and dragged down the column:
=TRANSPOSE(QUERY({A:B;C:D},"select Col2 where Col1='"&E1&"'"))
See this example sheet to see these working: https://goo.gl/sv8ViD
Also, it is possible to create the output in a single cell (see cell Q1) but the formula is not very adaptable
Related Topics
Using a Variable in Openrowset Query
Show a One to Many Relationship as 2 Columns - 1 Unique Row (Id & Comma Separated List)
Is There Common Street Addresses Database Design for All Addresses of the World
Does Anyone Use Right Outer Joins
How to Send Email from Postgresql Trigger
Constraint for Only One Record Marked as Default
Is There a Performance Difference Between Cte , Sub-Query, Temporary Table or Table Variable
Is SQL or Even Tsql Turing Complete
Delete SQL Rows Where Ids Do Not Have a Match from Another Table
Log Record Changes in SQL Server in an Audit Table
Postgres Unique Constraint VS Index
How to Send Email from SQL Server
Is the Like Operator Case-Sensitive with SQL Server
How to Convert Rows to Columns in Oracle
Why Does This SQL Code Give Error 1066 (Not Unique Table/Alias: 'User')