Concat GROUP BY in Vertica SQL
The easiest on the long term is to use one of the official Vertica UDFs to be found on github at https://github.com/vertica/Vertica-Extension-Packages/tree/master/strings_package which provides a group_concat function. The installation procedure is to found in the README, and examples are even provided.
aggregate function to concatenate strings in Vertica
First, you'll need to compile the udx for agg_concatenate
.
-- Shell commands
cd /opt/vertica/sdk/examples/AggregateFunctions/
g++ -D HAVE_LONG_INT_64 -I /opt/vertica/sdk/include -Wall -shared -Wno-unused-value -fPIC -o Concatenate.so Concatenate.cpp /opt/vertica/sdk/include/Vertica.cpp
-- vsql commands
CREATE LIBRARY AggregateFunctionsConcatenate AS '/opt/vertica/sdk/examples/AggregateFunctions/Concatenate.so';
CREATE AGGREGATE FUNCTION agg_concatenate AS LANGUAGE 'C++' NAME 'ConcatenateFactory' LIBRARY AggregateFunctionsConcatenate;
Then you can do a query like:
select id, rtrim(agg_concatenate(name || ', '),', ') ag
from mytable
group by 1
order by 1
Uses rtrim to get rid of the last ', '.
If you need the aggregate to be sorted a certain way, you may need to select/sort in an inline view or with first.
Vertica: how can you concate values by some order?
You can do this with a nasty case
expression. For three columns it is not so bad:
select t.*,
(gr || ',' ||
(case when a not in (le, gr) then a
when b not in (le, br) then b
else c
end) || ',' ||
le
),
((case gr when a then 'a' when b then 'b' else 'c' end) || ',' ||
(case when a not in (gr, le) then 'a'
when b not in (gr, le) then 'b'
else 'c'
end) || ',' ||
(case le when a then 'a' when b then 'b' else 'c' end)
)
from (select t.*, greatest(a, b, c) as gr, least(a, b, c) as le
from t
) t;
This particular version assumes there are no duplicates or NULL
values, although this can be adopted for that purpose.
Concatenate columns in vertica doesn't work
Use single quotes not double quotes:
=> select "foo" || "bar" from dual;
ERROR: column "foo" does not exist
=> select 'foo' || 'bar' from dual;
?column?
----------
foobar
(1 row)
SQL/Vertica - grouping multi-attribute combinations
Normally, you approach this type of problem using least()
and greatest()
, but you have two columns, instead of one. So, let's do it by comparing cities. I am guessing that city
is more unique than country
:
select (case when city1 < city2 then country1 else country2 end) as country1,
(case when city1 < city2 then city1 else city2 end) as city1,
(case when city1 < city2 then country2 else country1 end) as country2,
(case when city1 < city2 then city2 else city1 end) as city2,
count(*)
from dataset
group by (case when city1 < city2 then country1 else country2 end),
(case when city1 < city2 then city1 else city2 end),
(case when city1 < city2 then country2 else country1 end),
(case when city1 < city2 then city2 else city1 end)
Vertica: split string to array and group over to get set of unique values?
Found it:
Assuming you know what is the max number of values per entry (let's assume 6 in this example):
with m as (
select 1 as i union all
select 2 as i union all
select 3 as i union all
select 4 as i union all
select 5 as i union all
select 6 as i),
splitted_data as (
select distinct ID, split_part('values',',',i) as value
from main_table cross join m)
select ID, rtrim(agg_concatenate(value||','),',') as values
from splitted_data
Related Topics
How to Get the Nth Row in a SQL Server Table
How to Insert Random Values into a SQL Server Table
How to Update with Inner Join in Oracle
SQL Primary Key - Is It Necessary
Slick 3.0 - Update Columns in a Table and Return Whole Table Object
Why Does Nvl Always Evaluate 2Nd Parameter
SQL Order by Total Within Group By
Oracle Ora-00979 - "Not a Group by Expression"
SQL Server Audit Logout Creates Huge Number of Reads
How to Use Like Condition in SQL with Numeric Field
MySQL Is Array in Multiple Columns
Why Doesn't Oracle Raise "Ora-00918: Column Ambiguously Defined" for This Query
SQL Identity with Leading Padded Zeros
SQL Server, Combining Like and In