Splitting a Comma-Separated Field in Postgresql and Doing a Union All on All the Resulting Tables

Splitting a comma-separated field in Postgresql and doing a UNION ALL on all the resulting tables

This should give you the output you're looking for:

SELECT 
yourTable.ID,
regexp_split_to_table(yourTable.fruits, E',') AS split_fruits
FROM yourTable

EDIT: Fixed the regex.

Postgres - From array to multiple rows

Use string_to_array() and unnest() with ordinality in a lateral join:

with my_table(street, numbers) as (
values
('broadway', '01, 03'),
('helmet', '01, 03, 07')
)

select street, number, position
from my_table
cross join unnest(string_to_array(numbers, ', ')) with ordinality as u(number, position)

street | number | position
----------+--------+----------
broadway | 01 | 1
broadway | 03 | 2
helmet | 01 | 1
helmet | 03 | 2
helmet | 07 | 3
(5 rows)

Error while using regexp_split_to_table (Amazon Redshift)

A1

E is a prefix for Posix-style escape strings. You don't normally need this in modern Postgres. Only prepend it if you want to interpret special characters in the string. Like E'\n' for a newline char.Details and links to documentation:

  • Insert text with single quotes in PostgreSQL
  • SQL select where column begins with \

E is pointless noise in your query, but it should still work. The answer you are linking to is not very good, I am afraid.

A2

Should work as is. But better without the E.

SELECT id, regexp_split_to_table(fruits, '|') AS split_fruits
FROM tbl;

For simple delimiters, you don't need expensive regular expressions. This is typically faster:

SELECT id, unnest(string_to_array(fruits, '|')) AS split_fruits
FROM tbl;

In Postgres 9.3+ you'd rather use a LATERAL join for set-returning functions:

SELECT t.id, f.split_fruits
FROM tbl t
LEFT JOIN LATERAL unnest(string_to_array(fruits, '|')) AS f(split_fruits)
ON true;

Details:

  • What is the difference between LATERAL and a subquery in PostgreSQL?
  • PostgreSQL unnest() with element number

Amazon Redshift is not Postgres

It only implements a reduced set of features as documented in its manual. In particular, there are no table functions, including the essential functions unnest(), generate_series() or regexp_split_to_table() when working with its "compute nodes" (accessing any tables).

You should go with a normalized table layout to begin with (extra table with one fruit per row).

Or here are some options to create a set of rows in Redshift:

  • How to select multiple rows filled with constants in Amazon Redshift?

This workaround should do it:

  1. Create a table of numbers, with at least as many rows as there can be fruits in your column. Temporary or permanent if you'll keep using it. Say we never have more than 9:

    CREATE TEMP TABLE nr9(i int);
    INSERT INTO nr9(i) VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9);
  2. Join to the number table and use split_part(), which is actually implemented in Redshift:

    SELECT *, split_part(t.fruits, '|', n.i) As fruit
    FROM nr9 n
    JOIN tbl t ON split_part(t.fruits, '|', n.i) <> ''

Voilá.

Convert comma seperated db items to isolated rows

Use string_to_array() in unnest():

with a_table(id, data) as (
values
(1, 'Apple,Orance'),
(2, 'Apple,Grapes'),
(3, 'Juice,Apple,Cucumber')
)

select id, elem
from a_table,
lateral unnest(string_to_array(data, ',')) elem

id | elem
----+----------
1 | Apple
1 | Orance
2 | Apple
2 | Grapes
3 | Juice
3 | Apple
3 | Cucumber
(7 rows)

Use distinct to get distinct elements:

with a_table(id, data) as (
values
(1, 'Apple,Orance'),
(2, 'Apple,Grapes'),
(3, 'Juice,Apple,Cucumber')
)

select distinct on (elem) elem
from a_table,
lateral unnest(string_to_array(data, ',')) elem

elem
----------
Apple
Cucumber
Grapes
Juice
Orance
(5 rows)

Create a UNION ALL query when the columns are in different order

Alas, no. UNION ALL goes by position not by names. However, you can generate the columns:

select string_agg(column_name, ', ')
from information_schema.columns
where table_name = ? and
table_schema = ?;

You can then plug the list into your code.

Concatenate string instead of just replacing it

Storing comma separated values is a huge mistake to begin with. But if you really want to make your life harder than it needs to be, you might want to create a function that merges two comma separated lists:

create function merge_lists(p_one text, p_two text)
returns text
as
$$
select string_agg(item, ',')
from (
select e.item
from unnest(string_to_array(p_one, ',')) as e(item)
where e.item <> '' --< necessary because of the leading , in your data
union
select t.item
from unnest(string_to_array(p_two, ',')) t(item)
where t.item <> ''
) t;
$$
language sql;

If you are using Postgres 14 or later, unnest(string_to_array(..., ',')) can be replace with string_to_table(..., ',')

Then your INSERT statement gets a bit simpler:

INSERT  INTO t00 (userid,a,b)  VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = merge_lists(excluded.b, t00.b);

Postgres - Select rownumber without a table but data from a function

This is happening because regexp_split_to_table returns a table and not a single value for each rownumber.

Maybe a different approach will work? For example, this gives me the output you were after:

SELECT  regexp_split_to_table('a,b,c', ',')   as value, 
rank() over(order by regexp_split_to_table('a,b,c', ',')) as rownumber

EDIT: The above will reorder the results which may not be what you want. The following will preserve the ordering:

WITH T as
(
SELECT regexp_split_to_table('d,a,b,c', ',') as value
)

SELECT row_number() over() as rownumber,
value

FROM t

UNNEST Multiple Values?

You can use the unnest(split(col)) strategy, but don't cross-join both columns. Your answer implies an implicit order to the comma-separated values, so you'll need to establish a field (RowNumber below) to indicate this ordering.

with Expanded2 as (
select
tableName.col1,
col2.col2,
row_number() over (partition by col1 order by 1) RowNumber
from
tableName,
unnest(split(col2)) col2
), Expanded3 as (
select
tableName.col1,
col3.col3,
row_number() over (partition by col1 order by 1) RowNumber
from
tableName,
unnest(split(col3)) col3
)
select
Expanded2.col1,
Expanded2.col2,
Expanded3.col3
from
Expanded2
full outer join Expanded3 on
Expanded2.col1 = Expanded3.col1
and Expanded2.RowNumber = Expanded3.RowNumber

I'm not sure how your rdbms handles effectively-null window partitioning. The above works in PostgreSQL. SQL Server would require order by (select null). Ymmv.



Related Topics



Leave a reply



Submit