Redshift: Executing a Dynamic Query from a String

Create dynamic string for IN clause

Thanks for confirming. I am not sure how you hit a character escaping problem with variables in the Matillion REST API, because the variables are passed in via the POST body?

In any case, here's something that worked for me:

Create an Orchestration Job named EntryPointJob with Job Variables set like below.

Sample Image

Where

  • jv_commasep is the public input, and is expected to contain something like AL,CA
  • pjv_inlist is a private variable which gets updated to contain the same information in SQL compatible format like 'AL','CA'

I used a Python3 script to derive pjv_inlist from jv_commasep with this code:

context.updateVariable('pjv_inlist', ','.join(["'" + x.strip() + "'" for x in jv_commasep.split(',')]))

Now, in the Matillion job you can use the pjv_inlist variable in SQL, for example like this: ... WHERE "state" IN (${pjv_inlist})

To run the job and pass a scalar variables you first need to create a JSON file like below. I named mine RunVariablesContainer.json

{
"scalarVariables" : {
"jv_commasep" : "AL,CA"
}
}

Then you can call Matillion's own REST API to run the Matillion job (with parameters) like this:

curl -k -X POST -u un:pw -H "Content-type: application/json" "https://.../rest/v1/group/name/.../project/name/.../version/name/.../job/name/EntryPointJob/run?environmentName=Demo" --data-binary @RunVariablesContainer.json

Redshift UDF function with dynamic SQLstatement

You need to use a stored procedure to perform dynamic SQL. See "Overview of Stored Procedures in Amazon Redshift"

CREATE PROCEDURE pp_calc(identifier varchar(100),table_name varchar(100))
RETURNS float
AS $$
BEGIN
IF identifier ='OC' THEN
EXECUTE 'SELECT '||identifier||'_1 + '||identifier||'_2 FROM '||table_name ;
/* I want to return this addition result */
ELSE
EXECUTE 'SELECT '||identifier||'_1 + '||identifier||'_2 FROM '||table_name ;
END IF;
END;
$$ LANGUAGE plpgsql;

See also this previous answer: Redshift: Executing a dynamic query from a string

how to write to dynamically created table in Redshift procedure

Well, when you are declaring a variable "new_table", and performing a SELECT ..INTO "new_table", the value is getting assigned to the variable "new_table". You will see that if you return your variable using a OUT parameter.

And when you remove the declaration, it simply work as a SELECT INTO syntax of Redshift SQL and creates a table.

Now to the solution:

Create a table using the CREATE TABLE AS...syntax.

Also you need to pass the value of declared variable, so use the EXECUTE command.

CREATE OR REPLACE PROCEDURE public.ct_tab (vname varchar)
AS $$
DECLARE tname VARCHAR(50):='public.swap_'||vname;

BEGIN

execute 'create table ' || tname || ' as select ''name''';

END;
$$ LANGUAGE plpgsql
;

Now if you call the procedure passing 'abc', a table named "swap_abc" will be created in public schema.

call public.ct_tab('abc');

Let me know if it helps :)

Issue with passing column name as a parameter to PREPARE in Redshift

This can now be done using Stored Procedures without the need for PREPARE. "Overview of Stored Procedures in Amazon Redshift"

It seems like you are trying to emulate GROUPING SETS or ROLLUP functionality. I have added a UNION ALL to the dynamic SQL to provide this type of output.

For this example stored procedure, both column names are provided as input and a REFCURSOR is declared as output.

CREATE PROCEDURE get_fruit_sum(IN column_1 VARCHAR, IN column_2 VARCHAR, result_set INOUT REFCURSOR) AS $$
BEGIN
OPEN result_set FOR
EXECUTE 'SELECT '|| quote_ident(column_1) ||' , '|| quote_ident(column_2)
|| ' , SUM(fb.user_Count) as user_count '
|| 'FROM dv_product.fruit_basket fb GROUP BY 1,2'
|| 'UNION ALL '
|| 'SELECT '|| quote_ident(column_1) ||' , ''ALL'''
|| ' , SUM(fb.user_Count) as user_count '
|| 'FROM dv_product.fruit_basket fb GROUP BY 1;'
RETURN;
END;
$$ LANGUAGE plpgsql;

You specify the columns and the output REFCURSOR when calling the procedure. The column names could be retrieved from a table by another stored procedure if needed. Then fetch the output from the REFCURSOR.

BEGIN; 
CALL get_fruit_sum ( 'Banana','Orange','result_set' );
FETCH ALL FROM result_set;
END;


Related Topics



Leave a reply



Submit