Select a Sequence of Columns: ':' Works But Not 'Seq'

Select a sequence of columns: `:` works but not `seq`

On recent versions of data.table, numbers can be used in j to specify columns. This behaviour includes formats such as DT[,1:2] to specify a numeric range of columns. (Note that this syntax does not work on older versions of data.table).

So why does DT[ , 1:2] work, but DT[ , seq(1:2)] does not? The answer is buried in the code for data.table:::[.data.table, which includes the lines:

  if (!missing(j)) {
jsub = replace_dot_alias(substitute(j))
root = if (is.call(jsub))
as.character(jsub[[1L]])[1L]
else ""
if (root == ":" || (root %chin% c("-", "!") && is.call(jsub[[2L]]) &&
jsub[[2L]][[1L]] == "(" && is.call(jsub[[2L]][[2L]]) &&
jsub[[2L]][[2L]][[1L]] == ":") || (!length(all.vars(jsub)) &&
root %chin% c("", "c", "paste", "paste0", "-", "!") &&
missing(by))) {
with = FALSE
}

We can see here that data.table is automatically setting the with = FALSE parameter for you when it detects the use of function : in j. It doesn't have the same functionality built in for seq, so we have to specify with = FALSE ourselves if we want to use the seq syntax.

DT[ , seq(1:2), with = FALSE]

Spark get a column as sequence for usage in zeppelin select form

You can try getting a tuple of object and string from the RDD, and use toIterable to convert to Iterable[(Object, String)]:

val testIter = data.select("file", "id").collect().map(
x => (x.getAs[Object](0), x.getAs[String](1))
).toIterable

How to use DISTINCT used while selecting all columns including sequence number column?

For two columns this query will be enough:

SELECT name, min(seq_num)
FROM table
GROUP BY name

For more column, use row_number analytic functon

SELECT name, col1, col2, .... col500, seq_num
FROM (
SELECT t.*, row_number() over (partition by name order by seq_num ) As rn
FROM table t
)
WHERE rn = 1

The above queries pick only one row with a given name and the smallest seq_num value for each name.

Scala Spark DataFrame : dataFrame.select multiple columns given a Sequence of column names

val columnNames = Seq("col1","col2",....."coln")

// using the string column names:
val result = dataframe.select(columnNames.head, columnNames.tail: _*)

// or, equivalently, using Column objects:
val result = dataframe.select(columnNames.map(c => col(c)): _*)

Get table and column owning a sequence

Get the "owning" table and column

ALTER SEQUENCE seqName OWNED BY table.id;

Your ALTER SEQUENCE statement causes an entry in the system catalog pg_depend with the dependency type (deptype) 'a' and a refobjsubid greater than 0, pointing to the attribute number (attnum) in pg_attribute. With that knowledge you can devise a simple query:

SELECT d.refobjid::regclass, a.attname
FROM pg_depend d
JOIN pg_attribute a ON a.attrelid = d.refobjid
AND a.attnum = d.refobjsubid
WHERE d.objid = 'public."seqName"'::regclass -- your sequence here
AND d.refobjsubid > 0
AND d.classid = 'pg_class'::regclass;
  • Double quotes ("") are only needed for otherwise illegal names (mixed case, reserved words, ...).

  • No need to assert that refclassid is of type regclass since the join to pg_attribute does that automatically.

    No need to assert that the sequence is a sequence since schema-qualified object names are unique across the database.

    No need to join to pg_class or pg_namespace at all.

  • The schema name is only needed to disambiguate or if it's not in the search_path.

    The same table name (or sequence name for that matter) can be used in multiple schemas. A cast to the object identifier type regclass observes the current search_path to pick the best match if you omit the schema qualification. If the table is not visible, you get an error message.

  • What's more, a regclass type is displayed as text to the user automatically. (If not, cast to text.) The schema-name is prepended automatically where necessary to be unambiguous in your session.

Get the actual "owner" (the role)

To get the role owning a specific sequence, as requested:

SELECT c.relname, u.usename 
FROM pg_class c
JOIN pg_user u ON u.usesysid = c.relowner
WHERE c.oid = '"seqName"'::regclass; -- your sequence here

Run table for all columns in sequence

Excluding Id column reshape from wide-to-long using stack, then table to get counts including NAs, transpose to have column names as rows, then convert table object to dataframe:

data.frame(rbind(t(table(stack(d[, -1]), useNA = "always"))))
# X82 X87 X88 NA.
# Col_A_1 1 2 2 0
# Col_A_2 1 3 1 0
# Col_A_3 3 1 0 1
# Col_A_100 1 1 3 0
# NA. 0 0 0 0

Spark dataframe how to select columns using Seq[String]

Use :

.select(
(colsWithoutPlanWeekData.map(c => col(c)) ++ Seq(
col("bbDemoImpsAttribute.bbDemoImpsAttributes.demoId").as("bbDemoId"),
col("demoValuesAttribute.demoAttributes.demoId").as("demoId"),
col("hhDemoAttribute.demoId").as("hhDemoId"))): _*
)

Concat the 2 Seq before using the syntactic-sugar : _*

Add a sequence column in a query

I suspect that you are looking about numbering the rows based on the rate so use an analytic function like this :

 select ref_leger_code, rate, sumbalance, due_date,
ROW_NUMBER() OVER (PARTITION BY rate ORDER BY due_date asc ) AS sequence
from (
select ref_leger_code, rate, sum(balance) sumbalance, to_char(due_date,'yyyymm') due_date
from tbl_value_temp
group by ref_leger_code, rate, to_char(due_date,'yyyymm')
);


Related Topics



Leave a reply



Submit