Dropping Multiple Partitions in Impala/Hive

Dropping multiple partitions in Impala/Hive

1.

Your syntax is wrong.

In the DROP command the partitions should be separated by commas.

Demo

hive> create table t (i int) partitioned by (p int);
OK

hive> alter table t add partition (p=1) partition(p=2) partition(p=3) partition(p=4) partition(p=5);
OK

hive> show partitions t;
OK
partition
p=1
p=2
p=3
p=4
p=5

hive> alter table t drop if exists partition (p=1),partition (p=2),partition(p=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK

hive> show partitions t;
OK
partition
p=4
p=5

2.

You can drop a range.

Demo

hive> create table t (i int) partitioned by (p int);
OK

hive> alter table t add partition (p=1) partition(p=2) partition(p=3) partition(p=4) partition(p=5);
OK

hive> show partitions t;
OK
partition
p=1
p=2
p=3
p=4
p=5

hive> alter table t drop if exists partition (p<=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK

hive> show partitions t;
OK
partition
p=4
p=5

Dropping a range of partitions in HIVE

I tried this syntax it worked.

ALTER TABLE mytable DROP PARTITION (dates>'2018-04-14',dates<'2018-04-16');

Command output:

    Dropped the partition dates=2018-04-15/country_id=107
Dropped the partition dates=2018-04-15/country_id=110
Dropped the partition dates=2018-04-15/country_id=112
Dropped the partition dates=2018-04-15/country_id=14
Dropped the partition dates=2018-04-15/country_id=157
Dropped the partition dates=2018-04-15/country_id=159
Dropped the partition dates=2018-04-15/country_id=177
Dropped the partition dates=2018-04-15/country_id=208
Dropped the partition dates=2018-04-15/country_id=22
Dropped the partition dates=2018-04-15/country_id=233
Dropped the partition dates=2018-04-15/country_id=234
Dropped the partition dates=2018-04-15/country_id=76
Dropped the partition dates=2018-04-15/country_id=83
OK
Time taken: 0.706 seconds

I am using, Hive 1.2.1000.2.5.5.0-157

How do I drop all partitions at once in hive?

There are multiple options, here is one:

alter table schedule_events drop if exists partition (year<>'');

Hive: Extend ALTER TABLE DROP PARTITION syntax to use all comparators

"... To drop a partition from a Hive table, this works:

ALTER TABLE foo DROP PARTITION(ds = 'date')

...but it should also work to drop all partitions prior to date.

ALTER TABLE foo DROP PARTITION(ds < 'date')
This task is to implement ALTER TABLE DROP PARTITION for all of the
comparators, < > <= >= <> = != instead of just for ="

https://issues.apache.org/jira/browse/HIVE-2908

How to Update/Drop a Hive Partition?

You can update a Hive partition by, for example:

ALTER TABLE logs PARTITION(year = 2012, month = 12, day = 18) 
SET LOCATION 'hdfs://user/darcy/logs/2012/12/18';

This command does not move the old data, nor does it delete the old data. It simply sets the partition to the new location.

To drop a partition, you can do

ALTER TABLE logs DROP IF EXISTS PARTITION(year = 2012, month = 12, day = 18);

Hope it helps!

How to drop rows from partitioned hive table?

Table partitions can be overwritten directly from select from itself + WHERE filter. The scenario is quite simple, you do not need any temporary table. Make backup table if you are not sure what will happen.

  1. If you want to drop entire partitions (not overwrite), execute

    ALTER TABLE TableName DROP IF EXISTS
    PARTITION (<partition spec to be dropped>); --check partition spec to be dropped carefully

Skip this if no partitions to be dropped.


  1. Overwrite other partitions with filtered rows:

    set hive.exec.dynamic.partition=true;
    set hive.exec.dynamic.partition.mode=nonstrict;
    set hive.allow.move.on.s3=true; --If you are on Qubole/S3

    insert overwrite table TableName partition (date, source_key ) --partition spec should match table DDL
    select * from TableName
    where <condition> --condition should be True for rows which NOT be deleted

Your code is rather confusing because you created temporary table using LIKE but using different partition specification and selecting * (same order of columns like in original table). Order of columns shold match exactly, partition columns are the last ones, also in the same order.

How do we drop partitions in hive with regex. Is it possible?

Regexp is not supported unfortunately.

You can use all these comparators < > <= >= <> = != maybe it will help. See usage in this answer: https://stackoverflow.com/a/56646879/2700344

See also this jira Extend ALTER TABLE DROP PARTITION syntax to use all comparators

Also one more jira not implemented yet: Extend ALTER TABLE DROP PARTITION syntax to use multiple conditions

Impala supports LIKE in drop partition:

alter table historical_data drop partition (year < 1995, last_name like 'A%');

Created this Jira for adding regexp, please vote in the Jira if you need it.



Related Topics



Leave a reply



Submit