Dynamic partition and static partition of hive

Keywords: hive

Dynamic partition and static partition of hive

1.1 static partition

If the value of the partition is determined, it is called a static partition.
When adding a partition or loading partition data, the partition name has been specified.

create table if not exists day_part1(
uid int,
uname string
)
partitioned by(year int,month int)
row format delimited fields terminated by '\t';

##Load data specified partition

load data local inpath '/root/Desktop/student.txt' into table day_part1 partition(year=2017,month=04);

##Add partition specify partition name

alter table day_part1 add partition(year=2017,month=1) partition(year=2016,month=12);

Dynamic partition: the value of the partition is uncertain, determined by the input data

2.1 related properties of dynamic partition:
hive.exec.dynamic.partition=true: allow dynamic partition
hive.exec.dynamic.partition.mode=strict: partition mode setting
strict: at least one static partition is required
nostrict: can be all dynamic partition
hive.exec.max.dynamic.partitions=1000: maximum number of dynamic partitions allowed
hive.exec.max.dynamic.partitions.pernode =100: the maximum partition allowed to be created by mapper/reducer on a single node

2.2 dynamic zoning

##Create temporary table

create table if not exists tmp
(uid int,
commentid bigint,
recommentid bigint,
year int,
month int,
day int)
row format delimited fields terminated by '\t';

##Load data

load data local inpath '/root/Desktop/comm' into table tmp;

##Create dynamic partition table

create table if not exists dyp1
(uid int,
commentid bigint,
recommentid bigint)
partitioned by(year int,month int,day int)
row format delimited fields terminated by '\t';

##Strict mode

insert into table dyp1 partition(year=2016,month,day)
select uid,commentid,recommentid,month,day from tmp;

##Non strict mode

##Set non strict mode dynamic partition

set hive.exec.dynamic.partition.mode=nostrict;

##Create dynamic partition table

create table if not exists dyp2
(uid int,
commentid bigint,
recommentid bigint)
partitioned by(year int,month int,day int)
row format delimited fields terminated by '\t';

##Load data for non strict mode dynamic partition

insert into table dyp2 partition(year,month,day)
select uid,commentid,recommentid,year,month,day from tmp;

3. Pay attention to the details of the Division

(1) Do not use dynamic partitions as much as possible, because when dynamic partitions are used, the number of reducers will be allocated to each partition. When the number of partitions is large, the number of reducers will increase, which is a disaster to the server.

(2) The difference between the dynamic partition and the static partition is that the static partition will be created no matter whether there is data or not, and the dynamic partition will be created if there is result set, otherwise it will not be created.

(3) , strict mode of hive dynamic partition and strict mode of hive.mapped.mode provided by hive.

hive provides us with a strict pattern: to prevent users from accidentally submitting malicious hql

hive.mapred.mode=nostrict : strict

If the pattern value is strict, three queries are blocked:
(1) , for partitioned table query, the filter field in where is not a partitioned field.

(2) , Cartesian product join query, join query statement, without on condition or where condition.

(3) . for the order by query, if there is an order by query, there is no limit statement.

13 original articles published, 8 praised, 255 visited
Private letter follow

Posted by noisyscanner on Tue, 14 Jan 2020 21:53:44 -0800