hive表中字段显示为NULL时,HDFS文件中存储为\N

时间:2023-03-08 23:02:13
hive表中字段显示为NULL时,HDFS文件中存储为\N

hive数据落地到hdfs,null会默认用'\N'存储

解决方式1:利用命令(这个我没起效果)

alter table adl_cici_test_fdt set serdeproperties('serialization.null.format' = '');

 解决方式2;建表时直接指定(两种方式)

a、用语句
ROW FORMAT SERDE ‘org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe’
with serdeproperties('serialization.null.format' = '')
实现,注意两者必须一起使用,如
CREATETABLE hive_tb (idint,name STRING)
PARTITIONED BY (`day` string,`type` tinyint COMMENT'0 as bid, 1 as win, 2 as ck',`hour` tinyint)
ROW FORMAT SERDE ‘org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe’
WITH SERDEPROPERTIES (‘field.delim’='/t’,‘escape.delim’='//’,serialization.null.format'='' )
STORED AS TEXTFILE;
b、或者通过ROW FORMAT DELIMITED NULL DEFINED AS ''
CREATETABLE hive_tb (idint,name STRING)
PARTITIONED BY (`day` string,`type` tinyint COMMENT'0 as bid, 1 as win, 2 as ck',`hour` tinyint)
ROW FORMAT DELIMITED
NULL DEFINED AS ''
STORED AS TEXTFILE;
如:
create table vip_info( id string, mobile string)row format delimited fields terminated by '\t' NULL DEFINED AS '' stored as textfile;

解决方式3

insert overwrite directory 'hive_test/vip'
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'field.delim'='\t',
'serialization.format'= '',
'serialization.null.format'=''
) STORED AS TEXTFILE
select * from user;