-
prepare a local table file named sample.csv having content as below.
35, Amy 17, Ben 5,Chris 10,Don
- create table. -- creating a managed table that "knows" how the flat file are formated.
hive> use testdb; hive> create table test_hive(age int, name string) row format delimited fields terminated by ',';
Note: explicitly specifying the delimited character like "row format delimited fields terminated by ','
" is the key part here as the default delimited character identified by hive is ^A (ASCII code 1). - Loading file. -- load file from local file system into Hive.
hive>use testDB; hive> load data local inpath '/home/hadoopuser/hiveTest/sample.csv' overwrite into table test_hive;
Note: this example is about loading file from linux local file system into Hive that is on HDFS file system. To load file from HDFS file system into Hive, we probably create an external table for it. - Check data. -- select from table.
hive> select * from test_hive; OK 35 Amy 17 Ben 5 Chris 10 Don
Note: You can see that "Amy" and "Ben" are not aligned with rest two rows. It is because hive takes all white spaces at the front of string and behind string as part of column values. So, we need to trim white spaces before and behind strings if white spaces are not expected in hive table.
To determine it. Run below queries to see different results.select * from test_hive where name='Amy'; select * from test_hive where name=' Amy'; select * from test_hive where name=' Amy ';
The above example is an extremely simple one. To check all possible options when a table is created, please check Hive DDL document from Apache and Cloudera.
No comments:
Post a Comment