Sponsored Links
Ad by Google
Apache Hive is a popular data warehouse, based on Hadoop infrastructure and very demanding for data analytic. Nowadays Hive is almost used in every data analytic job. It's very much similar to any sql-oriented rdbms syntax but the objective of Hive is totally different than, traditional RDBMS. Hive is very popular for batch processing.
In this article, I am going to show you an example of one of the collection data type in hive known as struct, although we have already seen a complete hive data type tutorial here. Hive's collection data type support four different type and those are-
Collection data type in Hive:
Struct data type in Hive:
It's very much similar to Java object or exactly same as struct in C language. It contains different types of fields unlike array(array contains similar type) and fields can be accessed via .(dot) notation like product.id
Sample Cricket Player Dataset:
Lets create table to hold struct type:
Describe command to verify table creation:
Load data into cricket_players table:
putting data on hdfs
Let's access some elements from struct type:
That's it in a simpler way to use struct in hive.
In this article, I am going to show you an example of one of the collection data type in hive known as struct, although we have already seen a complete hive data type tutorial here. Hive's collection data type support four different type and those are-
Collection data type in Hive:
- Array: Indexed based collection of similar type.
- Struct: Object(object contains different types of fields)
- Map: Collection of Key-Value pair.
- uniontype: is a collection of heterogeneous data types.
Struct data type in Hive:
It's very much similar to Java object or exactly same as struct in C language. It contains different types of fields unlike array(array contains similar type) and fields can be accessed via .(dot) notation like product.id
Sample Cricket Player Dataset:
Lets create table to hold struct type:
create table cricket_players(id int,team string,country string,player string, match_details struct<total_test:int,total_odi:int,debut_dt:string>) row format delimited fields terminated by '\001' collection items terminated by '\002' stored as textfile;
Describe command to verify table creation:
hive> desc cricket_players; OK id int team string country string player string match_details struct<total_test:int,total_odi:int,debut_dt:string> Time taken: 0.083 seconds, Fetched: 5 row(s)
Load data into cricket_players table:
putting data on hdfs
hadoop fs -put cricket_players /loading dataset
hive> load data inpath '/cricket_players' into table cricket_players;Select Query:
hive> select * from cricket_players;Output:
Let's access some elements from struct type:
hive> select team,country,player,match_details.total_test as total_test,match_details.total_odi as total_odi from cricket_players;Output:
That's it in a simpler way to use struct in hive.
Sponsored Links
0 comments:
Post a Comment