Search

How to skip header rows from a table in Hive?



Suppose while processing some log files, we may find header records.
System=….
Version=…
Sub-version=….
Like above, It may have 3 lines of headers that we do not want to include in our Hive query. To skip header lines from our tables in Hive we can set a table property that will allow us to skip the header lines.
MySQL


CREATE EXTERNAL TABLE userdata (name STRING,
job STRING,
dob STRING,
id INT,
salary INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY STORED AS TEXTFILE
LOCATION ‘/user/data’
TBLPROPERTIES("skip.header.line.count"="3”);