Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
This documentation has been retired and might not be updated.
This article shows how to import a Hive table from cloud storage into Azure Databricks using an external table.
Databricks does not recommend using Hive tables for storing or organizing data. This documentation is provided to help you configure a connection to an existing Hive table to migrate or ingest data from an external system.
Issue a SHOW CREATE TABLE <tablename>
command on your Hive command line to see the statement that created the table.
hive> SHOW CREATE TABLE wikicc;
OK
CREATE TABLE `wikicc`(
`country` string,
`count` int)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'<path-to-table>'
TBLPROPERTIES (
'totalSize'='2335',
'numRows'='240',
'rawDataSize'='2095',
'COLUMN_STATS_ACCURATE'='true',
'numFiles'='1',
'transient_lastDdlTime'='1418173653')
If the statement that is returned uses a CREATE TABLE
command, copy the statement and replace CREATE TABLE
with CREATE EXTERNAL TABLE
.
EXTERNAL
ensures that Spark SQL does not delete your data if you drop the table.- You can omit the
TBLPROPERTIES
field.
DROP TABLE wikicc
CREATE EXTERNAL TABLE `wikicc`(
`country` string,
`count` int)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'<path-to-table>'
SELECT * FROM wikicc