Skip to main content

Hive Table

Reads data from or writes data to Hive tables managed by your workspace's Metastore.

note

Please choose the provider as Hive on properties page.

Source

Source Parameters

ParameterDescriptionRequiredDefault
Database nameName of the databaseTrue
Table nameName of the tableTrue
ProviderMust be set to hiveTrue
Filter PredicateWhere clause to filter the tableFalse(all records)

Source Example

Generated Code

Without filter predicate

def Source(spark: SparkSession) -> DataFrame:
return spark.read.table(f"test_db.test_table")

With filter predicate

def Source(spark: SparkSession) -> DataFrame:
return spark.sql("SELECT * FROM test_db.test_table WHERE col > 10")

Target

Target Parameters

ParameterDescriptionRequiredDefault
Database nameName of the databaseTrue
Table nameName of the tableTrue
Custom file pathUse custom file path to store underlying filesFalse
ProviderMust be set to hiveTrue
Write ModeWhere clause to filter the tableTrue(all records)
File FormatFile format to use when saving dataTrueparquet
Partition ColumnsColumns to partition byFalse(empty)
Use insert intoIf true, use .insertInto instead of .save when generating codeFalsefalse

Below are different type of write modes which prophecy provided hive catalog supports.

Write ModeDescription
overwriteIf data already exists, existing data is expected to be overwritten by the contents of the DataFrame.
appendIf data already exists, contents of the DataFrame are expected to be appended to existing data.
ignoreIf data already exists, the save operation is expected not to save the contents of the DataFrame and not to change the existing data. This is similar to a CREATE TABLE IF NOT EXISTS in SQL.
errorIf data already exists, an exception is expected to be thrown.

Below are different type of file formats during write which prophecy provided hive catalog supports.

  1. Parquet
  2. Text file
  3. Avro
  4. ORC
  5. RC file
  6. Sequence file

Target Example

Generated Code

def Target(spark: SparkSession, in0: DataFrame):
in0.write\
.format("hive")\
.option("fileFormat", "parquet")\
.mode("overwrite")\
.saveAsTable("test_db.test_table")