Skip to main content

File Operation

Helps perform file operations like copy and move on different file systems

Parameters

ParameterDescriptionRequired
File SystemLocal - for operations on driver node file system
DBFS - for operations on Databricks file system
True
OperationOperation to perform, Copy or MoveTrue
RecurseBoolean for performing Operation recursively. Default is FalseFalse
Source PathPath of source file/directory.
Eg: /dbfs/source_file.txt, dbfs:/source_file.txt
True
Destination PathPath of destination file/directory.
Eg: /dbfs/target_file.txt, dbfs:/target_file.txt
True
info

You can perform operations on DBFS files using Local file system too by providing path under /dbfs!
This is because Databricks uses a FUSE mount to provide local access to the files stored in the cloud. A FUSE mount is a secure, virtual filesystem.

Examples


Copy Single File

Example usage of File Operation - 1

def copy_file(spark: SparkSession):
from pyspark.dbutils import DBUtils
DBUtils(spark).fs.cp(
"dbfs:/Prophecy/example/source/person.json",
"dbfs:/Prophecy/example/target/person.json",
recurse = False
)

Copy All Files From A Directory

Example usage of File Operation - 2

def copy_file(spark: SparkSession):
from pyspark.dbutils import DBUtils
DBUtils(spark).fs.cp(
"dbfs:/Prophecy/example/source/",
"dbfs:/Prophecy/example/target/",
recurse = True
)

Copy Entire Directory

Example usage of File Operation - 3

def copy_file(spark: SparkSession):
from pyspark.dbutils import DBUtils
DBUtils(spark).fs.cp(
"dbfs:/Prophecy/example/source/",
"dbfs:/Prophecy/example/target/source",
recurse = True
)