xml (DataStreamReader)

Loads an XML file stream and returns the result as a DataFrame. If schema is not specified, the input schema is inferred from the data.

Syntax

Parameters

Parameter Type Description
path str Path for the XML input.
schema StructType or str, optional Schema as a StructType or DDL-formatted string (for example, col0 INT, col1 DOUBLE).

Returns

DataFrame

Examples

Write a DataFrame to XML and read it back as a stream:

import tempfile
import time
with tempfile.TemporaryDirectory(prefix="xml") as d:
    spark.createDataFrame(
        [{"age": 100, "name": "Hyukjin Kwon"}]
    ).write.mode("overwrite").option("rowTag", "person").xml(d)
    q = spark.readStream.schema(
        "age INT, name STRING"
    ).xml(d, rowTag="person").writeStream.format("console").start()
    time.sleep(3)
    q.stop()