Runbook

Spark HDFS Connectivity Issue Incident

Back to Runbooks

Overview

The Spark HDFS Connectivity Issue Incident refers to a situation where there is a problem with the connectivity between Apache Spark and Hadoop Distributed File System (HDFS). This can result in Spark being unable to access data stored in HDFS or facing performance issues due to slow connectivity. This type of incident can be caused by various factors such as network issues, misconfiguration, or software bugs. It can impact the overall functioning of Spark-based applications and require immediate attention to ensure smooth operations.

Parameters

Debug

Check if Spark is running

Check if HDFS is running

Check if Spark can connect to HDFS via Hadoop command line

Check if HDFS can be accessed via Hadoop command line

Incorrect configuration of Spark or HDFS settings

Repair

Try restarting the Spark and HDFS services to see if that resolves the connectivity issue.