Data
storage model in Apache Spark is based on RDDs. RDDs help achieve fault
tolerance through lineage. RDD always has the information on how to build from
other datasets. If any partition of a RDD is lost due to failure, lineage helps
build only that particular lost partition.