Article
Data & Platform
Apache Spark — Core Concepts and When to Use It
RDDs, DataFrames, Spark SQL, and the use cases where Spark is the right tool.
Spark
Apache Spark
DataFrames
distributed compute
Spark SQL
Article
Data & Platform
Testing Strategy for Data Pipelines
Unit tests, integration tests, data contract tests, and regression testing for pipelines.
testing
data pipeline
dbt
unit tests
integration tests