Running (Py)Spark on Airflow Locally & Processing Good/Bad Data separately (Batch-mode).Tech Stack: Docker, Airflow, Spark, S3May 8, 2021May 8, 2021
Validation Layer: An approach to schema validation for Big DataFor numerous reasons, having a validation layer in a data platform is critical, with the end goal of having control over what and how data…May 16, 2020May 16, 2020