Dice is the leading career destination for tech experts at every stage of their careers. Our client, Spadtek Solutions LLC, is seeking the following. Apply via Dice today!
Title: Data Engineer / Data Architect (PySpark | Databricks | Airflow | Data Lakehouse)
Location: San Antonio, Tx
Experience: 12+ years
Duration: 6+ months
Visa: H1B, - and L2
Position Overview:
We are looking for a highly skilled Data Engineer / Data Architect with strong experience in PySpark, Databricks, Airflow, DBT, and modern Data Lakehouse architectures. The ideal candidate will work on both POC environments and live production projects, supporting on prem Dell ecosystem setups as well as cloud based data platforms.
This role involves building scalable data pipelines, optimizing distributed processing, and architecting end to end data solutions using Starburst, Databricks, and Airflow.
Key Responsibilities:
Data Engineering & Architecture
Design, build, and optimize data pipelines using PySpark, DBT, and Python.
Architect and implement Data Lakehouse solutions using Databricks.
Work with Starburst engine for distributed SQL query processing.
Develop and maintain Airflow DAGs for orchestration and workflow automation.
Support POC systems and transition them into scalable production ready solutions.
Infrastructure & Ecosystem
Work within Dell on prem ecosystem, including local servers, storage, and containerized environments.
Manage data ingestion, transformation, and storage across on prem and cloud based systems.
Build small scale POC pipelines using Airflow, containers, and local compute environments.
Project Delivery
Contribute to live project environments, including BW related data flows (if applicable).
Collaborate with cross functional teams to define data models, architecture patterns, and best practices.
Ensure data quality, governance, and performance optimization across all pipelines.
Required Skills & Experience
- 10 years of experience in Data Engineering / Data Architecture.
- Strong hands on expertise in:
- PySpark
- Databricks
- Airflow
- DBT
- Python
- Experience working with:
- Starburst / Trino engines
- Data Lakehouse architectures
- On prem Dell ecosystem (servers, storage, containers)
- Experience building POC systems and scaling them to production.
- Strong understanding of distributed computing, ETL/ELT frameworks, and data modeling.