At the forefront of digital innovation, we're seeking highly skilled professionals to spearhead data engineering efforts. With a focus on harnessing the potential of Google Cloud Platform (GCP), our ideal candidate will drive the development and implementation of cutting-edge data solutions.
Job Description
Our team is responsible for designing and managing comprehensive data pipelines that integrate diverse sources, ensuring seamless data flow across various environments.
- Data Ingestion & Processing : Efficiently collect, process, and transform large datasets utilizing Hadoop clusters, Hive for querying, and PySpark for data transformations.
- Infrastructure Management : Deploy and manage infrastructure on GCP, encompassing network architectures, security implementations, DNS configuration, VPN, and Load Balancing.
- Core GCP Services Management : Work extensively with services like Google Kubernetes Engine (GKE), Cloud Run, BigQuery, Compute Engine, and Composer, all managed through Terraform.
- Application Implementation : Develop and implement Python applications for various GCP services, ensuring seamless integration and optimal performance.
- Critical Pipeline Integration : Integrate and manage CI / CD pipelines using GitLab Magenta for automating cloud deployment, testing, and configuration of diverse data pipelines.
- Comprehensive Security Measures : Implement robust security measures, manage IAM policies, secrets using Secret Manager, and enforce identity-aware policies.
- Data Integration : Seamlessly handle integration of data sources from CDI Datendrehscheibe (FTP servers), TARDIS APIs, and Google Cloud Storage (GCS).
- Multi-environment Deployment : Create and deploy workloads across Development (DEV), Testing (TEST), and Production (PROD) environments.
- AI Solutions Implementation : Implement AI solutions using Google's Vertex AI for building and deploying machine learning models.
The successful candidate will possess strong expertise in GCP, extensive knowledge of network architectures, security implementations, and management of core GCP services. Proficiency in employing data processing tools like Hive, PySpark, and data orchestration tools like Airflow is also essential.
We're looking for a highly skilled Data Engineer who can leverage their expertise to drive the success of our organization. If you're passionate about working with cutting-edge technology and have a proven track record of delivering high-quality results, we encourage you to apply.