Contract
Dallas, TX
Posted 4 weeks ago

websites for c2c jobs

Role: Databrick Architect
Location: Remote
Duration: 12+ Months

Must Have Skills –
• Databricks +AWS
• Data Modeling & Design
• PySpark Scripts
• SQL Knowledge
• Data Integration
• Unity Catalog and Security Design
• Identity federation
• Auditing and Observability system tables/API/external tools
• Access control / Governance in UC
• External locations & storage credentials
• Personal tokens & service principals
• Metastore & unity catalog concepts
• Interactive vs production workflows
• Policies & entitlements
• Compute types (incl. UC & non-UC

Job description-
Note:- Candidate should have Hands on experience in Databricks +AWS,
Data Modeling & Design, PySpark Scripts, SQL Knowledge, Unity Catalog
and Security Design, Identity federation, Auditing and Observability
system tables/API/external tools, Access control / Governance in UC,
External locations & storage credentials, Personal tokens & service
principals, Metastore & unity catalog concepts, Interactive vs
production workflows, Policies & entitlements, Compute types (incl. UC &
non UC, scaling, optimization)

Key Responsibilities:
1. Data Strategy & Architecture Development
• Define and implement scalable, cost-effective, and high-performance
data architecture aligned with business objectives.
• Design Lakehouse solutions using Databricks on AWS, Azure, or GCP.
• Establish best practices for Delta Lake and Lakehouse Architecture.
2. Data Engineering & Integration
• Architect ETL/ELT pipelines using Databricks Spark, Delta Live Tables
(DLT), and Databricks Workflows.
• Integrate data from sources like Oracle Fusion Middleware, Web
Methods, MuleSoft, Informatica.
• Enable real-time and batch processing using Apache Spark and Delta
Lake.
• Ensure seamless connectivity with enterprise platforms (Salesforce,
SAP, ERP, CRM).
3. Data Governance, Security & Compliance
• Implement governance frameworks using Unity Catalog for lineage,
metadata, and access control.
• Ensure HIPAA, GDPR, and life sciences regulatory compliance.
• Define and manage RBAC, Databricks SQL security, and access policies.
• Enable self-service data stewardship and democratization.
4. Performance Optimization & Cost Management
• Optimize Databricks compute clusters (DBU usage) for cost efficiency.
• Leverage Photon Engine, Adaptive Query Execution (AQE), and caching
for performance tuning.
• Monitor workspace health, job efficiency, and cost analytics.
5. AI/ML Enablement & Advanced Analytics
• Design and manage ML pipelines using Databricks MLflow.
• Support AI-driven analytics in genomics, drug discovery, and clinical
data.
• Collaborate with data scientists to deploy and operationalize ML
models.
6. Collaboration & Stakeholder Engagement
• Align data strategy with business objectives across teams.
• Engage with platform vendors (Databricks, AWS, Azure, GCP,
Informatica, Oracle, MuleSoft).
• Lead PoCs, drive Databricks adoption, and provide technical
leadership.
7. Data Democratization & Self-Service Enablement
• Implement self-service analytics using Databricks SQL and BI tools
(Power BI, Tableau).
• Foster data literacy and enable data sharing frameworks.
• Establish robust data cataloging and lineage.
8. Migration & Modernization
• Lead migration from legacy platforms (Informatica, Oracle, Hadoop) to
Databricks Lakehouse.
• Design cloud modernization roadmaps ensuring minimal disruption.

Key Skills:
Databricks & Spark:
• Databricks Lakehouse, Delta Lake, Unity Catalog, Photon Engine.
• Apache Spark (PySpark, Scala, SQL), Databricks SQL, Delta Live Tables,
Databricks Workflows.
Cloud Platforms:
• Databricks on AWS (preferred), Azure, or GCP.
• Cloud storage (S3, ADLS, GCS), VPC, IAM, Private Link.
• Infrastructure as Code: Terraform, ARM, CloudFormation.
Data Modeling & Architecture:
• Dimensional, Star Schema, Snowflake, Data Vault.
• Experience with Lakehouse, Data Mesh, and Data Fabric architectures.
• Data partitioning, indexing, caching, query optimization.
ETL/ELT & Integration:
• ETL/ELT development with Databricks, Informatica, MuleSoft, Apache
tools.

To apply for this job email your details to piyush@empowerprofessionals.com

Databrick Architect-100% Remote Role

Related

Top 500+ Prime Vendor List, Quick overview and make submission Faster

Top 50+ Caregiver Jobs in USA with Visa Sponsorship, Skyrocket your career

Top 120+ USA Job WhatsApp Group Links, Quick get dream Jobs