Open Opportunities
Senior Data Engineer
About The Position
About Cellebrite:
Cellebrite’s (Nasdaq: CLBT) mission is to enable its global customers to protect and save lives by enhancing digital investigations and intelligence gathering to accelerate justice in communities around the world. Cellebrite’s AI-powered Digital Investigation Platform enables customers to lawfully access, collect, analyze and share digital evidence in legally sanctioned investigations while preserving data privacy. Thousands of public safety organizations, intelligence agencies and businesses rely on Cellebrite’s digital forensic and investigative solutions—available via cloud, on-premises and hybrid deployments—to close cases faster and safeguard communities. To learn more, visit us at www.cellebrite.com, https://investors.cellebrite.com/investors and find us on social media @Cellebrite.
Position Overview:
We are looking for a Senior Data Engineer to join our Data Platform team and help build a secure, scalable Data Lake powering global search and analytics.
This is a hands-on role combining data engineering and system design, working with large-scale, complex datasets in a production environment.
What You’ll Do & Own:
- Build and maintain end-to-end data pipelines (batch & streaming)
- Design and improve our Data Lake / Lakehouse architecture
- Work with large-scale, complex datasets and optimize processing performance
- Structure data to support search and indexing systems
- Ensure data quality, reliability, and scalability in production
- Take ownership of features—from design to deployment
- Collaborate closely with backend, data, and product teams.
Requirements
Responsibilities:
Design and implement scalable, production-grade data pipelines using Spark and AWS
Build efficient data ingestion and transformation processes for large-scale datasets
Enable high-performance data access and querying for analytics and product use cases
- Take end-to-end ownership of data solutions — from design and development to deployment and production support
- Contribute to architecture decisions and continuously improve system design and reliability
- Monitor, troubleshoot, and optimize pipelines for performance, scalability, and cost
Technical Expertise (Must Have)
- 8+ years of experience in data engineering or related fields
- Strong hands-on experience with Spark / PySpark
- Proven experience building and operating data pipelines in production environments
- Solid experience with AWS data services (S3, EMR, Glue, Athena, Lambda, Kinesis)
- Experience working with large-scale, distributed systems
- Strong Python skills
Nice to Have:
- Experience with Data Lakes / Lakehouse architectures (e.g., Iceberg)
- Experience with streaming pipelines, CDC, or real-time processing
- Familiarity with search technologies (Elasticsearch / OpenSearch)
- Experience with Infrastructure as Code (Terraform / CDK)
- Background in high-scale or regulated environments
Mindset:
Strong ownership mindset, with the ability to independently drive solutions end-to-end
Ability to review designs and code, provide constructive feedback, and uphold engineering standards
- Strong communication skills, with the ability to clearly explain technical decisions and trade-offs to diverse stakeholders
- Pragmatic and solution-oriented approach to problem-solving
- Comfortable working in a cross-functional, fast-paced environment
What We Offer:
- Competitive compensation and benefits
- Opportunity to work on large-scale, real-world systems
- High ownership and meaningful technical challenges
- Collaborative, experienced team