We are looking for engineers with production experience using SQL and Python in a Linux on AWS environment to join our Data Platform team.
Our team is the horizontal layer supporting business intelligence, optimization, machine learning, external & internal reporting, and data APIs. We process and report on hundreds of millions of events and user attributes per day, gathered from an extremely heterogeneous set of input streams.
Our bread and butter is Python and PostgreSQL, but we also utilize a range of technologies including Lambda, SNS, SQS, Redis, and Dynamodb for caching and mapping, as well as Redshift, Kinesis, and Spark for large dataset munging and ad hoc analytics.
Responsibilities, As A Key Member Of Our Data Platform Team
- Developing new data processing infrastructure
- Adding new reporting features (i.e. low latency data-marts, API reporting services, and other custom reporting solutions)
- Building new, and scaling out existing, Spark pipelines and streaming apps
- Building out new ingest pipelines
- Working with the data-science team to productionize ML pipelines
- Coordinating data models with other engineering teams
- Working with the DevOps team to increase our monitoring and alerting coverage as needed
- Discover scale bottlenecks and successfully overcome them
- Contributing technical specification documents for data architecture projects
- Contributing to ongoing maintenance of existing infrastructure and investigating issues and failures
- 5+ years Python development experience
- 3+ years PostgreSQL experience, SQL fluency, and an understanding of relational data models is a must
- Extensive experience building and managing ETL pipelines for cloud-based projects from initial inception to production
- Experience ensuring high degree of reliability and data integrity for mission critical systems
- Prior experience working with Spark or other big data processing platform in high-volume environments (and running the big data solutions in AWS) is a plus
- Experience with large migration projects (moving data and ETL pipelines from a tech stack to a new, different one) is a plus
- Excellent communication skills and a collaborative mindset
- Interest in mentoring junior team members