Site Reliability Engineer - Global Fintech Startup
Crypto.com
2021-12-03 09:02:09
Stratford, California, United States
Job type: fulltime
Job industry: Banking & Financial Services
Job description
About Crypto.com
Crypto.com was founded in 2016, Crypto.com today serves over 10 million customers with the world's fastest growing crypto app, along with the Crypto.com Visa Card - the world's largest crypto card program - the Crypto.com Exchange and Crypto.com DeFi Wallet. Recently launched, Crypto.com NFT is the premier platform for collecting and trading NFTs, carefully curated from the worlds of art, design, entertainment and sports.Crypto.com is built on a solid foundation of security, privacy and compliance and is the first cryptocurrency company in the world to have ISO/IEC 27701:2019, CCSS Level 3, ISO27001:2013 and PCI:DSS 3.2.1, Level 1 compliance, and independently assessed at Tier 4, the highest level for both NIST Cybersecurity and Privacy Frameworks. With over 2,600 people in offices across the Americas, Europe and Asia, Crypto.com is accelerating the world's transition to cryptocurrency.
Find out more:
Role & Responsibilities:
Be part of the cross-functional Site Reliability team to deliver a system which can serve 100 millions active users from around the world. Site Reliability Engineers are empowered to design and implement both technical and non technical solutions in order to minimize unplanned downtime.
here are some highlights of the role:
- identify problems and propose technical solutions and processes to solve them
- setup monitoring and improve observability of our docker based backend system
- minimize human errors and operation challenges by automation
- define report metrics and provide insights to improve system performance and scalability
- work with engineers and system architects to deliver non-functional business requirements
- utilize third party monitoring tooling or build whatever tools to get the job done
Requirements:
- 8+ years of working experience in backend development and/or production system maintenance
- working experience on implementing, maintaining and optimizing critical large scale financial system
- working experience on leading engineers
- familiar with DevOps
- familiar with autoscaling docker based cloud infrastructure
- familiar with application monitoring
- familiar with zero downtime scalable distributed system design
- familiar with linux
- familiar with system networking
- work independently and being accountable
- capable to work under pressure
- aggressive in learning and self-improvement
- willing to take criticism and also providing honourable feedback to others
- willing to drive changes, to both yourself and your teammates
Advantage
- Deep knowledge of AWS products, resources and resilience setup
- working experience on monitoring and tuning RDBMS database performance
- working experience on event-driven microservices system design
- working experience on 24x7 production support operation design
- working experience on stress test setup and execution
- working experience on third party monitoring platforms (e.g. NewRelic, Datadog, AWS Cloudwatch)
- working experience on improving Kubernetes observability (e.g. Prometheus, Grafana)
- working experience on message queue and event streams (e.g. RabbitMQ, Kafka)
- working experience on sizing and cost control
- working experience on Blue Green Grey deployment
- good at technical writing and communication
Benefits
- We offer an attractive compensation package working in a cutting-edge field of Fintech.
- Huge responsibilities from Day 1. Be the owner of your own learning curve. The possibilities are limitless and depend on you.
- You get to work in a very dynamic environment and be part of an international team.
- You will get to have involvement in developing brand new products from scratch and/or developing business in new regions, alongside with a talented team.