Responsibilities and Duties
- Evolving our infrastructure platform building and running large-scale, massively distributed and fault-tolerant systems.
- Working closely with our Product and Development teams to architecture and develop first-class infrastructure components.
- Designing and implementing tooling to improve the availability, scalability, observability and latency of our services, which are used by our internally critical and our externally-visible customers to deploy and operate the services.
- Administer our infrastructure built on Aliyun, Amazon Web Services and Microsoft Azure.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- Automate key deployment, monitoring, testing, and verification processes.
Desired Skills and Experience
- You've at least 2 years of relevant systems administration experience working in large setups
- You're expert in problem-solving and analyzing high loaded systems
- You've strong knowledge of Aliyun and AWS
- You're experience in networking: VPC, subnet, security groups,
- You've experience in compute services: EC2, ECS, Lambda,
- You can use developer tools: CodePipeline, CodeBuild, CodeDeploy
- You've knowledge in databases: RDS (MySQL, PostgreSQL, SQLServer), Elasticache, ElasticSearch
- You've strong knowledge of Terraform framework to manage Aliyun, AWS or Azure infrastructure
- You've strong knowledge of LINUX
- You can speak, read, and write Chinese and English fluently
You’ll be our rockstar if
- You've experience with CI/CD frameworks with GitHub Actions
- You've already worked with change management systems (preferably Ansible)
- You've already met these technologies: Docker, Kubernetes / Perl, Python, Javascript, Java / Kafka, Redis
简历投递邮箱:
lemon.li@parklu.com /
[email protected]