You love Linux but you may have experience managing other systems too. And you love hardware. You live to build beautifully orchestrated clusters with many machines working together. You feel at home when in the Data Center. You've seen firsthand how anything can fail at any time and know how to design systems that have the necessary fault tolerance. Tough problems excite you and motivate you to go above and beyond. You are able to decide for the right reasons whether to deploy on-premise, choosing and setting up the machines yourself, or go to AWS/GCP. You are looking forward to the next software release, to roll it out and make the new features available. You see the value in optimizing developer workflows. You've got character. Then, we want to talk to you.
You will be working closely with a lean team of Software Engineers who write deep, storage-oriented code to handle large-scale infrastructure challenges which you may have already encountered, so you'll feel at home. You'll be responsible for the company's existing infrastructure and will also be the one executing its large expansion. You'll be facing some interesting orchestration problems when running at scale and you will be implementing testing and integration workflows for the rest of the team. Along with designing our clusters you will also be responsible for the tooling that our engineers will use. We utilize the latest modern tools, and are strong supporters of open source. Being one of our first DevOps hires, you will also help with recruiting, mentoring, and leading the infrastructure team as the company grows.
Our ideal candidate has managed large, production-quality systems with multiple points of failure, contributed to popular open source projects and developed custom tooling to automate workflows.
- BS in Computer Science or similar. MS/PhD a plus.
- 5+ years experience in relevant roles.
- Proficient in Linux administration.
- Proficient in setting up physical machines, switches and storage in the Data Center.
- Proficient with configuration management tools: Puppet, Ansible, Salt, Chef.
- Proficient with Python and shell scripting.
- Experience with continuous deployment, live monitoring, dynamic load balancing, and security.
- Experience with continuous integration tools: Buildbot, Travis.
- Experience with Linux package and repository management (Debian- and RedHat-based).
- Experience with managing large clusters in the order of hundreds to thousands of machines.
- Competitive compensation package, including significant equity component.
- Convenient working location with great subway access.
- Breakfast and Lunch. Random snacks and beverages around the day.
- Sponsorship for top-tier international conferences and seminars.
- Access to a carefully curated Art and Tech library.