The DC/OS Foundations Engineering Team builds the core components that make the Data Center Operating System ecosystem tick. From the initial point of customer experience (the installer) to the glue code that underpins how DC/OS services work together, we get to play with every piece of the system.
This team is consists of mostly DevOps engineers with a strong slant towards coding. However, all of us come with a strong operations or site reliability background. We feel at ease talking about the finer points of CAP or the differences between multi-thread or multi-process architectures.
We don't mind getting into the weeds with a hard to diagnose networking issue, and we troubleshoot such problems by leveraging our years of frontline firefighting in web operations. The OSI model isn't a stranger in our day to day work and on any given day we find ourselves working in almost every layer.
Some of us had experience with Mesos before coming on board at Mesosphere, and some of us didn't. However, having a strong understanding of distributed systems and systems engineering is key to our success. We’ve been solving Site Reliability Engineer problems through code before SRE or DevOps became a term, and take pride in creating software which people rely on and is a joy to use.
At Mesosphere, we pride ourselves on building insanely great products. On the DC/OS Foundations team, we like think that product begins with us.
- Help architect, build and maintain systems engineering code in different languages (we have Python, Golang, Ruby and others)
- Help write unit and integration tests for the products we build – we pride ourselves on TDD
- Contribute to documentation for both our customers and other engineers
- Make DC/OS the easiest operating system to deploy, manage, and monitor at scale
- Responsible for 3rd party services and production infrastructure in which DC/OS is operating on
- Partner with other engineers to design, build, and maintain critical systems
- Ship code that is well-tested and survives at internet scale
- Consistently work to make our software simpler
- Effectively estimate time to implement designs
- Challenge yourself and your peers to always improve
- BS or Master’s degree in Computer Science, related degree, or equivalent experience
- Expert level knowledge in at least one high level programming language such as Python, Ruby, or Go
- 3+ years experience with OOP, and infrastructure design skills
- Designed and operated large scale infrastructure running on AWS or other cloud providers
- Able to debug, troubleshoot, and resolve complex technical issues reported by customers
- Practical knowledge in network programming and/or programming against HTTP APIs
- Background in system administration, operations or site reliability
- Understanding of network protocols and networking generalist
- Production experience with service oriented architectures and distributed systems like Mesos, Kafka, Cassandra, Hadoop, Zookeeper, etc.
- Pragmatic and know how to ship high-value features quickly
- An extremely clear, concise and effective communicator
- Deep knowledge of cgroups and Linux fundamentals
- Know Python internals and libraries like the back of your hand
- Worked with container systems like Docker or Rkt in production
- Strong sense of ownership, urgency and drive
- Self-driven and motivated, with a strong work ethic and a passion for problem solving
Mesosphere is dedicated to helping enterprises unlock the next generation of datacenter scale, efficiency and automation with Apache Mesos. Mesosphere’s commercial product is the Mesosphere Datacenter Operating System (DCOS), a new kind of operating system that spans the entire datacenter and pools datacenter resources and automates IT operations. Backed by Andreessen Horowitz, Data Collective, Fuel Capital and Kleiner Perkins Caufield & Byers, Mesosphere is headquartered in San Francisco with a second office in Hamburg, Germany.
To apply for this job please visit tinyurl.com.