Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Site Reliability Engineer (Application Software) image - Rise Careers
Job details

Site Reliability Engineer (Application Software)

SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars.

SITE RELIABILITY ENGINEER (APPLICATION SOFTWARE) 

The application software team is the central nervous system of SpaceX – we create mission critical applications that are used throughout SpaceX to accelerate launch vehicle production and flight as well as systems that allow Starlink to grow into a worldwide fast, reliable Internet service. 

We are looking for an experienced Site Reliability Engineer to operate and scale custom-built mission-critical software products for engineering, test, and launch. These products are used to deliver the software flying rockets, spacecraft, satellites, and more - every time a Falcon 9 launches, a Dragon capsule docks with the ISS, or a Starlink satellite connects a new community, the software responsible for it was created with the tools you'll build and maintain. 

SpaceX relies on our vehicle software being built quickly and correctly, tested rigorously, and rapidly iterated on. This allows us to pioneer technologies that were science fiction a decade ago; you'll work to ensure that software delivery at SpaceX keeps pace with other engineering efforts, to enable our goal of making humanity multi-planetary. 

Aerospace experience is not required to be successful here - rather we look for smart, motivated, collaborative engineers who love solving problems and want to make an impact on a super inspiring mission. We are looking for engineers who treat fellow teammates with fairness, respect, and support. You will have full ownership of challenging problems, working with a team of enthusiastic engineers to design and produce solutions that enable SpaceX to move towards our goals at a rapid pace. The success of the missions at SpaceX depends on the software that you and your team produce. 

RESPONSIBILITIES: 

  • Deploy, upgrade, operate/maintain, and scale our suite of mission critical products and services 
  • Manage our underlying infrastructure as code and use modern observability tools to tell a complete story of application health 
  • Closely collaborate with software engineers to create highly operable and maintainable products 
  • Engage in and improve the whole software development lifecycle of services -- from inception and design, through deployment, operation, and refinement 
  • Practice sustainable incident response and blameless postmortems 
  • Provide end-user support to vehicle software engineers for products 
  • Participate in the team’s on-call rotation periodically 
  • Focus on performance bottlenecks and performance improvement techniques 

BASIC QUALIFICATIONS: 

  • Bachelor’s degree in computer science, information systems, or engineering discipline; OR 3+ years of professional experience with site reliability or DevOps without a degree 
  • Experience with Linux operating systems

PREFERRED SKILLS AND EXPERIENCE: 

  • 5+ years of DevOps, site reliability engineering, or system administration experience 
  • 3+ years of experience with Python and Python-based development frameworks 
  • Experience with source code and version control tools such as Git or Subversion 
  • Experience with infrastructure as code (IaC) products for automatically managing fleets of servers 
  • Experience with build systems (Make, Bazel/Pants/Buck, Gradle, etc.) and package management tools (pip, npm, etc.) 
  • Experience with both container and virtualization technologies (VirtualBox, KVM, Docker, Kubernetes, vSphere, EC2, GCE) 
  • Experience with Terraform, Ansible, Puppet, or other automation frameworks 
  • Knowledge of TCP/IP networking 
  • Experience with databases and data modeling 
  • Experience with workflow and issue management tools such as JIRA 
  • Ability to work with mission critical and sensitive systems, with a sense of urgency appropriate to the responsibilities 
  • Ability to communicate with customers, peers, management etc. in both formal and informal situations 

ADDITIONAL REQUIREMENTS: 

  • Must be able to work extended hours and weekends as needed 

COMPENSATION AND BENEFITS:

Pay Range:
Site Reliability Engineer/Level I: $120,000.00 - $145,000.00/per year
Site Reliability Engineer/Level II: $140,000.00 - $170,000.00/per year

Your actual level and base salary will be determined on a case-by-case basis and may vary based on the following considerations: job-related knowledge and skills, education, and experience.

Base salary is just one part of your total rewards package at SpaceX. You may also be eligible for long-term incentives, in the form of company stock, stock options, or long-term cash awards, as well as potential discretionary bonuses and the ability to purchase additional stock at a discount through an Employee Stock Purchase Plan. You will also receive access to comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, paid parental leave, and various other discounts and perks. You may also accrue 3 weeks of paid vacation & will be eligible for 10 or more paid holidays per year. Exempt employees are eligible for 5 days of sick leave per year.

ITAR REQUIREMENTS:

  • To conform to U.S. Government export regulations, applicant must be a (i) U.S. citizen or national, (ii) U.S. lawful, permanent resident (aka green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. § 1158, or be eligible to obtain the required authorizations from the U.S. Department of State. Learn more about the ITAR here.  

SpaceX is an Equal Opportunity Employer; employment with SpaceX is governed on the basis of merit, competence and qualifications and will not be influenced in any manner by race, color, religion, gender, national origin/ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, mental or physical disability or any other legally protected status.

Applicants wishing to view a copy of SpaceX’s Affirmative Action Plan for veterans and individuals with disabilities, or applicants requiring reasonable accommodation to the application/interview process should notify the Human Resources Department at (310) 363-6000.

SpaceX Glassdoor Company Review
3.8 Glassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star icon
SpaceX DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of SpaceX
SpaceX CEO photo
Elon Musk
Approve of CEO

Average salary estimate

$145000 / YEARLY (est.)
min
max
$120000K
$170000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Site Reliability Engineer (Application Software), SpaceX

At SpaceX, we believe that a future where humanity explores the stars is thrilling and achievable. As a Site Reliability Engineer (Application Software), you'll play a crucial role in this journey, helping us create and maintain the mission-critical applications that power our operations from launch vehicle production to worldwide internet services through Starlink. This isn't just about operating software; it's about empowering our rockets, spacecraft, and satellites with the tools you develop. You'll collaborate with a talented team of engineers who share your passion for problem-solving and making an impact. You'll have the opportunity to take ownership of challenging issues, ensuring that our software is delivered swiftly and efficiently. With your expertise, you’ll implement automated solutions, manage infrastructure as code, and enhance the health and vigilance of our applications using modern observability tools. Engaging directly with our development cycle will equip you to create highly operable and maintainable products. Your role encompasses everything from designing and deploying to maintaining and refining these essential systems. Even if you lack prior aerospace experience, if you are intelligent, motivated, and a team player, we want to hear from you. Join us in our mission to make humanity a multi-planetary species, and let's create the future together at SpaceX.

Frequently Asked Questions (FAQs) for Site Reliability Engineer (Application Software) Role at SpaceX
What does a Site Reliability Engineer do at SpaceX?

A Site Reliability Engineer at SpaceX operates and maintains mission-critical software products essential for engineering, testing, and launch operations. You'll ensure that these applications are reliable, scalable, and efficiently deployed, directly influencing the success of our global initiatives like Falcon 9 and Starlink, which improve our technology and enhance user experiences.

Join Rise to see the full answer
What qualifications do you need for the Site Reliability Engineer position at SpaceX?

To be a Site Reliability Engineer at SpaceX, you need a bachelor’s degree in computer science, information systems, or a related engineering discipline, or equivalent professional experience in site reliability or DevOps. Experience with Linux, Python, and modern infrastructure as code tools is crucial to effectively contribute to our projects.

Join Rise to see the full answer
Is aerospace experience required for the Site Reliability Engineer role at SpaceX?

No, aerospace experience is not a requirement for the Site Reliability Engineer position at SpaceX. We focus on finding smart, collaborative engineers who are eager to solve problems and align with our exciting mission. Your technical skills and positive attitude are what matter most.

Join Rise to see the full answer
What responsibilities does a Site Reliability Engineer have at SpaceX?

As a Site Reliability Engineer at SpaceX, your main responsibilities include deploying and scaling critical software services, collaborating with software engineers, managing infrastructure code, improving software development lifecycles, and practicing solid incident response strategies. You'll directly support the various engineering teams and contribute to SpaceX's operational excellence.

Join Rise to see the full answer
What tools do Site Reliability Engineers at SpaceX use?

Site Reliability Engineers at SpaceX utilize a range of tools including infrastructure as code products like Terraform and Ansible, version control systems like Git, and container technologies such as Docker and Kubernetes. You will also engage with observability tools to ensure the performance and health of our applications.

Join Rise to see the full answer
What is the work culture like for a Site Reliability Engineer at SpaceX?

The work culture for a Site Reliability Engineer at SpaceX is collaborative and innovative. Engineers are encouraged to share ideas and take ownership of complex problems, all within an environment that values respect, support, and the pursuit of a groundbreaking mission in aerospace technology.

Join Rise to see the full answer
What is the compensation range for a Site Reliability Engineer at SpaceX?

The compensation for a Site Reliability Engineer at SpaceX varies based on experience and skills, but you can expect a salary range between $120,000 and $170,000 per year depending on your level. Additional benefits include stock options, medical coverage, and generous vacation time.

Join Rise to see the full answer
Common Interview Questions for Site Reliability Engineer (Application Software)
Can you explain the principles of Site Reliability Engineering?

Site Reliability Engineering emphasizes automating operations and fostering a culture of collaboration between software and operations teams. Highlighting your experience in creating reliable systems, focusing on automation, incident management, and production readiness can showcase your comprehension of these principles.

Join Rise to see the full answer
How do you prioritize issues during site incidents?

During site incidents, I prioritize issues based on their impact on users and overall business objectives. Effectively communicating with the team to assess severity and urgency is crucial. Discuss past experiences where your prioritization led to successful incident resolution.

Join Rise to see the full answer
What tools have you used for monitoring application health?

In my previous roles, I've employed tools like Prometheus, Grafana, and ELK Stack for monitoring application health. I ensure that these tools provide comprehensive observability, allowing us to identify performance bottlenecks and improve our services efficiently.

Join Rise to see the full answer
What is Infrastructure as Code and why is it important?

Infrastructure as Code (IaC) is a management approach where infrastructure setup is managed and provisioned using code. This practice is vital for ensuring consistency, speed, and repeatability in deployments and helps reduce manual errors. Share your experience with tools like Terraform or Ansible to illustrate your understanding.

Join Rise to see the full answer
How do you handle blameless postmortems?

Handling blameless postmortems involves focusing on learning rather than assigning blame. I engage the team in identifying root causes, documenting them, and establishing preventive measures. This cultivates honesty and trust within the team, encouraging everyone to learn from failures.

Join Rise to see the full answer
Can you discuss your experience with containerization?

Certainly! I have extensive experience with Docker, defining microservices architecture, and managing Kubernetes clusters. Discuss specific projects where you improved deployment efficiency or resolved issues through containerization to provide context to your experience.

Join Rise to see the full answer
What strategies do you use for performance optimization?

I typically analyze application performance metrics and leverage profiling tools to pinpoint bottlenecks. Implementing caching strategies, code refactoring, and optimizing database queries have been effective in previous roles. Sharing specific metrics from past projects where you increased performance will strengthen your answer.

Join Rise to see the full answer
How do you ensure effective collaboration with software engineers?

Effective collaboration begins with open communication and establishing common goals. I regularly participate in development discussions, solicit feedback from engineers, and encourage pair programming to bridge gaps between operations and development.

Join Rise to see the full answer
What experience do you have with version control systems?

I have robust experience with Git for version control, utilizing branches for feature development and merging processes for integration. Properly managing source code allows for seamless collaboration with teams and maintaining code integrity, which is crucial for reliability.

Join Rise to see the full answer
Describe a challenging problem you solved in your previous role as a Site Reliability Engineer.

One challenging problem was a downtimes issue during peak usage hours; I identified a performance bottleneck using monitoring tools, implemented autoscaling, and optimized our database queries. This significantly improved our response time and user experience. Highlighting such real-world scenarios can demonstrate your problem-solving skills.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Mission Driven
Social Impact Driven
Passion for Exploration
Reward & Recognition
Photo of the Rise User
Mission Driven
Social Impact Driven
Passion for Exploration
Reward & Recognition
Photo of the Rise User
Posted 3 days ago
Photo of the Rise User
Posted 9 days ago
Photo of the Rise User
Posted 10 days ago
Mission Driven
Social Impact Driven
Passion for Exploration
Reward & Recognition
Photo of the Rise User
Posted 3 days ago
Photo of the Rise User
Veolia Environnement SA Hybrid 1935 S Hughes Way, El Segundo, CA 90245, USA
Posted 2 days ago

SpaceX, founded by Elon Musk, is an aerospace manufacturer and space transport services company aiming to revolutionize space technology, with the ultimate goal of enabling human life on Mars.

855 jobs
MATCH
VIEW MATCH
BADGES
Badge Future MakerBadge Office VibesBadge Work&Life BalanceBadge Rapid Growth
CULTURE VALUES
Mission Driven
Social Impact Driven
Passion for Exploration
Reward & Recognition
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
SALARY RANGE
$120,000/yr - $170,000/yr
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
November 28, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!