Senior Performance Engineer

Palo Alto, CA

What's the best way to achieve the fastest standing quarter mile? More horsepower? Better traction? Adding lightness? As a performance engineer you will be responsible for driving and establishing the critical performance metrics for MongoDB, from macro benchmarks through workload characterizations. You will work with cutting edge customers (internal & external) to characterize what is changing in the software, capture and analysis of results and determine how to achieve a set of performance goals (code changes, application changes etc.). You will determine and recommend changes to the core kernel to improve the performance. You need to be hands-on and drive building frameworks, workloads and understanding customer performance and capacity planning issues.
 
Responsibilities

  • Set up of topologies: Create a library of large scale topologies that can be deployed to the cloud
  • Load generation: Create tools to generate loads with a precise mix of operations
  • Create a library of loads to run against target systems
  • Response verification: Create tools to measure the response times, throughputs and correctness of responses during load testing
  • System monitoring: Track the utilization of system resources across time at all nodes involved
  • Stress tests: An increasingly complex set of stress tests aimed at verifying that systems behave in a healthy fashion
  • Soak tests: Run heavy loads for a period of weeks in an attempt to uncover and fix longevity issues
  • Functional throughput tests: Find the cost of isolated functions by running load tests and measuring throughput at saturation
  • Compare behavior across builds to provide early warning of degraded performance
  • System performance tests: Measure how performance is impacted as a cluster is expanded to hundreds of nodes
  • White box testing: Instrument key code paths. Monitor performance of these code paths from build to build
  • Provide recommendations on how code needs to change, refactored to improve performance 

Requirements

  • Bachelor’s Degree in Computer Science, Math, or Engineering
  • 6+ years hands-on experience in performance testing, data collection, analysis and workload characterization, bottleneck identification, and capacity planning
  • Experience diagnosing the full stack; application, database, o/s, storage, and network layers
  • Experience with large-scale, large volume, distributed 24x7 systems supporting millions of transactions / sec
  • Experience in applying appropriate mathematic modeling techniques in deriving and validating projected performance improvement
  • Excellent written and verbal communications skills; not afraid of standing in front of 100+ of your peers to discuss and defend your findings
  • Excellent programming knowledge of (C or C++ preferred, Java / Scala ok)
  • Excellent scripting skills (Shell, JavaScript, Python or Go)
  • Must be an expert in UNIX/Linux O/S (Windows a bonus)
  • Must have strong problem-solving skills
  • Should have experience with cloud providers (Amazon EC2, Azure, Joyent , Rackspace, Softlayer)
  • Should have database experience (SQL or Non-relational)
  • Should have experience with/knowledge of hypervisor technologies (KVM, VMWare, Xen or Hyper-V)