Member of Technical Staff - Edge Inference Engineer

5 months ago

Remote, Worldwide +2 moreSenior / Staff+

H1B Sponsor

Responsibilities

Implement and optimize inference kernels for various edge hardware architectures.
Develop quantization strategies to maximize compression while preserving model quality.
Contribute to open-source inference frameworks like llama.cpp.
Profile and optimize end-to-end inference pipelines for low latency.
Collaborate with ML researchers to identify optimization opportunities.

Requirements

5+ years of experience in systems programming with strong C++ proficiency.
Experience in embedded software engineering or resource-constrained systems.
Understanding of ML fundamentals at the linear algebra level.
Familiarity with hardware architecture concepts like cache hierarchies and memory bandwidth.
Experience with contributions to inference frameworks is a plus.

Benefits

Competitive base salary with equity in a unicorn-stage company.
100% coverage of medical, dental, and vision premiums for employees and dependents.
401(k) matching up to 4% of base pay.
Unlimited PTO plus company-wide Refill Days throughout the year.

Tech Stack

Categories

AI & ML Embedded