Job Description
Are you ready to architect the backbone of tomorrow's intelligence? Nexus Horizon Labs is seeking a visionary AI Infrastructure Architect to lead the deployment of next-generation neural networks and high-performance computing clusters. Join a team of elite engineers and researchers dedicated to pushing the boundaries of what is possible in 2026 and beyond.
About the Role:
We are building the infrastructure that powers the next wave of generative AI. You will be responsible for designing resilient, scalable, and ultra-low-latency systems capable of handling petabyte-scale data streams. If you thrive in a fast-paced, innovative environment and want to leave a lasting legacy in the tech industry, this is your opportunity.
Why Join Us?
- Work on cutting-edge technology that defines the future.
- Competitive equity package and top-tier compensation.
- Flexible remote-first culture with premium San Francisco amenities.
Responsibilities
- Architect Scalable Systems: Design and implement robust distributed systems to support large-scale machine learning workloads and real-time inference engines.
- Optimize Performance: Spearhead initiatives to reduce model training latency and improve GPU cluster utilization efficiency by up to 40%.
- Quantum Readiness: Integrate hybrid cloud and quantum-ready protocols into existing infrastructure to future-proof our architecture.
- Collaborative Engineering: Partner with data scientists and ML researchers to translate theoretical models into production-grade, high-performance software.
- Disaster Recovery: Develop and maintain comprehensive disaster recovery strategies to ensure 99.999% uptime for critical AI services.
Qualifications
- Education: Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or a related technical field.
- Experience: 5+ years of experience in software engineering, with a focus on infrastructure, distributed systems, or high-performance computing.
- Tech Stack: Proficiency in Python, Rust, Go, or C++; deep understanding of Kubernetes, Docker, and AWS/Azure ecosystems.
- AI Expertise: Hands-on experience with large language models (LLMs), MLOps pipelines, and deep learning frameworks (PyTorch, TensorFlow).
- Problem Solving: Exceptional ability to troubleshoot complex performance bottlenecks in high-throughput environments.