Careers

Explore opportunities at our portfolio companies.

Data QA Engineer - Contract Role

Sanas

Sanas

Quality Assurance
Bengaluru, Karnataka, India
Posted on Sep 17, 2025
Sanas.ai is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world’s first real-time speech transformation platform capable of accent translation, noise elimination, speech enhancement, and cross-language communication.
Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language.
Since going to market in 2023, Sanas has scaled at an extraordinary pace, growing from $0 to $32M ARR in under two years, with a projected >$50M ARR by the end of 2025. The company recently recorded its first $10M quarter and is on track to achieve $120M in ARR next year. With a SaaS-based model, Sanas serves some of the world’s largest enterprises, including Comcast, UPS, UHG. Today, Sanas technology is deployed across >17 of the Fortune 500 and continuing to accelerate growth.
The company’s valuation has a clear trajectory toward multi-billion-dollar market capitalization as it continues to expand into new verticals and product categories. With a TAM that spans all human in the loop communications and beyond, Sanas has the potential to impact every industry and every global interaction.
Sanas is revolutionizing the way we communicate with the world’s first real-time algorithm, designed to modulate accents, eliminate background noises, and magnify speech clarity. Pioneered by seasoned startup founders with a proven track record of creating and steering multiple unicorn companies, our groundbreaking GDP-shifting technology sets a gold standard.
Sanas is a 200-strong team, established in 2020. In this short span, we’ve successfully secured over $100 million in funding. Our innovation has been supported by the industry’s leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you’re not just adopting a product; you’re investing in the future of communication.
As a Data QA engineer, you will work on validating and curating audio and transcription datasets used to train and evaluate AI models. You will review real and synthetic audio for issues like clipping, background noise, and transcription errors, and help identify, reproduce, and document data anomalies. You will get to work closely with research and engineering teams, you will support data quality improvements, enhance validation tools, and contribute to scalable QA workflows; all while ensuring high standards of data hygiene and consistency.

Key Responsibilities:

  • Conduct thorough validation of datasets used in model training and evaluation, focusing on transcription accuracy, metadata integrity, and audio quality.
  • Review real customer calls and synthetic audio to detect data anomalies such as clipping, silence, incorrect speaker tags, or transcription mismatches.
  • Reproduce and document data issues that impact model quality, enabling effective debugging and iteration by research teams.
  • Curate, clean, and manage high-quality datasets from a variety of sources including customer calls, synthetic pipelines, and open-source corpora.
  • Annotate and label audio with quality issues such as background noise, gender mismatches, speech overlap, silence, or segmentation errors.
  • Collaborate with research and engineering teams to enhance data validation tools and scale automation within QA workflows.
  • Ensure high standards of data hygiene, consistency, and reproducibility across all Data QA processes.
  • Support data-related workflows such as data mining, extraction, transformation, and manipulation.

Must have qualifications:

  • 2+ years of experience in Data QA, audio/transcription QA, or related quality assurance fields.
  • Exceptional attention to detail with the ability to identify subtle inconsistencies and data quality issues.
  • Hands-on experience with audio inspection tools like Audacity, Praat, or similar platforms.
  • Familiarity with audio quality aspects such as clipping, background noise, channel imbalance, or a strong willingness to learn.
  • Proficiency in handling structured data using tools like Excel, Google Sheets, CSV/JSON, and basic scripting in Python or Bash.
  • Strong written communication skills for producing clear, actionable QA documentation and feedback.
  • Knowledge of database languages (e.g., SQL) and experience working with DBMS tools like PostgreSQL.
  • Demonstrated ability to collaborate effectively with ML researchers, product managers, and customer-facing teams.
Joining us means contributing to the world’s first real-time speech understanding platform revolutionizing Contact Centers and Enterprises alike.
Our technology empowers agents, transforms customer experiences, and drives measurable growth. But this is just the beginning. You'll be part of a team exploring the vast potential of an increasingly sonic future