company_logo

Speech Recognition Intern

Sony Research India

Updated on: 17 April 2025

Additional Details

Website

www.sony.com

website

Work Location

Work-from-home

location

Job Type

Internship + Fte

job_type

Batch

2025 | 2024

batch

Stream Required

Masters in (Research) or PhD. in deep learning/machine learning with hands-on experience

stream

Salary

30K-40K/ Month (Expected) [Stipend]

salary

Job Description

Sony Research India is seeking a dynamic and motivated Speech Recognition Intern to join our innovative research team. As a Speech Recognition Intern, you will have the opportunity to work on cutting-edge projects in the field of speech recognition technologies. This internship is designed for individuals passionate about advancing their skills and knowledge in speech recognition, speech activity detection, speaker diarization, machine learning, and artificial intelligence.

 

Key Responsibilities:

  • Research and Development: Collaborate with our research team to design, implement, and evaluate state-of-the-art speech recognition models (e.g., Whisper, Wav2Vec2, Conformer, etc.)
  • Algorithm Optimization: Work on optimizing existing speech recognition models for enhanced accuracy, noise robustness, and latency.
  • Stay Current: Stay updated of the latest developments in the field of speech recognition, speaker diarization and contribute insights to enhance the teams knowledge base.

 

Work Location:

  • Remote

 

Duration of the paid Internship:

  • This paid internship will be for a period of 6 months starting May first week of 2025
  • 9:00 to 18:00 (Monday to Friday)

 

Qualification:

Currently pursuing/completed Masters in (Research) or Ph.D. in deep learning/machine learning with hands-on experience on Transformer models with an applications audio/speech.

 

Must Have Skills:

  • Strong programming skills in Python, shell scripting.
  • Familiarity with ASR frameworks (e.g., HuggingFace Transformers, ESPnet, SpeechBrain, Kaldi, or OpenAI Whisper).
  • Hands-on deep learning, machine learning (Pytorch, Tensorflow OR librosa).
  • Strong foundation in machine learning and signal processing.

 

Good to have skills

  • Knowledge of deep learning for speech (e.g., CTC, encoder-decoder, attention mechanisms).
  • Prior experience in development of Indian Languages ASR and noise-robust ASR.

Disclaimer: The Job Company is an independent platform dedicated to providing information about job openings. We are not affiliated with, nor do we represent, any company, agency, or agent mentioned in the job listings. Please refer to our Terms of Services for further details.

Important: If an employer asks you to pay any kind of fee, please notify us immediately. The Job company does not charge any fee from the applicants and we do not post any jobs where companies ask candidates to pay.

Click on the Apply Now button to apply for Sony Research India