

Junior Data Analyst
M2.0 Communications Inc.
- Quezon City, Philippines# 94 Scout Castor, Quezon City, Metro Manila, PhilippinesQuezon CityMetro ManilaPhilippinesPhilippines
 - Full timeFULL_TIME
 
Posted 5 months ago and deadline of application is on 25 May
Recruiter was hiring 8 hours ago
2025-06-04T05:25:48.575001+00:002026-05-25T16:00:00+00:00Job Description
As a Junior Data Analyst, you will play a crucial role in the data preprocessing phase of our project to fine-tune the Whisper model for Taglish and other languages. Your responsibilities will include collecting, organizing, cleaning, and preparing high-quality multilingual data for model training. You will work closely with the machine learning team to ensure that the data meets the necessary standards for effective model training.
Key Responsibilities:
Data Collection and Organization:
- Gather raw audio files in various formats (e.g., MP3, WAV, FLAC) from diverse sources such as interviews, podcasts, and YouTube videos.
 - Organize files into a structured directory hierarchy, ensuring a clear and consistent file naming convention.
 
Audio Preprocessing:
- Convert audio files to the required format (16kHz mono, 16-bit signed integer WAV) using tools like FFmpeg.
 - Transcribe audio files, either manually or through a transcription service, and store text files with corresponding filenames.
 
Data Cleaning and Normalization:
- Clean and normalize text data to address spelling variations, punctuation issues, and formatting inconsistencies.
 - Standardize abbreviations and contractions, and remove special characters or unnecessary symbols.
 
Data Segmentation and Labeling:
- Split lengthy audio recordings into smaller, manageable segments.
 - Create and maintain a metadata file that maps audio files to their corresponding transcriptions and alignment details.
 
Quality Assurance and Validation:
- Conduct thorough quality checks to validate the dataset for accuracy, consistency, and completeness.
 - Identify and resolve issues in the audio and text data, such as misalignments or incorrect transcriptions.
 
Data Analysis and Reporting:
- Use data analysis techniques to evaluate dataset health and completeness.
 - Provide regular reports on data collection progress, challenges, and recommendations for improvements.
 
Collaboration and Communication:
- Work closely with the machine learning team to address any data-related issues.
 - Provide regular updates on data collection and preprocessing progress.
 
Minimum Qualifications
Qualifications:
- Strong Proficiency in Python: Experience with data manipulation, cleaning, and preprocessing using Python libraries such as Pandas, NumPy, and TensorFlow.
 - Data Cleaning and Preprocessing: Proven ability to clean, organize, and preprocess data for machine learning applications.
 - NLP Knowledge: Familiarity with natural language processing techniques, including text normalization and handling multilingual or code-mixed data.
 - SQL Skills: Experience with SQL for data querying and management.
 - Problem-Solving Skills: Ability to identify and solve complex data-related problems with creativity and efficiency.
 - Work Under Pressure: Capable of handling multiple tasks simultaneously and meeting deadlines in a fast-paced environment.
 - Adaptability: Willingness to learn new tools and techniques as needed for the project.
 - Attention to Detail: Meticulous attention to detail to ensure data accuracy and integrity.
 - Communication Skills: Excellent communication skills to collaborate effectively with cross-functional teams.
 
Desired Skills:
- Familiarity with audio processing tools like FFmpeg.
 - Familiarity with transcription tools and alignment software (e.g., Aeneas, Gentle).
 - Knowledge of Taglish language nuances and variations.
 - Experience with version control systems like Git.
 - Familiarity with code-mixing or multilingual NLP techniques
 
Perks and Benefits
Paid Vacation Leave
Paid Sick Leave
Maternity & Paternity Leave
Jobs Summary
- Job Level
 - Entry Level / Junior, Apprentice
 
- Job Category
 - IT and Software
 
- Educational Requirement
 - Bachelor's degree graduate
 
- Recruiter response to application
 - Once in a while
 
- Office Address
 - Don Roces Avenue, Quezon City
 
Feel secure when applying: look for the verified icon and always do your research on a company. Avoid and report situations when employers require payment or work without compensation as part of their application process.