S.S. Tarek

About Me

Hi. I'm an AI/ML and AIOps Engineer, experienced in both R&D and production-grade intelligent systems: from data pipelines and model training to agentic AI and cloud-native deployment.

My work sits at the intersection of research and engineering. I prefer to design systems that work in production and bring value to real-world problems. Whether that's an autonomous incident investigation pipeline that posts root cause analysis directly to Slack, or a music analysis model deployed inside a Sony library, I focus on the full journey from idea to impact.

Outside the engineering life, my horizon spans across books, anime, and most predominantly, music. A shade of Rock n Roll keeps my soul afloat and acts as the driving force behind my everyday life.

Experience

Woven by Toyota

via Robert Half AI/ML & AIOps Engineer

Feb 2026 - Present Tokyo, Japan

Designed and deployed production-grade AIOps systems on AWS - owning the AI roadmap for cloud and network operations, replacing manual workflows with autonomous AI pipelines.

Built an event-driven RCA pipeline that automatically investigates AWS incidents reported in Slack and posts root cause analysis in real time
Extended the RCA pipeline into a fully agentic system using LangChain and LangGraph with dynamic MCP tool binding, deployed as two production microservices containerized with Docker and orchestrated with Kubernetes
Deployed an AI alert classification system achieving 97% accuracy and 100% recall on Critical alerts, replacing manual review of ~30 daily alerts
Designed an SLO violation forecasting pipeline using Holt-Winters and Monte Carlo confidence scoring - average violation lead-time of 19 days on validation data

Python AWS LangChain LangGraph MCP Docker Kubernetes Terraform FastAPI ChromaDB GitHub Actions

Sony Computer Science Laboratories

via Hiperdyne Corporation AI Engineer - Deep12 Music Analysis AI

Nov 2018 - Apr 2025 Tokyo, Japan

R&D contributor on Deep12, a music analysis AI developed in collaboration with Sony Computer Science Laboratories and deployed as a service in the Sony Music Publishing library.

Designed and trained CNN, LSTM, and Transformer-based models for music search, detection, and prediction tasks - achieving results comparable to published benchmarks
Built a multi-stage NLP pipeline progressing from TF-IDF & Naive Bayes to fine-tuned BERT, improving performance by ~8% over the next-best model
Leveraged AWS Bedrock for synthetic minority-class generation using few-shot prompting with deduplication and topic filtering at scale
Optimized GPU-accelerated data processing pipelines with PyTorch parallelization - improving runtime by 4-5x

Python PyTorch TensorFlow Keras scikit-learn BERT AWS MLflow Docker GitHub Actions

Projects

Case Study

AI-Driven Root Cause Analysis

When an AWS incident report drops in Slack, the system investigates autonomously - with a root cause delivered in the same thread.

Multi-step Lambda pipeline orchestrated by Step Functions - validates, retrieves CloudWatch logs, analyzes with LLM, and posts root cause report to the originating Slack thread
Extended into a fully agentic system with LangChain and LangGraph - stateful orchestration, dynamic MCP tool binding, prompt injection safeguards at the MCP layer
Deployed as two production microservices - FastAPI inference layer and custom MCP tool server, containerized with Docker and orchestrated with Kubernetes
TF-IDF log deduplication reduced LLM input tokens by up to 96% with no loss in analysis quality

Python AWS LangChain LangGraph MCP FastAPI Docker Kubernetes Step Functions

Case Study

AI-Driven Alert Classification

Replaced manual review of ~30 daily Slack alerts with an automated triage pipeline.

Tiered classification architecture - nearest-neighbor retrieval, regex pattern matching, RAG-based LLM escalation with confidence scoring and a fail-safe default to Critical
97% accuracy, 0.89 macro F1, 100% recall on Critical alerts
Deployed on AWS Lambda with Terraform-provisioned infrastructure and GitHub Actions CI/CD
CloudWatch error-rate alarms with SNS notifications for production pipeline health monitoring

Python AWS Lambda Bedrock ChromaDB OpenSearch Terraform GitHub Actions

Case Study

Deep12 - Music Analysis AI

R&D contributor on a music analysis AI deployed inside the Sony Music Publishing library.

Designed and trained CNN, LSTM, and Transformer-based models for audio-based music classification, similarity search, and attribute detection - results comparable to published benchmarks
Built a multi-stage NLP pipeline from TF-IDF & Naive Bayes to fine-tuned BERT - 8% improvement over next-best model
Synthetic minority-class generation via AWS Bedrock with few-shot prompting, cosine similarity deduplication, and BERTopic topic filtering
GPU-accelerated PyTorch data processing pipelines - 4-5x runtime improvement

Python PyTorch TensorFlow BERT AWS Bedrock MLflow Docker GitHub Actions

Personal Project

LLM Uncertainty Quantification

Most ML systems tell you what they think. This project asks how confident they really are.

Measured how reliably a model's confidence predicts correctness across math reasoning and factual QA tasks - using 7 token-level and sentence-level signals
AUROC of 0.75 on reasoning tasks and 0.84 on factual QA - uncertainty signals rank errors well above random across both task types
Identified confident ignorance - the model can be highly certain and completely wrong, especially on knowledge-recall tasks
Reproduced entirely on a single consumer GPU using a 4-bit quantized model - no cloud compute required

Python PyTorch Hugging Face Phi-3-Mini GSM8K TriviaQA

View on GitHub

Personal Project

PDF RAG API

Upload a PDF. Ask questions. Get grounded answers with exact page citations.

Async ingestion pipeline - long-running processing happens in the background after the request returns, surviving container restarts
Page-level chunking preserves page boundaries enabling exact citation of source pages per answer
Distributed tracing pinpointed the embedding step as the ingestion bottleneck across concurrent requests - not derivable from logs alone
27 passing tests with async route testing and mocked AWS dependencies, CI on every push

Python FastAPI AWS Bedrock S3 Vectors LangChain OpenTelemetry Terraform Docker

View on GitHub

I build

About Me

Tech Stack

Languages

ML & Deep Learning

Generative AI & NLP

Cloud & Infrastructure

MLOps & APIs

Data Engineering

Experience

Woven by Toyota

Sony Computer Science Laboratories

Projects

AI-Driven Root Cause Analysis

AI-Driven Alert Classification

Deep12 - Music Analysis AI

LLM Uncertainty Quantification

PDF RAG API

Contact

Let's build something.