About Me
I am a graduate student at Rensselaer Polytechnic Institute studying multimodal machine learning for medical imaging and beyond in the BME Department. Prior to this, I was an undergraduate at UC Berkeley Data Science majoring in data science with a domain emphasis on cognition.
I am a team-oriented research-focused graduate student with data science internship experience to solve real-world problems ranging from IT Ops, Neurotech, e-Commerce to Healthcare.
I have used statistical and deep-learning-based AI models in NLP, computer vision, and medical imaging.
I am interested in basic exploratory data analysis, particularly to answer research questions.
I am also interested in interfacing with domain experts to learn new topics, analyze their data and communicate the findings to the corresponding business stakeholders.
My M.S. Thesis carves out a new space for inference-time fact-checking of generative AI in chest radiology. I built a new fact-checking model as a supervised contrastive multi-label regression network trained on a large synthetically created dataset of real and fake pairings of chest radiology imaging with report findings. The 24 million size dataset called RadCheck has been contributed to open source on HuggingFace and is the largest to date for chest X-rays. This model is agnostic to report generators, and can auto-correct radiology reports from frozen report generator models. This has been recently accepted as Master's thesis at RPI and has spawned 4 publications appearing in MICCAI, ISBI, NeurIPS, etc.
Please see my CV for details.
Education/Courses Taken
M.S. Biomedical Engineering, Dept. of Biomedical Engineering, Rensselaer Polytechnic Institute (RPI), August 2025.
Published M.S. Thesis on Image-Driven Fact-Checking of AI Generated Chest Radiology Reports .
This thesis showed that a fact-checking discriminative neural model could be developed for use at inference time to detect and correct errors in generative AI radiology reports independent of the report generators. Led to 15-40% improvement in the quality of generative AI chest radiology reports for clinical workflows.
B.A. Data Science, University of California, Berkeley, May 2022
Studied Data Science with a domain emphasis in Cognition at UC Berkeley.
Awards
Best Video Editing Award, Short Film: EXIST, RPI.
UPAC Student Film Festival, RPI, New York, March 2025.
Runner up, 2020 Undergraduate HCE Essay Prize, UC Berkeley.
For essay on independent, original research on social, cultural, policy of ethical issues at the intersection of data science, technology and society.
COSMOS Video Game Cluster
Selected for the competitive California Summer School for Math & Science (COSMOS) at UC Santa Cruz.
Skills
Machine Learning
Large language models (LLMs) and VLMs (e.g. GPT4, Llama3.2, Granite Vision, LLava, Llama3.2, Idefics2), VLM encoder models (CLIP, OpenCLIP, SigLIP2), Language encoder models (Word2Vec, SBERT, BERT, BGE), Deep Learning Computer Vision models (U-Net, ResNet101, VGG-16, other generative AI models (Diffusion models), segmentation models (U-net variants)
Data Science:
Statistical ML, Data Science, Data Analysis tools, statistical methods (A/B Testing, Regression), Data Mining, Data ETL, Data Visualization, Libraries (Sklearn, Pandas, Seaborn, Pytorch, Matplotlib, Numpy, NLTK)
Programming Languages:
Python, R, Java, Matlab, SQL and associated IDEs.
DevOps skills
GitHub repository management, issue tracking, pull-requests, pytest QA and automation
Tools, Libraries & Databases
Jupyter Notebook, TensorFlow, Keras, PyTorch,Eclipse, Visual Studio, Sublime, Deepnote, IntelliJ, MySQL, VIM, VI, Logism, ITKSnap, 3D Slicer, Adobe Premiere Pro, Unity, Maya
Pandas, Scikit-learn, Scipy, Seaborn, Numpy, Nltk, Gensim, Matplotlib, Tidyverse, SimpleITK, OpenCV, Matplotlib, Pyplot.
Cloud Platform Technologies
Docker, Linux, Git, Webservers (Tomcat), Big Data (S3, Hadoop), AWS, Familiarity with DevOps
Other:
Independent Research skills, making technical presentations, peer reviewing papers for discussion groups, journal submissions & articles experience.
Technical Writing, Creative writing, Video Editing, Projects and Artisitc Critques on Medium
Research/Academic Experience
Fact-checking for Generative AI, RPI Graduate Research May 2023 – May 2025
Independent research on image-driven fact-checking of AI-generated textual reports for chest X-ray imaging under guidance of Dr. Pingkun Yan (RPI) and radiologist Dr. Mannudeep Kalra (Harvard/MGH)
Developed a new supervised contrastive regression network as a phrase-grounded fact-checking model to correct generative AI radiology reports. Developed a dataset of 24 million real/fake image-finding pairs and open-sourced on HuggingFace. 2 Patents filed. Showed 15-40% quality improvement on AI reports
Produced 4 refereed papers, at MICCAI 2025, ISBI 2025, MLMI2023 (published), and NeurIPS 2025 (under review)
IBM Watson (Jan-May'21)
Gained NLP research experience in the IBM Watson NLP team led by Dr. Rama Akkiraju, CTO of Watson AI Ops and mentored by Xiatong Liu, Data Science Manager.
Developed a novel log anomaly classification algorithm combining BERT language modeling of IT logs with supervised contrastive learning.
The resulting algorithm achieved an overall accuracy of 97.32% on a dataset of 10000 HDFS system logs and outperformed other machine learning algorithms. Research paper under preparation.
Work involved using Spacy NLP library, BERT sentence transformed from Huggingface on PyTorch, and supervised contrastive learning model modeled after a NeurIPS2020 paper .
Academic Development Committee Mentor, Data Science Society(DSS), UC Berkeley (Aug'21-present)
Selected to teach Data Science basics and mentor Berkeley undergraduates on Data Science Capstone research projects.
Facilitated discussion on specific Data Science topics through mini-lectures and curated jupyter notebooks.
Mentored two groups of 5 students on Data Science Capstone projects.
Expanded practical knowledge of EDA, visualization, modeling, machine learning, hypothesis testing.
Lab Assistant/Academic Intern - CS/Data Science, UC Berkeley (Aug'2019-May'2021)
Facilitated students’ introductory Berkeley CS experience through hands-on instruction, tutoring in office hours and CS labs, mediating online course forum discussions (CS61A), providing problem walkthroughs for class projects and bug fixing on Python Jupyter Notebooks (Data 100).
Received recognition from students for assistance with debugging, quick explanations, recapping course topics (data visualization, modelling) in feedback forms. Was able to accommodate more than half of the incoming queue during consultation hours.
Medical Imaging/Computer Vision Machine Learning Intern - IBM Almaden Research Center (June'2014-July'2016, Jan. -June '21)
Worked again recently in the Medical Sieve Radiology Grand Challenge group at IBM Almaden Research Center on developing a new type of deep learning framework called spatially-preserving flattening, for location-sensitive recognition of findings in chest X-rays. Joint work with Neha Srivastava at Stanford. Resulted in a publication at ISBI'22 conference.
Earlier, under a program for middle and high school students to be mentored by researchers at IBM Almaden Research Center, I volunteered as a machine learning intern in this group.
Contributed to several medical imaging AI research projects for automatic detection of pulmonary embolism, cardiac aneurysms, and dilated cardiomyopathy in CT and echocardiography through development of new ideas, and implementing them along with IBM Researchers.
The research resulting from this early mentoring experience was presented in Synopsis Science Fair, 2014-2016, and published in international conferences (AMIA’14 PMID: 25954393, IEEE ISBI’15) at age 14, and patent disclosures were submitted.
Learned the use of many pre-deep learning statistical machine learning packages, tools such as ImageJ, and coding in Java and Matlab to process medical data in HL7 and DICOM during this experience.
Summer Internships
HyperFine Research (Aug-Sept'21)
Interned as part of a team of professional data scientists to create labeled data collections for deep learning-driven anomaly recognition in brain MRI generated from their portable MRI scanner.
Developed an automated labeling algorithm for brain MRI images from their companion textual reports using language models, NLP, and vocabulary-driven concept extraction. The algorithm extracted 7200 annotations for 600 brain MRIs achieving 88% precision and 70% recall in performance. Used BERT and Word2Vec models and Spacy NLP libraries.
Wrote python scripts to collate results of radiologist annotations with original MRI Dicom files using an edit distance-based name matching. Work involved using Pandas, Spacy NLP libraries.
Developed user interfaces to record ground truth anomaly labels indicated by clinicians in companion MRI reports that led to ten-fold decrease in annotation time.
Xoran Technologies (June-Aug'21)
Developed a 3D anatomical segmentation algorithm for cone beam CT studies. Reconstructed volumes for 9 anatomical structures in head and neck including eyes, maxillary sinus, sphenoid, etc. using U-net-based deep learning architecture trained on 17 CT volumes achieving a Dice coefficient of 0.68. Work used SimpleITK, numpy, Python, Keras, and Tensorflow libraries.
Surveyed several image annotation tools and prepared a report. Trained colleagues on use of ITKSnap and 3D Slicer for manual regional annotations.
SWAYD (Jan-March'20)
Worked as an computer vision content gathering intern in a team of 4 for the startup. Developed an algorithm in Python for automatically classifying foods/dishes using ImageNet-trained DL models and linking them to their respective restaurants via hashtags and geo-tags in instagram posts.
Obtained hands-on experience of data preparation, cleansing, processing, algorithms development, APIs/platforms (Postman, ClarifAI, Google Maps API).
Projects
Over the last 7 years, I have done several projects covering data science and general CS areas.
The projects done as part of work experience are proprietary and details are provided in the
attached presentations. For projects done in open source or freelance, GitHub links are provided
where possible for code.
Data Science Projects
CS Projects
Other Projects
Data Science Projects
Summer Internship Projects
The summer internships at Xoran Tech and Hyperfine involved development of automated annotation tooling to enable deep learning model development.
More details of this work are available in the following reports and under the Experience tab:
IBM Watson AI Ops – Anomaly Detection in IT Logs
While interning with IBM Watson NLP team, developed a new approach called ContrastBERT, for log anomaly classification using supervised contrastive learning on BERT-encoded log data.
The probem addressed was to recognize which of a set of IT logs were anomalous.
The IT logs came in the form of free text intermixed with identifiers such as block ids but no definite signatures for anomalies.
Previous approaches tried to address this by extracting handcrafted features from event sequences and building classifiers on the features.
My approach was to observe that there is information in both the order and the content of the text sentences and modeled it using sentence BERT.
I then built a supervised constrastive encoder that differentiates between the BERT encodings of normal and abnormal IT logs. A deep learning classifier was then built using the learned contrastive encoder.
The resulting classifier outperformed existing log anomaly detection methods on a benchmark dataset of 10K HDFS logs achieving an accuracy of 97.3%
This paper was accepted at IEEE Big Data Conference Workshop on Knowledge Discovery in Data Mining on IT Operations, Osaka, Japan, Dec. 2022 paper .
Download related presentation and Github code .
NeuroTech Elective Project - A Deep Learning-based Sleep Stage Analyzer
Developed a deep learning-based algorithm to analyze the sleep stages from EEG signals and study their variance across population.
Adapted a 1D CNN architecture (8 CNN layers, 1 drop-out layer) to implement a 5-channel EEG signal classifier into 5 sleep stages of (Awake, Stage1, Stage2, Stage3, REM).
Achieved a balanced accuracy of 0.76 and Cohen's Kappa score of 0.706 for the developed network on a dataset of EEG signals from the Sleep Physionet dataset (30 PolySomnoGraphic sleep recordings).
Download related presentation and Github code .
Deep Learning Projects on Kaggle
Worked under the mentorship of a data scientist, Humza Iqbal, Secruiti.ai on several Kaggle datasets to build deep learning models for several problems below.
Gained experience on deep learning model building using Keras, Tensorflow, and PyTorch.
Digit recognition using CNNs on MNIST data
ResNet50 classifier on CFAR-10 dataset.
U-Net based TGS Salt deposit segmentation. Used this model example to later build 3d segmentation of conebeam CT.
Machine learning-driven Contraceptive Use Prediction
Worked in a three-member team to find optimal predictor variables for the use of contraceptives in a survey dataset gathered for Indonesian women for purposes of family planning rollout measures.
Experimented with logistic regression, decision trees, and random forest with PCA on features.
Implemented using Scikit-learn library. Dealt with data pre-processing, cleansing, and formatting.
Explored standardization of patient data formats such as FHIR for large scale patient record analysis.
Download report and access Github code .
Cal Hacks 6.0 Collegiate Hackathon Project : LateNight
Developed an app as part of a group project that used neighborhood crime data from local county to develop a safety index for the restaurants in neighborhoods in Berkeley.
Involved web scraping, crime record analysis, map visualization.
Programmed in Swift and Python.
CS Projects
Full-fledged CPU design
Developed a full-fledged CPU design for processing a full set of RISC-V instructions using Logism in CS61C Computer Architecture course.
Development of GitHub Clone
Developed a full-fledged GitHub clone in Java that implements functions of Github for repository management in CS61B Data Structures.
Simulation of the Enigma Machine
Built Java-based simulator for a generalized version of the Enigma machine used during WWII for encrypting messages & substitution ciphers.
Escape The Tune Game
When I was selected to participate in competitive California Summer School for Math & Science (COSMOS) at UC Santa Cruz, I designed a video game as part of video game cluster.
This was a space game where the player navigates a spaceship syncing to musical rhythms. Implemented this in Unity and Maya. Code available from Github.
Film Projects
Film Blog Website
Created a personal film blog, critiquing effects such as motion capture, CGI, and cinematography in contemporary movie reviews.
Shared film critiques with a broader audience to foster dialogue. This was featured in a local news magazine .
Publications
R. Mahmood, K. C. L. Wong, D. M. Reyes, N. D’Souza, L. Shi, J. Wu, P. Kaviani, M. Kalra, G. Wang, P. Yan, and T. Syeda-Mahmood "A Phrase-grounded Fact-checking Model for Automatically Generated Radiology Reports,” in Proc. International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Daejeon S. Korea, Oct. 2025. The underlying dataset of 24 million pairs of images with correct/incorrect radiology findings called RadCheck has been contributed to open source on HuggingFace.
R. Mahmood, P. Yan, D. M. Reyes, G. Wang, M. K. Kalra, P. Kaviani, J. T. Wu, and T. Syeda-Mahmood, “Evaluating Automated Radiology Report Quality through Fine-Grained Phrasal Grounding of Clinical Findings ,” Houston, TX, April. 2025.
R. Mahmood, Ge Wang, Mannudeep Kalra, Pingkun Yan, “Fact-Checking of AI-Generated Reports ,” in Proc. Machine Learning for Medical Imaging (MICCAI Workshop), Vancouver, BC, Canada October 2023.
R. Mahmood, X. Liu, A. Xu, R. Akkiraju, “ContrastBERT: Supervised Contrastive Learning of BERT-Encoded IT logs for Anomaly Classification ,” in Proc. IEEE Big Data Conference Workshop on Knowledge Discovery in Data Mining on IT Operations. Osaka, Japan, Dec. 2022.
N. Shrivastava, R. Mahmood, T. Syeda-Mahmood, “Spatially-preserving flattening in deep learning for location-aware classification ,” in Proc. International Symposium on Biomedical Imaging, Kolkata, India, March, 2022.
R. Mahmood, T. Syeda-Mahmood, “Automatic detection of left ventricular aneurysms in echocardiograms ,” in Proc. International Symposium on Biomedical Imaging (ISBI), New York, April 2015. See local copy.
R. Mahmood, T. Syeda-Mahmood, ”Automatic detection of dilated cardiomyopathy in cardiac ultrasound videos ,” in Proc. American Medical Informatics Association (AMIA) Annual Conference, Washington, D.C., November, 2014. See local copy.
Contact
Elements
Text
This is bold and this is strong . This is italic and this is emphasized .
This is superscript text and this is subscript text.
This is underlined and this is code: for (;;) { ... }
. Finally, this is a link .
Heading Level 2
Heading Level 3
Heading Level 4
Heading Level 5
Heading Level 6
Blockquote
Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.
Preformatted
i = 0;
while (!deck.isInOrder()) {
print 'Iteration ' + i;
deck.shuffle();
i++;
}
print 'It took ' + i + ' iterations to sort the deck.';
Lists
Unordered
Dolor pulvinar etiam.
Sagittis adipiscing.
Felis enim feugiat.
Alternate
Dolor pulvinar etiam.
Sagittis adipiscing.
Felis enim feugiat.
Ordered
Dolor pulvinar etiam.
Etiam vel felis viverra.
Felis enim feugiat.
Dolor pulvinar etiam.
Etiam vel felis lorem.
Felis enim et feugiat.
Icons
Actions
Table
Default
Name
Description
Price
Item One
Ante turpis integer aliquet porttitor.
29.99
Item Two
Vis ac commodo adipiscing arcu aliquet.
19.99
Item Three
Morbi faucibus arcu accumsan lorem.
29.99
Item Four
Vitae integer tempus condimentum.
19.99
Item Five
Ante turpis integer aliquet porttitor.
29.99
100.00
Alternate
Name
Description
Price
Item One
Ante turpis integer aliquet porttitor.
29.99
Item Two
Vis ac commodo adipiscing arcu aliquet.
19.99
Item Three
Morbi faucibus arcu accumsan lorem.
29.99
Item Four
Vitae integer tempus condimentum.
19.99
Item Five
Ante turpis integer aliquet porttitor.
29.99
100.00