Hi! I am a second year Masters student majoring in Computer Science at University of Pennsylvania, where I have been doing research with Prof. Mark Yatskar. In the past, I have also had the opportunity to work with Prof. Ranjay Krishna and Prof. Maneesh Agrawala. I did my B.Tech in Computer Science from Veermata Jijabai Technological Institute (VJTI). My current research interests lie at the intersection of vision and language, which have been the focus of my past and current research experiences.
As increasingly powerful deep learning models in vision and language emerge, fundamental
questions like “What do the models actually learn?” and “What does the model base its predictions
on?” still persist. My long-term research goal is to build a robust and reliable model that would
be able to answer these questions. I wish to explore these two directions for it:
(i) leveraging human-like intuitions, such as learning compositionally, to boost the robustness of models and
(ii) building interpretable models such that we can use correctability methods to make sure
they don't rely on biases
and thereby facilitating reliability and robustness.
Zixian Ma*, Jerry Hong*, Mustafa Omer Gul*, Mona Gandhi, Irena Gao, Ranjay Krishna
CVPR 2023 [Highlight: 10% of accepted papers / 2.5% of submissions]
Technologies: Computer Vision, Deep Learning
We proposed and implemented two novel methods to improve our outputs for neural style transfer: (i) fine-tuning the model as a classification for a particular style, and (ii) flattening the layers to allow us the convenience of adding style and content loss inside the blocks. Through network dissection, we compare the best models and positions for style and content loss. We found that mobilenetv2 with flattening with fine-tuned model gave the best visual results.
[Report] [Presentation] [Drive]Technologies: ReactJS, MySQL
We have created a social cataloging application specifically for poetry lovers. Our aim was to make a platform that enables users to explore the world of poetry through an extensive collection of books, series, authors, and reviews. By signing in, users can create and maintain their own virtual library of poetry books, rate them, and receive custom recommendations based on their past behaviour about new poetry books that they might enjoy.
[Report] [Video] [GitHub]Domain: Computer Vision, Natural Language Processing
We implemented a Multi-modal Sarcasm Detector using video, audio and text features from the MUStARD dataset - data from various TV sitcoms like Friends, Big Bang Theory. Training and analyzing the performance of LSTMs with different types of attention mechanisms, we found the best performing model to learn a bias towards labeling data as sarcastic, but does very well in detecting non-sarcastic data.
[Report] [Presentation] [Video]Domain: Natural Language Processing
We implemented a Sentiment analysis system for Amazon Food Reviews. Text cleaning and feature extraction techniques were performed on the reviews data, namely removing special characters and numbers, removing stop words, and tokenizing the data. Other forms of text vectorizations were also examined, including Word2Vec, GLOVE and Bag-of-Words. We developed baseline models using Naive Bayes, Logistic Regression, and XGBoost, before moving to LSTM models. Finally we also evaluated BERT embeddings with LSTM.
[Report] [GitHub]Domain: Natural Language Processing
We developed a Fake-News Detector using Transformer-based model - BERT trained on LIAR dataset. On analysis, we infer that the model does well classifying false statements, and does a poor job classifying true statements.
[Report] [GitHub]Technologies: HTML, CSS, ReactJS
Many algorithms in computer science become easier to understand if visualized. I developed a tool to visualize sorting algorithms like bubble sort, merge sort and insertion sort where every comparison the algorithm makes can be seen. There is an interactive page to see every node path finding algorithms like BFS, DFS, Dijkstra's visit where one can add weights and walls in the grid. Making comparing various algorithms against each other easier.
[Report] [GitHub]Technologies: Flask, HTML, CSS, JS
Managing information about students, staff, admission process for a Hostel is tedious. We created a website which does not not require the person handling the system to be very efficient or to be good at calculations. Some key features are managing data of students, staff, students' representative, admission process, mess and also helps in maintaining exit-entry records of students who stay in the hostel, visitors and couriers delivered to them.
[GitHub]Domain: Network Security, Machine Learning
Given the huge amount of traffic on a network detecting malicious activities using machine learning becomes difficult as it can hide easily in normal traffic. We create an ensemble of various imbalance reduction techniques, while comparing them against each other. We infer that Depending upon the combination of techniques used for the ensemble, the results have varied. Ensembles of certain techniques did prove to show better results.
[Slides] [Report] [GitHub]