Curriculum Vitae
Education
Master of Science - Computer Science
Jan 2023 - Present
Bilkent University
Ankara, Turkey
Jan 2023 - Present
Bachelor of Engineering - Mechanical Engineering
2016 - 2021
Middle East Technical University
Ankara, Turkey
2016 - 2021
Experience
Teaching Asistant
Jan 2023 - Present
Bilkent University
Ankara, Turkey
Jan 2023 - Present
CS 101 Algorithms and Programming I
CS115 Introduction to Programming in Python
CS353 Database Systems
Web Developer
Oct. 2022 - March 2023
Epicflow.io
USA (Remote)
Oct. 2023 - March 2023
Worked on integrating authentication and payment systems. Developed frontend of the application.
(Javascript, React, AWS Cognito, AWS Lambda, Stripe)
Mechanical System and Design Engineer
Sept. 2021 - March 2023
ASELSAN
Ankara, Turkey
Sept. 2021 - March 2023
Worked on designing mechanical parts, determination and selection of the system components and layout of the system components.
Publications & Preprints
Projects
Synthetic Data Generation for In-Domain Text-to-SQL Translation (SING-SQL)
June 2024 - Present
June 2024 - Present
The project focuses on the real-world enterprise scenarios, where models need to specialize to a single database schema and organizations require to be able to evaluate their Text-to-SQL systems on their own databases. Proposed a fully automated two-stage framework for generating high-quality synthetic Text-to-SQL data tailored to any database without SQL logs or manual annotations. The method hierarchically partitions schemas into sub-schemas, synthesizes multi-level SQL queries, and applies LLM-as-a-judge validation, executability checks, and column balancing for high coverage. Fine-tuned models trained on this data achieved state-of-the-art in-domain performance, outperforming the best baseline (comparable scale) by +16.21%.
(Python, PyTorch, Supervised Fine-Tuning (SFT), Low-Rank Adaptation(LoRA), Unsloth, Gemini API)
Sparse & Dense Retrieval Comparison
June 2024 - Present
June 2024 - Present
Developed an information retrieval system to compare sparse and dense (embedding-based) retrieval methods on the multilingual Istella22 dataset. Implemented sparse retrieval using PyLucene and BM25 scoring, and dense retrieval via multilingual transformer-based embeddings indexed with HNSW for approximate nearest neighbor search. Evaluated multiple models and embedding aggregation strategies (first-passage vs. average) under various dataset scales. Additionally, implemented the HDCT framework using fine-tuned BERT for context-aware term weighting. Assessed performance using NDCG, MAP, and MRR, highlighting trade-offs between retrieval quality and computational efficiency.
(Python, Apache Lucene, PyTorch, Supervised Fine-Tuning (SFT))
Text-to-SQL Translation Using LLMs (E-SQL)
June 2024 - Present
June 2024 - Present
Developing a challenging NLP application utilizing large language models (LLMs) for translating natural language queries into SQL statements (i.e., NL2SQL). This approach significantly lowers the barriers to accessing relational databases by parsing and understanding the structure of natural language to accurately generate corresponding SQL queries.
(Python, PyTorch, Hugging Face, OpenAI)
Jan 2024 - May 2024
In this project, the Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) is implemented. Unlike the original study, the implemented model is trained and tested on the Oxford-IIIT Ped dataset for the image segmentation task.
(Python, PyTorch)
Jan 2024 - May 2024
Gelgit Travel is a cloud-native application which showcases transportation ticket management and purchase platform. It is developed to learn and utilize fundamental AWS cloud services.
(VPC, EC2, RDS, API Gateway, Lambda, Cognito, SES, Amplify, IAM, WAF, CloudTrail, CloudWatch, React)
Oct 2023 - Jan 2024
An NLP application that employs weak-supervision techniques to identify the relation between natural language query tokens and database entities, treating the keyword mapping problem as a sequence tagging problem using neural network.
(Python, PyTorch, Hugging Face)
Oct 2023 - Jan 2024
A Stance Detection project that identifies the attitude of a textual content under the changing nature of the data. Adaptive model which tries to maintain its accuracy overtime by fine-tuning itself using new data.
(Python, PyTorch, Tensorflow)
May 2022 - June 2022
A website that allows travelers to search travels and reserve/purchase tickets for their desired transportation method, and that allows travel companies to register and manage their trips.
(Python, Flask, MySQL, Docker)
Teachers-Zone Web Application
Sept 2022
Sept 2022
Platform where teachers can show their areas of expertise and thus earn additional income by providing lectures for students.
(Next.js, TailwindCSS, MongoDB, Vercel)
Hasan Alp Caferoglu © 2024
