Hasan Alp Caferoğlu

Curriculum Vitae

PDF


Education

Master of Science - Computer Science

Bilkent University

Ankara, Turkey

Jan 2023 - Present

Bachelor of Engineering - Mechanical Engineering

Middle East Technical University

Ankara, Turkey

2016 - 2021

Experience

Teaching Asistant

Bilkent University

Ankara, Turkey

Jan 2023 - Present

CS 101 Algorithms and Programming I

CS115 Introduction to Programming in Python

CS353 Database Systems

Web Developer

Epicflow.io

USA (Remote)

Oct. 2023 - March 2023

Worked on integrating authentication and payment systems. Developed frontend of the application.

(Javascript, React, AWS Cognito, AWS Lambda, Stripe)

Mechanical System and Design Engineer

ASELSAN

Ankara, Turkey

Sept. 2021 - March 2023

Worked on designing mechanical parts, determination and selection of the system components and layout of the system components.

Publications & Preprints

  • H. A. Caferoğlu, O. Ulusoy. E-SQL: Direct Schema Linking via Question Enrichment in Text-to-SQL, 2024. (In Submission)[PDF] [Code]
  • H. A. Caferoğlu, M.S. Çelik, O. Ulusoy. SING-SQL: A Synthetic Data Generation Framework for In-Domain Text-to-SQL Translation, 2025. (In Submission)[PDF] [Code]

Projects

Synthetic Data Generation for In-Domain Text-to-SQL Translation (SING-SQL)

June 2024 - Present

The project focuses on the real-world enterprise scenarios, where models need to specialize to a single database schema and organizations require to be able to evaluate their Text-to-SQL systems on their own databases. Proposed a fully automated two-stage framework for generating high-quality synthetic Text-to-SQL data tailored to any database without SQL logs or manual annotations. The method hierarchically partitions schemas into sub-schemas, synthesizes multi-level SQL queries, and applies LLM-as-a-judge validation, executability checks, and column balancing for high coverage. Fine-tuned models trained on this data achieved state-of-the-art in-domain performance, outperforming the best baseline (comparable scale) by +16.21%.

(Python, PyTorch, Supervised Fine-Tuning (SFT), Low-Rank Adaptation(LoRA), Unsloth, Gemini API)

Sparse & Dense Retrieval Comparison

June 2024 - Present

Developed an information retrieval system to compare sparse and dense (embedding-based) retrieval methods on the multilingual Istella22 dataset. Implemented sparse retrieval using PyLucene and BM25 scoring, and dense retrieval via multilingual transformer-based embeddings indexed with HNSW for approximate nearest neighbor search. Evaluated multiple models and embedding aggregation strategies (first-passage vs. average) under various dataset scales. Additionally, implemented the HDCT framework using fine-tuned BERT for context-aware term weighting. Assessed performance using NDCG, MAP, and MRR, highlighting trade-offs between retrieval quality and computational efficiency.

(Python, Apache Lucene, PyTorch, Supervised Fine-Tuning (SFT))

Text-to-SQL Translation Using LLMs (E-SQL)

June 2024 - Present

Developing a challenging NLP application utilizing large language models (LLMs) for translating natural language queries into SQL statements (i.e., NL2SQL). This approach significantly lowers the barriers to accessing relational databases by parsing and understanding the structure of natural language to accurately generate corresponding SQL queries.

(Python, PyTorch, Hugging Face, OpenAI)

Image Segmentation with R2U-Net

Jan 2024 - May 2024

In this project, the Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) is implemented. Unlike the original study, the implemented model is trained and tested on the Oxford-IIIT Ped dataset for the image segmentation task.

(Python, PyTorch)

GelGit Travel: A Cloud-Native Application

Jan 2024 - May 2024

Gelgit Travel is a cloud-native application which showcases transportation ticket management and purchase platform. It is developed to learn and utilize fundamental AWS cloud services.

(VPC, EC2, RDS, API Gateway, Lambda, Cognito, SES, Amplify, IAM, WAF, CloudTrail, CloudWatch, React)

NLIDB Keyword Mapping

Oct 2023 - Jan 2024

An NLP application that employs weak-supervision techniques to identify the relation between natural language query tokens and database entities, treating the keyword mapping problem as a sequence tagging problem using neural network.

(Python, PyTorch, Hugging Face)

Stance Detection - Stream Data

Oct 2023 - Jan 2024

A Stance Detection project that identifies the attitude of a textual content under the changing nature of the data. Adaptive model which tries to maintain its accuracy overtime by fine-tuning itself using new data.

(Python, PyTorch, Tensorflow)

Online Ticket Purchasing Application

May 2022 - June 2022

A website that allows travelers to search travels and reserve/purchase tickets for their desired transportation method, and that allows travel companies to register and manage their trips.

(Python, Flask, MySQL, Docker)

Teachers-Zone Web Application

Sept 2022

Platform where teachers can show their areas of expertise and thus earn additional income by providing lectures for students.

(Next.js, TailwindCSS, MongoDB, Vercel)


You can accompany my story on other platforms.

Hasan Alp Caferoglu © 2024