Midterm Project for STA323
Project overview for the STA323 (Big Data Analysis Software and Application) midterm work.
Project Overview
This project is part of the midterm assignment for STA323 (Big Data Analysis Software and Application). The customer churn section combines analytical modeling, report writing, database experimentation, and website presentation around a single course project.
Task Scope
The full scope includes survival analysis on the customer churn dataset, a written report that records the analytical process and results, a MySQL-based Text-to-SQL experiment using LLMs, and the presentation of the project through this personal website.
Survival Analysis
The survival analysis component follows the official Spark workflow and focuses on data preparation, Kaplan-Meier estimation, Cox modeling, AFT modeling, and CLV estimation. Its goal is to explain retention patterns and identify variables associated with churn timing and customer value.
Text-to-SQL Experiment
The project also includes a MySQL-based Text-to-SQL experiment with LLMs. This part evaluates how language models handle structured customer churn questions in a database setting and complements the analytical work from a data application perspective.
Website and Blog
The final results are presented through this PRISM-based website. The dedicated blog article focuses on the survival analysis report, while this project page summarizes the broader scope of the STA323 midterm work.