CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a Python data comparison tool built with Python 3.13+. The project is currently in early development with a minimal structure containing:

uv run main.py

This launches a web-based GUI at http://localhost:8080

uv run data_comparator.py

The project uses Python 3.13+ with uv for dependency management. Dependencies include:

main.py - Main application entry point that launches the web GUI
data_comparator.py - Core comparison logic for KST vs Coordi data analysis
web_gui.py - Flask-based web GUI application
analyze_excel.py - Basic Excel file structure analysis utility
data/ - Directory containing sample data files
- sample-data.xlsx - Sample Excel data file for comparison operations
templates/ - HTML templates for web GUI (auto-generated)
pyproject.toml - Python project configuration and metadata

KST vs Coordi Comparison: Compares data between KST columns (Title KR, Epi.) and Coordi columns (KR title, Chap)
Mismatch Categorization: Identifies KST-only, Coordi-only, and duplicate items
Data Reconciliation: Ensures matching counts after excluding mismatches
Web-based GUI: Interactive interface with tabs for different data views
File Upload: Upload Excel files directly through the web interface
Sheet Filtering: Filter results by specific Excel sheets
Real-time Analysis: Live comparison with detailed mismatch reasons

The tool compares Excel data by:

Sheet-specific analysis only - No more "All Sheets" functionality, each sheet is analyzed independently
Fixed column positions - KST data from columns I & J, Coordi data from columns C & D
Extracting title+episode combinations from both datasets within the selected sheet
Fixed duplicate detection - Only items that appear multiple times within the same dataset are marked as duplicates
Mixed duplicate priority - Items that exist in both datasets but have duplicates on one side are prioritized over pure duplicates
Categorizing mismatches and calculating reconciliation
Displaying results with reasons for each discrepancy

US URGENT: 금수의 영역 - Episode 17, 신결 - Episode 23 (Coordi duplicates), 트윈 가이드 - Episode 31 (mixed duplicate)
TH URGENT: 백라이트 - Episode 53-1x(휴재) (KST duplicate, doesn't appear in Coordi)