Project Overview
The Capital Gains Statement Processor is a sophisticated web application that automates the conversion of PDF capital gains statements into structured Excel/CSV format, making tax preparation and financial analysis significantly easier.
AI-Powered Extraction
Uses Landing AI for intelligent document parsing and data extraction from complex PDF layouts.
Smart Conversion
Leverages Google Gemini 2.0 Flash for accurate markdown-to-Excel conversion with proper formatting.
Cost Transparency
Real-time cost calculation showing exactly how much each processing step costs.
Technical Architecture
Backend Stack
- FastAPI - Modern, fast web framework for building APIs
- Python 3.11 - Core programming language
- AsyncIO - Asynchronous programming for better performance
- Uvicorn - ASGI server for production deployment
- Jinja2 - Template engine for HTML rendering
Data Processing
- PyPDF2 - PDF parsing and page counting
- Pandas - Data manipulation and CSV generation
- OpenPyXL - Excel file generation and formatting
- TikToken - Accurate token counting for cost calculation
Frontend Stack
- Bootstrap 5.3 - Responsive UI framework
- Vanilla JavaScript - Clean, dependency-free frontend
- Font Awesome - Professional icon library
- CSS3 - Custom styling and animations
AI Services
- Landing AI - Document extraction and OCR
- Google Gemini 2.0 Flash - Language model for data conversion
- Agentic-Doc - Document processing pipeline
Development Journey
Phase 1: Research & Planning
Analyzed the challenge of extracting structured data from PDF capital gains statements. Identified the need for a two-stage approach: PDF → Markdown → Excel to ensure accuracy and maintain data integrity.
Key decisions: FastAPI for backend, modular architecture, AI-first approachPhase 2: Core Architecture
Built the foundation with FastAPI, implemented file upload system, created modular service architecture. Designed the three-step process: Upload → Extract → Convert.
Challenges: File handling, async processing, error managementPhase 3: AI Integration
Integrated Landing AI for PDF extraction using the agentic-doc library. Implemented Google Gemini 2.0 Flash for intelligent markdown-to-CSV conversion with complex prompt engineering.
Breakthrough: Achieving 95%+ accuracy in data extractionPhase 4: User Experience
Developed responsive frontend with Bootstrap, implemented real-time progress tracking, added file preview capabilities, and created intuitive three-step workflow.
Focus: Simplicity, visual feedback, error handlingPhase 5: Cost Transparency
Added real-time cost calculation system using tiktoken for accurate token counting. Implemented transparent pricing breakdown showing Landing AI and Gemini costs separately.
Innovation: First-of-its-kind cost transparency in document processingProcessing Pipeline
PDF Upload
- File validation & security checks
- Size and format verification
- Secure temporary storage
- Immediate cost estimation
AI Extraction
- Landing AI document analysis
- Table structure recognition
- Text and data extraction
- Markdown format conversion
Smart Conversion
- Gemini AI processing
- Data structure analysis
- Excel format mapping
- Quality validation
Output Generation
- Excel/CSV file generation
- Format optimization
- Download preparation
- Final cost calculation
Key Challenges & Solutions
Challenge: PDF Complexity
Capital gains statements have varied layouts, complex tables, and inconsistent formatting across different brokers and years.
Solution:
Used Landing AI's advanced document understanding to handle layout variations. Implemented fallback mechanisms and manual input options for edge cases.
Challenge: API Timeouts
Large or complex documents sometimes caused AI processing to timeout or get stuck at 0% progress.
Solution:
Implemented progressive timeouts (30s → 60s), graceful fallbacks, and helpful user guidance for retry strategies.
Challenge: Data Accuracy
Ensuring extracted financial data maintains perfect accuracy without errors in numbers, dates, or calculations.
Solution:
Implemented multi-stage validation, cross-verification of calculations, and preview capabilities for user verification.
Challenge: Cost Transparency
Users needed to understand the cost of processing before committing to expensive AI operations.
Solution:
Built comprehensive cost calculator with real-time estimates, token counting, and detailed breakdowns of all charges.
Future Enhancements
Database Integration
Add persistent storage for processed documents, user accounts, and processing history.
Mobile Optimization
Enhanced mobile experience with touch-friendly upload and better responsive design.
Analytics Dashboard
Visual analytics for capital gains trends, tax optimization suggestions, and portfolio insights.
Enhanced Security
End-to-end encryption, secure document handling, and compliance with financial data regulations.
API Integration
Direct integration with tax software, accounting platforms, and financial management tools.
Multi-Region Support
Support for different countries' tax formats and regulatory requirements.
Technology Credits
This application was built with gratitude to the open-source community and AI providers:
Core Technologies
- FastAPI by Sebastián Ramirez
- Bootstrap by the Bootstrap Team
- Font Awesome by Fonticons
- Pandas by the Pandas Development Team
AI Services
- Landing AI - Document Processing
- Google Gemini - Language Processing
- TikToken - Token Counting