ChronoScope: AI-Powered Timeline Builder

Drop a resume, cover letter, or personal document into ChronoScope and watch your career materialize as an interactive timeline. The differentiator: privacy-first architecture where all extraction happens locally, plus a validation system that assigns confidence scores instead of blindly trusting AI output. Export to TimelineJS for polished presentations or explore relationships in a knowledge graph.

Tags: Document Processing, GenAI, NLP, OpenAI, Plotly, Streamlit, Timeline, Visualization

Categories: Data Products & Interfaces

Why This Matters

The gap between “I have documents” and “I can see my journey.” Resumes are optimized for job applications, not reflection. Cover letters highlight specific moments. LinkedIn profiles follow a rigid format. None of them show the full arc of your professional life—the threads connecting education to career pivots to skill development.

Manual timeline creation is tedious. You could build one by hand, copying dates and descriptions into a spreadsheet. Most people don’t, because the effort doesn’t justify the insight.

AI extraction changes the equation. Feed ChronoScope your documents and it identifies events, dates, locations, and people automatically. But AI can hallucinate—so ChronoScope includes confidence scoring and validation to catch extraction errors before they become timeline artifacts.

How It Works

📄 Upload Document → 🤖 AI Extraction → ✅ Validation → 📊 Visualization → 💾 Export

Extraction Pipeline

Stage	What Happens
Upload	Accept PDF, TXT, or DOCX; detect document type (resume, cover letter, personal statement)
Extract	GPT identifies events with dates, locations, people, and context
Fallback	If LLM fails, rule-based parser extracts structured data
Validate	Quality checks assign confidence scores; flag uncertain extractions
Store	Events saved to local JSON with deduplication
Visualize	Render as Plotly timeline, priority matrix, or network graph
Export	Download as TimelineJS (6 color schemes, 10 fonts) or Excel

TimelineEvent data model

```python @dataclass class TimelineEvent: id: str # Unique identifier title: str # Event name description: str # What happened start_date: datetime # When it started end_date: Optional[datetime] # When it ended (if known) location: Optional[str] # Where it happened category: str # Education, Work, Achievement, etc. people: List[str] # People involved tags: List[str] # User-defined tags priority: int # 1-10 importance rating confidence: float # AI confidence score (0-1) source_document: str # Which document this came from ```

What Shipped

Core Features

AI-powered extraction from PDF, TXT, and DOCX using OpenAI GPT with context-aware document classification
Dual extraction strategy — LLM primary with rule-based fallback when API fails
Interactive Plotly visualizations — Timeline charts, priority matrices, and temporal distribution views with filtering and tooltips
TimelineJS export with 6 professional color schemes and 10 font combinations
Knowledge graph — NetworkX visualization showing relationships between events, people, and places (JSON-based + optional Neo4j)
User notes — Capture thoughts alongside your timeline with persistent storage
Multi-document processing — Combine events from multiple sources into unified timeline
Quality validation — Confidence scoring and completeness checks

Privacy Architecture

All processing happens locally:

Documents parsed on your machine
OpenAI API calls send only extracted text (not files)
Event data stored in local JSON files
No telemetry, no cloud storage, no data retention

Architecture

Core Components

Component	Purpose
DocumentProcessor	Context-aware extraction with LLM + fallback
TimelineStore	JSON persistence with filtering, deduplication, CRUD
TimelineVisualizer	Plotly charts with interactive tooltips
UserNotesStore	Persistent notes with metadata tracking

Data flow diagram

``` Document Upload ↓ DocumentProcessor ├── Document type detection (resume, cover letter, personal) ├── LLM extraction (OpenAI GPT) │ ↓ │ (If API fails) │ ↓ └── Fallback parser (rule-based) ↓ Validation ↓ TimelineStore ├── Deduplication ├── JSON persistence └── Filtering/querying ↓ TimelineVisualizer ↓ Streamlit UI ```

Storage Pattern

JSON-based with backup/restore:

timeline_events.json — Single source of truth for events
user_notes.json — User annotations with metadata
Automatic backup before destructive operations
Error recovery with graceful fallback

Implementation Notes

Streamlit Lessons Learned

Building ChronoScope surfaced hard-won lessons about Streamlit’s reactive model. These patterns are now codified in the project’s development guide:

The Golden Rules:

Rule	Why It Matters
Every widget needs an explicit `key`	Streamlit auto-generates keys by execution order; adding conditionals breaks them
Check file hash before processing	File uploaders persist across reruns; without deduplication, you get duplicate events
Justify every `st.rerun()`	Each rerun executes the entire script; multiple reruns cause loops
Cache expensive computations	Code in main body runs on every rerun; use hash-based caching

Widget key registry pattern:

# Naming convention: {section}_{purpose}
st.selectbox("Color scheme", options, key="export_color_scheme")
st.date_input("Date range", value, key="filter_date_range")
st.slider("Priority", 1, 10, key="filter_priority")

File deduplication pattern:

file_hash = hashlib.md5(file.read()).hexdigest()
file.seek(0)  # Reset pointer after hashing

if file_hash in st.session_state.processed_files:
    continue  # Skip already-processed file

process_file(file)
st.session_state.processed_files[file_hash] = metadata

These patterns eliminated a class of bugs where events were reprocessed on every UI interaction.

What’s Next

LLM transparency features — Show extraction reasoning alongside results
Formal test suite — pytest coverage for DocumentProcessor, TimelineStore
Performance optimization — Cache LLM responses to avoid redundant API calls
Enhanced date parsing — Better handling of ambiguous dates and ranges
Mobile-responsive design — Timeline viewing on smaller screens

Use Cases

Scenario	How ChronoScope Helps
Career portfolio	Visualize your professional journey for interviews or self-reflection
Academic timeline	Track education milestones, publications, research evolution
Life story	Document personal achievements and memories in chronological context
Project history	Chart project evolution, deliverables, and team contributions
Team retrospective	Combine individual timelines into collective achievement visualization

Screens & Artifacts

Screenshots from the Streamlit interface:

Main timeline view with interactive Plotly chart
Advanced settings panel with export options
TimelineJS exported presentation

See docs/assets/images/ for interface screenshots.