File Service

Files are the core of The Archive's digital assets. The current system couldn't tell you where a file was in the processing pipeline, whether an upload had actually succeeded, or whether a file deposited two years ago was still intact. For an archive whose mission is long-term preservation, that's not a UX problem — it's an existential one.

No items found.

Product Designer

Product Owner

Tech Lead * Architect

3 ~ 4 Developers

UX Engineer

MVP: Mar 2024 - Mar 2025

Maintenance: Mar 2026 - Current

Overview

The Archive receives and preserves tens of thousands of research datasets deposited by academic researchers, federal agencies, and data publishers. The File Service is the reusable React component library and backend pipeline that handles every file interaction across The Archive's products — upload, virus scanning, metadata extraction, status tracking, and download. It replaced two legacy systems that had become dangerously fragile.

As the product designer on this project, I was responsible for the user-facing experience across all interaction types: deposit workflows for researchers of varying technical sophistication, status visibility for staff, and the download management interfaces for complex restricted-data access.

What Was Actually Broken

The legacy systems had accumulated a specific category of failure: silent errors. Files would disappear from a depositor's workspace without explanation. Uploads would fail, but the user was not notified — file metadata would appear in the deposit workspace as if the upload had succeeded, and the depositor would discover the error only when attempting to publish, sometimes months later. Processing multiple large deposits simultaneously could create backlogs lasting more than 24 hours, with no visibility into why.

The failures weren't random — they had a structural cause. File handling was built into each product individually rather than operating as an independent service. The architecture made it impossible to monitor, upgrade, or scale file processing without touching every product that depended on it.

The File Service solved this at the infrastructure level: a single, modular service accessed by all The Archive products. Files upload directly from the browser to S3 — bypassing intermediate servers that introduced timeout failures. A configurable processing pipeline handles virus scanning, metadata extraction, and storage checks independently, so a failure in one stage doesn't corrupt another. Every stage is logged and visible to both staff and depositors in real time.

Considerations
User spectrum: Designing for the Full Range

The Archive's depositor population spans an enormous range of technical sophistication and motivation. The design couldn't optimize for one end of that range without failing the other.

                                                                                       
Depositor ProfileWhat They Need From the System
Novice researcher (e.g. undergraduate)Reassurance that the upload worked, clear confirmation, minimal cognitive load
Intermediate researcherStructured workflow, visibility into file organization requirements, feedback on completeness
Principal Investigator (PI)Audit capability across large study archives, efficiency, detail on demand without clutter
The Archive Staff (curators, support)Full processing history, ability to re-run ingests, provenance audit trail for compliance

The Two-Axis Display Problem

The core UX insight from design working sessions: file list behavior is governed by two independent axes that must be treated separately.

                                                                   
AxisWhat It Controls
Screen size (mobile → desktop)How much horizontal space is available for metadata columns
Context importance (minimal → full management)How much cognitive weight the file list is allocated within the surrounding page

My solution: metadata priority tiers that assign display conditions based on both axes together. A field earns its place based on context importance and screen size simultaneously, not independently.

                                                                                       
TierFields · Display Condition
Tier 1 — Always visibleFile name, file type (icon/pill), upload status · Every context, every screen size
Tier 2 — Space permittingDate added, file size · Desktop, or high-importance context on any screen
Tier 3 — Full management onlyInline actions (download, rename, delete), description/notes, uploader/version · Desktop + high-importance context
Tier 4 — On demand onlyChecksum/hash, full path/URL, processing log · Tap or click to expand in a detail panel
Handling Scale: Large Files, Parallel Uploads, Legacy Migration

The File Service had to handle use cases that exposed the limits of the previous system in the most concrete terms. One project brought more than 11TB of video files digitized from 1,500 VHS tapes — a single deposit larger than anything the legacy system could process. Another organization regularly receives deposits with tens of thousands of files and strict publication deadlines. A third delivers a large quarterly file via SFTP that needed automated ingestion without manual intervention.

Each of these wasn't just a performance problem — it was a UX problem. When a large upload is running for hours, the user needs to know it's working. When a batch of 10,000 files is ingesting in parallel, the status display needs to communicate aggregate progress without becoming a wall of individual status indicators. I designed the progress and status experience for these edge cases as primary scenarios, not exceptions.

Outcome

The File Service shipped and is available for production use since March 2025. It replaced two legacy systems and became the shared file infrastructure for all The Archive products.

Takeaway

The File Service taught me that infrastructure design is user experience design. The pipeline behind the upload button is invisible to the user — until it fails. When it fails silently (as the legacy system did), it creates a specific kind of distrust that is very hard to recover from: the user stops believing the system is telling them the truth. Making the pipeline visible — its stages, its progress, its errors — transforms infrastructure from an invisible assumption into a legible, trustworthy contract.

Latest Works

Federal UX · Service Design · Dual Persona · Section 508 Accessibility

Multi-product suite · Responsive, Accessible, Themable · Also base system across the Archive's products

Pipeline UX · Multi-User Workflow Design · React Component Library · Data Preservation

Want to find other projects?