erDiagram
PATIENT ||--o{ TUMOR : has
TUMOR ||--o{ TREATMENT : receives
TUMOR ||--|| DIAGNOSIS : "classified by"
FACILITY ||--o{ TUMOR : reports
PATIENT {
string patient_id PK
string name
date birth_date
string ssn
string address
}
TUMOR {
string tumor_id PK
string patient_id FK
date diagnosis_date
string primary_site
string histology
string stage
}
8.1 Requirements Analysis & Data Analysis
Raw elicitation findings must be organized, analyzed, and specified in detail. In business analysis, this is Requirements Analysis and Design Definition. In public health, it maps to Data Analysis and Logic Model Development. Both processes transform unstructured input into structured, actionable specifications.
8.1.1 The Dual Framework
| BA Perspective | PH Perspective |
|---|---|
| Requirements Analysis | Data Analysis |
| Requirements Specification | Logic Model / Theory of Change |
| Functional Requirements | Program Activities |
| Non-Functional Requirements | Implementation Characteristics |
| Data Requirements | Case Definitions, Data Dictionaries |
| Business Rules | Clinical Guidelines, Protocols |
8.1.2 Types of Requirements
8.1.2.1 Functional Requirements
BA Definition: What the system must do. Capabilities, features, functions.
PH Equivalent: Program activities, intervention components, service delivery specifications.
Functional Requirement (BA format):
FR-101: The system shall allow users to search for cases by patient name, medical record number, or social security number.
Program Activity (PH format):
Cancer registrars will abstract and code incident cases from hospital pathology reports within 6 months of diagnosis date.
Both describe “what happens” but at different levels of specificity.
8.1.2.2 Non-Functional Requirements (NFRs)
BA Definition: Quality attributes, constraints, performance characteristics.
PH Equivalent: Implementation characteristics (per CFIR framework).
| NFR Category | BA Focus | PH Focus (CFIR Domain) |
|---|---|---|
| Performance | Response time, throughput | Efficiency of intervention delivery |
| Security | Access control, encryption | HIPAA compliance, trust |
| Scalability | Growth capacity | Outbreak surge response |
| Usability | User interface design | Complexity, ease of adoption |
| Reliability | Uptime, fault tolerance | Service continuity |
| Interoperability | API standards, data exchange | Health information exchange |
NFR (BA format):
NFR-201: The system shall maintain 99.9% uptime during business hours (8 AM to 6 PM Eastern).
Implementation Characteristic (PH format):
The CancerSurv platform must demonstrate high reliability to maintain registrar confidence and ensure continuous data collection, critical during cancer awareness campaigns when reporting volumes increase.
8.1.2.3 Data Requirements
Data specifications are central to both domains:
BA Data Model:
- Entity-Relationship diagrams
- Database schemas
- Data dictionaries
- Validation rules
PH Case Definitions:
- Diagnostic criteria
- Inclusion/exclusion criteria
- Coding standards (ICD-O-3, TNM)
- Data quality metrics
8.1.2.4 Data Architecture Requirements
Modern public health data systems require architecture that handles data from ingestion through analytics. The medallion architecture provides a framework for specifying data flow requirements across three progressive layers.
Specifying Requirements by Layer
When documenting data requirements, specify which layer each requirement applies to:
| Requirement Type | Bronze Layer | Silver Layer | Gold Layer |
|---|---|---|---|
| Primary Focus | Completeness, lineage | Accuracy, consistency | Timeliness, usability |
| Data State | Raw, as-received | Cleansed, standardized | Aggregated, analytics-ready |
| Schema | Schema-on-read (flexible) | Enforced schema | Dimensional models |
| Retention | Long-term archive | Medium-term | Purpose-specific |
Bronze Layer Requirements:
- REQ-DATA-001: The system shall ingest HL7 v2.x ADT messages from hospital interfaces within 15 minutes of receipt
- REQ-DATA-002: The system shall preserve original message content with timestamp and source metadata for audit purposes
- REQ-DATA-003: The system shall support ingestion of CSV files from facilities without HL7 capability
Silver Layer Requirements:
- REQ-DATA-010: The system shall deduplicate patient records using probabilistic matching (≥95% precision)
- REQ-DATA-011: The system shall map incoming diagnosis codes to ICD-O-3 standard within 24 hours
- REQ-DATA-012: The system shall apply NAACCR edit checks and flag records failing validation
Gold Layer Requirements:
- REQ-DATA-020: The system shall generate NPCR-compliant annual submission files by January 31
- REQ-DATA-021: The system shall calculate age-adjusted incidence rates by county, updated monthly
- REQ-DATA-022: The system shall provide self-service query access for approved epidemiologists
Data Lineage and Traceability
Public health reporting requires demonstrable data provenance. Requirements should specify:
- How data flows from source to final output
- Which transformations are applied at each layer
- How to trace any Gold-layer value back to its Bronze-layer source
This is equivalent to the “chain of custody” concept in laboratory settings.
flowchart LR
subgraph Bronze["Bronze (Raw)"]
B1["HL7 Messages"]
B2["Lab Reports"]
B3["Vital Records"]
end
subgraph Silver["Silver (Cleansed)"]
S1["Deduplicated<br/>Patient Records"]
S2["Standardized<br/>Case Abstracts"]
end
subgraph Gold["Gold (Curated)"]
G1["Incidence<br/>Reports"]
G2["Analytics<br/>Dashboards"]
G3["Research<br/>Datasets"]
end
B1 --> S1
B2 --> S1
B3 --> S1
S1 --> S2
S2 --> G1
S2 --> G2
S2 --> G3
8.1.2.5 Business Rules / Clinical Guidelines
Rules governing system behavior and data processing:
| BA Business Rule | PH Clinical Guideline |
|---|---|
| “Order cannot be placed if credit limit exceeded” | “Case is reportable if primary site is within state jurisdiction” |
| “Discount applies if quantity > 100” | “Stage is unknown if pathology report unavailable within 4 months” |
| “Manager approval required for refunds > $500” | “Multiple primary rules apply per SEER guidelines” |
8.1.3 Data Standards as Primary Requirements
In commercial software projects, data standards (file formats, API specifications, integration protocols) are often treated as technical details to be resolved by developers during implementation. In public health IT, this approach fails.
Data standards are primary business requirements, not optional technical details.
Health information systems operate within a regulatory and interoperability landscape where specific standards are mandated, not merely preferred. These standards should be identified and documented early, during requirements analysis, not deferred to design or implementation.
| Standard | Purpose | Requirement Implication |
|---|---|---|
| HIPAA | Privacy and security | Security architecture, access controls, audit logging |
| HL7 v2 | Message-based data exchange | Interface specifications for lab results, ADT events |
| HL7 FHIR | Modern API-based exchange | RESTful API design for EHR integration |
| USCDI | Federal data interoperability | Required data classes for ONC certification |
| ICD-10 / ICD-O-3 | Diagnosis and oncology coding | Validation rules, lookup tables, code mapping |
| SNOMED CT | Clinical terminology | Concept mapping specifications |
| LOINC | Laboratory test coding | Interface specifications for electronic lab reporting |
| NAACCR | Cancer registry standards | Data dictionary, edit checks, submission formats |
When data standards are not identified as requirements, projects encounter costly surprises during integration testing. A system that functions correctly in isolation may fail when connected to external systems that expect specific data formats, codes, or protocols.
For business analysts entering public health IT: treat data standards as "Must Have" requirements from day one. Interview stakeholders about external data exchanges early, and document the specific standards each interface requires.
Standards-Based Requirements for CancerSurv:
| Standard | CancerSurv Requirement |
|---|---|
| HIPAA | All PHI encrypted at rest and in transit; role-based access; 6-year audit log retention |
| HL7 FHIR | Patient, Condition, and Observation resources for hospital EHR integration |
| NAACCR v24 | All required data items; automated EDITS validation; annual submission file generation |
| ICD-O-3 | Validated primary site and histology codes with cross-validation rules |
| LOINC | Mapping table for incoming electronic pathology reports |
These standards-based requirements appeared in the CancerSurv requirements specification alongside functional requirements, with the same priority and traceability as any other "Must Have" item.
8.1.4 The Logic Model as Requirements Framework
Public health uses the Logic Model to specify program components. This structure maps directly to requirements categories:
flowchart LR
subgraph Inputs[" "]
I["**Inputs**<br/>(Resources)<br/>───────<br/>Funding<br/>Staff<br/>Data feeds<br/>Infrastructure"]
end
subgraph Activities[" "]
A["**Activities**<br/>(Functions)<br/>───────<br/>Case abstraction<br/>Data quality<br/>Reporting<br/>Analytics"]
end
subgraph Outputs[" "]
O["**Outputs**<br/>(Deliverables)<br/>───────<br/>Case records<br/>Quality reports<br/>NPCR submissions<br/>Dashboards"]
end
subgraph Outcomes[" "]
OC["**Outcomes**<br/>(Success Metrics)<br/>───────<br/>95% completeness<br/>Timely reporting<br/>User satisfaction"]
end
I --> A --> O --> OC
| Logic Model Component | Requirements Category |
|---|---|
| Inputs | Constraints, Assumptions, Dependencies |
| Activities | Functional Requirements |
| Outputs | System Deliverables, Features |
| Outcomes | Success Metrics, Acceptance Criteria |
| Impact | Strategic Objectives, Business Value |
8.1.5 Prioritization
8.1.5.1 Methods for Ranking Requirements
Not all requirements are equal. Prioritization ensures critical needs are addressed first:
MoSCoW Method:
- Must have: Essential for go-live
- Should have: Important but not critical
- Could have: Desirable if time permits
- Won’t have: Out of scope for this release
Weighted Scoring:
Assign weights to criteria (business value, regulatory requirement, user impact) and score each requirement.
Kano Model:
- Basic needs (expected, cause dissatisfaction if missing)
- Performance needs (more is better)
- Delighters (unexpected features that excite)
Must Have:
- Case entry and coding functionality
- HIPAA-compliant security
- NPCR data submission capability
Should Have:
- Real-time analytics dashboard
- Mobile-friendly interface
- Automated duplicate detection
Could Have:
- Machine learning for coding assistance
- Patient portal for self-reported outcomes
- Integration with research databases
8.1.6 Requirements Traceability
8.1.6.1 Linking Requirements to Objectives
Traceability ensures every requirement connects to a business need or program goal:
flowchart TB
BN[Business Need /<br/>Program Goal] --> FR[Functional<br/>Requirement]
BN --> NFR[Non-Functional<br/>Requirement]
FR --> TC[Test Case]
NFR --> TC
FR --> US[User Story]
TC --> TR[Test Result]
Traceability Matrix Example:
| Requirement ID | Description | Source | Priority | Test Case |
|---|---|---|---|---|
| FR-101 | Case search functionality | Registrar interviews | Must | TC-101, TC-102 |
| FR-102 | ICD-O-3 coding validation | NAACCR standards | Must | TC-103 |
| NFR-201 | 99.9% uptime | SLA requirements | Must | TC-201 |
8.1.7 Specification Formats
8.1.7.1 Writing Good Requirements
Regardless of format, good requirements share characteristics:
| Characteristic | Description | Example |
|---|---|---|
| Complete | Contains all necessary information | Includes error handling, edge cases |
| Consistent | Does not contradict other requirements | Uses standard terminology |
| Unambiguous | Only one interpretation possible | “Within 3 seconds” not “quickly” |
| Verifiable | Can be tested | Measurable acceptance criteria |
| Traceable | Links to source and test | Includes requirement ID |
8.1.7.2 User Story Format
For Agile projects:
As a [role], I want [feature], so that [benefit].
Acceptance Criteria:
- Given [context], when [action], then [result]
8.1.7.3 GPS Format for Clinical Contexts
Given [clinical context], the [health worker role] should [specific action] to [health outcome].
User Story:
As a cancer registrar, I want to search for existing cases before creating a new record, so that I avoid creating duplicate entries.
GPS Format:
Given a new pathology report, the registrar should search existing cases by patient identifiers before abstracting, to maintain data integrity and accurate incidence counts.
Acceptance Criteria:
- Given a patient name, when the registrar searches, then matching cases display within 3 seconds
- Given a patient with no existing cases, when the registrar searches, then a “No matches found” message displays with option to create new case
8.1.8 Deliverables from This Phase
| BA Deliverable | PH Deliverable | Purpose |
|---|---|---|
| Requirements Specification | Logic Model | Document what must be built |
| Data Dictionary | Case Definition / Data Standards | Specify data structures |
| Business Rules Catalog | Clinical Protocol | Define processing rules |
| Traceability Matrix | Evaluation Framework | Link requirements to objectives |
| Prioritized Backlog | Workplan | Order implementation work |
8.1.9 Moving Forward
With requirements analyzed, prioritized, and specified, the next phase focuses on Design: defining how the solution will be built to meet these requirements.