OpenMRS Security Test Implementation Guideline
1. Introduction
The OpenMRS O3 Security Testing Framework is an automated security testing system designed to continuously evaluate OpenMRS vulnerabilities using standardized CVSS 4.0 scoring. This framework enables both security researchers and OpenMRS contributors to write behavior-driven tests that simulate real-world attacks against the OpenMRS platform, automatically calculating vulnerability severity scores based on observed system behavior.
Traditional security testing often relies on manual penetration testing or one-time security audits, which can be time-consuming, inconsistent, and difficult to repeat as the codebase evolves. This framework addresses these limitations by providing automated, repeatable security tests that run continuously through GitHub Actions. Each test not only identifies whether a vulnerability exists but also quantifies its severity using the industry-standard Common Vulnerability Scoring System (CVSS) version 4.0, making security findings immediately actionable for developers.
Contributors can write tests for various attack scenarios including authentication attacks (brute force, credential stuffing), authorization bypass attempts, session management vulnerabilities, and injection attacks. The framework automatically tracks CVSS scores over time, allowing teams to see whether security is improving or degrading as new code is deployed. Results are visualized through an automatically generated dashboard at http://cvss-report.openmrs.org.
2. Getting Started
2.1 Prerequisites
Before you begin, ensure you have the following installed:
Python 3.11 or higher
Docker and Docker Compose
Git
You should also have a basic understanding of security testing concepts and familiarity with command-line interfaces. No deep security expertise is required, but understanding common web vulnerabilities (such as brute force attacks, session hijacking, etc.) will help you write more effective tests.
2.2 Repository Structure
Directory / File | Description |
|---|---|
| Test implementation files ( |
| Dashboard generation script that processes test results and creates the HTML dashboard |
| CI/CD automation configuration that runs tests automatically on every commit |
| All Python dependencies needed for the framework |
2.3 Installation and Setup
# Clone the repository
git clone https://github.com/openmrs/openmrs-contrib-cvss-scanning.git
cd openmrs-contrib-cvss-scanning
# Install Python dependencies
pip install -r requirements.txt --break-system-packages
# Install Playwright browsers
python -m playwright install chromium
# Start the OpenMRS instance
docker compose up -d
# Monitor startup
docker compose logs -f
Note: The
--break-system-packagesflag is required for system Python installations.
2.4 Verifying Your Setup
Run an existing test to verify your setup:
pytest tests/authentication/test_01_brute_force_password.py -v -sThe test should execute successfully and display a CVSS score (typically 5.5 for this test when defenses are working). Then generate the dashboard locally:
python scripts/generate_security_dashboard.pyOpen the resulting security_dashboard.html file in your browser to see test results visualized.
3. Framework Architecture
3.1 Component Overview
Component | Role |
|---|---|
Playwright | Browser automation for UI-based tests, simulating real attacker interactions |
pytest-bdd | Behavior-driven test structure using Given-When-Then format |
requests | API-level testing via direct HTTP requests, bypassing the UI layer |
CVSS 4.0 Calculator | Converts observed vulnerabilities into standardized severity scores using the MacroVector lookup table |
SQLite Database | Stores all historical test results for baseline tracking and trend analysis |
Dashboard Generator | Processes test results and creates an HTML visualization |
3.2 Test Execution Flow
Execute the attack scenario (e.g., multiple failed login attempts)
Observe system behavior (account lockout, response codes, lockout duration)
Determine dynamic CVSS parameters based on observations
Calculate the final CVSS score (0.0–10.0) via the MacroVector lookup function
Save results to SQLite — the first run sets the baseline; subsequent runs track changes
Generate an updated HTML dashboard reflecting the current security posture
3.3 From Test to Dashboard
The complete pipeline runs automatically via GitHub Actions on every push to main:
Download previous test results database from artifacts (if it exists)
Spin up OpenMRS instance in Docker
Run all security tests and capture detailed output logs
Process results through the dashboard generator
Calculate improvements against baselines and build trend data
Deploy the HTML dashboard to GitHub Pages at http://cvss-report.openmrs.org
Upload the updated database as an artifact (90-day retention) for future trend tracking
4. Writing Your First Test
4.1 Established Concepts
Security tests consist of two complementary files:
Feature file (
.feature) — Human-readable test scenarios in Gherkin format (Given-When-Then). Readable by non-technical stakeholders.Test implementation file (
.py) — Python code that executes the test, defines CVSS parameters, and implements step definitions.
Feature file structure:
Given <initial state>
When <action being tested>
Then <expected outcome>Test implementation file structure:
CVSS parameter definitions (with rationale comments)
MacroVector lookup function for score calculation
Dynamic parameter detection functions
pytest-bdd step implementations (
@pytest_bdd.given,@pytest_bdd.when,@pytest_bdd.then)
4.2 Detailed Conventions — Coming Soon
The detailed methodology for feature file patterns and test structure conventions is currently being refined by the research team. For now, refer to existing tests as templates.
Reference files:
tests/authentication/test_01_brute_force_password.py— Complete front-end UI test with Playwright, CVSS documentation, and dynamic parameter detectiontests/authentication/test_02_brute_force_api.py— Same attack scenario tested against the REST API layertests/authentication/o3_authentication_security.feature— Proper Gherkin syntax and scenario structure
Key principles:
Focus on one specific attack scenario per test
Document CVSS parameters clearly at the top of each implementation file
Create separate functions for dynamic parameter detection (reusable and testable)
Include comprehensive output logging throughout test execution
5. CVSS Scoring Guidelines
5.1 Established Concepts
This framework uses CVSS 4.0 Base scoring on a scale of 0.0–10.0:
Score Range | Severity |
|---|---|
0.0 | None |
0.1 – 3.9 | Low |
4.0 – 6.9 | Medium |
7.0 – 8.9 | High |
9.0 – 10.0 | Critical |
Each test uses eleven CVSS parameters split into two types:
Static parameters — Fixed based on the attack scenario itself (e.g., Attack Vector = Network for all remote web attacks), regardless of whether defenses are active
Dynamic parameters — Determined at runtime based on observed system behavior (e.g., Confidentiality Impact depends on whether the attacker was blocked)
The framework uses the MacroVector lookup table method from CVSS 4.0 (not the mathematical formula from CVSS 3.1). Parameters are grouped into five Equivalence Classes (EQs), converted to a five-number key, and looked up in a table of 108 pre-calibrated values established by http://FIRST.org . Score calculation is handled automatically by calculate_cvss_v4_score().
5.2 Parameter Selection Methodology — Coming Soon
Comprehensive methodology including decision frameworks, validation criteria, and edge case guidance is currently being developed. For now, reference existing tests for parameter selection patterns.
Key considerations when designing CVSS scoring:
Think from the attacker's perspective, not the defender's
Base static parameters on attack scenario characteristics, not deployed defenses
Identify parameters that could change based on observable system behavior — these are candidates for dynamic detection
Document your reasoning for each parameter, not just the value chosen
Remember that parameter selection has academic implications — rigor and justification matter
6. Running Tests
6.1 Running Tests Locally
# Run a single test
pytest tests/authentication/test_01_brute_force_password.py -v -s
# Run all tests in a directory
pytest tests/authentication/ -v -s
# Run the full test suite
pytest tests/ -v -s
# Run with full reporting
pytest tests/ -v -s --html=report.html --self-contained-html --json-report --json-report-file=report.json
6.2 Viewing Test Output
During execution, the terminal displays:
Real-time attack logs (each login attempt, response codes, system behavior)
CVSS score calculation section at the end of each test (score, severity, parameter values, vector string)
After tests complete, open report.html for a visual summary or report.json for structured data used by the dashboard generator.
6.3 Generating the Dashboard Locally
python scripts/generate_security_dashboard.pyThis will initialize the SQLite database (or create it on first run), parse test results from report.json, save results and set baselines, and generate security_dashboard.html.
6.4 GitHub Actions Workflow
On every push to main, the workflow automatically:
Downloads the previous test results database from artifacts
Spins up OpenMRS via Docker Compose
Runs all tests, capturing output to
test_output.logGenerates the HTML dashboard
Deploys to GitHub Pages at http://cvss-report.openmrs.org
Uploads the updated database as artifact
test-results-db(90-day retention)
6.5 Debugging Test Issues
Issue | Solution |
|---|---|
Unexpected test failure | Check terminal output with |
Incorrect CVSS scores | Verify dynamic parameters are set properly in detection function output |
OpenMRS connection issues | Run |
Test timeouts | Increase wait times in test code or allocate more memory to Docker |
CVSS shows 0.0 | Verify score calculation function ran; check dynamic parameters aren't |
Database errors | Check write permissions on |
Dashboard won't generate | Verify |
GitHub Actions failure | Check the "Run Security Tests" step logs under the Actions tab |
7. Understanding Results
7.1 Dashboard Structure
The dashboard is organized into:
Summary cards — Total Tests, Passed, Failed, Duration
Test results table — Name, attack description, status, CVSS score, severity, improvement vs. baseline, trend sparkline, duration
Note: "Passed" and "Failed" reflect test execution status, not security status. A passing test means it ran successfully and calculated a CVSS score.
7.2 CVSS Severity Levels
Score | Severity | Description |
|---|---|---|
9.0 – 10.0 | Critical | Full system access possible with minimal effort — address immediately |
7.0 – 8.9 | High | Serious concern with significant access or impact — prompt attention required |
4.0 – 6.9 | Medium | Notable issue, may have mitigating factors — should be addressed |
0.1 – 3.9 | Low | Minor concern with limited exploitability or impact |
0.0 | None | No vulnerability or defenses completely prevent exploitation |
7.3 Baseline System
When a test runs for the first time, its CVSS score is saved as the baseline in SQLite
The baseline remains constant unless manually reset
All subsequent runs compare against this baseline to calculate improvement metrics
The dashboard displays both the current score and the original baseline score
7.4 Improvement Metrics
The Improvement column shows: baseline score − current score
Display | Meaning |
|---|---|
| Security improved — CVSS score decreased |
| Security degraded — CVSS score increased |
| No change from baseline |
Remember: Lower CVSS = less vulnerable = positive improvement.
7.5 Mouseover Tooltip History
Hovering over any improvement value shows a tooltip with the last 10 test runs, including:
Run number (current run marked with
◀)CVSS score for that run
Delta from baseline (color-coded)
7.6 Trend Sparkline Visualization
The Trend column displays a small line chart of CVSS scores over the last 20 runs:
Sparkline Shape | Meaning |
|---|---|
Flat horizontal line | Consistent security — score unchanged |
Downward slope | Improving security — scores decreasing |
Upward slope | Degrading security — scores increasing |
Volatile (up and down) | Inconsistent security or flaky test behavior |
Displays "Not enough data" until at least two runs exist.
8. Contributing Tests
8.1 Before You Start
Review
/tests/authentication/and the feature file to avoid duplicating existing workEnsure your local OpenMRS instance is running
Familiarise yourself with the CVSS 4.0 specification if needed
Decide whether your scenario targets the frontend UI, REST API, or both
8.2 Development Workflow
Create a feature branch:
add-session-hijacking-testortest-sql-injectionWrite the feature file scenario first (Given-When-Then)
Implement the test file with full CVSS parameter documentation
Run locally multiple times to verify consistent results
Verify the CVSS score is appropriate for the scenario and observed defenses
Document parameter rationale thoroughly in comments
8.3 Submitting a Pull Request
Your PR description should include:
What vulnerability or attack scenario the test evaluates
Expected behavior when defenses are working
Example output including the CVSS score
Rationale for CVSS parameter choices (especially dynamic ones)
References to related OpenMRS security issues or documentation
At least one complete test run output showing attack progression and final score
8.4 Review Process
Reviewers will assess:
Reliability — Test executes consistently and produces repeatable results
CVSS justification — Parameters are appropriate and well-documented
Dynamic detection logic — Correctly interprets system behavior
CI/CD compatibility — Test passes in GitHub Actions, not just locally
Code quality — Readability, documentation, and adherence to existing patterns
Security accuracy — Test reveals a real vulnerability with an appropriately scored CVSS
8.5 Testing Checklist
Before submitting your PR, verify the following:
[ ] Feature file follows Given-When-Then structure with clear, readable scenarios
[ ] CVSS parameters are documented at the top of the test file with detailed rationale
[ ] Test produces consistent results across multiple runs
[ ] Dynamic parameters correctly detect system behavior (test with and without defenses if possible)
[ ] Comprehensive output logging is included throughout test execution
[ ] Test passes both locally and in GitHub Actions CI/CD
[ ] Code is clear and documented — another developer should understand what and why
[ ] Git commits are clean with meaningful commit messages
[ ] All necessary files are included in the branch
9. Troubleshooting
9.1 Common Issues and Solutions
OpenMRS fails to start
Run
docker psto verify Docker is runningCheck for port conflicts on ports 80 and 3306
Run
docker compose downthendocker compose up -dMonitor with
docker compose logs -f
Tests consistently time out
Increase wait times in test code
Allocate more memory to Docker
CVSS scores show as 0.0
Verify the score calculation function executed by checking test output logs
Confirm dynamic parameters were set to valid values (not
None)
Database errors
Check that
test_results.dbhas write permissionsDelete
test_results.dbto start fresh if corrupted
Dashboard fails to generate
Ensure
report.jsonexists and contains valid JSONRun
pip install -r requirements.txt --break-system-packagesCheck dashboard generator output for specific error messages
GitHub Actions workflow fails
Check the "Run Security Tests" and "Extract CVSS Scores and Generate Dashboard" steps under the Actions tab
9.2 Where to Get Help
Resource | Use For |
|---|---|
Framework bugs, feature requests | |
Questions about OpenMRS behavior and features |