OpenMRS Security Test Implementation Guideline

OpenMRS Security Test Implementation Guideline

1. Introduction

The OpenMRS O3 Security Testing Framework is an automated security testing system designed to continuously evaluate OpenMRS vulnerabilities using standardized CVSS 4.0 scoring. This framework enables both security researchers and OpenMRS contributors to write behavior-driven tests that simulate real-world attacks against the OpenMRS platform, automatically calculating vulnerability severity scores based on observed system behavior.

Traditional security testing often relies on manual penetration testing or one-time security audits, which can be time-consuming, inconsistent, and difficult to repeat as the codebase evolves. This framework addresses these limitations by providing automated, repeatable security tests that run continuously through GitHub Actions. Each test not only identifies whether a vulnerability exists but also quantifies its severity using the industry-standard Common Vulnerability Scoring System (CVSS) version 4.0, making security findings immediately actionable for developers.

Contributors can write tests for various attack scenarios including authentication attacks (brute force, credential stuffing), authorization bypass attempts, session management vulnerabilities, and injection attacks. The framework automatically tracks CVSS scores over time, allowing teams to see whether security is improving or degrading as new code is deployed. Results are visualized through an automatically generated dashboard at http://cvss-report.openmrs.org.


2. Getting Started

2.1 Prerequisites

Before you begin, ensure you have the following installed:

  • Python 3.11 or higher

  • Docker and Docker Compose

  • Git

You should also have a basic understanding of security testing concepts and familiarity with command-line interfaces. No deep security expertise is required, but understanding common web vulnerabilities (such as brute force attacks, session hijacking, etc.) will help you write more effective tests.

2.2 Repository Structure

Directory / File

Description

Directory / File

Description

/tests/authentication/

Test implementation files (.py) and feature files (.feature)

/scripts/

Dashboard generation script that processes test results and creates the HTML dashboard

.github/workflows/

CI/CD automation configuration that runs tests automatically on every commit

requirements.txt

All Python dependencies needed for the framework

2.3 Installation and Setup

# Clone the repository git clone https://github.com/openmrs/openmrs-contrib-cvss-scanning.git cd openmrs-contrib-cvss-scanning # Install Python dependencies pip install -r requirements.txt --break-system-packages # Install Playwright browsers python -m playwright install chromium # Start the OpenMRS instance docker compose up -d # Monitor startup docker compose logs -f

Note: The --break-system-packages flag is required for system Python installations.

2.4 Verifying Your Setup

Run an existing test to verify your setup:

pytest tests/authentication/test_01_brute_force_password.py -v -s

The test should execute successfully and display a CVSS score (typically 5.5 for this test when defenses are working). Then generate the dashboard locally:

python scripts/generate_security_dashboard.py

Open the resulting security_dashboard.html file in your browser to see test results visualized.


3. Framework Architecture

3.1 Component Overview

Component

Role

Component

Role

Playwright

Browser automation for UI-based tests, simulating real attacker interactions

pytest-bdd

Behavior-driven test structure using Given-When-Then format

requests

API-level testing via direct HTTP requests, bypassing the UI layer

CVSS 4.0 Calculator

Converts observed vulnerabilities into standardized severity scores using the MacroVector lookup table

SQLite Database

Stores all historical test results for baseline tracking and trend analysis

Dashboard Generator

Processes test results and creates an HTML visualization

3.2 Test Execution Flow

  1. Execute the attack scenario (e.g., multiple failed login attempts)

  2. Observe system behavior (account lockout, response codes, lockout duration)

  3. Determine dynamic CVSS parameters based on observations

  4. Calculate the final CVSS score (0.0–10.0) via the MacroVector lookup function

  5. Save results to SQLite — the first run sets the baseline; subsequent runs track changes

  6. Generate an updated HTML dashboard reflecting the current security posture

3.3 From Test to Dashboard

The complete pipeline runs automatically via GitHub Actions on every push to main:

  1. Download previous test results database from artifacts (if it exists)

  2. Spin up OpenMRS instance in Docker

  3. Run all security tests and capture detailed output logs

  4. Process results through the dashboard generator

  5. Calculate improvements against baselines and build trend data

  6. Deploy the HTML dashboard to GitHub Pages at http://cvss-report.openmrs.org

  7. Upload the updated database as an artifact (90-day retention) for future trend tracking


4. Writing Your First Test

4.1 Established Concepts

Security tests consist of two complementary files:

  • Feature file (.feature) — Human-readable test scenarios in Gherkin format (Given-When-Then). Readable by non-technical stakeholders.

  • Test implementation file (.py) — Python code that executes the test, defines CVSS parameters, and implements step definitions.

Feature file structure:

Given <initial state> When <action being tested> Then <expected outcome>

Test implementation file structure:

  1. CVSS parameter definitions (with rationale comments)

  2. MacroVector lookup function for score calculation

  3. Dynamic parameter detection functions

  4. pytest-bdd step implementations (@pytest_bdd.given, @pytest_bdd.when, @pytest_bdd.then)

4.2 Detailed Conventions — Coming Soon

The detailed methodology for feature file patterns and test structure conventions is currently being refined by the research team. For now, refer to existing tests as templates.

Reference files:

  • tests/authentication/test_01_brute_force_password.py — Complete front-end UI test with Playwright, CVSS documentation, and dynamic parameter detection

  • tests/authentication/test_02_brute_force_api.py — Same attack scenario tested against the REST API layer

  • tests/authentication/o3_authentication_security.feature — Proper Gherkin syntax and scenario structure

Key principles:

  • Focus on one specific attack scenario per test

  • Document CVSS parameters clearly at the top of each implementation file

  • Create separate functions for dynamic parameter detection (reusable and testable)

  • Include comprehensive output logging throughout test execution


5. CVSS Scoring Guidelines

5.1 Established Concepts

This framework uses CVSS 4.0 Base scoring on a scale of 0.0–10.0:

Score Range

Severity

Score Range

Severity

0.0

None

0.1 – 3.9

Low

4.0 – 6.9

Medium

7.0 – 8.9

High

9.0 – 10.0

Critical

Each test uses eleven CVSS parameters split into two types:

  • Static parameters — Fixed based on the attack scenario itself (e.g., Attack Vector = Network for all remote web attacks), regardless of whether defenses are active

  • Dynamic parameters — Determined at runtime based on observed system behavior (e.g., Confidentiality Impact depends on whether the attacker was blocked)

The framework uses the MacroVector lookup table method from CVSS 4.0 (not the mathematical formula from CVSS 3.1). Parameters are grouped into five Equivalence Classes (EQs), converted to a five-number key, and looked up in a table of 108 pre-calibrated values established by http://FIRST.org . Score calculation is handled automatically by calculate_cvss_v4_score().

5.2 Parameter Selection Methodology — Coming Soon

Comprehensive methodology including decision frameworks, validation criteria, and edge case guidance is currently being developed. For now, reference existing tests for parameter selection patterns.

Key considerations when designing CVSS scoring:

  • Think from the attacker's perspective, not the defender's

  • Base static parameters on attack scenario characteristics, not deployed defenses

  • Identify parameters that could change based on observable system behavior — these are candidates for dynamic detection

  • Document your reasoning for each parameter, not just the value chosen

  • Remember that parameter selection has academic implications — rigor and justification matter


6. Running Tests

6.1 Running Tests Locally

# Run a single test pytest tests/authentication/test_01_brute_force_password.py -v -s # Run all tests in a directory pytest tests/authentication/ -v -s # Run the full test suite pytest tests/ -v -s # Run with full reporting pytest tests/ -v -s --html=report.html --self-contained-html --json-report --json-report-file=report.json

6.2 Viewing Test Output

During execution, the terminal displays:

  • Real-time attack logs (each login attempt, response codes, system behavior)

  • CVSS score calculation section at the end of each test (score, severity, parameter values, vector string)

After tests complete, open report.html for a visual summary or report.json for structured data used by the dashboard generator.

6.3 Generating the Dashboard Locally

python scripts/generate_security_dashboard.py

This will initialize the SQLite database (or create it on first run), parse test results from report.json, save results and set baselines, and generate security_dashboard.html.

6.4 GitHub Actions Workflow

On every push to main, the workflow automatically:

  1. Downloads the previous test results database from artifacts

  2. Spins up OpenMRS via Docker Compose

  3. Runs all tests, capturing output to test_output.log

  4. Generates the HTML dashboard

  5. Deploys to GitHub Pages at http://cvss-report.openmrs.org

  6. Uploads the updated database as artifact test-results-db (90-day retention)

6.5 Debugging Test Issues

Issue

Solution

Issue

Solution

Unexpected test failure

Check terminal output with -v -s flags; review test_output.log

Incorrect CVSS scores

Verify dynamic parameters are set properly in detection function output

OpenMRS connection issues

Run docker compose ps and docker compose logs

Test timeouts

Increase wait times in test code or allocate more memory to Docker

CVSS shows 0.0

Verify score calculation function ran; check dynamic parameters aren't None

Database errors

Check write permissions on test_results.db or delete to start fresh

Dashboard won't generate

Verify report.json exists and is valid; check all dependencies are installed

GitHub Actions failure

Check the "Run Security Tests" step logs under the Actions tab


7. Understanding Results

7.1 Dashboard Structure

The dashboard is organized into:

  • Summary cards — Total Tests, Passed, Failed, Duration

  • Test results table — Name, attack description, status, CVSS score, severity, improvement vs. baseline, trend sparkline, duration

Note: "Passed" and "Failed" reflect test execution status, not security status. A passing test means it ran successfully and calculated a CVSS score.

7.2 CVSS Severity Levels

Score

Severity

Description

Score

Severity

Description

9.0 – 10.0

Critical

Full system access possible with minimal effort — address immediately

7.0 – 8.9

High

Serious concern with significant access or impact — prompt attention required

4.0 – 6.9

Medium

Notable issue, may have mitigating factors — should be addressed

0.1 – 3.9

Low

Minor concern with limited exploitability or impact

0.0

None

No vulnerability or defenses completely prevent exploitation

7.3 Baseline System

  • When a test runs for the first time, its CVSS score is saved as the baseline in SQLite

  • The baseline remains constant unless manually reset

  • All subsequent runs compare against this baseline to calculate improvement metrics

  • The dashboard displays both the current score and the original baseline score

7.4 Improvement Metrics

The Improvement column shows: baseline score − current score

Display

Meaning

Display

Meaning

+2.5 ↑ (green)

Security improved — CVSS score decreased

-2.5 ↓ (red)

Security degraded — CVSS score increased

0.0 — (grey)

No change from baseline

Remember: Lower CVSS = less vulnerable = positive improvement.

7.5 Mouseover Tooltip History

Hovering over any improvement value shows a tooltip with the last 10 test runs, including:

  • Run number (current run marked with )

  • CVSS score for that run

  • Delta from baseline (color-coded)

7.6 Trend Sparkline Visualization

The Trend column displays a small line chart of CVSS scores over the last 20 runs:

Sparkline Shape

Meaning

Sparkline Shape

Meaning

Flat horizontal line

Consistent security — score unchanged

Downward slope

Improving security — scores decreasing

Upward slope

Degrading security — scores increasing

Volatile (up and down)

Inconsistent security or flaky test behavior

Displays "Not enough data" until at least two runs exist.


8. Contributing Tests

8.1 Before You Start

  • Review /tests/authentication/ and the feature file to avoid duplicating existing work

  • Ensure your local OpenMRS instance is running

  • Familiarise yourself with the CVSS 4.0 specification if needed

  • Decide whether your scenario targets the frontend UI, REST API, or both

8.2 Development Workflow

  1. Create a feature branch: add-session-hijacking-test or test-sql-injection

  2. Write the feature file scenario first (Given-When-Then)

  3. Implement the test file with full CVSS parameter documentation

  4. Run locally multiple times to verify consistent results

  5. Verify the CVSS score is appropriate for the scenario and observed defenses

  6. Document parameter rationale thoroughly in comments

8.3 Submitting a Pull Request

Your PR description should include:

  • What vulnerability or attack scenario the test evaluates

  • Expected behavior when defenses are working

  • Example output including the CVSS score

  • Rationale for CVSS parameter choices (especially dynamic ones)

  • References to related OpenMRS security issues or documentation

  • At least one complete test run output showing attack progression and final score

8.4 Review Process

Reviewers will assess:

  • Reliability — Test executes consistently and produces repeatable results

  • CVSS justification — Parameters are appropriate and well-documented

  • Dynamic detection logic — Correctly interprets system behavior

  • CI/CD compatibility — Test passes in GitHub Actions, not just locally

  • Code quality — Readability, documentation, and adherence to existing patterns

  • Security accuracy — Test reveals a real vulnerability with an appropriately scored CVSS

8.5 Testing Checklist

Before submitting your PR, verify the following:

  • [ ] Feature file follows Given-When-Then structure with clear, readable scenarios

  • [ ] CVSS parameters are documented at the top of the test file with detailed rationale

  • [ ] Test produces consistent results across multiple runs

  • [ ] Dynamic parameters correctly detect system behavior (test with and without defenses if possible)

  • [ ] Comprehensive output logging is included throughout test execution

  • [ ] Test passes both locally and in GitHub Actions CI/CD

  • [ ] Code is clear and documented — another developer should understand what and why

  • [ ] Git commits are clean with meaningful commit messages

  • [ ] All necessary files are included in the branch


9. Troubleshooting

9.1 Common Issues and Solutions

OpenMRS fails to start

  • Run docker ps to verify Docker is running

  • Check for port conflicts on ports 80 and 3306

  • Run docker compose down then docker compose up -d

  • Monitor with docker compose logs -f

Tests consistently time out

  • Increase wait times in test code

  • Allocate more memory to Docker

CVSS scores show as 0.0

  • Verify the score calculation function executed by checking test output logs

  • Confirm dynamic parameters were set to valid values (not None)

Database errors

  • Check that test_results.db has write permissions

  • Delete test_results.db to start fresh if corrupted

Dashboard fails to generate

  • Ensure report.json exists and contains valid JSON

  • Run pip install -r requirements.txt --break-system-packages

  • Check dashboard generator output for specific error messages

GitHub Actions workflow fails

  • Check the "Run Security Tests" and "Extract CVSS Scores and Generate Dashboard" steps under the Actions tab

9.2 Where to Get Help

Resource

Use For

Resource

Use For

GitHub Issues

Framework bugs, feature requests

OpenMRS Talk

Questions about OpenMRS behavior and features