Platform Team Meeting Notes 2024
2024-10-09
Discuss DB clustering with Michaël Bontyes (joining for the 2nd half of the call)
2024-07-24
Performance improvements
Added visit parameter to fetching of orders and observations (these are only in Platform 2.7)
Adding features to RESTWS module
If we make changes to the Platform (e.g., in the latest version), how do we want to introduce these features to O3?
Do we set up a “special server” to test out changes? Is it something temporary or do we want something more permanent?
Using dev3 also theoretically means we can get the added benefit of leveraging Jayasanka’s load testing infrastructure (in theory)
OpenMRS 3.1 release planning – are there any modules that should be bumped?
Will plan on pulling in latest released version of each module as part of preparing release
2024-07-10
QA Refapp is showing liquibase errors
Audit logging
Tried creating an audit log using hibernate interceptors; however it logs only changes to data (inserts, edits, deletes), but not reads.
We recognize there is a need for access logging (i.e., auditing who is looking at whose data). Burke gave the simple questions like (1) who has accessed this patient’s record over this time period? and (2) which patients' records has this user accessed over this time period? It’s likely implementers will have more requirements for their local security needs (e.g., being able to generate a report that summarizes access or identifies red flags).
@Org Administrator will start a discussion on gathering requirements on access logging
@Njidda Salifu pointed out a couple of issues
Docker nginx configuration limiting the size of data uploaded (proposed fix)
SDK does not offer options for location when logging in, because there are no locations defined in the database
@Manoj Rathnapriya shared some errors when getting Flag resource (http://localhost:8080/openmrs/ws/fhir2/R4/Flag/d073f2af-779d-4160-b2a4-3aa87486cf6b)
@Ian Bacher said it looks like something isn’t getting registered properly. It looks like the FHIR2 module was not listed as a dependency in a new maven submodule in the pom.xml. Adding a dependency on the patient-flags-fhir module should fix the problem.
2024-06-26
Performance
Visit Summary
@Org Administrator investigated whether diagnoses can be loaded for all encounters within a visit without loading all the other encounter data from the database and found this doesn’t exist; however, realized that, instead of creating a new
visitsummary
resource to get around this, a simpler solution could be to add encounter diagnoses as a property of the visit – i.e., asking for visit diagnoses would efficiently return all diagnoses across all encounters in the visit without having to fetch every encounter.@Org Administrator to link ticket for this work into these notes
Startup performance
Startup currently takes about 10 minutes. A lot of it appears to be a “ridiculous” number of setup steps for the first setup while only warnings for teleconsultation setup are being reported. Need to investigate cause.
2024-06-19
Performance
Investigating approach to improving performance of the Visits views for O3 (Visit Summary and All Encounters tabs)
TODO: Need a ticket for this work (@Org Administrator will make a ticket for this)
Working on cybersecurity issues from recent penetration testing
Working on XStream whitelisting
TRUNK-6188: Upgrade xstream and migrate to whitelistingCode Review (Initial)Will need create releases for affected modules prior to repeat testing.
Reviewed progress for 2024 for report to funders
2024-06-12
Performance issues
Palladium Kenya is currently deploying recent changes that Ian made
Some calls for obs have sorting at the API level that the REST API does not use
Does not have natural sorting, so order of results may vary
Some calls don’t have sorting capability at the API level (e.g., orders)
Some calls don’t have sorting at API level, but does include some natural sorting (e.g., visits)
O3-3386: Add sorting to a few fields when loading paged dataTo Do
Search term
Observations within a specific concept or concept set
Currently, in O3 Visit Summary tab, we are making a call like:
https://o3.openmrs.org/openmrs/ws/rest/v1/visit?patient=64d1f948-2535-4788- b585-8d338b4ca0de&v=custom:(uuid,encounters:(uuid,diagnoses:(uuid,display, rank,diagnosis),form:(uuid,display),encounterDatetime,orders:full,obs:full, encounterType:(uuid,display,viewPrivilege,editPrivilege),encounterProviders: (uuid,display,encounterRole:(uuid,display),provider:(uuid,person:(uuid, display)))),visitType:(uuid,name,display),startDatetime,stopDatetime,patient, attributes:(attributeType:ref,display,uuid,value)&limit=5
This is complicated query that covers a lot of data that is not initially rendered. Ideally, we would have a custom endpoint where we could get only the data needed for the initial view with references to data needed for expansion when needed.
Approach? Create a new view or new resource?
One option would be to define a new view for this specific need, e.g., fetch visits with
v=summary
(or something similar). It would be nice to follow any existing convention for this rather than to create a one-off solution.The other option, following
obstree
approach, would be to create a separatevisitsummary
resource to meet this need.
Currently, we feel that following the existing convention of
obstree
of creating a new resource to meet the specific application/business need is cleaner than creating a one-off custom view. So, we plan to createvisitsummary
.The new
visitsummary
resource would return an paged array of visits similar to the custom call mentioned above; however, we would want to avoid preloading all data for all encounters, observations, etc. Figuring out what level of data is needed and what calls can be used to complete the views in a performant way in O3 will take some additional discovery & discussion. For example, considering the image below, we would want all the data needed for the green parts of the view along with, perhaps, links to the API endpoints to get data to fill in the blue part of the view:
2024-06-05
@Org Administrator still working to find time with @Antony Ojwang to identify slow requests
Performance issues
Most calls in O3 are using FHIR (only a few use OpenMRS custom REST API), which already supports filtering and sorting.
Very few (if any) take advantage of sorting.
Most use a large
n
to load 100+ results
Will focus on adding missing support (e.g., for sorting) on existing endpoints that are being used by the frontend. If there is something that can be just as easily migrated to a FHIR endpoint, that would be preferable; however, in the many cases where supporting sorting in the OpenMRS REST API is a trivial fix, we will do that.
We should focus on supporting as many sorting parameters as we can, but not support sorting on every property of every resource (would require too many indexes).
FHIR specs talk about unsupported sorting parameters, suggesting that unknown or unsupported parameters be ignored by default unless the client specifies
handling=strict
We’ve not some places in the frontend where resources are called multiple times (e.g., patient resource or session endpoint)
Per @Ian Bacher, the plan is to introduce some custom caching (for 100s of milliseconds) to avoid repeating requesting the same resource multiple times in a single operation
PR for spa module set cachet headers to reduce unnecessary downloading of code
Still need an endpoint to receive errors/exceptions from clients
Would ideally be able to capture catastrophic errors on the client (things that break the Spa module) and report these errors (post the exceptions) to the server
Use case: people in the field not uncommonly find OpenMRS 3 gets into an unusable state and have learned to solve the problem by clearing their cache. While this may get them functional again, it requires their client to re-download all the same code again, slowing down the client and wasting bandwidth. If critical exceptions occurring in the client were logged on the server, it would increase the chance of identifying & fixing the causes so the number of times the client gets into an unusable state heads toward zero.
Ideally, the server could tell the client which log level to report (e.g., ERROR, WARNING, INFO)
Tracking & prioritizing performance issues
Not discussed
2024-05-29
@Org Administrator working with @Antony Ojwang to identify high priority slow requests
One clear area for performance improvement is to support filtering & sorting within the REST API so the client doesn’t need to request all data from the server
Examples of large queries that could be improved by supporting sorting & filtering within the REST API:
Query for observations
Query for Vitals & Biometrics
Do we have tickets for these? If not, we should create 1-2 epics (e.g., Support server-side sorting, filtering, and paging of observations)
Strategy
Enumerate the endpoints needed by the client (e.g., parameter supported for sorting, filtering, and paging) so the client can request a single page of data needed for display rather than all data to filter/sort locally.
Build/refactor the needed endpoints in the Platform
Refactor the frontend client to leverage the new endpoints, relying on the server to perform filtering, sorting, and paging of data rather than doing all of it in the client
Determine the extent to which these changes needed to be backported to server implementation needs
Outstanding question: how does this affect offline mode? Either these features become unavailable in offline mode or the client would need to prefetch some data to support a scaled down version of these features.
@Org Administrator to define a process/approach to track prioritization & progress on performance-related issues
2024-05-22
Performance and bandwidth issues
@Org Administrator had discussions with Palladium Kenya and identified a number of performance issues issues
Looked into specific performance issues
Majority of bandwidth usage is for code that is unnecessarily reloaded. Caching can help, but the caching frequently needs to be cleared when pages don’t load completely.
If we could create an endpoint for receiving client-side errors, then its possible the SPA module could report errors to the server when errors occur
@Ian Bacher & @Antony Ojwang discussed trying to find a time when they could connect while Antony is in the field to do some live troubleshooting
Do we have to use FHIR? It sends more information than our custom REST API.
Investigate FHIR’s GraphQL
When the database has a lot of data (e.g., large number of observations), some queries perform more slowly.
Might be able to address these by improving indexing or queries/paging
There are multiple points in the application where full representations are unnecessarily requested, when a custom representation could perform much better (return less unnecessary information)
Old hardware can cause adverse performance
OpenMRS could publish hardware requirements
Make sure CI pipeline and developers are experiencing application that more closely reflects real world hardware
In some cases, multiple calls are made to handle a single operation where a single call would be more efficient.
Clustering
Created page: O3 Cluster and Cloud Deployments
2024-05-15
Performance Issues
@Jan Flowers - working on finding “real-world” type data set for using in testing
other possible pathways - work with Palladium to work real time on troubleshooting together or via VPN, synthetic data (pros/cons)
@Org Administrator - will follow up with Antony to determine pathway for troubleshooting their issue they reported
Tracking/Prioritizing
Can we make an Epic at least? Grace is tagging
@Org Administrator making O3 chattiness Epic
How do we track the performance issues that are being reported
How do we make sure we are creating tickets for the performance issues we want to prioritize and focus on resolving; measure/track/target to resolve
E.g. Locations thread, supposedly fixed with indexing fix and closed, but with recent versions of Tomcat there is a noticeable slowness - is there a ticket for this and is it assigned to be addressed?
We are not in a situation where there is no actionable performance issues - Tomcat issue, and “chattiness” from O3 for Palladium
@Paul Biondich - can Daniel be responsible to driving the troubleshooting and resolving of OpenMRS performance issues
Daniel - challenges in troubleshooting to get to the point of creating epics/tickets
When Daniel can’t move something forward, should turn to Paul/Jan/Burke to help unblock and problem solve
Create momentum through shared responsibility for solving problems - holding folks to commitments for follow up, pinging when someone doesn’t follow up, etc.
Billing/Stock Management Module
@Org Administrator - working with ___(?) to generalize module that was harvested from Banda Health
Docker Images for recent JDKs
@raff - JDK 11 and 17? Ready for the master build, will backport for 2.6 and 2.5 release lines
Cloud hosting architecture
Looking into cluster containers and drafting architecture and approach for cloud based deployment of OpenMRS3, started talk post - waiting for feedback; will start R&D on this approach next week
MVP definition - request for OpenMRS to be run on multi-tenant environment
multiple instances for multiple facilities in a cluster, via kubernetes with centralized platform for deployment with monitoring
advising for AWS, Azure, etc deployment
not just about scaling the API, but also about the backend db - kubernetes supports the cluster of db, instances, but more work needed on the API
Goal: get to the point that this is a “best practice” approach and is a straight forward recipe/lift for implementing
Auto de-activation of users / timeouts - @isaiahmuli
Reviewing code and sorting through questions for Daniel
Need guidance on how improvements are made at code level, pointers to documentation
@Org Administrator use forums (talk and slack) as much as possible in public way so that others can help support (not just @Org Administrator directly), also improves knowledge base for others to get set up; edit documentation, point out gaps and problems, as you go through things
PM support for Platform/Backend
Can @jmwiinga spend some time helping here? Jeremiah and Jan to follow up to determine how he could help