Summer Of Code 2007


We are so thankful for the strong show of support from the student and larger open source community. We are pleased to announce that we received 134 eligible proposals! Applications came from not only 13 states within the US, but also 34 other countries as well, covering six of the seven continents (guess we're not seeing a lot of OpenMRS interest in Antarctica yet). Students additionally have a wide range of educational experiences: from a recent high school graduate to many PhD students — 56% of applicants were working towards a bachelor's degree, 36% were masters-level students, and approximately 8% were PhD candidates.

Summer Projects

Mentors and interns should see Summer of Code 2007 - Getting Started.


Intern: Hugo Rodrigues
Mentor: Darius Jazayeri

Abstract: OpenMRS is built upon a flexible, highly scalable, EAV-based data repository model. This stacked approach, while excellent for storing millions of rows of clinical data, is formatted in a non-intuitive way to most implementers. Therefore, we are actively developing tools which allow users to both define cohorts and the structure of flat output tables. Further work to design highly intuitive, web-based design tools is needed, and students will work closely with mentors on multiple interface designs.


Intern: Sashikanth Damaraju
Mentor: Paul Biondich

Abstract: Medical Informatics often involves managing and processing large volumes of information. For example, clinicians are often forced to infer treatment plans through interpretation of large sets of historical, laboratory, and other test data for each patient. Researchers evaluate large retrospective patient cohort data sets to find particularly effective treatment protocols or previously unrecognized relationships between risk factors and the later presence of disease. Data managers evaluate 10s of millions of clinical observations to identify and correct nonsensical or outlier data. All of this work would benefit from modern advances in data visualization. We are particularly interested in web-based Flash-JavaScript integrations, such as the excellent Google Finance stock trend visualization tool.

In most cases, we are talking about either (1) numeric or (2) categorical data. Numeric results usually have associated ranges (e.g., ranges defining the bounds of "normal" or "critical" values). Categorical data have coded answers like yes/no/unknown or codes for various disease conditions. All data are timestamped.

Example numeric data
for [










Some data are related (like diastolic and systolic [archive:blood pressure@w)
and would be be viewed together in the same graph:


systolic blood pressure

diastolic blood pressure












Often, patients will have dozens of data points for each parameter. Clinicians and researches will want to understand the trends of these values and how they relate over time. We would like to explore ways in which to not only present data in a readily understandable manner, but also allow users to interact with the data to meet their needs.


Intern: Michael Rudd Zwolinski
Mentor: Justin Miranda

Abstract: There are a variety of open source reporting frameworks, such as BIRT, Pentaho, and JasperReports which allow for rapid development of summary or patient-level reports. OpenMRS, being a modular architecture, would like to encourage exploration of integration with all of these frameworks to give end-users the freedom to choose packages which best suit their needs. Potential projects include:

In this project, a developer design and develop a BIRT Open Data Access adapter that interfaces with OpenMRS (through a web service) to expose OpenMRS cohorts and columns (similar to the way a JDBC driver would expose tables and columns). The ODA driver (above) will need to interface with an OpenMRS end-point (i.e web service) to retrieve available data columns (tokens/concepts). The main goal of the project is to create an interface to OpenMRS that makes building reports easier. The Logic web service will also be responsible for responding to queries made by the ODA driver.

See BIRT Open Data Access adapter and Logic Web Service

Logic Service Project GSoc 2007

Intern: Vladimir Mitrovic
Mentor: Burke Mamlin

Abstract: Medical data is extremely complex. While a dictionary-based system provides tremendous flexibility in collecting data, consuming the data for decision support, reporting, research question, or even for display within the web application can be increasingly difficult. For example, even answering a simple question like "is the patient HIV positive?" may require a complex algorithm that considers the results of multiple lab results, temporal relations of results, orders, and questionnaire responses. Ideally, this "business logic" is defined in one place and easily accessible to all aspects of the system — e.g., whether we're displaying HIV status on a web page, generating a PEPFAR report, or executing a decision support alert, the same algorithm should be used. We call this bit of business knowledge a "rule." The rules should be easily accessible and return predictable and flexible results.

Patient Matching, Record Linkage, and Data Aggregation Project

Intern: Sarp Centel
Mentor: Shaun Grannis

Abstract: Uniquely identifying patients and accurately linking information across disparate medical record systems is a challenging but critical medical record system function. This challenge is further compounded in resource constrained settings, where systems often operate in a disconnected fashion. Informatics researchers are actively working on adding core statistical/probabilistic matching functions to OpenMRS, and this provides promising students with an opportunity to contribute to this intellectually stimulating topic area. Particular areas of needed work include:

  • Extending GUI's and/or underlying methods that define the analytic parameters needed to drive statistical/probabilistic matching
  • Implement/extend methods that perform operational patient matching using parameters defined in the analytic phase
  • Creating modules which integrate the previously developed statistical/probabilistic matching techniques into OpenMRS end functions

Because the specific fields (patient identifiers and demographics) used for matching will vary for each OpenMRS implementation, the matching algorithm must flexibly adapt to varying field combinations. We plan to initially implement patient matching using the Felligi-Sunter maximum-likelihood algorithm because it offers such flexibility.

To determine agreement status among fields, we'll build hooks to incorporate a variety of string comparators including (but not limited to) the Longest Common Substring, Levenshtien Edit Distance, and the Jaro-Winkler comparator.

In order to implement probabilistic matching in OpenMRS, the targeted data sources must be analyzed to calculate the parameters that drive the algorithm. We seek to update and extend an existing GUI that performs this analysis. This "analytic phase" calculates a number of parameters for data sources, including:

  • field agreement rates among true matches
  • field agreement rates among false matches
  • estimated number of trues matches between two data sources
  • number of records in each data source
  • number of unique non-null values for each field
  • number of null values for each field
  • Shannon's entropy for each field

Additional background can be found at Patient Matching Design Doc. Shaun Grannis, the lead mentor on this project would welcome questions and would be happy to guide potential applicants toward an application that best suits particular coding strengths or abilities. Contact Shaun (sgrannis!At-image.png! if you have questions.

Mobile Data Collection Project

Intern: Matthias Nuessler
Mentor: Simon Kelly

Abstract: HIV medical outreach frequently occurs in resource constrained settings, often far away from primary clinical locations. Handheld computers are often more practical for data management in remote settings. In addition, cell phones are used extensively in developing countries. As such, there's a need for tools to both collect and process information related to a person's care using mobile devices. We are developing software to run on mobile devices to carry out this work. An OpenMRS server will export X-Forms, patient sets, and information about which X-Forms should be collected on which patients to a handheld computer. Software on the handheld will allow the user to look up patients, create new ones, and fill out X-forms. The completed X-Forms will be uploaded to the server and included in the main database. Possible summer projects include (1) working on a "transport layer" so that completed forms can be sent over the web, the phone line, or by manually carrying electronic media, (2) extending our systems ability to present data on the handheld.

Drug Order Entry Project

Intern: Desmond Elliott
Mentor: Hamish Fraser

Abstract: Electronic Medical Records can collect a wide range of clinical data from patient history, physical exam, labs, x-rays etc. One type of data has been shown to be particualrly important for good patient care and that is the patient's drug regimen. In most of our patients this is the critical factor in their treatment, and we know from US based studies that carefully designed forms and displays for managing medication data can reduce medical errors and improve quality of care. We also have early studies showing this effect in Peru. We have extensive experience in building these tools but need to create new elegant systems to help our busy and often inexperienced staff enter drugs accurately, including the drug name, type, dose per day and per week and dates of starting and stopping and instructions about dosing. It is important to support range check and rules that link existing data about the patient to drug choices e.g. "the patient is too light for this dose" or "the patient is known to have kidney failure and we need to lower the dose" or "the patient is pregnant and can't take this drug". There are several design patterns that have been used including having rich data about each drug show up as you enter it or a multi-page process with issues around individual drug highlighted.

Coding would require a heavy amount of AJAX and Java. Good design skills and enthusiasm for HCI is important.


Intern: Gjergji Strakosha
Mentor: Justin Miranda
Backup Mentor: Mike Seaton



Intern: Matthew Harrison
Mentor: Andreas Kollegger
Backup Mentor: Dr. Chris Seebregts

Abstract: The myriad of workflows inherent in health care necessitate a variety of mechanisms and medium to collect data into a medical record system. The OpenMRS project has already implemented a module which utilizes Microsoft InfoPath for client-side data entry, but there are a number of promising open source alternatives for forms (including OpenOffice and Orbeon Forms) which would provide needed open source alternatives for data collection. Ideally, these tools utilize the XForms standard.

OpenMRS Installer Project

Intern: Zach Elko
Mentor: Dirk de Jager

Abstract: Work has begun on an izPack installer (source). The installer needs to be completed, tested, expanded, and maintained. For further details, see archive:TRAC-149@ticket (or all tickets.

Data Synchronization Project

Intern: Anders Gjendem
Mentor: Maros Cunderlik

Abstract: As described under the remote data entry project and mobile data collection projects, we have a need to synchronize local data storage with the central servers. Local data storage maybe fully functioning OpenMRS instance or perhaps a scaled down version of the server. While a complete end-to-end synchronization solution featuring master-master replication, conflict resolution, contextual data replication, automatic schema updates is clearly beyond the scale of a summer project, significant contributions toward this critical feature could be made. In the initial phase of this effort, we would like to implement data synchronization under the assumption that the remote data entry site would be able to add new patients and encounters but not modify existing records. As such, it is assumed that there will be only a single 'master' copy of data at any given point and thus conflict detection and resolution are not needed at this point. Consequently, the synchronization in this context can be viewed as export, transfer, and import (analogous to ETL techniques used in data warehousing). Unlike ETL (which generally deals with the problem domain of the one directional transform between the different E-R schemas and data semantics) however, it is assumed and expected that the advanced synchronization features will be added to OpenMRS in the near future. Thus the broader consideration and awareness of the challenges in bi-directional replication and contextual replication are desired and expected to be reflected in the proposed solution.

High-level project plan and tasks (proposed):

  • Explore and evaluate existing open source frameworks for the data synchronization (see funambol)
  • Design and impact analysis: What is the proposed design? Outline how proposed design matches the desired capabilities. What changes (if any) are required in the existing code base and data model? Is there an overlap with other OpenMRS initiatives and/or code already in existence (i.e. XML serialization maybe used in data export and mobile applications, etc.)
  • Implementation: Export. Implement the mechanism for generating a local changeset (i.e. data changes since last synchronized expressed as XML).
  • Implementation: Transfer. Explore pros/cons of various transfer mechanisms, and implement selected choice(s)
  • Implementation: Import. Code server-side 'sync service' that is capable of processing the client changesets and applying them to the central server