OCC Justification

What is this?

Hi, this is Paul here. I want to try something radically different (for me at least), in my attempts to describe and build a clear understanding of the OpenMRS Concept Cooperative (OCC). I believe that this service will ultimately be very important to the long term sustainability and growth of the OpenMRS collaborative, and so I think it's very important to communicate these concepts to the broadest audience possible. That being said, I'm going to model this wiki page (or maybe pages as it grows) in a style which will ultimately become a journal submission. It's my full intention to publish this "article" in the scientific literature once it reaches maturity. By posting this work-in-progress here, I offer an open invitation to you all to both help me craft this document, and in doing, become a co-author to the ultimate submission.

At this point, my rough thoughts are that there are two kinds of contributions (even this is up for discussion, I just don't want to get bogged down too much on these details):

  1. "Idea refiner": an author who has substantive new contributions to either the style or content of the document. Those who bring new content to the document. You will be a co-author. I'll determine author order by relative contribution.
  2. "Copy editor": a reader who has edits to the grammar or wording of the document. You will receive acknowledgements for your help.

Questions/thoughts? Email me: paul at openmrsdotorg


Evolving Practical Standard Terminologies Using a Community-Developed Approach: The OpenMRS Concept Cooperative

Introduction / Background

Electronic medical record (EMR) systems are built with the expressed intention of managing clinical information. As the health care enterprise cares for an individual, it creates myriad data artifacts, which represent both what was observed and what actions were taken to optimize his or her health status. (Perhaps an example or two of what a "data point" is?) EMRs, as they've evolved over the years, have represented these data points within their systems in differing ways. This has led to poor information interoperability, as the ways in which data points are modeled, codified, and represented differ from one system to the next.

Typically, system designers resolve issues related to semantic interoperability through the use of standard vocabularies. While the health care community has worked with these code sets since the 1960's, it wasn't until the 1990's that robust clinical vocabularies emerged. Clinical vocabularies such as LOINC and SNOMED attempt to model a set of terms, with their corresponding definitional information and other relevant metadata at an appropriate level of specificity to document patient care. These so called reference terminologies are by their nature controlled and developed through the hard work of organizations such as the College of American Pathologists, the World Health Organization, and the Regenstrief Institute.

However, there are many challenges facing the development community as these reference terminologies get utilized in real world scenarios.

  • No "one stop shop" -> considerable overlap between terminologies (I don't disagree completely, but in the case of LOINC and SNOMED - which many folks assumed there was a lot of overlap - recent evaluations by some of the Utah folks showed that there was really very little overlap. -Dan)
  • Artificial nature of how work has been split up among organizations doesn't allow a proper metadata model for an actual system – eg: you need normal ranges and units for tests, etc. (True, which is why LOINC asks submitters to include this information in their requests for new terms. But, this of course is not the same as knowing what normal ranges are for everyone who has mapped to a certain concept. I think that the heterogeneity in healthcare and system implementations probably exceeds the capability of SDOs to be able specify at that level. For example, the LOINC Committee recently sent out a call for large institutions and references labs who had mapped to LOINC to send us their units of measure. We have gathered these and are now doing the analysis to determine the degree of variability and to see whether there is any possibility of asserting a 'preferred' unit - at least for the US. Of course, there are differences among countries as well in terms of prevalence of mass concentration vs substance concentration measurement, etc. There just haven't really been many evaluations of this kind though. - Dan)
  • Healthcare is under constant revision - vocabularies managed by organizations can't possibly keep up with this evolution - ?'s about sustainability of this approach
  • Vocabulary development organizations are not close enough to the actual health care environment to understand needs: most are more technologists than clinicians (This challenge seems to beg the question as to why vocabulary developers are not close enough to the health care environment ... Creating and maintaining clinical vocabularies has been traditionally so resource intensive that those who develop terminology lack the resources to also maintain deep knowledge about the vagaries of the health care environment. Should it be asserted that one of the great values of OCC is its potential to bring terminology development and management "to the masses"? – Shaun)
    (From the LOINC perspective, I would have to disagree with this assertion. From the very beginning (see original LOINC paper), the entire vocabulary was built on terms used in real systems. The explicit purpose was to provide universal codes for terms in existing systems. There are few, if any, LOINC codes that weren't first instantiated in someone's real system. Now, this might not be the case for other terminologies (smile). - Dan)
  • Deep challenges inherent in modeling a given concept: what’s the proper way to model a developmental milestone?
    For one, given the immense complexity and quantity of terminology necessary to represent clinical care, vocabulary development organizations in large part start by attempting to focus upon specific domains of expertise. For example, LOINC initially focused upon laboratory testing.
    (I agree. And as you point out later, there isn't really a right or wrong way to go about it...it depends on specific use-cases. For example, LOINC built its naming convention and level of granualarity specifically to distinguish between tests reported on lab reports. That is a different use case than say decision support, and thus you might make different choices about the level of granularity if you were building a terminology for that purpose etc. Obviously, the bigger the net you cast - like SNOMED for example, the tougher the job to meet all demands. - Dan)

New Opportunities Modern Technologies Afford the Community

  • Internet and so called "2.0" technologies -> foster collaborative content development
  • Example of Mediawiki [ The WHO adopts a community ('wisdom of the crowds') approach for the ICD|http://www.cbc.ca/health/story/2007/05/02/disease-wiki.html]
  • "Folksonomies"

The OpenMRS Concept Cooperative - What it does

OCC is a*collection of the cumulative concept development work of the OpenMRS community, shared and viewable in such a way to allow commonly used conventions to "rise to the top". Perhaps with enough participation, common modeling conventions, and commonly used concepts will themselves become "de-facto standards".

  • OCC's foundation is the OpenMRS concept model, which represents to the best of our knowledge, the relevant metadata needed to actually drive system behavior. (Unclear how 'metadata _. drives system behavior' – Shaun) (Agree that it would be valuable to elaborate on this a bit more to describe how additional attributes about the concept besides the term name are needed to create flowsheets, data entry screens, and essentially anything else you want to do - Dan).
  • OCC concepts can be linked to 1 to n standardized reference vocabularies (such as SNOMED, LOINC, ICD, etc)

    (Does OCC have concepts...or simply mappings? "OCC Concept" could be misleading. Would it be better to say "OCC concept mappings can be linked to..." or "OCC-linked concepts can be mapped to..."? -Burke)

  • OCC's key ingredient is tight linkage to the vocabulary development mechanisms inherent in the OpenMRS Base install. (May be overstating the obvious, but if the reason that 'tight linkage' is a 'key ingredient' is because it eases the oft-cumbersome process of accessing and browsing terminologies in the familiar and friendly OpenMRS interface, then it may be worth stating that very point – Shaun) Using network connectivity, users can browse the OCC resource within the OpenMRS dictionary editor, and import concepts into an implementation.
  • Implementations which import a given concept create an automatic mapping between their site and all other sites which have used the concept. They also import all of the collective work for that concept. (I think this is something that should be elaborated on. I would think that the more information about what has been mapped to a given concept and all things related to its current usage would be a huge help in mapping. - Dan) So, if any site maps the concept to a standardized vocabulary, all of the sites benefit from that new mapping. (Would be interested in hearing more about your thoughts on this particular point. As you know, we've taken a centralized approach to mapping in the INPC in part because the resources and expertise needed to do it are more than many local sites can expend. OpenMRS has taken a different approach. Either way, you want to take advantage of the work wherever it occurs. - Dan)
  • The assumption will be that implementations will vote by their actions... which concepts are more germane and "correct", and the OCC will highlight these votes by rendering the dictionary in ways that will show these decisions. ''(Agree ... this is one of the key points about the OCC that jump out to me. It is additional attribute that you can actually model, similar to the actual frequency of use in the repository, and could help in mapping. This idea (frequency of mapping to a given concept) is part of what 3M actually uses in their mapping service, described originally (i think) in a paper from 2000. - Dan)

The OpenMRS Concept Cooperative - What it doesn't do

*Serve as a standard vocabulary: By definition, the OCC does not centrally manage a standard terminology. It merely makes the work of all OpenMRS sites more transparent. The OpenMRS leadership team has no intentions to manage and support the work involved in maintaining it's own vocabulary.

*Manage "concept_proposals": Concept proposals are a feature in which end sites for a given OpenMRS implementation can suggest augmentations to the concept dictionary. These proposals are intended to be vetted by those in charge of a given implementation's vocabulary, and as such aren't considered part of the OCC until they have been accepted as official changes to the implementation's dictionary. The OCC is intended to make a given site's official vocabulary transparent to the rest of the community. For management of concept proposals across a large series of clinics, please refer to the work being done on the "terminology services bureau" led by Unlicensed user @ MVP. We see this work as very complementary to the OCC.

Beneficiaries of the OCC

*Beginning users of OpenMRS: Many end-users of OpenMRS are operating in resource-constrained environments, limited by time, finances, and technical know-how. Given the acknowledged challenges related to vocabulary development, the OCC is meant to serve as a modeling "starting point" informed by the growing OpenMRS implementation community. It is a way for the collective intelligence of the community to be shared amongst all participants.

  • For begginning users and already experienced user scaling up their implementation (_Not sure of how this fits in though _): The OCC could help in exchange and sharing of forms because concepts are the basis for forms (and formlets as we might see in the future.

*Large, established treatment programs: Many large-scale environments will be interested in centrally managing their own vocabularies amongst all of their sites. In such settings, vocabularies won't be developed directly by clinical sites. However, the OCC can help more experienced dictionary developers within these programs to take advantage of the mapping work provided by the community, and by participating will create ways to communicate between other OpenMRS implementations.

*Informatics community: There are many unanswered questions related to concept modeling. The OCC provides a real-time look into how a growing community has chosen to model medical concepts, and might ultimately serve as a cornerstone of research in this regard.

*Clinical research community: Those interested in multi-site research efforts will be able to use the OCC as a starting point towards understanding what information lies within the OpenMRS install base, and perhaps begin the work around data aggregation. In one place, researchers will know what information is available, how it's modeled, and where it's located. One could imagine also receiving summary statistics around amount of data available for each of these concepts as well.


Reference terminology: a terminology which can be used for mapping among different terminologies. compositional concepts independent of human language (within machine). support mapping and subsequent aggregation

Interface terminology: set of designations optimized for data entry by humans


http://online.wsj.com/public/article/SB115756239753455284-A4hdSU1xZOC9Y9PFhJZV16jFlLM_20070911.html -> excellent debate between the founder of Wikipedia and a leader of Encyclopedia Brittanica

http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds -> a strong personal inspiration

"The irony of group wisdom is that it is only when a group is unaware of its intelligence that it can be effective." <--- a key reason we must hold very strongly to the notion that the OCC is not a standardized reference terminology

Great story from "Wisdom of Crowds":

In 1906, Francis Galton, a British scientist, visited a country fair. At the fair, he came across a weight judging competition, where for sixpence, participants could buy a ticket on which they could fill in their names and addresses, and their guess as to the weight of an ox which was on display. At the end of the day, the closest estimates would win prizes.

After the competition, Galton collected the used tickets, and analyzed the guesses of the crowd. He was astonished by the results. After the ox had been slaughtered and dressed, it weighed 1,198 pounds. And the average of the crowd's guesses? 1197 pounds. The crowd's guess was thus amazingly accurate. This result, where the crowd's guess is better than the estimate of even the most expert individual in the crowd, has been replicated in numerous experiments.