2014 Internship Project
This project is being considered as a potential project for 2014 Internships. If you are a potential intern and are interested in working on this project, please discuss it in detail with the mentor(s) listed here before submitting your internship proposal.
Primary mentor | |
Backup mentor | TBD |
Assigned to |
Abstract
The amount of data generated is getting increased day by day and so as the appetite for finding the information from data as well. Growing appetite for data analysis can't be achieved by transactional databases. With organization thriving towards separating their data warehouse compliance from transactional databases so, that they can track the historical data better, the need for ETL tool is getting increased everyday. The intention of this project is to have a ETL module to interact with multiple DW compliance over which predictive modeling code could run. So, that healthcare provider can check upon the predictive modeling result based on historical data they are having/loading.During this summer we need to have at-least a basic module which will allow doing ETL to MySQL & Hadoop (Hive) and have a way to inject the predictive modeling code into the warehouse compliance directly from the OpenMRS UI and to fetch back the result.
Project Champions
Objectives
Extra Credit
1. Adding a UI interface for doing ETL similar to what Informatica has.
2. Providing a interactive UI to analyze the predictive modeling results coming out from the DW compliance.
Resources
- Data Warehousing : http://en.wikipedia.org/wiki/Data_warehouse
- ETL ( Extraction - Transformation - Loading) : http://en.wikipedia.org/wiki/Extract,_transform,_load
- Apache Hive : http://hive.apache.org/
- Apache Mahout : https://mahout.apache.org/