Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Some work I done earlier according to timeline

...

  1. Manual Loading - Export MySQL Table as csv data and Import that csv data to Hive tables.
  2. Sqoop
  3. Tungsten Replicator

 

 

Manual via CSV

Sqoop

Tungsten Replicator

Process

Manual/ Scripted

Manual/ Scripted

Fully automated

Incremental Loading

Possible with DDL changes

Requires DDL changes

Fully supported

Latency

Full-load

Intermittent

Real-time

Extraction Requirements

Full table scan

Full and partial table scans

Low-impact binlog scan

 

 

...

  • 19 May -  22 May: Study MySQL DB Transformation using JSP
  • 22 May - 26 May: Study  Direct Web Remoting and Spring Controller
  • 26 May - 4 June. DWR implementation and back end designing using JavaScript and Java.
  • 5 June -  8 June:  Make a Web Wizard similar to mentioned Mockup UI.
  • 9 June - 12 June: jQuery Drag and Drop support for columns in table.
  • 12 June - 14 June: Hadoop, Hive, Thrift Study, Setup, Resource Collection.
  • 14 June - 15 June: Testing and Implementing Datawarehouse Login (Backend and Frontend).
  • 16 June - 22 June: Transform and Load Table from MySQL to Hive Datawarehouse. Implementing various Joins.
  • 23 June - 30 June: Add more feature like multiple table or database selection and Hive Table Editing.Loading data to Hive
  • 1 July - 28 July: Studying Apache Mahout, Implementaion and Web Interface
  • 28 July - 18 August: Code Fixing, Bug Solving and Later Finishing

...