Saturday, November 2, 2013

Big Data tools and relational BI analysis tools used in different stage of data analysis

I was thinking about the relationship between traditional BI platforms and Big Data analytic tools. I think they should coexist and service for different stage of data analyzing. That is, with the Data first approach, people use Big Data analytic tools to explore data and find out right question. Then, with right and interesting question, people drill data and try to give out answers in relational database based BI platform.

I happened to watch a presentation given by Facebook. It looks they are use both Big data tools and relational DB tools for different stages of data analysis. Below is cited from the presentation and video on youtube.



Sunday, August 18, 2013

FlywayFile, a maven plugin for generating migration script version number for flywayDB

I thought I do not need DB migration tool as long as I am good at handling version control and I use OR mapping in the java world. But, after using Ruby DB migration tool for a while in a Big Data project, I feel it will make use peaceful mind if we use an agile DB migration tool. In Ruby, we have nice db migration tool from active record migrations. I use it and it is pretty handy in an agile environment. In the java world, we have agile DB migration tools too. What I find recently is FlywayDB. It looks neat and well documented. Also, I like its feature comparison on its homepage.

But, it seems like FlywayDB does not have a tool for generate a migration files fitting its name convention for migration files. This post is about introducing my quick work that creatign a maven plugin to help user generate script file name having proper version info as prefix.

The code has be shared on Github. It named as FlywayFile as I am thinking it could be expanded to handle file operation stuff for FlywayDB in the future. Actually, I will ask if FlywayDB will accept this as one of their Maven plugin.

There are two eclipse maven projects in the repository. One the the main part, which is about maven plugin. The second is an extremely simple servlet, which can generate version number based on system time and random number.

The reason for me to add this servlet here is that I think there might be case where we have developer from different time zone. But they share one version control system like subversion or git. So, in order to avoid confusion, we can use a central server to generate the migration script file name. However, this version code does not support complicate network environment yet. It is just for a prototype. I can improve it later if working in different time zones is really popular case. Below is a picture that can interprets this idea quickly.


To use this maven plugin, we need to do the following steps
  1. Installing this maven plugin into your local maven repository as I have not registered it with central maven plugin repository
  2. Adding "org.idatamining" plugin group into your project pom.xml file or global settings.xml file
  3. Adding property "flyway.SQL.directory" in your project pom.xml. This property specifies where the SQL migration script file will be created.
  4. Adding property "flyway.filename.generator" in your project pom.xml. This property is optional. It specifies the URL, where VersionNumber servlet listens.
  5. call the generate goal: mvn flywayFile:generate -DmyFilename=jia
Below is a sample pom.xml file I used to test this maven plugin. The property "flyway.filename.generator" is optional. We need to set its value only when we need to get version number from a central server.

  4.0.0
  org.idatamining
  testFly
  jar
  1.0-SNAPSHOT
  testFly
  http://maven.apache.org
  
      /home/yiyujia/workingDir/eclipseWorkspace/testFly
      http://localhost:8080/versionGenerator/VersionNumber
  
  
    
      junit
      junit
      3.8.1
      test
    
  


mvn package mvn install:install-file -Dfile=./flyway-version-maven-plugin-1.0.jar -DgroupId=com.nokia.flyway -DartifactId=flyway-version-maven-plugin -Dversion=1.0 -Dpackaging=jar mvn flyway-version:flyway-version mvn com.nokia.flyway:flyway-version:1.0:create -e -DuniqueVersion=false -DmyFile=testFile mvn archetype:generate -DgroupId=idatamining.org -DartifactId=maven-flyway-generator-plugin -DarchetypeGroupId=org.apache.maven.archetypes -DarchetypeArtifactId=maven-archetype-mojo mvn -o flywayGenerator:create -DmyFile=yiyuFun -e -X mvn archetype:generate -DgroupId=org.idatamining -DartifactId=testFly -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false mvn -o flywayGenerator:create -DmyFilename="yiyuFunny" -e -X mvn -o flywayGenerator:create -DmyFilename="yiyuFunny" -Dprefix=A mvn archetype:generate -DgroupId=org.idatamining -DartifactId=versionGenerator -Dversion=1.0 -DarchetypeArtifactId=maven-archetype-webapp -DinteractiveMode=false