2016年7月6日星期三

Hdfs Apply to Medicurator




This week, I apply Hdfs to Medicurator.

As we all know, the hadoop distributed file system(HDFS) is a distributed file system designed to run on commodity hardware. HDFS is highly fault-tolerant and is designed to be deploved on low-cost hardware. Hdfs provides high throughput access to application data and is suitable for applications that have large data sets.

To make medicurator easier to deal with the high throughput, I decide to use the Hdfs. I inherit the class Storage and make the already existed LocalStorage become HdfsStorage. In order to realize this, I mainly use the API, referred https://hadoop.apache.org/docs/r2.6.1/api/overview-summary.html.

After my test, it works well. I only run this on my single computer, to make this run on the cluster, there still has some work to do.

To add, the consumer can choose the localStorage or the hdfsStorage according to their peference by changing the STORAGE (hdfs/local) in Constants.java. To use HDFS, the user should config HDFS_URI and HDFS_BASEDIR in Constants.java.For example, HDFS_URI = "hdfs://localhost:9000/" and HDFS_BASEDIR = "/user/xxx/medicurator/"
Source code and More information
https://bitbucket.org/BMI/medicurator

没有评论:

发表评论