2016年5月11日星期三

My thought on MEDIator: A Data Sharing Synchronization Platform for Heterogeneous Medical Image Archives



MEDIator: A Data Sharing Synchronization Platform for Heterogeneous Medical Image Archives
This is a paper written by my mentor.The content has  connection with the project I am going  to do.

Here, I list something I learn form this paper.

While sharing data is encouraged in science, algorithms and architectures should be designed for mashing up and sharing the medical data efficiently. Hence, a data sharing synchronization system should be secured and minimize data duplication in client instances, in addition to the regular requirements of the data access integration platforms.A data sharing synchronization platform should let data consumers to view sub sets of data that satisfy user-defined search criteria, and share them with others using pointers to the actual data.

This paper presents MEDIator, a data sharing and synchronization middleware platform for heterogeneous medical image archives. MEDIator allows sharing pointers to medical data efficiently, while letting the consumers manipulate the pointers without modifying the raw medical data. MEDIator has been implemented for multiple data sources, including Amazon S3, The Cancer Imaging Archive (TCIA), caMicroscope, and metadata from CSV files for cancer images.


 Also, an in-memory data grid can be an alternative for a traditional storage for the replica sets, as it provides faster storage, access, and execution. And this paper uses the platform - Infinispan. By the way, in my project, I plan to use Infinispan.

MEDIator lets the users create, update, retrieve, and delete replica sets, and share the replica sets with others. 
Higher Level Use Case VIew


MEDIator APIs :InterfaceAPI,PubConsAPI,Integrator
(details ignored)


Integration with Medical Data Sources 

Clinical data is deployed in multiple data sources such as TCIA, caMicroscope, and Amazon S3. Figure 3 de- picts the deployment of the system with multiple med- ical data sources.This part can help us access to different sources of data.


"MEDIator is multi-tenanted where multiple users co-exist without the knowledge of existence of the other users, sharing the same cache space. Involving a time stamp for the class extending P ubC onsAP I , downloaded items can be tracked, and the dis can be produced for the user download. Thus a download can be paused and resumed later, downloading the images that have not been downloaded yet."

---I am not clear about this paragraph, to be discussed.

To concluded, firstly, I think I can employ the part of the Representation of Medical Image Sources to my project. This part can help me to represent the source data.Secondly, I can join the MEDIator to access data and then do the Near Duplicate Detection work based on that.


没有评论:

发表评论