With huge quantity of data distributed in scientific and industrial world, demand for parallel accessing of information motivates the usage of Grid. Grid computing is an emerging infrastructure modeling for broad variety of disciplines with high volume and varied data sets. Due to the limitation of parallel access of data in the grid as well as satisfaction for both service provider and client related requirements, efficient data integration becomes an important challenge. In this paper, an efficient technique called, Derived Genetic Key Matching (DGKM) is developed for quick parallel accessing of data for heart disease diagnosis from multiple grid location and seamless data integration spread over the disturbed grid servers is introduced. Synchronization of storage key to the grid location is done to identify the request data based on the factors leading to heart disease for multiple users (i.e. patients) at different location and therefore improving the data integrity rate. DGKM in distributed grid services allows for parallel and integrated data accessing (i.e. accessing different features) with derived gene populations of key matching indexes, aiming at reducing the time taken for key matching. Finally, the Vantage Point (VP) Tree Indexed Berkeley Key matching algorithm is developed to optimize different data grid storage with the objective of returning the result based on the factors resulting in heart disease to corresponding grid server location, aiming at improving the speed of parallel data accessing from distributed grids. The proposed technique is implemented by GridSim, a resource modeling and application scheduling for parallel computing. DGKM performance are tested with grid file accessing for online data repositories using Cleveland Clinic Foundation Heart disease data set available from UCI repository with metrics such as data grid access speed, data integrity rate, time taken for key matching, and accuracy of grid location identification. Experiment results show that the proposed technique achieve better performance by improving the data access speed by 17.04% and accuracy of grid location identification by 15.13% compared to state-of-the-art works.
Grid computing, Derived Genetic, Key Matching, Parallel accessing, Gene populations, Parallel computing
Share This Article
© The Author(s) 2015. Open Access. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License which permits unrestricted use, sharing, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.