Last edited by
Quratt ul ain Siddique
Summary:
This paper discusses the data mining of the genomics of the mouse that is an area of importance because of its relationship to understanding of basic genetics of other mammals and in particular the human as well as livestock genetics and its breeding.
The data mining tools of multiplot, data partition, clustering, self-organized maps (SOM), regression, association, and neural networks were all used in this research The paper has demonstrated the data mining and visualization results including virtual gene map, mouse genomic features on chromosome, clustering, cluster proximity, T-Scores effect, self-organizing map, and regression analysis. One of the novelties of this research is that the data mining is performed at the genomic level of a mammal that is commonly used as prototype testings for humans.
The data mining performed on the mouse genome data indicated a linearity of regression for the B16F0 Chromosone, significant reduction in the average error upon using neural network algorithms, significant effect in the visulization plots upon using self-organized maps (SOM), and a nonlinear relationship of the cubicclustering criterion with discontinities when the number of clusters reached 22 and 38.
The results of data mining performed also indicated that it was useful to visualize at the genomic level for the mouse data. The analysis shown here can also help researchers who are interested in genome data, and others to visualize the use of data mining at this micro dimensional level.
Future directions of the research are to continue to perform more data mining of the mouse genome data. This may entail using other data mining tools and software. Other future directions are to perform data mining for other data bases such as for other mammals that are of evolutionary relationship to humans, and also other genomic databases of differing dimensionalities to contrast the findings of the research presented in this paper.
0 comments:
Post a Comment