Evolutionary Learning of Concepts

Read  full  paper  at:

http://www.scirp.org/journal/PaperInformation.aspx?PaperID=47412#.VMib_izQrzE

Author(s)

Rodrigo Morgon, Silvio do Lago Pereira

Affiliation(s)

Department of Information Technology, FATEC-SP/CEETEPS, S?o Paulo, Brazil.
Department of Information Technology, FATEC-SP/CEETEPS, S?o Paulo, Brazil.

ABSTRACT

Concept learning is a kind of classification task that has interesting practical applications in several areas. In this paper, a new evolutionary concept learning algorithm is proposed and a corresponding learning system, called ECL (Evolutionary Concept Learner), is implemented. This system is compared to three traditional learning systems: MLP (Multilayer Perceptron), ID3 (Iterative Dichotomiser) and NB (Naïve Bayes). The comparison takes into account target concepts of varying complexities (e.g., with interacting attributes) and different qualities of training sets (e.g., with imbalanced classes and noisy class labels). The comparison results show that, although no single system is the best in all situations, the proposed system ECL has a very good overall performance.

KEYWORDS

Evolutionary Algorithms, Machine Learning, Classification, Interaction, Imbalance, Noise

Cite this paper

Morgon, R. and Pereira, S. (2014) Evolutionary Learning of Concepts. Journal of Computer and Communications, 2, 76-86. doi: 10.4236/jcc.2014.28008.

References

[1] Michie, D., Spiegelhalter, D.J. and Taylor, C.C. (1994) Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York. http://www1.maths.leeds.ac.uk/~charles/statlog/whole.pdf
[2] Kotsiants, S.B., Zaharakis, I.D. and Pintelas, P.E. (2006) Machine Learning: A Review of Classification and Combining Techniques. Artificial Intelligence Review, 26, 159-190.
http://dx.doi.org/10.1007/s10462-007-9052-3
[3] Moreira, L.M. (2000) The Use of Boolean Concepts in General Classification Contexts. Ph.D. Thesis, école Polythechnique Fédérale de Lausanne, Lausanne.
http://infoscience.epfl.ch/record/82654/files/rr00-46.pdf
[4] Menon, A.K., Agarwal, H.N.S. and Chawla, S. (2013) On the Statistical Consistency of Algorithms for Binary Classification under Class Imbalance. Proceedings of the 30th International Conference on Machine Learning, Atlanta, 16-21 June 2013, 603-611.
http://clweb.csa.iisc.ernet.in/harikrishna/Papers/Class-imbalance/icml13-class-imbalance.pdf
[5] Jakulin, A. (2003) Attribute Interactions in Machine Learning. M.Sc. Thesis, University of Ljubljana, Ljubljana. http://www.stat.columbia.edu/~jakulin/Int/interactions_full.pdf
[6] Natarajan, N., Dhillon, I., Ravikumar, P. and Tewari, A. (2013) Learning with Noisy Labels. Advances in Neural Information Processing Systems, NIPS, 1196-1204. http://papers.nips.cc/paper/5073-learning-with-noisy-labels
[7] Whitley, D. (2001) An Overview of Evolutionary Algorithms: Practical Issues and Common Pitfalls. Information and Software Technology, 43, 817-831. http://dx.doi.org/10.1016/S0950-5849(01)00188-4
[8] Hekanaho, J. (1998) An Evolutionary Approach to Concept Learning. Ph.D. Thesis, Abo Akademi University, Vasa.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.27.6647&rep=rep1&type=pdf
[9] Thrun, S.B., et al. (1991) The Monk’s Problems—APerformance Comparison of Different Learning Algorithms. Technical Report, Carnigie Mellon University.
http://people.cs.missouri.edu/~skubicm/375/thrun.comparison.pdf
[10] Labatut, V. and Cherifi, H. (2012) Accuracy Measures for the Comparison of Classifiers. Proceedings of the 5th International Conference on Information Technology, Chania Crete, 7-9 July 2014, 1-5. http://arxiv.org/ftp/arxiv/papers/1207/1207.3790.pdf
[11] De Jong, K.A. (2006) Evolutionay Computation: A Unified Approach. MIT Press, London.
[12] Weise, T. (2008) Global Optimization Algorithms: Theory and Application. 2nd Edition. http://www.it-weise.de
[13] Koza, J.R. (1998) Genetic Programming. MIT Press, London.
[14] Fogel, L.J. (1964) On the Organization of Intellect. Ph.D. Thesis, University of California, Los Angeles.
[15] Rechenberg, I. (1965) Cybernetic Solution Path of an Experimental Problem. Royal Aircraft Establishment, Library Translation 1122, Farnborough.
[16] Witten, I.H., Frank, E. and Hall, M.A. (2011) Data Mining. 3rd Edition, Morgan Kaufmann, Burlington.
[17] Ceder, V.L. (2010) The Quick Python Book. 2nd Edition, Manning Publications Co., Greenwich.
[18] Alcalá-Fdez, J., et al. (2011) KEEL Data-Mining Software Tool: Data Set Repository. Integration of Algorithms and Experimental Analysis Framework. Journal of Multiple-Valued Logic and Soft Computing, 17, 255-287. http://www.keel.es
[19] Bache, K. and Lichman, M. (2013) UCI Machine Learning Repository. University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml                               eww150128lx

Causal Groupoid Symmetries and Big Data

Read  full  paper  at:

http://www.scirp.org/journal/PaperInformation.aspx?PaperID=52267#.VI-A4cnQrzE

Author(s)

ABSTRACT

The big problem of Big Data is the lack of a machine learning process that scales and finds meaningful features. Humans fill in for the insufficient automation, but the complexity of the tasks outpaces the human mind’s capacity to comprehend the data. Heuristic partition methods may help but still need humans to adjust the parameters. The same problems exist in many other disciplines and technologies that depend on Big Data or Machine Learning. Proposed here is a fractal groupoid-theoretical method that recursively partitions the problem and requires no heuristics or human intervention. It takes two steps. First, make explicit the fundamental causal nature of information in the physical world by encoding it as a causal set. Second, construct a functor F: C C′ on the category of causal sets that morphs causal set C into smaller causal set C′ by partitioning C into a set of invariant groupoid-theoretical blocks. Repeating the construction, there arises a sequence of progressively smaller causal sets C, C′, C″, … The sequence defines a fractal hierarchy of features, with the features being invariant and hence endowed with a physical meaning, and the hierarchy being scale-free and hence ensuring proper scaling at all granularities. Fractals exist in nature nearly everywhere and at all physical scales, and invariants have long been known to be meaningful to us. The theory is also of interest for NP-hard combinatorial problems that can be expressed as a causal set, such as the Traveling Salesman problem. The recursive groupoid partition promoted by functor F works against their combinatorial complexity and appears to allow a low-order polynomial solution. A true test of this property requires special hardware, not yet available. However, as a proof of concept, a suite of sequential, non-heuristic algorithms were developed and used to solve a real-world 120-city problem of TSP on a personal computer. The results are reported.

Cite this paper

Pissanetzky, S. (2014) Causal Groupoid Symmetries and Big Data. Applied Mathematics, 5, 3489-3510. doi: 10.4236/am.2014.521327.

References

[1] Pissanetzky, S. (2014) Causal Groupoid Symmetries. Applied Mathematics, 5, 628-641.
http://www.scirp.org/Journal/Home.aspx?IssueID=4511
http://dx.doi.org/10.4236/am.2014.54059
[2] Kauffman, S. (2011) Answering Descartes: Beyond Turing. Proceedings of European Conference on Artificial Life (ECAL 2011), Paris, 8-12 August 2011, 11-22.
http://mitpress.mit.edu/sites/default/files/titles/alife/0262297140chap4.pdf
[3] Opdyke, W.F. (1992) Refactoring Object-Oriented Frameworks. Ph.D. Thesis, Department of Computer Science, University of Illinois, Urbana Champaign, Illinois.
http://dl.acm.org/citation.cfm?id=169783
[4] Pissanetzky, S. (2012) Reasoning with Computer Code: A New Mathematical Logic. Journal of Artificial General Intelligence, 3, 11-42.
http://www.degruyter.com/view/j/jagi.2012.3.issue-3/issue-files/jagi.2012.3.issue-3.xml
[5] Cuntz, H., Mathy, A. and Hausser, M. (2012) A Scaling Law Derived from Optimal Dendritic Wiring. Proceedings of the National Academy of Sciences of the United States of America, 109, 11014-11018.
http://www.pnas.org/content/109/27/11014.abstract
http://dx.doi.org/10.1073/pnas.1200430109
[6] Pissanetzky, S. and Lanzalaco, F. (2013) Black-Box Brain Experiments, Causal Mathematical Logic, and the Thermodynamics of Intelligence. Journal of Artificial General Intelligence, 4, 10-43.
http://www.degruyter.com/view/j/jagi.2013.4.issue-3/jagi-2013-0005/jagi-2013-0005.xml
[7] Lanzalaco, F. and Pissanetzky, S. (2013) Causal Mathematical Logic as a Guiding Framework for the Prediction of “Intelligence Signals” in Brain Simulations. Journal of Artificial General Intelligence, 4, 44-88.
http://www.degruyter.com/view/j/jagi.2013.4.issue-3/jagi-2013-0006/jagi-2013-0006.xml
[8] MacGregor, J.N. and Chu, Y. (2011) Human Performance on the Traveling Salesman and Related Problems: A Review. The Journal of Problem Solving, 3, 1-29.
http://docs.lib.purdue.edu/jps/vol3/iss2/2/
http://dx.doi.org/10.7771/1932-6246.1090
[9] Dorigo, M. and Gambardella, L.M. (1997) Ant Colonies for the Travelling Salesman Problem. Biosystems, 43, 73-81.
http://www.sciencedirect.com/science/article/pii/S0303264797017085
http://dx.doi.org/10.1016/S0303-2647(97)01708-5
[10] Martin, C.F., Bhui, R., Bossaerts, P., Matsuzawa, T. and Camerer, C. (2014) Chimpanzee Choice Rates in Competitive Games Match Equilibrium Game Theory Predictions. Scientific Reports, 4, Article No. 5182.
http://www.nature.com/srep/2014/140605/srep05182/full/srep05182.html
http://dx.doi.org/10.1038/srep05182
[11] Merolla, P.A., Arthur, J.V., Alvarez-Icaza, R., Cassidy, A.S., Sawada, J., Akopyan, F., et al. (2014) A Million Spiking-Neuron Integrated Circuit with a Scalable Communication Network and Interface. Science, 345, 668-673.
http://www.sciencemag.org/content/345/6197/668.abstract
http://dx.doi.org/10.1126/science.1254642
[12] Zhai, Y., Ong, Y.S. and Tsang, I.W. (2014) The Emerging “Big Dimensionality”. IEEE Computational Intelligence Magazine, 9, 14-26.
http://www.IEEE-CIS.org
[13] Huijse, P., Estevez, P.A., Protopapas, P., Principe, J.C. and Zegers, P. (2014) Computational Intelligence Challenges and Applications on Large-Scale Astronomical Time Series Databases. IEEE Computational Intelligence Magazine, 9, 27-39.
http://www.IEEE-CIS.org
[14] Zaremba, W., Szegedy, C., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. and Fergus, R. (2014) Intriguing Properties of Neural Networks. Computer Vision and Pattern Recognition, arxiv.org/abs/1312.6199.
[15] Ng, A. (2014) RSS2014: 07/16 09:00-10:00 Invited Talk: Andrew Ng (Stanford University): Deep Learning.
http://www.youtube.com/watch?v=W15K9PegQt0
[16] Pissanetzky, S. (2011) Emergence and Self-Organization in Partially Ordered Sets. Complexity, 17, 19-38.
http://dx.doi.org/10.1002/cplx.20389
[17] Connes, A. (1994) Noncommutative Geometry. Academic Press, San Diego.
http://
http://www.alainconnes.org/docs/book94bigpdf.pdf
[18] Hulpke, A. (2010) Notes on Computational Group Theory.
http://www.math.colostate.edu/~hulpke/CGT/cgtnotes.pdf
[19] Fuster, J.M. (2005) Cortex and Mind. Oxford University Press, New York.
http://ukcatalogue.oup.com/product/9780195300840.do
[20] Fuster, J.M. (2009) Cortex and Memory: Emergence of a New Paradigm. Journal of Cognitive Neuroscience, 21, 2047-2072.
http://cogsci.fmph.uniba.sk/~farkas/courses/Neurocomp/References/fuster.memory.jocn09.pdf
http://dx.doi.org/10.1162/jocn.2009.21280
[21] Berut, A., Arakelyan, A., Petrosyan, A., Ciliberto, S., Dillenschneider, R. and Lutz, E. (2012) Experimental Verification of Landauer’s Principle Linking Information and Thermodynamics. Nature, 483, 187-189.
http://www.nature.com/nature/journal/v483/n7388/full/nature10872.html
http://dx.doi.org/10.1038/nature10872
[22] Pissanetzky, S. (2014) Tours for the Traveling Salesman Problem gr120.
http://www.scicontrols.com/Publications/TheTours.txt
[23] Eguro, K., Hauck, S. and Sharma, A. (2005) Architecture Adaptive Range Limit Windowing for Simulated Annealing FPGA Placement. Microsoft Research, Design Automation Conference, San Francisco, 14-17 June 2005, 439-444.
http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=10020
[24] Reinelt, G. (1995) Discrete and Combinatorial Optimization. tsplib.
http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/
[25] Index of /groups/comopt/software/tsplib95/xmltsplib/instances. 1995.
http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/XML-TSPLIB/instances/
[26] Assembla Oocplex.
http://www.assembla.com/code/oocplex/subversion/nodes/3/
objectOrientedIntegerProgramming/sampleData/TSPLIB/gr120.opt.tour
[27] Assembla Oocplex.
http://www.assembla.com/code/oocplex/subversion/nodes/3/objectOrientedIntegerProgramming/
sampleData/TSPLIB/gr120.tsp
[28] The Traveling Salesman Problem.
http://www.math.uwaterloo.ca/tsp/
[29] Wissner-Gross, A.D. and Freer, C.E. (2013) Causal Entropic Forces. Physical Review Letters, 110, Article ID: 168702.
http://www.alexwg.org/publications/PhysRevLett_110-168702.pdf                                             eww141216lx

Visualizing Random Forest’s Prediction Results

Read  full  paper  at:

http://www.scirp.org/journal/PaperInformation.aspx?PaperID=52114#.VIkCtsnQrzE

ABSTRACT

The current paper proposes a new visualization tool to help check the quality of the random forest predictions by plotting the proximity matrix as weighted networks. This new visualization technique will be compared with the traditional multidimensional scale plot. The present paper also introduces a new accuracy index (proportion of misplaced cases), and compares it to total accuracy, sensitivity and specificity. It also applies cluster coefficients to weighted graphs, in order to understand how well the random forest algorithm is separating two classes. Two datasets were analyzed, one from a medical research (breast cancer) and the other from a psychology research (medical student’s academic achievement), varying the sample sizes and the predictive accuracy. With different number of observations and different possible prediction accuracies, it was possible to compare how each visualization technique behaves in each situation. The results pointed that the visualization of random forest’s predictive performance was easier and more intuitive to interpret using the weighted network of the proximity matrix than using the multidimensional scale plot. The proportion of misplaced cases was highly related to total accuracy, sensitivity and specificity. This strategy, together with the computation of Zhang and Horvath’s (2005) clustering coefficient for weighted graphs, can be very helpful in understanding how well a random forest prediction is doing in terms of classification.

Cite this paper

Golino, H. & Gomes, C. (2014). Visualizing Random Forest’s Prediction Results. Psychology, 5, 2084-2098. doi: 10.4236/psych.2014.519211.

References

[1] Antoniou, I. E., & Tsompa, E. T. (2008). Statistical Analysis of Weighted Networks. Discrete Dynamics in Nature and Society, 2008, 16.
http://dx.doi.org/10.1155/2008/375452
[2] Barrat, A., Barthelemy, M., Pastor-Satorras, R., & Vespignani, A. (2004). The Architecture of Complex Weighted Networks. Proceedings of the National Academy of Sciences of the United States of America, 101, 3747-3752.
http://dx.doi.org/10.1073/pnas.0400087101
[3] Bennett, K. P., & Mangasarian, O. L. (1992) Robust Linear Programming Discrimination of Two Linearly Inseparable Sets. Optimization Methods and Software, 1, 23-34.
http://dx.doi.org/10.1080/10556789208805504
[4] Blanch, A., & Aluja, A. (2013). A Regression Tree of the Aptitudes, Personality, and Academic Performance Relationship. Personality and Individual Differences, 54, 703-708.
http://dx.doi.org/10.1016/j.paid.2012.11.032
[5] Borg, I. & Groenen, P. (2005). Modern Multidimensional Scaling: Theory and Applications (2nd ed.). New York: Springer-Verlag.
[6] Borkin, M. A., Vo, A. A., Bylinskii, Z., Isola, P., Sunkavalli, S., Oliva, A., & Pfister, H. (2013). What Makes a Visualization Memorable? Visualization and Computer Graphics IEEE Transactions, 12, 2306-2315.
http://dx.doi.org/10.1109/TVCG.2013.234
[7] Breiman, L. (2001). Random Forests. Machine Learning, 1, 5-32.
http://dx.doi.org/10.1023/A:1010933404324
[8] Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression Trees. New York: Chapman & Hall.
[9] Cortez, P., & Silva, A. M. G. (2008). Using Data Mining to Predict Secondary School Student Performance. In A. Brito, & J. Teixeira (Eds.), Proceedings of 5th Annual Future Business Technology Conference, Porto, 5-12.
[10] Eloyan, A., Muschelli, J., Nebel, M., Liu, H., Han, F., Zhao, T., Caffo, B. et al. (2012). Automated Diagnoses of Attention Deficit Hyperactive Disorder Using Magnetic Resonance Imaging. Frontiers in Systems Neuroscience, 6.
http://dx.doi.org/10.3389/fnsys.2012.00061
[11] Epskamp, S., Cramer, A. O. J., Waldorp, L. J., Schmittmann, V. D., & Borsboom, D. (2012). Qgraph: Network Visualizations of Relationships in Psychometric Data. Journal of Statistical Software, 48, 1-18.
http://www.jstatsoft.org/v48/i04/
[12] Fruchterman, T. M. J., & Reingold, E. M. (1991). Graph Drawing by Force-Directed Placement. Software: Practice and Experience, 21, 1129-1164.
http://dx.doi.org/10.1002/spe.4380211102
[13] Geurts, P., Irrthum, A., & Wehenkel, L. (2009). Supervised Learning with Decision Tree-Based Methods in Computational and Systems Biology. Molecular Biosystems, 5, 1593-1605.
http://dx.doi.org/10.1039/b907946g
[14] Golino, H. F., & Gomes, C. M. A. (2014). Four Machine Learning Methods to Predict Academic Achievement of College Students: A Comparison Study. Revista E-Psi, 4, 68-101.
[15] Hardman, J., Paucar-Caceres, A., & Fielding, A. (2013). Predicting Students’ Progression in Higher Education by Using the Random Forest Algorithm. Systems Research and Behavioral Science, 30, 194-203.
http://dx.doi.org/10.1002/sres.2130
[16] Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference and Prediction (2nd ed.). New York: Springer.
http://dx.doi.org/10.1007/978-0-387-84858-7
[17] Honarkhah, M., & Caers, J. (2010). Stochastic Simulation of Patterns Using Distance-Based Pattern Modeling. Mathematical Geosciences, 42, 487-517.
http://dx.doi.org/10.1007/s11004-010-9276-7
[18] James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. New York: Springer.
http://dx.doi.org/10.1007/978-1-4614-7138-7
[19] Kalna, G., & Higham, D. J. (2007). A Clustering Coefficient for Weighted Networks, with Application to Gene Expression Data. Journal of AI Communications-Network Analysis in Natural Sciences and Engineering, 20, 263-271.
[20] Kuhn, M., & Johnson, K. (2013). Applied Predictive Modeling. New York: Springer.
http://dx.doi.org/10.1007/978-1-4614-6849-3
[21] Lemon, J. (2006). Plotrix: A Package in the Red Light District of R. R-News, 6, 8-12.
[22] Liaw, A., & Wiener, M. (2012). Random Forest: Breiman and Cutler’s Random Forests for Classification and Regression. R Package Version 4.6-7.
[23] Mangasarian, O. L., & Wolberg, W. H. (1990). Cancerdiagnosis via Linear Programming. SIAM News, 23, 1-18.
[24] Mangasarian, O. L., Setiono, R., & Wolberg, W. H. (1990). Pattern Recognition via Linear Programming: Theory and Application to Medical Diagnosis. In T. F. Coleman, & Y. Y. Li (Eds.), Large-Scale Numerical Optimization (pp. 22-30). Philadelphia, PA: SIAM Publications.
[25] Onnela, J. P., Saramaki, J., Kertesz, J., & Kaski, K. (2005). Intensity and Coherence of Motifs in Weighted Complex Networks. Physical Review E, 71, Article ID: 065103.
http://dx.doi.org/10.1103/PhysRevE.71.065103
[26] Quach, A. T. (2012). Interactive Random Forests Plots. All Graduate Plan B and Other Reports, Paper 134, Utah State Univesity.
[27] R Development Core Team (2011). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing.
http://www.R-project.org
[28] Seni, G., & Elder, J. F. (2010). Ensemble Methods in Data Mining: Improving Accuracy through Combining Predictions. Morgan & Claypool Publishers.
http://dx.doi.org/10.2200/S00240ED1V01Y200912DMK002
[29] Skogli, E., Teicher, M. H., Andersen, P., Hovik, K., & Øie, M. (2013). ADHD in Girls and Boys—Gender Differences in Co-Existing Symptoms and Executive Function Measures. BMC Psychiatry, 13, 298.
http://dx.doi.org/10.1186/1471-244X-13-298
[30] Steincke, K. K. (1948). Farvelogtak: Ogsaaen Tilvaerelse IV. København: Fremad.
[31] Venables, W. N., & Ripley, B. D. (2002). Modern Applied Statistics with S (4th ed.). New York: Springer.
http://dx.doi.org/10.1007/978-0-387-21706-2
[32] Wickham, H., Caragea, D., & Cook, D. (2006). Exploring High-Dimensional Classification Boundaries. Proceedings of the 38th Symposium on the Interface of Statistics, Computing Science, and Applications—Interface 2006: Massive Data Sets and Streams, Pasadena, May 24-27 2006.
[33] Wolberg, W. H., & Mangasarian, O. L. (1990) Multisurface Method of Pattern Separation for Medical Diagnosis Applied to Breast Cytology. Proceedings of the National Academy of Sciences of the United States of America, 87, 9193-9196.
[34] Zhang, B., & Horvath, S. (2005). A General Framework for Weighted Gene Co-Expression Network Analysis. Statistical Applications in Geneticsand Molecular Biology, 4.
http://dx.doi.org/10.2202/1544-6115.1128                                                                                 eww141211lx

Predicting Academic Achievement of High-School Students Using Machine Learning

Read  full  paper  at:

http://www.scirp.org/journal/PaperInformation.aspx?PaperID=51702#.VHUm3GfHRK0

ABSTRACT

The present paper presents a relatively new non-linear method to predict academic achievement of high school students, integrating the fields of psychometrics and machine learning. A sample composed by 135 high-school students (10th grade, 50.34% boys), aged between 14 and 19 years old (M = 15.44, DP = 1.09), answered to three psychological instruments: the Inductive Reasoning Developmental Test (TDRI), the Metacognitive Control Test (TCM) and the Brazilian Learning Approaches Scale (BLAS-Deep Approach). The first two tests have a self-appraisal scale attached, so we have five independent variables. The students’ responses to each test/scale were analyzed using the Rasch model. A subset of the original sample was created in order to separate the students in two balanced classes, high achievement (n = 41) and low achievement (n = 47), using grades from nine school subjects. In order to predict the class membership a machine learning non-linear model named Random Forest was used. The subset with the two classes was randomly split into two sets (training and testing) for cross validation. The result of the Random Forest showed a general accuracy of 75%, a specificity of 73.69% and a sensitivity of 68% in the training set. In the testing set, the general accuracy was 68.18%, with a specificity of 63.63% and with a sensitivity of 72.72%. The most important variable in the prediction was the TDRI. Finally, implications of the present study to the field of educational psychology were discussed.

Cite this paper

Golino, H. , Gomes, C. & Andrade, D. (2014). Predicting Academic Achievement of High-School Students Using Machine Learning. Psychology, 5, 2046-2057. doi: 10.4236/psych.2014.518207.

References

[1] Baca-Garcia, E., Perez-Rodriguez, M., Saiz-Gonzalez, D., Basurte-Villamor, I., Saiz-Ruiz, J., Leiva-Murillo, J. M., & de Leon, J. (2007). Variables Associated with Familial Suicide Attempts in a Sample of Suicide Attempters. Progress in Neuro-Psychopharmacology & Biological Psychiatry, 31, 1312-1316.
http://dx.doi.org/10.1016/j.pnpbp.2007.05.019
[2] Blanch, A., & Aluja, A. (2013). A Regression Tree of the Aptitudes, Personality, and Academic Performance Relationship. Personality and Individual Differences, 54, 703-708.
http://dx.doi.org/10.1016/j.paid.2012.11.032
[3] Breiman, L. (2001a). Random Forests. Machine Learning, 1, 5-32.
http://dx.doi.org/10.1023/A:1010933404324
[4] Breiman, L. (2001b). Bagging Predictors. Machine Learning, 24, 123-140.
http://dx.doi.org/10.1007/BF00058655
[5] Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression Trees. New York: Chapman & Hall.
[6] Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.), Hillsdale, NJ: Lawrence Erlbaum Associates.
[7] Commons, M. L., & Richards, F. A. (1984). Applying the General Stage Model. In M. L. Commons, F. A. Richards, & C. Armon (Eds.), Beyond Formal Operations. Late Adolescent and Adult Cognitive Development: Late Adolescent and Adult Cognitive Development (Vol. 1, pp. 141-157). New York: Praeger.
[8] Commons, M. L. (2008). Introduction to the Model of Hierarchical Complexity and Its Relationship to Postformal Action. World Futures, 64, 305-320.
http://dx.doi.org/10.1080/02604020802301105
[9] Commons, M. L., & Pekker, A. (2008). Presenting the Formal Theory of Hierarchical Complexity. World Futures, 64, 375-382.
http://dx.doi.org/10.1080/02604020802301204
[10] Cortez, P., & Silva, A. M. G. (2008). Using Data Mining to Predict Secondary School Student Performance. In A. Brito, & J. Teixeira (Eds.), Proceedings of 5th Annual Future Business Technology Conference, Porto, 5-12.
[11] Del Re, A. C. (2013). compute.es: Compute Effect Sizes. R Package Version 0.2-2.
http://cran.r-project.org/web/packages/compute.es
[12] Eloyan, A., Muschelli, J., Nebel, M., Liu, H., Han, F., Zhao, T., Caffo, B. et al. (2012). Automated Diagnoses of Attention Deficit Hyperactive Disorder Using Magnetic Resonance Imaging. Frontiers in Systems Neuroscience, 6, 61.
http://dx.doi.org/10.3389/fnsys.2012.00061
[13] Fischer, K. W. (1980). A Theory of Cognitive Development: The Control and Construction of Hierarchies of Skills. Psychological Review, 87, 477-531.
http://dx.doi.org/10.1037/0033-295X.87.6.477
[14] Fischer, K. W., & Yan, Z. (2002). The Development of Dynamic Skill Theory. In R. Lickliter, & D. Lewkowicz (Eds.), Conceptions of Development: Lessons from the Laboratory. Hove: Psychology Press.
[15] Flach, P. (2012). Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge: Cambridge University Press.
http://dx.doi.org/10.1017/CBO9780511973000
[16] Frederick, S. (2005). Cognitive Reflection and Decision Making. Journal of Economic Perspectives, 19, 25-42.
http://dx.doi.org/10.1257/089533005775196732
[17] Geurts, P., Irrthum, A., & Wehenkel, L. (2009). Supervised Learning with Decision Tree-Based Methods in Computational and Systems Biology. Molecular BioSystems, 5, 1593-1605.
http://dx.doi.org/10.1039/b907946g
[18] Gibbons, R. D., Hooker, G., Finkelman, M. D., Weiss, D. J., Pilkonis, P. A., Frank, E., Moore, T., & Kupfer, D. J. (2013). The Computerized Adaptive Diagnostic Test for Major Depressive Disorder (CAD-MDD): A Screening Tool for Depression. Journal of Clinical Psychiatry, 74, 669-674.
http://dx.doi.org/10.4088/JCP.12m08338
[19] Golino, H. F., & Gomes, C. M. A. (2012). The Structural Validity of the Inductive Reasoning Developmental Test for the Measurement of Developmental Stages. In K. Stalne (Chair), Adult Development: Past, Present and New Agendas of Research, Symposium Conducted at the Meeting of the European Society for Research on Adult Development, Coimbra, 7-8 July 2012.
[20] Golino, H. F., & Gomes, C. M. A. (2013). Controlando pensamentos intuitivos: O que o pao de queijo e o café podem dizer sobre a forma como pensamos. In C. M. A. Gomes (Chair), Neuroeconomia e Neuromarketing, Symposium conducted at the VII Simposio de Neurociencias da Universidade Federal de Minas Gerais, Belo Horizonte.
[21] Golino, H. F., & Gomes, C. M. A. (2014). Four Machine Learning Methods to Predict Academic Achievement of College Students: A Comparison Study. Revista E-PSI, 4, 68-101.
[22] Gomes, C. M. A., & Golino, H. F. (2009). Estudo exploratorio sobre o Teste de Desenvolvimento do Raciocinio Indutivo (TDRI). In D. Colinvaux (Ed.), Anais do VII Congresso Brasileiro de Psicologia do Desenvolvimento: Desenvolvimento e Direitos Humananos (pp. 77-79). Rio de Janeiro: UERJ.
http://www.abpd.psc.br/files/congressosAnteriores/AnaisVIICBPD.pdf
[23] Gomes, C. M. A. (2010). Perfis de estudantes e a relacao entre abordagens de aprendizagem e rendimento Escolar. Psico, 41, 503-509.
[24] Gomes, C. M. A., & Golino, H. F. (2012). Validade incremental da Escala de Abordagens de Aprendizagem. Psicologia: Reflexao e Critica, 25, 623-633.
http://dx.doi.org/10.1590/S0102-79722012000400001
[25] Gomes, C. M. A., Golino, H. F., Pinheiro, C. A. R., Miranda, G. R., & Soares, J. M. T. (2011). Validacao da Escala de Abordagens de Aprendizagem (EABAP) em uma amostra brasileira. Psicologia: Reflexao e Critica, 24, 19-27.
http://dx.doi.org/10.1590/S0102-79722011000100004
[26] Hardman, J., Paucar-Caceres, A., & Fielding, A. (2013). Predicting Students’ Progression in Higher Education by Using the Random Forest Algorithm. Systems Research and Behavioral Science, 30, 194-203.
http://dx.doi.org/10.1002/sres.2130
[27] Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference and Prediction (2nd ed.). New York: Springer.
http://dx.doi.org/10.1007/978-0-387-84858-7
[28] James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. New York: Springer.
http://dx.doi.org/10.1007/978-1-4614-7138-7
[29] Kuroki, Y., & Tilley, J. L. (2012). Recursive Partitioning Analysis of Lifetime Suicidal Behaviors in Asian Americans. Asian American Journal of Psychology, 3, 17-28.
http://dx.doi.org/10.1037/a0026586
[30] Liaw, A., & Wiener, M. (2012). Random Forest: Breiman and Cutler’s Random Forests for Classification and Regression. R Package Version 4.6-7.
http://cran.r-project.org/web/packages/randomForest/
[31] Linacre, J. M. (2012). Winsteps® Rasch Measurement Computer Program. Beaverton, OR: Winsteps.com.
[32] McGraw, K. O., & Wong, S. P. (1992). A Common Language Effect Size Statistic. Psychological Bulletin, 111, 361-365.
http://dx.doi.org/10.1037/0033-2909.111.2.361
[33] Scott, S. B., Jackson, B. R., & Bergeman, C. S. (2011). What Contributes to Perceived Stress in Later Life? A Recursive Partitioning Approach. Psychology and Aging, 26, 830-843.
http://dx.doi.org/10.1037/a0023180
[34] Skogli, E., Teicher, M. H., Andersen, P., Hovik, K., & Øie, M. (2013). ADHD in Girls and Boys—Gender Differences in Co-Existing Symptoms and Executive Function Measures. BMC Psychiatry, 13, 298.
http://dx.doi.org/10.1186/1471-244X-13-298
[35] Tian, F., Gao, P., Li, L., Zhang, W., Liang, H., Qian, Y., & Zhao, R. (2014). Recognizing and Regulating e-Learners’ Emotions Based on Interactive Chinese Texts in e-Learning Systems. Knowledge-Based Systems, 55, 148-164.
http://dx.doi.org/10.1016/j.knosys.2013.10.019
[36] van der Wal, C., & Kowalczyk, W. (2013). Detecting Changing Emotions in Human Speech by Machine and Humans. Applied Intelligence, 39, 675-691.
http://dx.doi.org/10.1007/s10489-013-0449-1                                                                 eww141126lx

Detecting Design Patterns in Object-Oriented Program Source Code by Using Metrics and Machine Learning

Read  full  paper  at:

http://www.scirp.org/journal/PaperInformation.aspx?PaperID=51394#.VGlDLmfHRK0

ABSTRACT

Detecting well-known design patterns in object-oriented program source code can help maintainers understand the design of a program. Through the detection, the understandability, maintainability, and reusability of object-oriented programs can be improved. There are automated detection techniques; however, many existing techniques are based on static analysis and use strict conditions composed on class structure data. Hence, it is difficult for them to detect and distinguish design patterns in which the class structures are similar. Moreover, it is difficult for them to deal with diversity in design pattern applications. To solve these problems in existing techniques, we propose a design pattern detection technique using source code metrics and machine learning. Our technique judges candidates for the roles that compose design patterns by using machine learning and measurements of several metrics, and it detects design patterns by analyzing the relations between candidates. It suppresses false negatives and distinguishes patterns in which the class structures are similar. As a result of experimental evaluations with a set of programs, we confirmed that our technique is more accurate than two conventional techniques.

Cite this paper

Uchiyama, S. , Kubo, A. , Washizaki, H. and Fukazawa, Y. (2014) Detecting Design Patterns in Object-Oriented Program Source Code by Using Metrics and Machine Learning. Journal of Software Engineering and Applications, 7, 983-998. doi: 10.4236/jsea.2014.712086.

References

[1] Gamma, E., Helm, R., Johnson, R. and Vlissides, J. (1994) Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, Boston.
[2] Lorenz, M. and Kidd, J. (1994) Object-Oriented Software Metrics. Prentice Hall, Upper Saddle River.
[3] Uchiyama, S., Kubo, A., Washizaki, H. and Fukazawa, Y. (2011) Design Pattern Detection Using Software Metrics and Machine Learning. Proceedings of the 5th International Workshop on Software Quality and Maintainability, Oldenburg, 1 March 2011, 38-47.
[4] Tsantalis, N., Chatzigeorgiou, A., Stephanides, G. and Halkidis, S. (2006) Design Pattern Detection Using Similarity Scoring. IEEE Transactions on Software Engineering, 32, 896-909.
http://dx.doi.org/10.1109/TSE.2006.112
[5] Blewitt, A., Bundy, A. and Stark, L. (2005) Automatic Verification of Design Patterns in Java. Proceedings of the 20th International Conference on Automated Software Engineering, Long Beach, 7-11 November 2005, 224-232.
[6] Kim, H. and Boldyreff, C. (2000) A Method to Recover Design Patterns Using Software Product Metrics. In: Proceedings of the 6th International Conference on Software Reuse: Advances in Software Reusability, Vienna, 27-29 June 2000, 318-335.
[7] Ferenc, R., Beszedes, A., Fulop, L. and Lele, J. (2005) Design Pattern Mining Enhanced by Machine Learning. Proceedings of the 21st IEEE International Conference on Software Maintenance, Budapest, 26-29 September 2005, 295-304. http://dx.doi.org/10.1109/ICSM.2005.40
[8] Washizaki, H., Fukaya, K., Kubo, A. and Fukazawa, Y. (2009) Detecting Design Patterns Using Source Code of before Applying Design Patterns. In: Proceedings of the 8th IEEE/ACIS International Conference on Computer and Information Science, Shanghai, 1-3 June 2009, 933-938.
[9] Shi, N. and Olsson, R.A. (2006) Reverse Engineering of Design Patterns from Java Source Code. In: Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering, Tokyo, 18-22 September 2006, 123-134. http://dx.doi.org/10.1109/ASE.2006.57
[10] Lee, H., Youn, H. and Lee, S. (2007) Automatic Detection of Design Pattern for Reverse Engineering. In: Proceedings of the 5th ACIS International Conference on Software Engineering Research, Management and Applications, Busan, 20-22 August 2007, 577-583.
[11] Wendehals, L. and Orso, A. (2006) Recognizing Behavioral Patterns at Runtime Using Finite Automata. Proceedings of the 4th ICSE 2006 Workshop on Dynamic Analysis, Shanghai, 26 May 2006, 33-40. http://dx.doi.org/10.1145/1138912.1138920
[12] Hayashi, S., Katada, J., Sakamoto, R., Kobayashi, T. and Saeki, M. (2008) Design Pattern Detection by Using Meta Patterns. IEICE Transactions on Information and Systems, 91-D, 933-944.
http://dx.doi.org/10.1093/ietisy/e91-d.4.933
[13] Lucia, A., Deufemia, V., Gravino, C. and Risi, M. (2009) Design Pattern Recovery through Visual Language Parsing and Source Code Analysis. Journal of Systems and Software, 82, 1177-1193.
http://dx.doi.org/10.1016/j.jss.2009.02.012
[14] Guéhéneuc, Y. and Antoniol, G. (2008) DeMIMA: A Multilayered Approach for Design Pattern Identification. IEEE Transactions on Software Engineering, 34, 667-684.
http://dx.doi.org/10.1109/TSE.2008.48
[15] Dietrich, J. and Elgar, C. (2007) Towards a Web of Patterns. Journal of Web Semantics, 5, 108-116. http://dx.doi.org/10.1016/j.websem.2006.11.007
[16] Basili, V.R. and Weiss, D.M. (1984) A Methodology for Collecting Valid Software Engineering Data. IEEE Transactions on Software Engineering, 10, 728-738.
http://dx.doi.org/10.1109/TSE.1984.5010301
[17] Segaran, T. (2007) Programming Collective Intelligence. O’Reilly, Sebastopol.
[18] Hirano, H. (2008) Neural Network Implemented with C++ and Java. Personal Media, Tokyo.
[19] Kurita, T. (1990) Deciding Unit Number of Hidden Layer in Three-Layer-Neural Network by Using Information Criteria. IEICE Transactions on Information and Systems, 73, 1872-1878.
[20] Yuki, H. (2014) An Introduction to Design Patterns to Study by Java. http://www.hyuki.com/dp/
[21] Tanaka, H. (2010) Hello World with Java! http://web.archive.org/web/20100808072152/
[22] Oracle. Oracle Technology Network for Java Developers.
http://www.oracle.com/technetwork/java/index.html
[23] JUnit.org. Resources for Test Driven Development. http://www.junit.org/
[24] SpringSource.org. Spring Source. http://www.springsource.org/                                              eww141117lx

The Computational Theory of Intelligence: Information Entropy

Read  full  paper  at:

http://www.scirp.org/journal/PaperInformation.aspx?PaperID=50204#.VDH94lfHRK0

Author(s)

ABSTRACT

This paper presents an information theoretic approach to the concept of intelligence in the computational sense. We introduce a probabilistic framework from which computation alintelligence is shown to be an entropy minimizing process at the local level. Using this new scheme, we develop a simple data driven clustering example and discuss its applications.

Cite this paper

Kovach, D. (2014) The Computational Theory of Intelligence: Information Entropy. International Journal of Modern Nonlinear Theory and Application, 3, 182-190. doi: 10.4236/ijmnta.2014.34020.

References

[1] Wechsler, D. and Matarazzo, J.D. (1972) Wechsler’s Measurement and Appraisal of Adult Intelligence. Oxford UP, New York.
[2] Gardner, H. (1993) Frames of the Mind: The Theory of Multiple Intelligences. Basic, New York.
[3] Sternberg, R.J. (1982) Handbook of Human Intelligence. Cambridge UP, Cambridge Cambridgeshire.
[4] Hawkins, J. and Sandra, B. (2004) On Intelligence. Times, New York.
[5] Ihara, S. (1993) Information Theory for Continuous Systems. World Scientific, Singapore.
[6] Schroeder, D.V. (2000) An Introduction to Thermal Physics. Addison Wesley, San Francisco.
[7] Penrose, R. (2011) Cycles of Time: An Extraordinary New View of the Universe. Alfred A. Knopf, New York.
[8] Hawking, S.W. (1998) A Brief History of Time. Bantam, New York.
[9] Jones, M.T. (2008) Artificial Intelligence: A Systems Approach. Infinity Science, Hingham.
[10] Russell, S.J. and Peter, No. (2003) Artificial Intelligence: A Modern Approach. Prentice Hall/Pearson Education, Upper Saddle River.
[11] (2013) Download Python. N.p., n.d. Web. 17 August 2013. http://www.python.org/getit
[12] Marr, D. and Poggio, T. (1979) A Computational Theory of Human Stereo Vision. Proceedings of the Royal Society B: Biological Sciences, 204, 301-328. http://dx.doi.org/10.1098/rspb.1979.0029
[13] Meyer, D.E. and Kieras, D.E. (1997) A Computational Theory of Executive Cognitive Processes and Multiple-Task Performance: Part I. Basic Mechanisms. Psychological Review, 104, 3-65.
http://dx.doi.org/10.1098/rspb.1979.0029
[14] Wissner-Gross, A. and Freer, C. (2013) Causal Entropic Forces. Physical Review Letters, 110, Article ID: 168702.
http://dx.doi.org/10.1103/PhysRevLett.110.168702
[15] (2013) Entropica. N.p., n.d. Web. 17 August 2013. http://www.entropica.com                                eww141006lx

A Fully Bayesian Sparse Probit Model for Text Categorization

Read  full  paper  at:

http://www.scirp.org/journal/PaperInformation.aspx?PaperID=49944#.VCIUKFfHRK0

A Fully Bayesian Sparse Probit Model for Text Categorization.

ABSTRACT

Nowadays a common problem when processing data sets with the large number of covariates compared to small sample sizes (fat data sets) is to estimate the parameters associated with each covariate. When the number of covariates far exceeds the number of samples, the parameter estimation becomes very difficult. Researchers in many fields such as text categorization deal with the burden of finding and estimating important covariates without overfitting the model. In this study, we developed a Sparse Probit Bayesian Model (SPBM) based on Gibbs sampling which utilizes double exponentials prior to induce shrinkage and reduce the number of covariates in the model. The method was evaluated using ten domains such as mathematics, the corpuses of which were downloaded from Wikipedia. From the downloaded corpuses, we created the TFIDF matrix corresponding to all domains and divided the whole data set randomly into training and testing groups of size 300. To make the model more robust we performed 50 re-samplings on selection of training and test groups. The model was implemented in R and the Gibbs sampler ran for 60 k iterations and the first 20 k was discarded as burn in. We performed classification on training and test groups by calculating P (yi = 1) and according to [1] [2] the threshold of 0.5 was used as decision rule. Our model’s performance was compared to Support Vector Machines (SVM) using average sensitivity and specificity across 50 runs. The SPBM achieved high classification accuracy and outperformed SVM in almost all domains analyzed.

Cite this paper

Madahian, B. and Faghihi, U. (2014) A Fully Bayesian Sparse Probit Model for Text Categorization. Open Journal of Statistics, 4, 611-619. doi: 10.4236/ojs.2014.48057.
References

 

[1] Pike, M., et al. (1980) Bias and Efficiency in Logistic Analyses of Stratified Case-Control Studies. International Journal of Epidemiology, 9, 89-95.
http://dx.doi.org/10.1093/ije/9.1.89
[2] Genkin, A., Lewis, D.D. and Madigan, D. (2007) Large-Scale Bayesian Logistic Regression for Text Categorization. Technometrics, 49, 291-304.
http://dx.doi.org/10.1198/004017007000000245
[3] Cao, J. and Zhang, S. (2010) Measuring Statistical Significance for Full Bayesian Methods in Microarray Analyses. Bayesian Analysis, 5, 413-427.
http://dx.doi.org/10.1214/10-BA608
[4] Li, J., et al. (2011) The Bayesian Lasso for Genome-Wide Association Studies. Bioinformatics, 27, 516-523.
http://dx.doi.org/10.1093/bioinformatics/btq688
[5] Bae, K. and Mallick, B.K. (2004) Gene Selection Using a Two-Level Hierarchical Bayesian Model. Bioinformatics, 20, 3423-3430.
http://dx.doi.org/10.1093/bioinformatics/bth419
[6] Tibshirani, R. (1996) Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society Series B, 58, 267-288.
[7] Madahian, B., Deng, L.Y. and Homayouni, R. (2014) Application of Sparse Bayesian Generalized Linear Model to Gene Expression Data for Classification of Prostate Cancer Subtypes. Open Journal of Statistics, 4, 518-526.
http://dx.doi.org/10.4236/ojs.2014.47049
[8] Wu, T.T., et al. (2009) Genome-Wide Association Analysis by Lasso Penalized Logistic Regression. Bioinformatics, 25, 714-721.
http://dx.doi.org/10.1093/bioinformatics/btp041
[9] Yang, J., et al. (2010) Common SNPs Explain a Large Proportion of the Heritability for Human Height. Nature Reviews Genetics, 42, 565-569.
http://dx.doi.org/10.1038/ng.608
[10] Madsen, H. and Thyregod, P. (2011) Introduction to General and Generalized Linear Models. Chapman & Hall/CRC, Boca Raton.
[11] Gelfand, A. and Smith, A.F.M. (1990) Sampling-Based Approaches to Calculating Marginal Densities. Journal of the American Statistical Association, 85, 398-409.
http://dx.doi.org/10.1080/01621459.1990.10476213
[12] Gilks, W.R., Richardson, S. and Spiegelhalter, D. (1995) Markov Chain Monte Carlo in Practice. Chapman and Hall/CRC, London.
[13] Leopold, E. and Kindermann, J. (2002) Text Categorization with Support Vector Machines. How to Represent Texts in InPut Space? Machine Learning, 46, 423-444.
http://dx.doi.org/10.1023/A:1012491419635
[14] Kim, H., Howland, P. and Park, H. (2005) Dimension Reduction in Text Classification with Support Vector Machines. Journal of Machine Learning Research, 6, 37-53.
[15] Joachims, T. (1998) Text Categorization with Support Vector Machines: Learning with Many Relevant Features. Springer, Berlin Heidelberg.
[16] Guyon, I., et al. (2002) Gene Selection for Cancer Classification Using Support Vector Machines. Machine Learning, 46, 389-422.
http://dx.doi.org/10.1023/A:1012487302797
[17] Weston, J., et al. (2002) Feature Selection for SVMs. Advances in Neural Information Processing Systems. MIT Press, Cambridge.
[18] Blei, D.M. (2012) Probabilistic Topic Models. Communications of the ACM, 55, 77-84.
http://dx.doi.org/10.1145/2133806.2133826
[19] Blei, D.M., Ng, A.Y. and Jordan, M.I. (2003) Latent Dirichlet Allocation. The Journal of Machine Learning Research, 3, 993-1022.
[20] Schmidt, B. (2013) Sapping Attention: Keeping the Words in Topic Models.
http://sappingattention.blogspot.com/2013/01/keeping-words-in-topic-models.html
[21] Weingart, S.B. (2012) Topic Modeling for Humanists: A Guided Tour.
http://www.scottbot.net/HIAL/?p=19113
[22] Wedderburn, R.W.M. (1974) Quasi-Likelihood Functions, Generalized Linear Models, and the Gauss-Newton Method. Biometrika, 61, 439-447.
[23] Jennrich, R.I. and Sampson, P.F. (1976) Newton-Raphson and Related Algorithms for Maximum Likelihood Variance Component Estimation. Technometrics, 18, 11-17.
http://dx.doi.org/10.2307/1267911
[24] Hastie, T., Tibshirani, R. and Friedman, J. (2009) Linear Methods for Regression. Springer, New York.
[25] Hoerl, A.E. and Kennard, R.W. (1970) Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12, 55-67.
http://dx.doi.org/10.1080/00401706.1970.10488634
[26] Li, Z. and Sillanpää, M.J. (2012) Overview of LASSO-Related Penalized Regression Methods for Quantitative Trait Mapping and Genomic Selection. Theoretical and Applied Genetics, 125, 419-435.
http://dx.doi.org/10.1007/s00122-012-1892-9
[27] Knight, K. and Fu, W. (2000) Asymptotics for Lasso-Type Estimators. The Annals of Statistics, 28, 1356-1378.
http://dx.doi.org/10.1214/aos/1015957397
[28] Yuan, M. and Lin, Y. (2005) Efficient Empirical Bayes Variable Selection and Estimation in Linear Models. Journal of the American Statistical Association, 100, 1215-1225.
http://dx.doi.org/10.1198/016214505000000367
[29] Zou, H. (2006) The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association, 101, 1418- 1429.
http://dx.doi.org/10.1198/016214506000000735
[30] Zou, H. and Li, R. (2008) One-Step Sparse Estimates in Non-Concave Penalized Likelihood Models. The Annals of Statistics, 36, 1509-1533.
http://dx.doi.org/10.1214/009053607000000802
[31] Park, T. and Casella, G. (2008) The Bayesian Lasso. Journal of the American Statistical Association, 103, 681-686.
http://dx.doi.org/10.1198/016214508000000337
[32] Hans, C. (2009) Bayesian Lasso Regression. Biometrika, 96, 835-845.
http://dx.doi.org/10.1093/biomet/asp047
[33] Griffin, J.E. and Brown,P.J. (2007) Bayesian Adaptive Lassos with Non-Convex Penalization. Technical Report, IMSAS, University of Kent, Canterbury.
[34] Albert, J. and Chib, S. (1993) Bayesian Analysis of Binary and Polychotomous Response Data. Journal of the American Statistical Association, 88, 669-679.
http://dx.doi.org/10.1080/01621459.1993.10476321
[35] Bae, K. and Mallick, B.K. (2004) Gene Selection Using a Two-Level Hierarchical Bayesian Model. Bioinformatics, 20, 3423-3430.
http://dx.doi.org/10.1093/bioinformatics/bth419
[36] Chen, J., et al. (2006) Decision Threshold Adjustment in Class Prediction. SAR and QSAR in Environmental Research, 17, 337-352.
http://dx.doi.org/10.1080/10659360600787700
[37] Altman, D.G. and Bland, J.M. (1994) Diagnostic Tests 1: Sensitivity and Specificity. British Medical Journal, 308, 1552.
http://dx.doi.org/10.1136/bmj.308.6943.1552
[38] Karatzoglou, A., Meyer, D. and Hornik, K. (2005) Support Vector Machines in R. Journal of Statistical Software, 15, 1-28.
[39] Karatzoglou, A., et al. (2004) Kernlab—An S4 Package for Kernel Methods in R. Journal of Statistical Software, 11, 1-20.
[40] Williams, P.M. (1995) Bayesian Regularization and Pruning Using a Laplace Prior. Neural Computation, 7, 117-143.
http://dx.doi.org/10.1162/neco.1995.7.1.117     eww140924lx