1. analysis and algebra
(1) Walter Rudin, principles of mathematical analysis,
(2) Walter Rudin, real and complex analysis,
(3) Walter Rudin, functional analysis
(4) Peter Lax, linear algebra and its application
(5) Roger A. Horn et al, matrix analysis
(6) Gene H. Golub and Charles F. Van Loan, Matrix computations
(7) LLoyd N. Trefethen and David Bau III, numerical linear algebra
2. probability and statistics
(1) G. Casella and R. L. Berger, statistical inference,
(2) A. Gelman et al., Bayesian data analysis
(3) C. Robert, The Bayesian choice
(4) R. Robert and G. Casella, Monte Carlo statistical methods
(5) Patrick Billingsley, probability and measure
3. optimization
(1) S. Boyd and L. Vandenberghe, convex optimization
(2) D. Bertsekas, nonlinear programming
4. machine learning
(1) C. Bishop, pattern recognition and machine learning
(2) T. Hastie and R. Tibshirani and J. Fried man, The elements of statistical learning,
(3) Daphne Koller and Nir Friedman, Probabilistic graphical models
(4) Ian Goodfellow, Yoshua Bengio and Aaron Courville, Deep learning