Tuesday, April 25, 2023
Sequences, series, Zeno's Paradox and Donald Duck
Friday, April 14, 2023
Cross Validation lecture for joint BYU-City Tech undergraduate data science seminar
Today, I delivered a lecture on cross-validation and Bootstrap, which are resampling methods used to evaluate machine learning algorithms' test error. I based my lecture on a resampling chapter from Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani's book, "An Introduction to Statistical Learning: with Applications in R" (2013).
During the lecture, we first discussed the bias-variance trade-off and how the training error cannot accurately predict the testing error as the model's complexity increases. We then explored the performance of various ways of spltting the data, such as splitting the data into two equal parts and performing leave-one-out splitting.
Next, we discussed k-fold cross-validation, which involves splitting the data into K subsets of equal size, and iteratively selecting a validation set and training on the remaining folds K times. The cross-validation score is the weighted average of the mean squared error (MSE) for each fold. We demonstrated an experiment with 10-fold MSE and found that K=5 or 10 is a good compromise in terms of the variance-bias trade-off.
Finally, we discussed an example of two-class classification with 5000 predictors and 50 samples, where the wrong way to perform K-fold CV was to apply it after filter 1, while the right way was to apply CV on both filter 1 and filter 2. We stressed the importance of understanding the correlation between each K-fold split and the variance-bias trade-off in determining the optimal choice of K for the CV folds.
Overall, the lecture was well-received, and there were many interesting questions from colleagues and students. This is a topic that is frequently discussed in data science forums, and it is important to emphasize the importance of balancing the variance-bias trade-off and the impact of data splitting on the results of cross-validation.
Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani's book, "An Introduction to Statistical Learning: with Applications in R" (2013). |
Thursday, April 6, 2023
Functions before Spring break!
So here we are, a few days since our last class in the middle of Spring break. I announced that I was pushing the exam to the Wednesday after Spring break. I could perceive the joy after the news.
This Monday 4/3, we continued talking about functions. We stressed key definitions related to functions such as:
- domain
- codomain
- range
- image
- preimage
- one-to-one functions
- onto functions
- bijection
- inverse functions
https://www.teacherspayteachers.com/Product/Spring-Break-Math-Practice-Packet-Common-Core-Aligned-622786 |
It has been a while: from Yoda to trigs
Our SIAM student chapter recently hosted an insightful talk by a mathematician from Brigham Young University. They delved into the mathemati...
-
This week I am a little behind with my journaling. We had exam last week and this week we switched gears to discuss sequences and series. W...
-
So today I was on the bus and the subway reflecting about the best review exercises for the exam I was supposed to give this coming Monday.....
-
This week I had to teach two days in a row, Tuesday and Wednesday, since Presindet's day was on Monday. I kept talking about propos...