Course Webpage
Course webpage (this page)
Syllabus
If you are taking this class, you already have a solid grasp of ‘pen and paper’ physics. Modern physics, however, often requires tools beyond ‘pen and paper’ in two ways:
This is a crash course in learning to deal with data in physical systems and numerical solutions of problems using computers. This is neither a computer science course on algorithms nor a data-based lab course, but it is somewhere in between.
Students: Please use GitHub classroom, use the “[submit]” link to make a private repository where you can submit homework
Schedule
Slot | Name | Topic |
---|---|---|
1 | Julian | Binary Search |
2 | Gus | Stacks |
3 | Nick | Divide & Conquer |
4 | Bre | Quicksort |
5 | Steven | Hash Table |
6 | Miriam | Breadth First Graph |
7 | Cindy | Breadth First + Queue |
8 | Adam | Trading for a Piano |
9 | Van | Greedy Algorithm |
10 | Tiffany | Set Covering |
11 | Stephen | Traveling Salesperson |
12 | James | Dynamic Programming |
13 | Jeremy | Longest Substring |
14 | Dallas | K-Nearest Neighbors |
15 | Bran | Regression |
16 | Trevor | TBA |
17 | Quentin | TBA |
The latest grade sheet is available here (pdf) or here (web). The grades are anonymized and listed according to your assigned P177 student key number.
This week we settled into the class. Some students are having trouble with registering, so if you’re not yet registered, please don’t wait until the last minute in case there are problems. (There’s something weird going on with the course registration system.)
Lecture 1: A soft introduction to the class. Why does “computational physics” exist, anyway? Is it different from “real physics”? The Golden Rule for this class: Professor Tanedo regrets cannot help you with technical computer problems. Please make time to get your system up and running—but once you’re in a Jupyter notebook, it should be smooth sailing. Index card: name / SID / year / programming background / anything I should know about you.
Lecture 2: Using GitHub. By the way, here’s an example for how to best look for resources online. The discussion on rounding error roughly follows the Python 3 Tutorial, section 15. More on representation of numbers:
Lecture 3: Basic integration using Riemann sums, installing packages.
Lecture 4: Integration with trapezoids and parabolas.
Note: there is NO CLASS on Tuesday, April 17. Homework is still due!
Lecture 5: Errors on integrals. What is a reasonable “maximum number of slices” when integrating? Comparing rounding error (numerical precision) to approximation error.
Lecture 6: Ordinary differential equations. Runge-Kutta.
Lecture 7: A peek at RK4, a low-pass filter, and something rotten with energy conservation.
Note: there is NO CLASS on Tuesday, May 1st. Homework is still due!
Lecture 8: Leap-frog, boundary value problems, partial differential equations
Lecture 9: Partial differential equations, Monte Carlo
Lecture 10: Buffon’s needle, Markov chains
Note: we will have a make up lecture on Monday, May 14
Lecture 11: Statistical Mechanics (see Lec 10 for a clean version of the code)
Lecture 12: Review: statistical mechanics sucks
Lecture 13: Metropolis
Note: we will have a make up lecture on Wednesday, May 23
Lecture 14: Simulated Annealing, traveling salesperson
Lecture 15: Ising model theory
Lecture 16: Ising model simulation
Lectures 17-18 this week will be given by Corey Kownacki at the usual time and place
The topic will be an introduction to machine learning and Kaggle competitions. Here’s Corey’s summary:
I began with a five minute introduction to the nifty things ML can do before leading the class through Kaggle and my own tutorial notebook. To keep the initial code approachable and the discussion generalizable, I created a simple toy-dataset with which to begin. After briefly surveying how unsupervised clustering works, we used scikit-learn to implement two of the most common supervised-learning algorithms–Ordinary Least-Squares (OLS) for regression and Classification And Regression Trees (CART) for classification. I made an effort here to outline a typical machine-learning model from start-to-finish, highlighting the loss function’s singular role in algorithmic learning (as it is quite general and intuitive).
After a quick break, I introduced them to the IRIS dataset and the powerful, efficient packages used to streamline data visualization (…that means pandas and seaborn :P). Unfortunately this felt unavoidable as the students were not familiar with dictionaries, forcing me to choose between scrolling through a huge mess of loops, functions, and variable definitions or just simply using pandas. At any rate, I restricted use to the simplest methods and showed how to quickly and easily generate numpy arrays from DataFrames (if they were too scary). We finished by generating and analyzing a series of plots (scatter, correlation, etc) that informed how the analysis should proceed.
For Part II, I plan to talk about the scoring metric, model selection, and cliff-notes’d optimization / parameter-tuning.
Combining both days, I expect to have covered and presented examples for the following algorithms: OLS Regression, CART, Logistic Regression, k-Nearest Neighbors, Linear Discriminant Analysis (LDA), and Principal Component Analysis (PCA). Based on the lectures and the code available to them, students should then be able to: 1) Visualize and discuss the IRIS data. 2) Naively apply one or more ML algorithms to the data. 3) Defend their choice of model & parameters.
Lecture 19: Now that you can code, what can you do with your newfound superpowers? Today: how to get involved in research as an undergraduate, how to apply for graduate school. Overleaf for LaTeX. REU lists: NSF, SULI
Lecture 20: Crash course introduction to LaTeX and BibTeX. Prof. Tanedo’s AMA. Naive thoughts on succeeding in the next stage of your lives.
“Final exam”: Rochelle Silva’s Kaggle Titanic tutorial. Createa Kaggle account, sign up for the Titanic tutorial, folllow Rochelle Silva’s tutorial, upload the Jupyter notebook to GitHub. Due Friday, June 15th.