This workshop will provide an overview of how to scrape data from html pages and website APIs using Python. This will mostly be accomplished using the Python requests, beautifulsoup, retry modules and the browser developer tools. The workshop is intended for users with basic Python knowledge. Anaconda Python 3.5 will be used.
Modern computers have a CPU with multiple cores (usually between 4-8). Come learn how to take advantage of them to parallelize and speed up your code. We’ll show you how to structure your code so you can parallelize it in 5 lines or less. We will also cover some theory, a few practical considerations along with some basic exercises. We’ll be using the multiprocessing module in Python. The workshop is intended for users with basic Python knowledge. The workshop assumes you know how to do the following in Python: i) write a for loop, ii) write a function that has inputs and outputs. Anaconda Python 3.5 will be used.