I had to implement a web scraping script, which fetches data from a website on a given interval. The problem was that sometimes the values were updated by the data provider and so some values in my scraping database had to be reloaded and updated as well.

We wanted to speed up a php web scraper. It uses a while loop to load all data sequentially from a csv file. The performance was quite bad because the script always blocks execution while it waits for a HTTP response (this happens on each loop iteration). I used the following approach to parallelise the execution with minimal rewriting afford.