Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Airflow DAG: Web scrapping using selenium

In my pipeline, I have a step wherein I am doing some web scrapping with selenium. That is: using selenium to open a web page and using web locators extract information from website. Ideally, I would like to have a python callable to do this and use Airflow/Composer. But that selenium will make use of a chrome driver to go to the web page (headless/without opening it physically) and get information. What is the best way to do this? Do I need a virtual machine with chrome installed? Can I do with Airflow/Composer? CC:  @ms4446 

Thanks

4 4 2,667
4 REPLIES 4