Team:Minnesota/Web Scrape

Team:Minnesota/Project/Insulin -




Team:Minnesota - Main Style Template Team:Minnesota - Template

Biotechnology and the Web

        The advent of the internet has conceived a common grounds for the public to rapidly generate and spread their ideas. Although this has undoubtedly shaped and improved our lives, patterns of misinformation spread in these channels have presented serious obstacles to the advancement to biotechnology in societal applications. To address this, we have taken prelimary steps into developing a module power by Google that has the ability to probe the web on both content and temporal ranges to give research a reference for public outlook.

The Rising Coalition

        The past few years has brought biotechnology to the forefront of science discussion in the public. Vaccination and genetically modified organisms are the greatest examples of this debate and how the public can follow emotionally driven arguments above scientific reason. How can we study these behaviors?

        "Web scraping" is a term used in computer science to describe the process of extracting data and information from websites in a highly automated manner. With the effectively endless supplies of opinions, data, and articles on the web, we can clean up text from these websites and computationally sweep them for both objective and emotional content.

        One example of what you can do with this software is to utilize the keyword hotspots across the web. After Google returns a list of websites with content, you can sweep the text for specific words and look at the neighborhood around these locations. This can give you an idea of how people feel and express their viewpoints about a specific technology. For the example below, we searched "GMO Benefits" and "Why to avoid GMOs" in separate queries, then used the hotspot words such as "genetically modified", "GMO", and "genetics". The resulting collection of words in the neighborhood for both of these queries were moved to Wordle to visualize the most prevalent terminology (above). This is one way to identify trends in public communication in biotechnology.

        Another example is tracking the opinions of a biotechnology company over time. If you want to apply technology developed at your company, you will always have a much easier time if there is a favorable public outlook. Tracking the sentiment of the web will illuminate impending PR problems or whether you've recovered from people actively attacking your company after a problem. To exemplify, we've used the "Google_Time_Lapse" to analyze content from the top webpages published in each month linking to the query. The sentiment analysis included in the Python module TextBlob allows us to see the trends in Monsanto's image which reaches an all-time low May 2015, which is consequently the time protest were undertaken, partially in response to Monsanto's actions involving genetically modified foods.

or contact:

Patrick V. Holec
University of Minnesota