Industrial manufacturing
Industrial Internet of Things | Industrial materials | Equipment Maintenance and Repair | Industrial programming |
home  MfgRobots >> Industrial manufacturing >  >> Industrial programming >> Python

Accessing Web Data with Python’s urllib: A Practical Guide

Accessing Web Data with Python’s urllib

What Is urllib?

urllib is a built‑in Python module that simplifies opening URLs and handling web resources. It offers functions and classes to fetch, parse, and process URLs, enabling developers to retrieve XML, HTML, JSON, and other data formats directly from the internet.

In this tutorial we’ll demonstrate how to use urllib to:

Opening a URL with urllib

Before executing any network request, import the appropriate urllib module:

import urllib.request  # Python 3
# import urllib2          # Python 2 (deprecated in newer releases)

Define your main routine, specify the target URL, and open a connection with urlopen(). We’ll use the Guru99 YouTube channel as an example:

def main():
    url = urllib.request.urlopen("https://www.youtube.com/user/guru99com")
    print("result code: " + str(url.getcode()))
    data = url.read()
    print(data.decode('utf-8'))

if __name__ == "__main__":
    main()

The getcode() method returns the HTTP status code (e.g., 200 for success). The read() method retrieves the page’s raw bytes, which we decode to UTF‑8 for readability.

Reading the HTML Content

Once the URL is open, you can extract the HTML by simply calling read(). The resulting bytes represent the entire HTML document, which you can then parse with libraries like BeautifulSoup or display directly.

Below is a visual example of the decoded HTML output in the console:

Accessing Web Data with Python’s urllib: A Practical Guide

Complete Code Samples

For reference, the following sections provide full scripts for both Python 2 and Python 3 environments.

Python 2 Example

import urllib2

def main():
    web_url = urllib2.urlopen("https://www.youtube.com/user/guru99com")
    print("result code: " + str(web_url.getcode()))
    data = web_url.read()
    print(data)

if __name__ == "__main__":
    main()

Python 3 Example

import urllib.request

web_url = urllib.request.urlopen('https://www.youtube.com/user/guru99com')
print('result code: ' + str(web_url.getcode()))
print(web_url.read().decode('utf-8'))

Python

  1. Navigating SaaS and Cloud: Why Meticulous Data Management Matters
  2. Mastering Python Data Types: A Practical Guide
  3. Mastering Python Type Conversion & Casting: A Comprehensive Guide
  4. Python Namespaces & Variable Scope: Understanding Names, Bindings, and Scopes
  5. Master Python Exception Handling: try, except, else, and finally Explained
  6. Build a Remote Temperature Sensor with Raspberry Pi and Python – Step‑by‑Step Guide
  7. Hyperconvergence and IoT: Unlocking Edge Computing Power (Part 1)
  8. How to Rename Files and Directories in Python with os.rename() – Step-by-Step Guide
  9. Unlocking Actionable Edge Insights with AI & ML
  10. Use gRPC in Python to Read and Write Process Data on an AXC F 3152