Unleashing the Power of Regular Expressions in Python
Part 7 : Python for Scripting
Welcome back to our Python for Scripting series! In this episode, we're diving deep into the fascinating world of Regular Expressions (regex) in Python. Buckle up as we explore the ins and outs of regex for powerful text manipulation!
Understanding Regular Expressions
Regular expressions are sequences of characters that define search patterns. They act as a secret code to find, extract, or manipulate specific pieces of text. The re
module in Python empowers us to work magic with these patterns.
Exploring the re
Module for Pattern Matching
Let's kick off our regex adventure by finding occurrences of the word "world" in a text:
import re
pattern = r"world"
text = "Hello, world! How's the world treating you?"
matches = re.findall(pattern, text)
print("Occurrences of 'world':", len(matches))
Expected Output:
Occurrences of 'world': 2
This code snippet uses re.findall()
to search for the pattern "world" within the text and counts how many times it appears.
Common Regex Patterns: Unveiling the Magic
1. Literal Matches
Consider finding instances of the word "apple" in a text:
pattern = r"apple"
text = "An apple a day keeps the doctor away. I love apples!"
matches = re.findall(pattern, text)
print("Occurrences of 'apple':", len(matches))
Expected Output:
Occurrences of 'apple': 2
2. Character Classes
Let's explore extracting vowels from a sentence:
pattern = r"[aeiou]"
text = "Regular expressions are amazing!"
vowels = re.findall(pattern, text)
print("Vowels found:", vowels)
Expected Output:
Vowels found: ['e', 'u', 'a', 'e', 'i', 'o', 'a', 'i', 'a', 'o', 'a', 'i']
3. Quantifiers
Extracting years from a sentence:
pattern = r"\d{2,4}"
text = "Python 3.9 was released in 2020."
years = re.findall(pattern, text)
print("Years found:", years)
Expected Output:
Years found: ['3', '9', '2020']
Validating and Extracting Data with Regex
1. Email Validation
Verify the validity of an email address:
pattern = r"[\w\.-]+@[\w\.-]+"
email = "user@example.com"
if re.match(pattern, email):
print("Valid email address.")
else:
print("Invalid email address.")
2. Extracting Dates
Extract dates from a text:
pattern = r"\d{2}/\d{2}/\d{4}"
text = "Date: 12/31/2022, Meeting on 05/15/2023"
dates = re.findall(pattern, text)
print("Dates found:", dates)
Expected Output:
Dates found: ['12/31/2022', '05/15/2023']
Understanding these regex patterns allows us to perform precise text operations. The examples provided showcase their versatility in searching, validating, and extracting information from text data. Practice these patterns to wield the power of regex effectively in your Python scripts! Stay tuned for more Python scripting adventures.
Solutions to Exercises from the Previous Blog
Exercise 1: Counting Words in a Text File
# Open the file in read mode
with open('sample.txt', 'r') as file:
text = file.read()
word_count = len(text.split())
print("Number of words in the file:", word_count)
Explanation:
open('sample.txt', 'r')
: Opens the file named 'sample.txt' in read mode.file.read
()
: Reads the entire content of the file and stores it in the variabletext
.text.split()
: Splits the text into words based on spaces and returns a list.len(text.split())
: Calculates the length of the list of words, which represents the total number of words in the file.
Expected Output:
Number of words in the file: 56
Exercise 2: Appending User Input to an Existing File
user_input = input("Enter text to append: ")
with open('existing_file.txt', 'a') as file:
file.write('\n' + user_input)
Explanation:
input("Enter text to append: ")
: Prompts the user to enter text.open('existing_file.txt', 'a')
: Opens the file 'existing_file.txt' in append mode.file.write('\n' + user_input)
: Appends the user's input to the file on a new line.
Exercise 3: Writing String to a File in Append Mode
def write_to_file(filename, text):
with open(filename, 'a') as file:
file.write('\n' + text)
write_to_file('data.txt', 'This text will be appended to the file.')
Explanation:
def write_to_file(filename, text)
: Defines a function that takes a filename and a string as arguments.file.write('\n' + text)
: Appends the provided text to the file in append mode.
Exercise 4: Reading and Printing CSV Contents
import csv
with open('data.csv', newline='') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
print(', '.join(row))
Explanation:
import csv
: Imports the CSV module for handling CSV files.csv.reader(csvfile)
: Creates a CSV reader object.for row in reader: print(', '.join(row))
: Iterates through each row of the CSV file and prints its contents as a table.
Exercise 5: Searching for a Specific Word in a Text File
search_word = 'specific'
with open('sample.txt', 'r') as file:
for line_num, line in enumerate(file, 1):
if search_word in line:
print(f"Word found at line {line_num}: {line.strip()}")
Explanation:
search_word = 'specific'
: Specifies the word to search.enumerate(file, 1)
: Enumerates through each line of the file, starting from line 1.if search_word in line: print(f"Word found at line {line_num}: {line.strip()}")
: Checks if the search word is present in each line and displays the line number and line content where the word is found.
These solutions provide practical implementations for the exercises, showcasing how Python can effectively handle file operations and text processing tasks.
Putting Your Regex Skills to the Test: Practical Exercises
Let's apply your newfound knowledge of regular expressions through hands-on exercises. These exercises cover real-life scenarios where regex can be incredibly useful.
Exercise 1: Phone Number Validator
Design a regex pattern to validate various phone number formats like (123) 456-7890, 123-456-7890, 123.456.7890, etc.
Exercise 2: Password Strength Checker
Create a regex pattern to check the strength of passwords. Consider criteria like minimum length, presence of uppercase letters, numbers, and special characters.
Exercise 3: URL Extractor
Develop a script that extracts URLs from a block of text using regex. Consider various URL formats (http, https, www, etc.).
Exercise 4: IP Address Finder
Craft a regex pattern to identify and extract IP addresses from a piece of text. Explore both IPv4 and IPv6 address formats.
Exercise 5: HTML Tag Remover
Build a function that uses regex to remove HTML tags from a given HTML document. Consider different tag formats and nested tags.
Try these exercises to deepen your understanding of regular expressions in practical scenarios. Happy coding!