5 Useful DIY Python Functions for Parsing Dates and Times

Photo by author

# Introduction

Parsing dates and times is one of those tasks that seems easy until you actually try to do it. Python’s datetime module Handles standard formats well, but real-world data is messy. User input, scraped web data, and legacy systems often throw curveballs.

This article walks you through five practical functions to handle common date and time parsing tasks. Finally, you’ll understand how to build flexible parsers that handle the messy date formats you see in projects.

Link to the code on GitHub

# 1. Parsing relative time strings

Social media apps, chat applications, and activity feeds display timestamps such as “5 minutes ago” or “2 days ago”. When you scrape or process this data, you need to convert these relative strings into reality. datetime object

Here’s a function that handles common relative time expressions:

from datetime import datetime, timedelta
import re

def parse_relative_time(time_string, reference_time=None):
    """
    Convert relative time strings to datetime objects.
    
    Examples: "2 hours ago", "3 days ago", "1 week ago"
    """
    if reference_time is None:
        reference_time = datetime.now()
    
    # Normalize the string
    time_string = time_string.lower().strip()
    
    # Pattern: number + time unit + "ago"
    pattern = r'(\d+)\s*(second|minute|hour|day|week|month|year)s?\s*ago'
    match = re.match(pattern, time_string)
    
    if not match:
        raise ValueError(f"Cannot parse: {time_string}")
    
    amount = int(match.group(1))
    unit = match.group(2)
    
    # Map units to timedelta kwargs
    unit_mapping = {
        'second': 'seconds',
        'minute': 'minutes',
        'hour': 'hours',
        'day': 'days',
        'week': 'weeks',
    }
    
    if unit in unit_mapping:
        delta_kwargs = {unit_mapping(unit): amount}
        return reference_time - timedelta(**delta_kwargs)
    elif unit == 'month':
        # Approximate: 30 days per month
        return reference_time - timedelta(days=amount * 30)
    elif unit == 'year':
        # Approximate: 365 days per year
        return reference_time - timedelta(days=amount * 365)

For units Time Delta Supports live (seconds over weeks), we create a timedelta and subtract it from the reference time. For months and years, we estimate using 30 and 365 days, respectively. It’s not perfect, but it’s good enough for most use cases.

reference_time The parameter lets you specify a different “now” for testing or when processing historical data.

Let’s check it out:

result1 = parse_relative_time("2 hours ago")
result2 = parse_relative_time("3 days ago")
result3 = parse_relative_time("1 week ago")

print(f"2 hours ago: {result1}")
print(f"3 days ago: {result2}")
print(f"1 week ago: {result3}")

Output:

2 hours ago: 2026-01-06 12:09:34.584107
3 days ago: 2026-01-03 14:09:34.584504
1 week ago: 2025-12-30 14:09:34.584558

# 2. Extracting dates from natural language text

Sometimes you need to find dates buried in text: “Meeting is scheduled for January 15, 2026” or “Please respond by March 3”. Instead of manually parsing the entire sentence, you just want to extract the date.

Here is a function that finds and extracts dates from natural language:

import re
from datetime import datetime

def extract_date_from_text(text, current_year=None):
    """
    Extract dates from natural language text.
    
    Handles formats like:
    - "January 15th, 2024"
    - "March 3rd"
    - "Dec 25th, 2023"
    """
    if current_year is None:
        current_year = datetime.now().year
    
    # Month names (full and abbreviated)
    months = {
        'january': 1, 'jan': 1,
        'february': 2, 'feb': 2,
        'march': 3, 'mar': 3,
        'april': 4, 'apr': 4,
        'may': 5,
        'june': 6, 'jun': 6,
        'july': 7, 'jul': 7,
        'august': 8, 'aug': 8,
        'september': 9, 'sep': 9, 'sept': 9,
        'october': 10, 'oct': 10,
        'november': 11, 'nov': 11,
        'december': 12, 'dec': 12
    }
    
    # Pattern: Month Day(st/nd/rd/th), Year (year optional)
    pattern = r'(january|jan|february|feb|march|mar|april|apr|may|june|jun|july|jul|august|aug|september|sep|sept|october|oct|november|nov|december|dec)\s+(\d{1,2})(?:st|nd|rd|th)?(?:,?\s+(\d{4}))?'
    
    matches = re.findall(pattern, text.lower())
    
    if not matches:
        return None
    
    # Take the first match
    month_str, day_str, year_str = matches(0)
    
    month = months(month_str)
    day = int(day_str)
    year = int(year_str) if year_str else current_year
    
    return datetime(year, month, day)

The function creates a dictionary mapping month names (both full and short) to their numeric values. The regex pattern matches month names followed by day numbers with an optional common suffix (ST, ND, RD, TH) and an optional year.

(?:...) Syntax forms a non-occupying group. This means we match the pattern but don’t store it separately. This is useful for optional fields such as common suffix and year.

When no year is provided, the function defaults to the current year. This is logical because if someone mentions “March 3” in January, they are usually referring to the upcoming March, not the previous year.

Let’s test it with different text formats:

text1 = "The meeting is scheduled for January 15th, 2026 at 3pm"
text2 = "Please respond by March 3rd"
text3 = "Deadline: Dec 25th, 2026"

date1 = extract_date_from_text(text1)
date2 = extract_date_from_text(text2)
date3 = extract_date_from_text(text3)

print(f"From '{text1}': {date1}")
print(f"From '{text2}': {date2}")
print(f"From '{text3}': {date3}")

Output:

From 'The meeting is scheduled for January 15th, 2026 at 3pm': 2026-01-15 00:00:00
From 'Please respond by March 3rd': 2026-03-03 00:00:00
From 'Deadline: Dec 25th, 2026': 2026-12-25 00:00:00

# 3. Parsing flexible date formats with smart detection

Real-world data comes in many formats. Writing a separate parser for each form is tedious. Instead, let’s create a function that tries multiple formats automatically.

Here’s a smart date parser that handles common formats:

from datetime import datetime

def parse_flexible_date(date_string):
    """
    Parse dates in multiple common formats.
    
    Tries various formats and returns the first match.
    """
    date_string = date_string.strip()
    
    # List of common date formats
    formats = (
        '%Y-%m-%d',           
        '%Y/%m/%d',           
        '%d-%m-%Y',           
        '%d/%m/%Y',         
        '%m/%d/%Y',           
        '%d.%m.%Y',          
        '%Y%m%d',            
        '%B %d, %Y',      
        '%b %d, %Y',         
        '%d %B %Y',          
        '%d %b %Y',           
    )
    
    # Try each format
    for fmt in formats:
        try:
            return datetime.strptime(date_string, fmt)
        except ValueError:
            continue
    
    # If nothing worked, raise an error
    raise ValueError(f"Unable to parse date: {date_string}")

This function uses a brute force approach. It tries every format until one works. strptime The function arises a ValueError If the date string does not match the format, then we catch the exception and move to the next format.

The order of the formats matters. We have adopted the International Organization for Standardization (ISO) format (%Y-%m-%d) first because it is the most common in technical contexts. Ambiguous formats like %d/%m/%Y And %m/%d/%Y Appear later. If you know you use data consistently, rearrange the list to prioritize it.

Let’s test this with different date formats:

# Test different formats
dates = (
    "2026-01-15",
    "15/01/2026",
    "01/15/2026",
    "15.01.2026",
    "20260115",
    "January 15, 2026",
    "15 Jan 2026"
)

for date_str in dates:
    parsed = parse_flexible_date(date_str)
    print(f"{date_str:20} -> {parsed}")

Output:

2026-01-15           -> 2026-01-15 00:00:00
15/01/2026           -> 2026-01-15 00:00:00
01/15/2026           -> 2026-01-15 00:00:00
15.01.2026           -> 2026-01-15 00:00:00
20260115             -> 2026-01-15 00:00:00
January 15, 2026     -> 2026-01-15 00:00:00
15 Jan 2026          -> 2026-01-15 00:00:00

This approach isn’t the most efficient, but it’s simple and handles the majority of date formats you’ll encounter.

# 4. Parsing time periods

Video players, exercise trackers, and time tracking apps display durations such as “1H 30M” or “2:45:30”. When parsing user input or scraped data, you need to convert them to timedelta Object for calculation.

Here’s a function that parses common period formats:

from datetime import timedelta
import re

def parse_duration(duration_string):
    """
    Parse duration strings into timedelta objects.
    
    Handles formats like:
    - "1h 30m 45s"
    - "2:45:30" (H:M:S)
    - "90 minutes"
    - "1.5 hours"
    """
    duration_string = duration_string.strip().lower()
    
    # Try colon format first (H:M:S or M:S)
    if ':' in duration_string:
        parts = duration_string.split(':')
        if len(parts) == 2:
            # M:S format
            minutes, seconds = map(int, parts)
            return timedelta(minutes=minutes, seconds=seconds)
        elif len(parts) == 3:
            # H:M:S format
            hours, minutes, seconds = map(int, parts)
            return timedelta(hours=hours, minutes=minutes, seconds=seconds)
    
    # Try unit-based format (1h 30m 45s)
    total_seconds = 0
    
    # Find hours
    hours_match = re.search(r'(\d+(?:\.\d+)?)\s*h(?:ours?)?', duration_string)
    if hours_match:
        total_seconds += float(hours_match.group(1)) * 3600
    
    # Find minutes
    minutes_match = re.search(r'(\d+(?:\.\d+)?)\s*m(?:in(?:ute)?s?)?', duration_string)
    if minutes_match:
        total_seconds += float(minutes_match.group(1)) * 60
    
    # Find seconds
    seconds_match = re.search(r'(\d+(?:\.\d+)?)\s*s(?:ec(?:ond)?s?)?', duration_string)
    if seconds_match:
        total_seconds += float(seconds_match.group(1))
    
    if total_seconds > 0:
        return timedelta(seconds=total_seconds)
    
    raise ValueError(f"Unable to parse duration: {duration_string}")

The function handles two main formats: colon-separated time and unit-based strings. For the colon format, we split on the colon and interpret the segments as hours, minutes, and seconds (or just minutes and seconds for two-part durations).

For the unit-based format, we use three separate regex patterns to find hours, minutes, and seconds. The pattern (\d+(?:\.\d+)?) Matches an integer or decimal such as “1.5”. The pattern \s*h(?:ours?)? Matches “H”, “hour”, or “hours” with optional whitespace.

Each matching value is converted in seconds and added to the total. This approach allows the function to handle fractional periods such as “45s” or “2H 15M” without requiring all units to be present.

Let’s now test the function with different period formats:

durations = (
    "1h 30m 45s",
    "2:45:30",
    "90 minutes",
    "1.5 hours",
    "45s",
    "2h 15m"
)

for duration in durations:
    parsed = parse_duration(duration)
    print(f"{duration:15} -> {parsed}")

Output:

1h 30m 45s      -> 1:30:45
2:45:30         -> 2:45:30
90 minutes      -> 1:30:00
1.5 hours       -> 1:30:00
45s             -> 0:00:45
2h 15m          -> 2:15:00

# 5. Parsing ISO week dates

Some systems use ISO week dates instead of regular calendar dates. An ISO week date such as “2026-W03-2” means “Week 3, Day 2 (Tuesday) of 2026”. This format is common in business contexts where planning is weekly.

Here is a function to parse ISO week dates:

from datetime import datetime, timedelta

def parse_iso_week_date(iso_week_string):
    """
    Parse ISO week date format: YYYY-Www-D
    
    Example: "2024-W03-2" = Week 3 of 2024, Tuesday
    
    ISO week numbering:
    - Week 1 is the week with the first Thursday of the year
    - Days are numbered 1 (Monday) through 7 (Sunday)
    """
    # Parse the format: YYYY-Www-D
    parts = iso_week_string.split('-')
    
    if len(parts) != 3 or not parts(1).startswith('W'):
        raise ValueError(f"Invalid ISO week format: {iso_week_string}")
    
    year = int(parts(0))
    week = int(parts(1)(1:))  # Remove 'W' prefix
    day = int(parts(2))
    
    if not (1 <= week <= 53):
        raise ValueError(f"Week must be between 1 and 53: {week}")
    
    if not (1 <= day <= 7):
        raise ValueError(f"Day must be between 1 and 7: {day}")
    
    # Find January 4th (always in week 1)
    jan_4 = datetime(year, 1, 4)
    
    # Find Monday of week 1
    week_1_monday = jan_4 - timedelta(days=jan_4.weekday())
    
    # Calculate the target date
    target_date = week_1_monday + timedelta(weeks=week - 1, days=day - 1)
    
    return target_date

ISO week dates follow specific rules. Week 1 is defined as the week that includes the first Thursday of the year. This means week 1 could start in December of the previous year.

The function uses a reliable approach: find January 4th (which is always in week 1), then find the Monday of that week. From there, we add the appropriate number of weeks and days to reach the target date.

Accounting jan_4.weekday() Returns 0 for Monday to 6 for Sunday. Subtracting this from January 4 gives us the Monday of Week 1. Then we add (week - 1) Weeks more (day - 1) Days to receive the last date.

Let’s check it out:

# Test ISO week dates
iso_dates = (
    "2024-W01-1",  # Week 1, Monday
    "2024-W03-2",  # Week 3, Tuesday
    "2024-W10-5",  # Week 10, Friday
)

for iso_date in iso_dates:
    parsed = parse_iso_week_date(iso_date)
    print(f"{iso_date} -> {parsed.strftime('%Y-%m-%d (%A)')}")

Output:

2024-W01-1 -> 2024-01-01 (Monday)
2024-W03-2 -> 2024-01-16 (Tuesday)
2024-W10-5 -> 2024-03-08 (Friday)

This format is less common than regular dates, but when encountered, having the parser ready saves significant time.

# wrap up

Every function in this article uses regex patterns and datetime arithmetic to handle variations in formatting. These techniques transfer to other analysis challenges, as you can adapt these patterns for custom date formats in your projects.

Building your own parser helps you understand how history parsing works. When you run into non-standard date formats that standard libraries can’t handle, you’ll be ready to write custom solutions.

These functions are especially useful for small scripts, prototypes, and learning projects where adding heavy external dependencies can increase overhead. Happy coding!

Bala Priya c is a developer and technical writer from India. She loves working at the intersection of mathematics, programming, data science, and content creation. His areas of interest and expertise include devops, data science, and natural language processing. She enjoys reading, writing, coding and coffee! Currently, she is working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces and more. Bala also engages resource reviews and coding lessons.

# Introduction

# 1. Parsing relative time strings

# 2. Extracting dates from natural language text

# 3. Parsing flexible date formats with smart detection

# 4. Parsing time periods

# 5. Parsing ISO week dates

# wrap up

Editor's pick

Get latest news

5 Useful DIY Python Functions for Parsing Dates and Times

# Introduction

# 1. Parsing relative time strings

# 2. Extracting dates from natural language text

# 3. Parsing flexible date formats with smart detection

# 4. Parsing time periods

# 5. Parsing ISO week dates

# wrap up

Train ZImage Turbo Laura with the Ostrich AI Toolkit

Kdnuggets comfyui crash course

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news