

Photo by author
# Introduction
Parsing dates and times is one of those tasks that seems easy until you actually try to do it. Python’s datetime module Handles standard formats well, but real-world data is messy. User input, scraped web data, and legacy systems often throw curveballs.
This article walks you through five practical functions to handle common date and time parsing tasks. Finally, you’ll understand how to build flexible parsers that handle the messy date formats you see in projects.
# 1. Parsing relative time strings
Social media apps, chat applications, and activity feeds display timestamps such as “5 minutes ago” or “2 days ago”. When you scrape or process this data, you need to convert these relative strings into reality. datetime object
Here’s a function that handles common relative time expressions:
from datetime import datetime, timedelta
import re
def parse_relative_time(time_string, reference_time=None):
"""
Convert relative time strings to datetime objects.
Examples: "2 hours ago", "3 days ago", "1 week ago"
"""
if reference_time is None:
reference_time = datetime.now()
# Normalize the string
time_string = time_string.lower().strip()
# Pattern: number + time unit + "ago"
pattern = r'(\d+)\s*(second|minute|hour|day|week|month|year)s?\s*ago'
match = re.match(pattern, time_string)
if not match:
raise ValueError(f"Cannot parse: {time_string}")
amount = int(match.group(1))
unit = match.group(2)
# Map units to timedelta kwargs
unit_mapping = {
'second': 'seconds',
'minute': 'minutes',
'hour': 'hours',
'day': 'days',
'week': 'weeks',
}
if unit in unit_mapping:
delta_kwargs = {unit_mapping(unit): amount}
return reference_time - timedelta(**delta_kwargs)
elif unit == 'month':
# Approximate: 30 days per month
return reference_time - timedelta(days=amount * 30)
elif unit == 'year':
# Approximate: 365 days per year
return reference_time - timedelta(days=amount * 365)The function uses a regular expression (regex) To extract a number and time unit from a string. The pattern (\d+) Holds one or more digits, and (second|minute|hour|day|week|month|year) Matches the time unit. s? The plurals “hour” and “hour” both work.
For units Time Delta Supports live (seconds over weeks), we create a timedelta and subtract it from the reference time. For months and years, we estimate using 30 and 365 days, respectively. It’s not perfect, but it’s good enough for most use cases.
reference_time The parameter lets you specify a different “now” for testing or when processing historical data.
Let’s check it out:
result1 = parse_relative_time("2 hours ago")
result2 = parse_relative_time("3 days ago")
result3 = parse_relative_time("1 week ago")
print(f"2 hours ago: {result1}")
print(f"3 days ago: {result2}")
print(f"1 week ago: {result3}")Output:
2 hours ago: 2026-01-06 12:09:34.584107
3 days ago: 2026-01-03 14:09:34.584504
1 week ago: 2025-12-30 14:09:34.584558# 2. Extracting dates from natural language text
Sometimes you need to find dates buried in text: “Meeting is scheduled for January 15, 2026” or “Please respond by March 3”. Instead of manually parsing the entire sentence, you just want to extract the date.
Here is a function that finds and extracts dates from natural language:
import re
from datetime import datetime
def extract_date_from_text(text, current_year=None):
"""
Extract dates from natural language text.
Handles formats like:
- "January 15th, 2024"
- "March 3rd"
- "Dec 25th, 2023"
"""
if current_year is None:
current_year = datetime.now().year
# Month names (full and abbreviated)
months = {
'january': 1, 'jan': 1,
'february': 2, 'feb': 2,
'march': 3, 'mar': 3,
'april': 4, 'apr': 4,
'may': 5,
'june': 6, 'jun': 6,
'july': 7, 'jul': 7,
'august': 8, 'aug': 8,
'september': 9, 'sep': 9, 'sept': 9,
'october': 10, 'oct': 10,
'november': 11, 'nov': 11,
'december': 12, 'dec': 12
}
# Pattern: Month Day(st/nd/rd/th), Year (year optional)
pattern = r'(january|jan|february|feb|march|mar|april|apr|may|june|jun|july|jul|august|aug|september|sep|sept|october|oct|november|nov|december|dec)\s+(\d{1,2})(?:st|nd|rd|th)?(?:,?\s+(\d{4}))?'
matches = re.findall(pattern, text.lower())
if not matches:
return None
# Take the first match
month_str, day_str, year_str = matches(0)
month = months(month_str)
day = int(day_str)
year = int(year_str) if year_str else current_year
return datetime(year, month, day)The function creates a dictionary mapping month names (both full and short) to their numeric values. The regex pattern matches month names followed by day numbers with an optional common suffix (ST, ND, RD, TH) and an optional year.
(?:...) Syntax forms a non-occupying group. This means we match the pattern but don’t store it separately. This is useful for optional fields such as common suffix and year.
When no year is provided, the function defaults to the current year. This is logical because if someone mentions “March 3” in January, they are usually referring to the upcoming March, not the previous year.
Let’s test it with different text formats:
text1 = "The meeting is scheduled for January 15th, 2026 at 3pm"
text2 = "Please respond by March 3rd"
text3 = "Deadline: Dec 25th, 2026"
date1 = extract_date_from_text(text1)
date2 = extract_date_from_text(text2)
date3 = extract_date_from_text(text3)
print(f"From '{text1}': {date1}")
print(f"From '{text2}': {date2}")
print(f"From '{text3}': {date3}")Output:
From 'The meeting is scheduled for January 15th, 2026 at 3pm': 2026-01-15 00:00:00
From 'Please respond by March 3rd': 2026-03-03 00:00:00
From 'Deadline: Dec 25th, 2026': 2026-12-25 00:00:00# 3. Parsing flexible date formats with smart detection
Real-world data comes in many formats. Writing a separate parser for each form is tedious. Instead, let’s create a function that tries multiple formats automatically.
Here’s a smart date parser that handles common formats:
from datetime import datetime
def parse_flexible_date(date_string):
"""
Parse dates in multiple common formats.
Tries various formats and returns the first match.
"""
date_string = date_string.strip()
# List of common date formats
formats = (
'%Y-%m-%d',
'%Y/%m/%d',
'%d-%m-%Y',
'%d/%m/%Y',
'%m/%d/%Y',
'%d.%m.%Y',
'%Y%m%d',
'%B %d, %Y',
'%b %d, %Y',
'%d %B %Y',
'%d %b %Y',
)
# Try each format
for fmt in formats:
try:
return datetime.strptime(date_string, fmt)
except ValueError:
continue
# If nothing worked, raise an error
raise ValueError(f"Unable to parse date: {date_string}")This function uses a brute force approach. It tries every format until one works. strptime The function arises a ValueError If the date string does not match the format, then we catch the exception and move to the next format.
The order of the formats matters. We have adopted the International Organization for Standardization (ISO) format (%Y-%m-%d) first because it is the most common in technical contexts. Ambiguous formats like %d/%m/%Y And %m/%d/%Y Appear later. If you know you use data consistently, rearrange the list to prioritize it.
Let’s test this with different date formats:
# Test different formats
dates = (
"2026-01-15",
"15/01/2026",
"01/15/2026",
"15.01.2026",
"20260115",
"January 15, 2026",
"15 Jan 2026"
)
for date_str in dates:
parsed = parse_flexible_date(date_str)
print(f"{date_str:20} -> {parsed}")Output:
2026-01-15 -> 2026-01-15 00:00:00
15/01/2026 -> 2026-01-15 00:00:00
01/15/2026 -> 2026-01-15 00:00:00
15.01.2026 -> 2026-01-15 00:00:00
20260115 -> 2026-01-15 00:00:00
January 15, 2026 -> 2026-01-15 00:00:00
15 Jan 2026 -> 2026-01-15 00:00:00This approach isn’t the most efficient, but it’s simple and handles the majority of date formats you’ll encounter.
# 4. Parsing time periods
Video players, exercise trackers, and time tracking apps display durations such as “1H 30M” or “2:45:30”. When parsing user input or scraped data, you need to convert them to timedelta Object for calculation.
Here’s a function that parses common period formats:
from datetime import timedelta
import re
def parse_duration(duration_string):
"""
Parse duration strings into timedelta objects.
Handles formats like:
- "1h 30m 45s"
- "2:45:30" (H:M:S)
- "90 minutes"
- "1.5 hours"
"""
duration_string = duration_string.strip().lower()
# Try colon format first (H:M:S or M:S)
if ':' in duration_string:
parts = duration_string.split(':')
if len(parts) == 2:
# M:S format
minutes, seconds = map(int, parts)
return timedelta(minutes=minutes, seconds=seconds)
elif len(parts) == 3:
# H:M:S format
hours, minutes, seconds = map(int, parts)
return timedelta(hours=hours, minutes=minutes, seconds=seconds)
# Try unit-based format (1h 30m 45s)
total_seconds = 0
# Find hours
hours_match = re.search(r'(\d+(?:\.\d+)?)\s*h(?:ours?)?', duration_string)
if hours_match:
total_seconds += float(hours_match.group(1)) * 3600
# Find minutes
minutes_match = re.search(r'(\d+(?:\.\d+)?)\s*m(?:in(?:ute)?s?)?', duration_string)
if minutes_match:
total_seconds += float(minutes_match.group(1)) * 60
# Find seconds
seconds_match = re.search(r'(\d+(?:\.\d+)?)\s*s(?:ec(?:ond)?s?)?', duration_string)
if seconds_match:
total_seconds += float(seconds_match.group(1))
if total_seconds > 0:
return timedelta(seconds=total_seconds)
raise ValueError(f"Unable to parse duration: {duration_string}")The function handles two main formats: colon-separated time and unit-based strings. For the colon format, we split on the colon and interpret the segments as hours, minutes, and seconds (or just minutes and seconds for two-part durations).
For the unit-based format, we use three separate regex patterns to find hours, minutes, and seconds. The pattern (\d+(?:\.\d+)?) Matches an integer or decimal such as “1.5”. The pattern \s*h(?:ours?)? Matches “H”, “hour”, or “hours” with optional whitespace.
Each matching value is converted in seconds and added to the total. This approach allows the function to handle fractional periods such as “45s” or “2H 15M” without requiring all units to be present.
Let’s now test the function with different period formats:
durations = (
"1h 30m 45s",
"2:45:30",
"90 minutes",
"1.5 hours",
"45s",
"2h 15m"
)
for duration in durations:
parsed = parse_duration(duration)
print(f"{duration:15} -> {parsed}")Output:
1h 30m 45s -> 1:30:45
2:45:30 -> 2:45:30
90 minutes -> 1:30:00
1.5 hours -> 1:30:00
45s -> 0:00:45
2h 15m -> 2:15:00# 5. Parsing ISO week dates
Some systems use ISO week dates instead of regular calendar dates. An ISO week date such as “2026-W03-2” means “Week 3, Day 2 (Tuesday) of 2026”. This format is common in business contexts where planning is weekly.
Here is a function to parse ISO week dates:
from datetime import datetime, timedelta
def parse_iso_week_date(iso_week_string):
"""
Parse ISO week date format: YYYY-Www-D
Example: "2024-W03-2" = Week 3 of 2024, Tuesday
ISO week numbering:
- Week 1 is the week with the first Thursday of the year
- Days are numbered 1 (Monday) through 7 (Sunday)
"""
# Parse the format: YYYY-Www-D
parts = iso_week_string.split('-')
if len(parts) != 3 or not parts(1).startswith('W'):
raise ValueError(f"Invalid ISO week format: {iso_week_string}")
year = int(parts(0))
week = int(parts(1)(1:)) # Remove 'W' prefix
day = int(parts(2))
if not (1 <= week <= 53):
raise ValueError(f"Week must be between 1 and 53: {week}")
if not (1 <= day <= 7):
raise ValueError(f"Day must be between 1 and 7: {day}")
# Find January 4th (always in week 1)
jan_4 = datetime(year, 1, 4)
# Find Monday of week 1
week_1_monday = jan_4 - timedelta(days=jan_4.weekday())
# Calculate the target date
target_date = week_1_monday + timedelta(weeks=week - 1, days=day - 1)
return target_dateISO week dates follow specific rules. Week 1 is defined as the week that includes the first Thursday of the year. This means week 1 could start in December of the previous year.
The function uses a reliable approach: find January 4th (which is always in week 1), then find the Monday of that week. From there, we add the appropriate number of weeks and days to reach the target date.
Accounting jan_4.weekday() Returns 0 for Monday to 6 for Sunday. Subtracting this from January 4 gives us the Monday of Week 1. Then we add (week - 1) Weeks more (day - 1) Days to receive the last date.
Let’s check it out:
# Test ISO week dates
iso_dates = (
"2024-W01-1", # Week 1, Monday
"2024-W03-2", # Week 3, Tuesday
"2024-W10-5", # Week 10, Friday
)
for iso_date in iso_dates:
parsed = parse_iso_week_date(iso_date)
print(f"{iso_date} -> {parsed.strftime('%Y-%m-%d (%A)')}")Output:
2024-W01-1 -> 2024-01-01 (Monday)
2024-W03-2 -> 2024-01-16 (Tuesday)
2024-W10-5 -> 2024-03-08 (Friday)This format is less common than regular dates, but when encountered, having the parser ready saves significant time.
# wrap up
Every function in this article uses regex patterns and datetime arithmetic to handle variations in formatting. These techniques transfer to other analysis challenges, as you can adapt these patterns for custom date formats in your projects.
Building your own parser helps you understand how history parsing works. When you run into non-standard date formats that standard libraries can’t handle, you’ll be ready to write custom solutions.
These functions are especially useful for small scripts, prototypes, and learning projects where adding heavy external dependencies can increase overhead. Happy coding!
Bala Priya c is a developer and technical writer from India. She loves working at the intersection of mathematics, programming, data science, and content creation. His areas of interest and expertise include devops, data science, and natural language processing. She enjoys reading, writing, coding and coffee! Currently, she is working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces and more. Bala also engages resource reviews and coding lessons.