URL Regex Python Validator

Search...

⌘K

URL Regex Python Validator

Search...

⌘K


URL Regex Python Validator

Use the URL Regex Python Validator to accurately test patterns for validating website links in Python. Whether you’re checking for http, https, or complex paths, this tool helps ensure your URLs are clean, correct, and reliable. For more regex testing, explore the Python Regex Tester, Email Regex Python Validator, or IP Address Regex Python Validator.

example.com
Possible security issues
This regex appears to be safe.
Explanation
  • [A-Z]: uppercase letters
  • [a-z]: lowercase letters
  • [0-9]: digits
  • \.: a literal dot
  • +: one or more of the preceding
  • *: zero or more of the preceding
  • ?: optional (zero or one)
  • ^: start of string
  • $: end of string
Test your APIs today!

Write in plain English — Qodex turns it into secure, ready-to-run tests.

Regular Expression - Documentation

What is the URL Regex Python Validator?

The URL Regex Python Validator is designed to help you check whether your regular expressions correctly match valid web addresses. This includes checking for:


  • Protocols like http or https

  • Domain names and subdomains

  • Optional ports, paths, query parameters, and fragments


It uses Python’s re module and is ideal for form validation, web crawling, data parsing, and link-checking tasks.


Common URL Regex Patterns

  1. Basic HTTP/HTTPS URL

    ^(http|https):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

    Matches: http://example.com, https://qodex.ai

    Does not match: example.com, ftp://server.com


  2. Full URL with Optional Paths & Queries

    ^(http|https):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(/[a-zA-Z0-9\-._~:/?#[\]@!$&'()*+,;=]*)?$

    Matches: https://site.com/path?search=value, http://domain.org


  3. With Optional Port

    ^(http|https):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(:\d{2,5})?(\/.*)?$

    Matches: http://localhost:8000/home, https://api.site.com:443/v1


What Does the Regex Actually Check?

  • HTTP Basic Authentication: Optionally allows for before the domain.

  • Domain Structure: Enforces at least one subdomain (e.g., ), accepts dashes within subdomains, and ensures each subdomain and the top-level domain don't exceed length limits (max 63 characters each, full domain under 253).

  • Top-Level Domain: Only allows alphanumeric characters (no dashes).

  • Localhost Support: Accepts as a valid domain.

  • Port Numbers: Optionally matches ports up to 5 digits (e.g., ).

  • IPv4 Addresses: Recognizes standard IPv4 addresses in the netloc.

  • IPv6 Addresses: For IPv6 validation, you’ll want to supplement with a dedicated IPv6 validator, as the regex alone may not cover all edge cases.


Handling Complex and Edge Cases

While the above regex patterns cover most use-cases, URLs in the wild can be tricky—especially with top-level domains like .co.uk or unconventional subdomain structures. If you need a more robust solution that accounts for these "weird cases," consider a regex pattern that also allows for:

  • Optional protocol (e.g., http://, https://, or none)

  • Optional subdomains (like www.)

  • Support for multi-part TLDs (e.g., co.uk)

  • Paths, query strings, and fragments

  • Hyphens in domain names


Example Enhanced Regex (Python-style)

regex = re.compile(
    r"(\w+://)?"                # protocol (optional)
    r"(\w+\.)?"                 # subdomain (optional)
    r"(([\w-]+)\.(\w+))"        # domain
    r"(\.\w+)*"                 # additional TLD parts (optional)
    r"([\w\-\.\_\~/]*)"         # path, query, fragments (optional)
)


Test Cases for Thoroughness

This more flexible approach will match a variety of real-world URLs, such as:

  • http://www.google.com

  • https://google.co.uk

  • google.com/~user/profile

  • www.example.org

  • https://sub.domain.co.uk/path/to/page

  • example.com

  • .google.com (edge case, may require post-processing)

By testing against a broad set of examples—including those with extra dots, missing subdomains, or unusual TLDs—you can ensure your regex is both comprehensive and resilient.

Feel free to adjust the patterns and test cases as needed to suit the specific requirements of your application. Regex is powerful, but always test thoroughly to avoid surprises!


Advanced Regex for Edge Cases

A more thorough regex can handle authentication, IPv4/IPv6 addresses, localhost, port numbers, and more:

import re

DOMAIN_FORMAT = re.compile(
    r"(?:^(\w{1,255}):(.{1,255})@^)"  # http basic authentication [optional]
    r"(?:(?:(?=\S{0,253}(?:$:)))"
    r"((?:?\.)+"  # subdomains
    r"(?:[a-z0-9]{1,63})))"  # top level domain
    r"localhost"
    r")(:\d{1,5})?",  # port [optional],
    re.IGNORECASE
)

SCHEME_FORMAT = re.compile(
    r"^(httphxxpftpfxp)s?$",  # scheme: http(s) or ftp(s)
    re.IGNORECASE
)

from urllib.parse import urlparse

def validate_url(url: str):
    url = url.strip()
    if not url:
        raise Exception("No URL specified")
    if len(url) > 2048:
        raise Exception(f"URL exceeds its maximum length of 2048 characters (given length={len(url)})")
    result = urlparse(url)
    scheme = result.scheme
    domain = result.netloc
    if not scheme:
        raise Exception("No URL scheme specified")
    if not re.fullmatch(SCHEME_FORMAT, scheme):
        raise Exception(f"URL scheme must either be http(s) or ftp(s) (given scheme={scheme})")
    if not domain:
        raise Exception("No URL domain specified")
    if not re.fullmatch(DOMAIN_FORMAT, domain):
        raise Exception(f"URL domain malformed (domain={domain})")
    return url

This approach splits the URL and validates the scheme and domain separately, handling a wider array of valid URLs (including those with authentication and ports). For even greater accuracy (such as validating IPv6), you might want to add an IPv6 validator.


Alternative: Using Validation Libraries

While regex is great for quick URL checks, Python has some powerful validation libraries that can save you time and headaches, especially when edge cases start popping up.


Using the Package

The package provides simple functions for validating URLs (and many other types of data like emails and IP addresses). Here’s how you can use it:

import validators
print(validators.url("http://localhost:8000"))  # True 
print(validators.url("ftp://invalid.com"))  # ValidationFailure object (evaluates to False)

For more robust code, consider wrapping this check to always return a boolean:

import validators
from validators import ValidationFailure

def is_string_an_url(url_string: str) -> bool:
    # Always strip whitespace before validating!
    Result = validators.url(url_string.strip())
    return result is True  # Only True is valid; ValidationFailure is falsy

Examples

print(is_string_an_url("http://localhost:8000"))  # True 
print(is_string_an_url("http://.www.foo.bar/"))   # False 
print(is_string_an_url("http://localhost:8000 ")) # True (after .strip())

Tip: Always trim leading and trailing spaces before validating URLs, as even a single space will cause most validators—including regex and libraries like these—to reject the input.


Even More Powerful: Pydantic and Django

If you’re using frameworks like Pydantic or Django, you get validation utilities that handle a lot of this for you:


Validation Using Django’s URLValidator

If you’re already using Django, leverage its built-in URL validator for comprehensive checks:

from django.core.validators import URLValidator
from django.core.exceptions import ValidationError

def is_string_an_url(url_string: str) -> bool:
    validate_url = URLValidator()
    try:
        validate_url(url_string.strip())
        return True
    except ValidationError:
        return False

Examples

print(is_string_an_url("https://example.com"))  # True 
print(is_string_an_url("not a url"))            # False

Adding Django just for its URL validation is probably overkill, but if you’re in a Django project already, this is one of the most reliable approaches.


Validation Using Pydantic

Pydantic offers types like AnyHttpUrl for strict URL validation:

from pydantic import BaseModel, AnyHttpUrl, ValidationError

class MyConfModel(BaseModel):
    URI: AnyHttpUrl

try:
    myAddress = MyConfModel(URI="http://myurl.com/")
    print(myAddress.URI)
except ValidationError:
    print('Invalid destination')

This approach raises exceptions for invalid URLs and supports a variety of URL types.

With these approaches—regex for quick checks, and libraries for thorough validation—you can confidently handle URL validation in a variety of Python projects. Whether you want something lightweight for a script or full-featured for an enterprise app, Python’s got you covered.


Beyond Regex: Defensive Validation

While regex is great for quick URL checks, rigorous validation often means adding more logic. Consider these defensive steps:

  • Trim whitespace before validation—accidental spaces cause most validators to reject otherwise valid URLs.

  • Check for empty input and enforce reasonable length limits (e.g., 2048 characters).

  • Validate scheme: Only allow http, https, or your required protocols.

  • Domain verification: Use regex or libraries to ensure the domain is well-formed.

Here’s an example of thorough validation logic:

import re
import urllib.parse

SCHEME_FORMAT = r"https?ftp"
DOMAIN_FORMAT = r"[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"

def validate_url(url: str):
    url = url.strip()
    if not url:
        raise Exception("No URL specified")
    if len(url) > 2048:
        raise Exception(f"URL exceeds its maximum length of 2048 characters (given length={len(url)})")
    result = urllib.parse.urlparse(url)
    scheme = result.scheme
    domain = result.netloc
    if not scheme:
        raise Exception("No URL scheme specified")
    if not re.fullmatch(SCHEME_FORMAT, scheme):
        raise Exception(f"URL scheme must either be http(s) or ftp(s) (given scheme={scheme})")
    if not domain:
        raise Exception("No URL domain specified")
    if not re.fullmatch(DOMAIN_FORMAT, domain):
        raise Exception(f"URL domain malformed (domain={domain})")
    return url


Alternative: Using Validation Libraries

While regex lets you roll your own, Python has powerful validation libraries that take care of edge cases and oddities—saving you time and reducing bugs.

Using the validators Package

This package provides simple functions for validating URLs (and other data like emails and IP addresses):

import validators

print(validators.url("http://localhost:8000"))     # True
print(validators.url("ftp://invalid.com"))         # ValidationFailure (evaluates to False)

For more robust code, wrap the check so it always returns a boolean:

import validators
from validators import ValidationFailure

def is_string_an_url(url_string: str) -> bool:
    # Always strip whitespace before validating!
    Result = validators.url(url_string.strip())
    return result is True  # Only True is valid; ValidationFailure is falsy

Examples:

print(is_string_an_url("http://localhost:8000"))     # True
print(is_string_an_url("http://.www.foo.bar/"))      # False
print(is_string_an_url("http://localhost:8000 "))    # True (after .strip())

Tip: Always trim leading and trailing spaces before validating URLs, as even a single space will cause most validators—including regex and libraries—to reject the input.


Other Approaches: RFC-Based Validation

If you want to validate URLs according to official standards, look into tools that implement , which defines recommendations for validating HTTP URLs and email addresses. For instance, you can use parser libraries (such as LEPL, though it’s no longer maintained) that follow these recommendations for higher accuracy in tricky cases.

A typical workflow with a parser library might look like this:

from lepl.apps.rfc3696 import HttpUrl 

validator = HttpUrl()

print(validator('google'))            # False  
print(validator('http://google'))     # False  
print(validator('http://google.com')) # True

While LEPL is archived, the above pattern shows how you might leverage standards-based parsing for edge cases that regex or general-purpose validators can miss. For modern projects, stick with maintained libraries, but it’s helpful to know these standards exist if you ever need to write your own validator or debug why something isn’t matching.


Validation Using Django’s URLValidator

If you’re already using Django, leverage its built-in URL validator for comprehensive checks:

from django.core.validators import URLValidator
from django.core.exceptions import ValidationError

def is_string_an_url(url_string: str) -> bool:
    validate_url = URLValidator()
    try:
        validate_url(url_string.strip())
        return True
    except ValidationError:
        return False

Examples:

print(is_string_an_url("https://example.com"))  # True
print(is_string_an_url("not a url"))            # False

Adding Django just for its URL validation is probably overkill, but if you’re in a Django project already, this is one of the most reliable approaches.


Bonus: Pydantic for Structured Validation

If you’re working with data models or APIs, Pydantic provides another robust way to validate URLs (and more) using Python type hints and schema validation. It’s especially handy when you want validation and structured error handling as part of your model definitions.

from requests import get, HTTPError, ConnectionError
from pydantic import BaseModel, AnyHttpUrl, ValidationError

class MyConfModel(BaseModel):
    URI: AnyHttpUrl

try:
    myAddress = MyConfModel(URI="http://myurl.com/")
    req = get(myAddress.URI, verify=False)
    print(myAddress.URI)
except ValidationError:
    print('Invalid destination')

Pydantic’s AnyHttpUrl will catch invalid URLs and raise a ValidationError. This is useful for ensuring that configuration, user input, or API parameters are valid before making requests or processing data.


Tested Patterns

Pydantic’s built-in validators are quite thorough. For example, the following URLs pass:

  • http://localhost

  • http://localhost:8080

  • http://example.com

  • http://user:password@example.com

  • http://_example.com

But these will fail validation:

  • http://&example.com

  • http://-example.com

If you need structured validation and meaningful error handling—especially in data models—Pydantic is a great addition to your toolkit.


Practical Testing and Edge Cases

Testing matters! Don’t forget to write cases for empty URLs, missing schemes, malformed domains, and subtle variants:

import pytest

def test_empty_url():
    with pytest.raises(Exception, match="No URL specified"):
        validate_url("")

def test_missing_scheme():
    with pytest.raises(Exception, match="No URL scheme specified"):
        validate_url("example.com")

def test_malformed_domain():
    with pytest.raises(Exception, match="URL domain malformed"):
        validate_url("http://.bad_domain")

Testing both the positive and negative cases ensures your validator does exactly what you expect—no more, no less.

Why Trim Whitespace Before URL Validation?

Before validating a URL, it’s essential to remove any leading or trailing spaces from the string. Even an extra space at the start or end—something easy to miss when copying and pasting—will cause most validation methods, including Python’s strict regex patterns, to treat the URL as invalid.

For example, "http://localhost:8000 " (with a trailing space) will fail validation, even though the actual URL is fine. By using Python’s strip() method, you ensure you’re testing the true URL as intended:

url = "http://localhost:8000 "
is_valid = is_string_an_url(url.strip())  # Returns True

Trimming whitespace helps your validations stay reliable, prevents false negatives, and ensures your applications don’t accidentally reject legitimate URLs due to minor copy-paste issues.

With these approaches—regex for quick checks, and libraries for thorough validation—you can confidently handle URL validation in a variety of Python projects.


A More Robust Solution: Comprehensive URL Validation

While the above regex patterns cover most everyday use-cases, URLs in the wild can be quite unpredictable. For bulletproof validation—recognizing everything from localhost to exotic internationalized domains, and robustly excluding invalid edge cases—you may want something more thorough.

Here's a regex pattern that takes into account:

  • Protocols: Supports http, https, ftp, rtsp, rtp, and mmp

  • Authentication: Handles optional user:pass@ credentials

  • IP Addresses: Accepts public IPs, rejects private/local addresses (e.g., 127.0.0.1, 192.168.x.x)

  • Hostnames & International Domains: Supports Unicode characters and punycode

  • Ports: Optional, supports typical port ranges

  • Paths & Queries: Optional, matches any valid path, query string, or fragment

import re

ip_middle_octet = r"(?:\.(?:1?\d{1,2}2[0-4]\d25[0-5]))"
ip_last_octet = r"(?:\.(?:[1-9]\d?1\d\d2[0-4]\d25[0-4]))"

URL_PATTERN = re.compile(
    r"^"  # start of string
    r"(?:(?:https?ftprtsprtpmmp)://)"  # protocol
    r"(?:\S+(?::\S*)?@)?"                  # optional user:pass@
    r"("                                   # host/ip group
        r"(?:localhost)"                  # localhost
        r"(?:(?:10127)" + ip_middle_octet + r"{2}" + ip_last_octet + r")"  # 10.x.x.x, 127.x.x.x
        r"(?:(?:169\.254192\.168)" + ip_middle_octet + ip_last_octet + r")"  # 169.254.x.x, 192.168.x.x
        r"(?:172\.(?:1[6-9]2\d3[0-1])" + ip_middle_octet + ip_last_octet + r")"  # 172.16.x.x - 172.31.x.x
        r"(?:(?:[1-9]\d?1\d\d2[01]\d22[0-3])" + ip_middle_octet + r"{2}" + ip_last_octet + r")"  # public IPs
        r"(?:(?:[a-z\u00a1-\uffff0-9_-]-?)*[a-z\u00a1-\uffff0-9_-]+"
        r"(?:\.(?:[a-z\u00a1-\uffff0-9_-]-?)*[a-z\u00a1-\uffff0-9_-]+)*"
        r"(?:\.(?:[a-z\u00a1-\uffff]{2,})))"  # domain names with TLD
    r")"
    r"(?::\d{2,5})?"            # optional port
    r"(?:/\S*)?"                # optional resource path
    r"(?:\?\S*)?"               # optional query
    r"$",
    re.UNICODE re.IGNORECASE
)

def url_validate(url):
    """URL string validation"""
    return URL_PATTERN.match(url)

Why use this?
If you're building forms or tools that need to reliably validate user-submitted URLs—including those with edge-case hostnames or public IP addresses—this pattern will catch what simpler regexes might miss. For example, it will recognize http://sub.例子.测试:8080/path?foo=bar and reject a string like http://192.168.1.1, which is a private IP.

Choose the right level of strictness for your needs:

  • For simple checks (e.g., ensuring a URL looks legit), the first few regexes are fast and easy.

  • If you need enterprise-grade validation or want to be sure you’re not letting through malformed or local network URLs, the comprehensive solution above is your friend.


Extending URL Validation for IPv6 Support

To make your URL validation regex compatible with IPv6 addresses, you’ll need to do two things:

  • Integrate a robust IPv6 validator regex (for example, from a trusted library or resource like Markus Jarderot’s pattern).

  • Adjust your URL parsing logic so it can recognize and accept IPv6 notation within URLs. This typically involves allowing square brackets around the IP portion (e.g., http://[2001:db8::1]:8080/).

A sample step in your validation routine could look like this:

  • When parsing the domain or host part of the URL, check if it’s an IPv6 address using your IPv6 validator. If so, ensure it matches the expected bracketed format for URLs.

By adding these enhancements, your validator will be able to handle URLs featuring IPv6 addresses alongside standard domain names or IPv4 addresses.


Ensuring Your String Is a Single Valid URL

A common pitfall in URL validation is accidentally matching inputs like http://www.google.com/path,www.yahoo.com/path as a single valid URL, when it's really two URLs separated by a comma. To prevent this and ensure your string is exactly one, clean, valid URL, follow these tips:

  • Anchor the regex: Always use ^ (start) and $ (end) in your pattern. This way, only a string that is a single URL—nothing more, nothing less—will be accepted.

  • Avoid matching delimiters: Do not allow characters such as commas or spaces after (or before) the URL in your regex.

  • No partial matches: Use the fullmatch() method rather than match() or search(). It checks if the whole string matches your pattern—not just a part of it.

Here's how your validation logic should look in Python:

import re

def is_strict_single_url(url):
    # Regex allows http/https, domains, subdomains, and optional paths/queries
    pattern = re.compile(r'^(httphttps):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(/[a-zA-Z0-9\-._~:/?#[\]@!$&\'()*+,;=]*)?$')
    return bool(pattern.fullmatch(url))

By using pattern.fullmatch(url), any extra commas, whitespace, or multiple URLs in the string will cause validation to fail—ensuring only single, proper URLs get through.


Enforcing a Maximum URL Length

To make sure your URLs don’t sneak past a set maximum length—commonly 2048 characters—you can add a simple length check before validating the rest of the URL. This is useful for keeping your forms and applications safe from overly long or potentially malicious links.

Here’s what you can do:

  • Trim whitespace from the input to avoid counting accidental spaces.

  • Check the length of the URL string.

  • Raise an error or reject the URL if it’s too long.

For example, before running your usual regex or validation logic:

MAX_URL_LENGTH = 2048
url = url.strip()
if len(url) > MAX_URL_LENGTH:
    raise ValueError(f"URL exceeds the maximum length of {MAX_URL_LENGTH} characters (got {len(url)})")
# Proceed with your normal URL validation checks here

This way, you immediately filter out any URLs that overshoot your preferred limit, keeping your processing tight and controlled. In most web and API environments, 2048 characters is a practical upper bound—used by browsers like Chrome and tools such as Postman—so it’s a solid default.


Python Example Code

import re

def is_valid_url(url):
    pattern = re.compile(r'^(http|https):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(/[a-zA-Z0-9\-._~:/?#[\]@!$&\'()*+,;=]*)?$')
    return bool(pattern.fullmatch(url))

# Test URLs
print(is_valid_url("https://qodex.ai"))             # True
print(is_valid_url("http://example.com/path"))      # True
print(is_valid_url("ftp://invalid.com"))            # False

Try variations using the Python Regex Tester.


How to Split and Validate a URL Using urllib.parse and Regex

To thoroughly validate a URL in Python, you'll often need to do more than just match the full string with a single regex. Here's a flexible approach combining Python’s urllib.parse with targeted regular expressions for each URL component:

  1. Break Down the URL:
    Use urllib.parse.urlparse() to split your URL into its core parts:

    • Scheme (http, https)

    • Netloc (domain, subdomain, and optional port)

    • Path, query, fragment, etc.

  2. Validate Each Piece:
    Apply regular expressions to the components that matter most for your use case:

    • Scheme: Ensure it’s http or https.

    • Netloc: Confirm it’s a valid domain name or IP address, and optionally check for a port (e.g., example.com:8080).

    • Path: If needed, add checks for valid characters in the path segment.

  3. IP Address Support:
    If your URLs might contain IP addresses instead of domain names, include regex patterns capable of matching IPv4 addresses. For IPv6 support, use a specialized IPv6 validator—such as Markus Jarderot’s widely regarded regex—for robust parsing.

  4. Example Workflow:

    • Parse the URL:

      from urllib.parse import urlparse
      import re
      
      url = "https://127.0.0.1:5000/home"
      parsed = urlparse(url)
    • Validate scheme:

      if parsed.scheme not in ["http", "https"]:
          # Handle invalid scheme
    • Validate netloc (domain or IP, with optional port):

      domain_pattern = r"^[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
      ipv4_pattern = r"^\d{1,3}(\.\d{1,3}){3}$"
      port_pattern = r":\d{2,5}$"
      
      netloc = parsed.netloc.split(':')[0]  # Extract domain/IP
    • For IPv6, integrate a dedicated validation function to accurately detect and confirm legitimate IPv6 addresses.

This modular technique gives you fine-grained control: you can adapt the regex and logic to your specific form, crawler, or parser requirements. It’s a great way to manage tricky edge cases that simple string-wide regex approaches might miss.


Detecting Potential URLs Beyond Strict Patterns

Sometimes, you may need to recognize tokens that could be URLs—even if they're not perfectly formed. For instance, you might encounter strings like yandex.ru.com/somepath or www.example.org, which don’t always match the strictest regex but still represent URLs in practice.

To address this, consider checking two things:

  • Does the text start with common URL schemes or prefixes (like http, www, or ftp)?

  • Does it end with a valid public domain suffix?

Here's a practical Python example that fetches an up-to-date list of public domain suffixes and uses them to identify likely URLs:

import requests

def get_domain_suffixes():
    res = requests.get('https://publicsuffix.org/list/public_suffix_list.dat')
    suffixes = set()
    for line in res.text.split(' '):
        if not line.startswith('//'):
            domains = line.split('.')
            cand = domains[-1]
            if cand:
                suffixes.add('.' + cand)
    return tuple(sorted(suffixes))

domain_suffixes = get_domain_suffixes()

def reminds_url(txt: str):
    """
    Returns True if the text looks like a URL.
    Example:
        >>> reminds_url('yandex.ru.com/somepath')
        True
    """
    ltext = txt.lower().split('/')[0]
    return ltext.startswith(('http', 'www', 'ftp')) or ltext.endswith(domain_suffixes)

This approach is especially useful for quick validation or preprocessing, where you want to capture URLs even if they're missing a protocol or have unusual structures.


Handling Python 2 and Python 3 for URL Validation

Python’s urlparse module is a handy way to validate URLs, but the import path changes between Python 2 and Python 3. Here’s how to ensure compatibility and robust URL checking across both versions.


Cross-Version Import

Depending on your environment, you’ll want to handle the import gracefully:

try:
    # For Python 2
    from urlparse import urlparse
except ImportError:
    # For Python 3
    from urllib.parse import urlparse

Example Function for URL Validation

After importing urlparse, you can create a simple validator function:

def is_valid_url(url):
    try:
        result = urlparse(url)
        return all([result.scheme, result.netloc])
    except (AttributeError, TypeError):
        return False

This function checks for both a scheme (like http or https) and a network location, filtering out partial paths, numbers, and malformed strings.

Sample Usage

urls = [
    "http://www.cwi.nl:80/%7Eguido/Python.html",  # Valid
    "/data/Python.html",                          # Invalid (missing scheme)
    532,                                          # Invalid (not a string)
    u"dkakasdkjdjakdjadjfalskdjfalk",             # Invalid (nonsense string)
    "https://qodex.ai"                            # Valid
]

results = [is_valid_url(u) for u in urls]
print(results)  # Output: [True, False, False, False, True]

This approach keeps your validation logic compatible regardless of whether you're running Python 2 or Python 3. And of course, it’s a good companion to using regex for more nuanced rules.


Alternative: Using Validation Libraries

While regex is great for quick URL checks, Python has some powerful validation libraries that can save you time and headaches, especially when edge cases start popping up.


Using the Package

The package provides simple functions for validating URLs (and many other types of data like emails and IP addresses). Here’s how you can use it:

import validators
print(validators.url("http://localhost:8000")) # True 
print(validators.url("ftp://invalid.com")) # ValidationFailure object (evaluates to False)

For more robust code, consider wrapping this check to always return a boolean:

import validators
from validators import ValidationFailure

def is_string_an_url(url_string: str) -> bool:
    # Always strip whitespace before validating!
    result = validators.url(url_string.strip())
    return result is True  # Only True is valid; ValidationFailure is falsy

Examples

print(is_string_an_url("http://localhost:8000")) # True 
print(is_string_an_url("http://.www.foo.bar/")) # False 
print(is_string_an_url("http://localhost:8000 ")) # True (after .strip())

Tip: Always trim leading and trailing spaces before validating URLs, as even a single space will cause most validators—including regex and libraries like —to reject the input.


Validation Using Django’s URLValidator

If you’re already using Django, leverage its built-in URL validator for comprehensive checks:

from django.core.validators import URLValidator
from django.core.exceptions import ValidationError

def is_string_an_url(url_string: str) -> bool:
    validate_url = URLValidator()
    try:
        validate_url(url_string.strip())
        return True
    except ValidationError:
        return False

Examples

print(is_string_an_url("https://example.com")) # True 
print(is_string_an_url("not a url")) # False

Adding Django just for its URL validation is probably overkill, but if you’re in a Django project already, this is one of the most reliable approaches.

With these approaches—regex for quick checks, and libraries for thorough validation—you can confidently handle URL validation in a variety of Python projects.


Validating URLs with Pydantic

Another slick option for URL validation in Python is the Pydantic library. While it’s most famous for parsing and validating data for FastAPI and configuration models, Pydantic actually provides a robust set of URL data types out of the box.


Pydantic’s URL Types: A Quick Overview

Pydantic comes with several helpful field types—perfect if you need more specificity than just “any old URL.” For example:

  • AnyUrl: Accepts nearly all valid URLs, including custom schemes.

  • AnyHttpUrl: Restricts to HTTP and HTTPS URLs.

  • HttpUrl: Demands HTTP/HTTPS, includes checks for host and TLD.

  • FileUrl, PostgresDsn, etc.: Specialized for files or specific database connections.

Refer to the documentation for a full list of options and scheme support.


How to Use with a Minimal Example

Here’s a typical usage pattern with Pydantic:

from pydantic import BaseModel, AnyHttpUrl, ValidationError

class Config(BaseModel):
    endpoint: AnyHttpUrl  # or choose the URL type you need

try:
    conf = Config(endpoint="http://localhost:8080")
    print(conf.endpoint)  # Will print a validated URL
except ValidationError:
    print("Not a valid HTTP(s) URL")
  • Attempting to create a model with an invalid URL will raise a ValidationError you can catch to handle input errors gracefully.

  • Pydantic also helps clarify why a value is invalid in its error messages.


Limitations and Gotchas

While Pydantic’s validators are thorough, keep in mind:

  • Some schemes (like ftp or database DSNs) require AnyUrl or more specific types (like PostgresDsn).

  • The strictness of validation depends on which field type you pick.

  • Leading/trailing spaces should be trimmed before assignment (Pydantic will usually do this, but don’t rely on it for noisy or poorly sanitized input).


Sample URLs and Outcomes

Here’s a taste of how Pydantic’s AnyHttpUrl responds:

  • "http://localhost" – valid

  • "http://localhost:8080" – valid

  • "http://user:password@example.com" – valid

  • "http://_example.com" – valid (underscore accepted)

  • "http://&example.com" – invalid (symbol not allowed)

  • "http://-example.com" – invalid (hyphen at start is rejected)

For comprehensive URL checks, Pydantic combines convenience with clarity—making your data models safer with minimal effort.


Checking the Latest Public Suffixes for Domain Validation

Sometimes, validating a URL or domain isn’t just about confirming the syntax—especially if you want to ensure your code recognizes valid top-level domains (TLDs) and public suffixes. To stay current with domain extensions (including newer ones like .dev, .app, or .io), you can programmatically retrieve the official public suffix list maintained by Mozilla.

Here’s a simple Python approach that pulls the latest list directly from publicsuffix.org and extracts all recognized domain suffixes:

import requests

def fetch_public_suffixes():
    response = requests.get('https://publicsuffix.org/list/public_suffix_list.dat')
    suffixes = set()
    for line in response.text.splitlines():
        line = line.strip()
        # Skip comments and empty lines
        if line and not line.startswith('//'):
            suffixes.add('.' + line)
    return tuple(sorted(suffixes))

# Fetch the latest suffixes
domain_suffixes = fetch_public_suffixes()
  • What this does:

    • Downloads the current public suffix list.

    • Ignores comments and empty lines in the dataset.

    • Collects each suffix into a tuple for easy lookups.

This technique helps ensure your domain validation logic is aware of every TLD currently recognized by major browsers and libraries—so you’re not blindsided by new suffixes.

Use this in your URL or email checker to make your validations future-proof and standards-compliant.


When Should You Do a DNS Check?

A quick note for thoroughness: validating a URL's format—whether using regex, the validators package, or Django’s built-in tools—only ensures the string looks like a URL. It doesn’t tell you whether that URL actually exists or leads to a live destination.

That’s where DNS checks come in. If you truly need to confirm that a URL points to a real, resolvable domain (e.g., verifying "https://www.google" isn’t just well-formed, but actually goes somewhere), you’ll need to go a step further by performing a DNS lookup. This process asks, "Does this domain exist on the internet right now?"—something no regex or typical package will answer for you.

DNS checks aren’t always necessary for basic validation tasks like form inputs or static checks. But, if you’re building anything that relies on external connectivity (think: crawlers, link checkers, or automated testing tools), adding a DNS resolution step is a good way to catch invalid or unavailable domains before they cause trouble later in your workflow.


Using the Package for URL

If you prefer not to write your own regex, you can easily check if a string is a valid URL by using the popular validators Python package. Here’s a straightforward approach:

import validators

def is_valid_url(url_string):
    result = validators.url(url_string)
    return result is True  # Returns True if valid, False otherwise

# Example usage
print(is_valid_url("http://localhost:8000"))         # True
print(is_valid_url("http://.www.foo.bar/"))          # False
  • The function is_valid_url returns True only when the provided string passes all URL checks performed by the library.

  • Internally, validators.url() returns True when valid, or a ValidationFailure object when not—so this function keeps things simple.

Use this for quick, robust validation without wrangling regex patterns.


Quick tip: Always strip spaces before validation, especially if the URL is coming from user input or copy-paste operations.

This approach is efficient, readable, and saves you from reinventing the wheel when working with URLs in Python.



Using Python Standard Library for URL Validation

Prefer to stick with the standard library? You can use urlparse (available via urllib.parse in Python 3 and urlparse in Python 2) to check whether a string is structured like a URL—without installing any third-party libraries.

Here's a basic approach:

try:
    # Python 2
    from urlparse import urlparse
except ImportError:
    # Python 3
    from urllib.parse import urlparse

def is_valid_url(string):
    try:
        result = urlparse(str(string).strip())
        # Validation: must have scheme (like http/https) and netloc (domain)
        return all([result.scheme, result.netloc])
    except Exception:
        return False

Examples:

print(is_valid_url('http://www.python.org'))        # True
print(is_valid_url('/just/a/path/file.txt'))        # False
print(is_valid_url(12345))                          # False
print(is_valid_url('not a url at all'))             # False
print(is_valid_url('https://github.com'))           # True

Note:
URL parsing checks structure, not whether the URL is actually reachable on the internet. For more extensive validation (including syntax and even DNS lookups), consider using packages like validators or requests. But for basic checks, urlparse fits the bill.


Making Validated URL Objects Act Like Strings

Say you’ve wrapped URL validation inside a custom class—how do you make sure your objects still behave like regular strings throughout your codebase? It’s simple: just ensure your class inherits from str directly, or implements the required string methods. This way, once a URL has passed your checks, you can use it anywhere a string is expected.

For example:

class ReachableURL(str):
    def __new__(cls, url):
        # Validate the URL here...
        # (Assume validation passes for this example)
        return str.__new__(cls, url)

Now, instances of ReachableURL can be used seamlessly—just like ordinary strings:

url_instance = ReachableURL("http://example.com")
print(isinstance(url_instance, str))  # True
print(url_instance.upper())           # HTTP://EXAMPLE.COM

This approach lets you layer extra functionality (like validation or reachability checks) while retaining all the familiar power of Python’s string operations. So, whether you’re concatenating, slicing, or handing off URLs to other libraries, you’ll keep everything as clean and Pythonic as possible.


Using Django’s URLValidator to Check URLs in Python

Django comes with a handy built-in tool for validating URLs: the URLValidator from the django.core.validators module. This validator is designed to determine whether a given string matches the criteria for a valid web address. If you’re already using Django in your project, it’s a convenient and reliable approach for URL validation.

Here’s how you can use it:

  1. Import the Necessary Classes:

    • URLValidator for the actual validation

    • ValidationError to handle invalid cases

  2. Write a Simple Validation Function:

    from django.core.validators import URLValidator
    from django.core.exceptions import ValidationError
    
    def is_valid_url(url: str) -> bool:
        validator = URLValidator()
        try:
            validator(url)
            return True
        except ValidationError:
            return False
    • When you call validator(url), it checks if the url string adheres to standard URL patterns.

    • If the supplied value isn’t a valid URL, it raises a ValidationError. The function returns True for valid URLs and False for invalid ones.

While Django’s URLValidator is powerful, keep in mind that adding Django as a dependency may be unnecessary for lightweight projects. However, for those already using Django, it’s a robust option for all your URL validation needs.


How validators.url Works in Python

If you're looking for a quick, reliable way to check whether a string is a valid URL in Python, the validators package is a handy tool. Its url function makes URL validation straightforward—even for tricky cases.

How It Validates

  • Pass your URL as a string to validators.url().

  • If the input is a valid URL, you'll get True as the result.

  • If the URL is not valid, instead of a simple False, it returns an object called ValidationFailure. While this might feel a bit unexpected, it still makes it easy to know whether your URL passes or fails validation.

Example Usage

Here’s what a typical validation flow might look like:

import validators
from validators import ValidationFailure

def is_string_a_url(candidate: str) -> bool:
    result = validators.url(candidate)
    return False if isinstance(result, ValidationFailure) else result

# Checking results
print(is_string_a_url("http://localhost:8000"))      # Outputs: True
print(is_string_a_url("http://.www.foo.bar/"))       # Outputs: False

This approach ensures you're only working with recognized, well-formed URLs—perfect for situations where data quality matters most.


Enforcing URL Validation with Python Classes and Type Checking

If you want to enforce URL validation at a deeper level across your codebase—not just at input time or form submission—you can use Python’s class inheritance and type checking. By encapsulating validation logic inside custom types, you make it much harder for invalid URLs to sneak into your application logic or data models.


Example: Creating a URL Type with Built-in Validation

You can define a custom string subclass that validates its input every time it’s instantiated. This approach leverages Python’s rich data model and is especially useful if you’re working in larger codebases, or when you want your function signatures and type hints to truly mean “URL—not just any string!”

Here’s a typical pattern using standard library features and popular utilities:

from urllib.parse import urlparse

class URL(str):
    def __new__(cls, value: str):
        result = urlparse(value)
        # Only allow non-empty scheme and netloc (host/address)
        if not (result.scheme and result.netloc):
            raise ValueError(f"Invalid URL: {value!r}")
        return str.__new__(cls, value)

Usage Example:

site = URL("https://wikipedia.org")        # works fine
another = URL("not a url")                 # raises ValueError

Any attempt to create a URL object with an invalid address immediately results in an error, so only valid URLs can be used downstream.


Benefits of Using Custom Types

  • Early Validation: Problems surface instantly at the object creation stage.

  • Type Safety: Your IDE and static analysis tools (e.g., mypy) can help catch mistakes when you annotate with your custom URL type.

  • Cleaner Code: Functions and classes that require URLs can explicitly declare so, boosting readability and reducing runtime surprises.


Extending Functionality

For stricter checks—such as ensuring the URL is reachable, uses HTTPS, or isn’t a localhost address—simply extend your base URL class and add additional validation in __new__.

import socket

class ReachableURL(URL):
    def __new__(cls, value: str):
        instance = super().__new__(cls, value)
        hostname = urlparse(instance).hostname
        if not hostname:
            raise ValueError(f"Invalid URL: {value!r}")
        try:
            socket.gethostbyname(hostname)
        except socket.error:
            raise ValueError(f"Hostname not resolvable: {hostname}")
        return instance

With this approach, your URL handling code stays explicit, self-documenting, and robust—whether you’re writing a web crawler, building APIs with FastAPI or Django, or just aiming for cleaner domain models.


Django’s URL Validator vs. Standalone Packages

If you're considering URL validation for your project, you might wonder whether to rely on Django’s built-in utility or opt for a lightweight standalone package. Here’s a quick breakdown to help you weigh the options:

Advantages of Django’s URL Validator:

  • Comprehensive Checks: Django’s validator is well-tested, supports standard URL patterns, and even has an option to check if the URL actually exists (verify_exists).

  • Integration: Seamlessly fits within the rest of Django’s validation ecosystem, making it perfect for projects already using Django for forms or models.

  • Community and Documentation: You benefit from a large, active community and thorough documentation—making it easier to troubleshoot or extend.

Drawbacks to Consider:

  • Dependency Bloat: Including Django just for URL validation can be overkill—Django is a robust, full-featured framework, and significantly increases your project’s size and dependencies if you’re not already using it.

  • Complexity: For smaller scripts, microservices, or non-Django projects, a standalone library (such as validators or simple regex-based checks) will keep things lean and more easily maintained.

  • Performance: Extra dependencies sometimes add startup time and potential version conflicts, especially in minimalist environments.

Summary:
If your stack already uses Django, leveraging its URL validator is a solid and hassle-free choice. For lightweight projects or scripts, standalone validation packages or tailored regex rules will keep your footprint minimal and setup easier. Choose based on your project’s needs and existing tech stack!


Why Trim Whitespace Before URL Validation?

Before validating a URL, it’s essential to remove any leading or trailing spaces from the string. Even an extra space at the start or end—something easy to miss when copying and pasting—will cause most validation methods, including Python’s strict regex patterns, to treat the URL as invalid.

For example, "http://localhost:8000 " (with a trailing space) will fail validation, even though the actual URL is fine. By using Python’s strip() method, you ensure you’re testing the true URL as intended:

url = "http://localhost:8000 "
is_valid = is_string_an_url(url.strip())  # Returns True

Trimming whitespace helps your validations stay reliable, prevents false negatives, and ensures your applications don’t accidentally reject legitimate URLs due to minor copy-paste issues.


Use Cases

  • Form Validation: Ensure users submit well-structured URLs in web forms.

  • Data Cleaning: Remove or fix malformed links in large datasets.

  • Crawlers & Scrapers: Verify URLs before crawling or scraping content.

  • Security Filtering: Block suspicious or malformed URLs from being stored or executed.


Useful tools:


Categorized Regex Metacharacters 

  • ^ : Matches the start of the string

  • $ : Matches the end of the string

  • . : Matches any character (except newline)

  • + : Matches one or more of the previous token

  • * : Matches zero or more of the previous token

  • ? : Makes the preceding token optional

  • [] : Matches any one character inside the brackets

  • () : Groups patterns

  • | : OR operator

  • \: : Escapes special characters like ":"

Pro Tips

  • Always use raw strings (r'') in Python to avoid escaping issues.

  • Add anchors ^ and $ to match the full URL and avoid partial matches.

  • Use non-capturing groups (?:...) for cleaner matching if needed.

  • Test localhost or custom ports using a regex like: localhost:\d{2,5}

  • Combine this validator with IP Address Regex Python Validator for APIs or internal tools.


More on Domain and Port Validation

When validating URLs, it's important to remember that the regex should handle both the scheme (like http or https) and the domain (or netloc) parts of the URL. The domain section includes everything up to the first slash /, so port numbers (like :8000) are safely included in this part of the match.

For example:


This approach ensures that your validator can match standard domains, custom ports, and even IPv4 addresses.


Supporting IPv6 Addresses

If you need to validate URLs containing IPv6 addresses, consider enhancing your regex or integrating a specialized IPv6 validator. A comprehensive IPv6 regex will handle the full range of valid address formats, so incorporate a solution like Markus Jarderot's IPv6 validator for best results. Remember to check both the domain and the IP format when validating.


Examples in Action

  • IPv4 and Alphanumeric Domains:
    Use a regex that matches standard domains and IPv4 addresses. For reference, tools like can help you test and refine your patterns.

  • IPv6 Support:
    With the right regex, you can capture URLs using IPv6 addresses, ensuring your validation routine is robust for any environment—including internal networks or modern APIs.

By combining these approaches, your URL validation will be flexible enough for everything from localhost development to production-grade, multi-protocol endpoints.


Frequently asked questions

Can I match localhost or internal domains?×
Yes. Adjust the regex to allow patterns like localhost or .local.
Does this support query parameters or fragments?+
What if I want to validate FTP or other protocols?+
Can this match URLs with trailing slashes?+
Can I use this in Django or Flask form validation?+
Can I import Figma designs?+
Is it SEO-friendly?+
Can I collaborate with my team?+
Is hosting included?+
Can I export code?+
Is there a free plan?+
Can I use custom fonts?+

URL Regex Python Validator

Search...

⌘K

URL Regex Python Validator

Search...

⌘K


URL Regex Python Validator

URL Regex Python Validator

Use the URL Regex Python Validator to accurately test patterns for validating website links in Python. Whether you’re checking for http, https, or complex paths, this tool helps ensure your URLs are clean, correct, and reliable. For more regex testing, explore the Python Regex Tester, Email Regex Python Validator, or IP Address Regex Python Validator.

example.com
Possible security issues
This regex appears to be safe.
Explanation
  • [A-Z]: uppercase letters
  • [a-z]: lowercase letters
  • [0-9]: digits
  • \.: a literal dot
  • +: one or more of the preceding
  • *: zero or more of the preceding
  • ?: optional (zero or one)
  • ^: start of string
  • $: end of string
Test your APIs today!

Write in plain English — Qodex turns it into secure, ready-to-run tests.

URL Regex Python Validator - Documentation

What is the URL Regex Python Validator?

The URL Regex Python Validator is designed to help you check whether your regular expressions correctly match valid web addresses. This includes checking for:


  • Protocols like http or https

  • Domain names and subdomains

  • Optional ports, paths, query parameters, and fragments


It uses Python’s re module and is ideal for form validation, web crawling, data parsing, and link-checking tasks.


Common URL Regex Patterns

  1. Basic HTTP/HTTPS URL

    ^(http|https):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

    Matches: http://example.com, https://qodex.ai

    Does not match: example.com, ftp://server.com


  2. Full URL with Optional Paths & Queries

    ^(http|https):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(/[a-zA-Z0-9\-._~:/?#[\]@!$&'()*+,;=]*)?$

    Matches: https://site.com/path?search=value, http://domain.org


  3. With Optional Port

    ^(http|https):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(:\d{2,5})?(\/.*)?$

    Matches: http://localhost:8000/home, https://api.site.com:443/v1


What Does the Regex Actually Check?

  • HTTP Basic Authentication: Optionally allows for before the domain.

  • Domain Structure: Enforces at least one subdomain (e.g., ), accepts dashes within subdomains, and ensures each subdomain and the top-level domain don't exceed length limits (max 63 characters each, full domain under 253).

  • Top-Level Domain: Only allows alphanumeric characters (no dashes).

  • Localhost Support: Accepts as a valid domain.

  • Port Numbers: Optionally matches ports up to 5 digits (e.g., ).

  • IPv4 Addresses: Recognizes standard IPv4 addresses in the netloc.

  • IPv6 Addresses: For IPv6 validation, you’ll want to supplement with a dedicated IPv6 validator, as the regex alone may not cover all edge cases.


Handling Complex and Edge Cases

While the above regex patterns cover most use-cases, URLs in the wild can be tricky—especially with top-level domains like .co.uk or unconventional subdomain structures. If you need a more robust solution that accounts for these "weird cases," consider a regex pattern that also allows for:

  • Optional protocol (e.g., http://, https://, or none)

  • Optional subdomains (like www.)

  • Support for multi-part TLDs (e.g., co.uk)

  • Paths, query strings, and fragments

  • Hyphens in domain names


Example Enhanced Regex (Python-style)

regex = re.compile(
    r"(\w+://)?"                # protocol (optional)
    r"(\w+\.)?"                 # subdomain (optional)
    r"(([\w-]+)\.(\w+))"        # domain
    r"(\.\w+)*"                 # additional TLD parts (optional)
    r"([\w\-\.\_\~/]*)"         # path, query, fragments (optional)
)


Test Cases for Thoroughness

This more flexible approach will match a variety of real-world URLs, such as:

  • http://www.google.com

  • https://google.co.uk

  • google.com/~user/profile

  • www.example.org

  • https://sub.domain.co.uk/path/to/page

  • example.com

  • .google.com (edge case, may require post-processing)

By testing against a broad set of examples—including those with extra dots, missing subdomains, or unusual TLDs—you can ensure your regex is both comprehensive and resilient.

Feel free to adjust the patterns and test cases as needed to suit the specific requirements of your application. Regex is powerful, but always test thoroughly to avoid surprises!


Advanced Regex for Edge Cases

A more thorough regex can handle authentication, IPv4/IPv6 addresses, localhost, port numbers, and more:

import re

DOMAIN_FORMAT = re.compile(
    r"(?:^(\w{1,255}):(.{1,255})@^)"  # http basic authentication [optional]
    r"(?:(?:(?=\S{0,253}(?:$:)))"
    r"((?:?\.)+"  # subdomains
    r"(?:[a-z0-9]{1,63})))"  # top level domain
    r"localhost"
    r")(:\d{1,5})?",  # port [optional],
    re.IGNORECASE
)

SCHEME_FORMAT = re.compile(
    r"^(httphxxpftpfxp)s?$",  # scheme: http(s) or ftp(s)
    re.IGNORECASE
)

from urllib.parse import urlparse

def validate_url(url: str):
    url = url.strip()
    if not url:
        raise Exception("No URL specified")
    if len(url) > 2048:
        raise Exception(f"URL exceeds its maximum length of 2048 characters (given length={len(url)})")
    result = urlparse(url)
    scheme = result.scheme
    domain = result.netloc
    if not scheme:
        raise Exception("No URL scheme specified")
    if not re.fullmatch(SCHEME_FORMAT, scheme):
        raise Exception(f"URL scheme must either be http(s) or ftp(s) (given scheme={scheme})")
    if not domain:
        raise Exception("No URL domain specified")
    if not re.fullmatch(DOMAIN_FORMAT, domain):
        raise Exception(f"URL domain malformed (domain={domain})")
    return url

This approach splits the URL and validates the scheme and domain separately, handling a wider array of valid URLs (including those with authentication and ports). For even greater accuracy (such as validating IPv6), you might want to add an IPv6 validator.


Alternative: Using Validation Libraries

While regex is great for quick URL checks, Python has some powerful validation libraries that can save you time and headaches, especially when edge cases start popping up.


Using the Package

The package provides simple functions for validating URLs (and many other types of data like emails and IP addresses). Here’s how you can use it:

import validators
print(validators.url("http://localhost:8000"))  # True 
print(validators.url("ftp://invalid.com"))  # ValidationFailure object (evaluates to False)

For more robust code, consider wrapping this check to always return a boolean:

import validators
from validators import ValidationFailure

def is_string_an_url(url_string: str) -> bool:
    # Always strip whitespace before validating!
    Result = validators.url(url_string.strip())
    return result is True  # Only True is valid; ValidationFailure is falsy

Examples

print(is_string_an_url("http://localhost:8000"))  # True 
print(is_string_an_url("http://.www.foo.bar/"))   # False 
print(is_string_an_url("http://localhost:8000 ")) # True (after .strip())

Tip: Always trim leading and trailing spaces before validating URLs, as even a single space will cause most validators—including regex and libraries like these—to reject the input.


Even More Powerful: Pydantic and Django

If you’re using frameworks like Pydantic or Django, you get validation utilities that handle a lot of this for you:


Validation Using Django’s URLValidator

If you’re already using Django, leverage its built-in URL validator for comprehensive checks:

from django.core.validators import URLValidator
from django.core.exceptions import ValidationError

def is_string_an_url(url_string: str) -> bool:
    validate_url = URLValidator()
    try:
        validate_url(url_string.strip())
        return True
    except ValidationError:
        return False

Examples

print(is_string_an_url("https://example.com"))  # True 
print(is_string_an_url("not a url"))            # False

Adding Django just for its URL validation is probably overkill, but if you’re in a Django project already, this is one of the most reliable approaches.


Validation Using Pydantic

Pydantic offers types like AnyHttpUrl for strict URL validation:

from pydantic import BaseModel, AnyHttpUrl, ValidationError

class MyConfModel(BaseModel):
    URI: AnyHttpUrl

try:
    myAddress = MyConfModel(URI="http://myurl.com/")
    print(myAddress.URI)
except ValidationError:
    print('Invalid destination')

This approach raises exceptions for invalid URLs and supports a variety of URL types.

With these approaches—regex for quick checks, and libraries for thorough validation—you can confidently handle URL validation in a variety of Python projects. Whether you want something lightweight for a script or full-featured for an enterprise app, Python’s got you covered.


Beyond Regex: Defensive Validation

While regex is great for quick URL checks, rigorous validation often means adding more logic. Consider these defensive steps:

  • Trim whitespace before validation—accidental spaces cause most validators to reject otherwise valid URLs.

  • Check for empty input and enforce reasonable length limits (e.g., 2048 characters).

  • Validate scheme: Only allow http, https, or your required protocols.

  • Domain verification: Use regex or libraries to ensure the domain is well-formed.

Here’s an example of thorough validation logic:

import re
import urllib.parse

SCHEME_FORMAT = r"https?ftp"
DOMAIN_FORMAT = r"[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"

def validate_url(url: str):
    url = url.strip()
    if not url:
        raise Exception("No URL specified")
    if len(url) > 2048:
        raise Exception(f"URL exceeds its maximum length of 2048 characters (given length={len(url)})")
    result = urllib.parse.urlparse(url)
    scheme = result.scheme
    domain = result.netloc
    if not scheme:
        raise Exception("No URL scheme specified")
    if not re.fullmatch(SCHEME_FORMAT, scheme):
        raise Exception(f"URL scheme must either be http(s) or ftp(s) (given scheme={scheme})")
    if not domain:
        raise Exception("No URL domain specified")
    if not re.fullmatch(DOMAIN_FORMAT, domain):
        raise Exception(f"URL domain malformed (domain={domain})")
    return url


Alternative: Using Validation Libraries

While regex lets you roll your own, Python has powerful validation libraries that take care of edge cases and oddities—saving you time and reducing bugs.

Using the validators Package

This package provides simple functions for validating URLs (and other data like emails and IP addresses):

import validators

print(validators.url("http://localhost:8000"))     # True
print(validators.url("ftp://invalid.com"))         # ValidationFailure (evaluates to False)

For more robust code, wrap the check so it always returns a boolean:

import validators
from validators import ValidationFailure

def is_string_an_url(url_string: str) -> bool:
    # Always strip whitespace before validating!
    Result = validators.url(url_string.strip())
    return result is True  # Only True is valid; ValidationFailure is falsy

Examples:

print(is_string_an_url("http://localhost:8000"))     # True
print(is_string_an_url("http://.www.foo.bar/"))      # False
print(is_string_an_url("http://localhost:8000 "))    # True (after .strip())

Tip: Always trim leading and trailing spaces before validating URLs, as even a single space will cause most validators—including regex and libraries—to reject the input.


Other Approaches: RFC-Based Validation

If you want to validate URLs according to official standards, look into tools that implement , which defines recommendations for validating HTTP URLs and email addresses. For instance, you can use parser libraries (such as LEPL, though it’s no longer maintained) that follow these recommendations for higher accuracy in tricky cases.

A typical workflow with a parser library might look like this:

from lepl.apps.rfc3696 import HttpUrl 

validator = HttpUrl()

print(validator('google'))            # False  
print(validator('http://google'))     # False  
print(validator('http://google.com')) # True

While LEPL is archived, the above pattern shows how you might leverage standards-based parsing for edge cases that regex or general-purpose validators can miss. For modern projects, stick with maintained libraries, but it’s helpful to know these standards exist if you ever need to write your own validator or debug why something isn’t matching.


Validation Using Django’s URLValidator

If you’re already using Django, leverage its built-in URL validator for comprehensive checks:

from django.core.validators import URLValidator
from django.core.exceptions import ValidationError

def is_string_an_url(url_string: str) -> bool:
    validate_url = URLValidator()
    try:
        validate_url(url_string.strip())
        return True
    except ValidationError:
        return False

Examples:

print(is_string_an_url("https://example.com"))  # True
print(is_string_an_url("not a url"))            # False

Adding Django just for its URL validation is probably overkill, but if you’re in a Django project already, this is one of the most reliable approaches.


Bonus: Pydantic for Structured Validation

If you’re working with data models or APIs, Pydantic provides another robust way to validate URLs (and more) using Python type hints and schema validation. It’s especially handy when you want validation and structured error handling as part of your model definitions.

from requests import get, HTTPError, ConnectionError
from pydantic import BaseModel, AnyHttpUrl, ValidationError

class MyConfModel(BaseModel):
    URI: AnyHttpUrl

try:
    myAddress = MyConfModel(URI="http://myurl.com/")
    req = get(myAddress.URI, verify=False)
    print(myAddress.URI)
except ValidationError:
    print('Invalid destination')

Pydantic’s AnyHttpUrl will catch invalid URLs and raise a ValidationError. This is useful for ensuring that configuration, user input, or API parameters are valid before making requests or processing data.


Tested Patterns

Pydantic’s built-in validators are quite thorough. For example, the following URLs pass:

  • http://localhost

  • http://localhost:8080

  • http://example.com

  • http://user:password@example.com

  • http://_example.com

But these will fail validation:

  • http://&example.com

  • http://-example.com

If you need structured validation and meaningful error handling—especially in data models—Pydantic is a great addition to your toolkit.


Practical Testing and Edge Cases

Testing matters! Don’t forget to write cases for empty URLs, missing schemes, malformed domains, and subtle variants:

import pytest

def test_empty_url():
    with pytest.raises(Exception, match="No URL specified"):
        validate_url("")

def test_missing_scheme():
    with pytest.raises(Exception, match="No URL scheme specified"):
        validate_url("example.com")

def test_malformed_domain():
    with pytest.raises(Exception, match="URL domain malformed"):
        validate_url("http://.bad_domain")

Testing both the positive and negative cases ensures your validator does exactly what you expect—no more, no less.

Why Trim Whitespace Before URL Validation?

Before validating a URL, it’s essential to remove any leading or trailing spaces from the string. Even an extra space at the start or end—something easy to miss when copying and pasting—will cause most validation methods, including Python’s strict regex patterns, to treat the URL as invalid.

For example, "http://localhost:8000 " (with a trailing space) will fail validation, even though the actual URL is fine. By using Python’s strip() method, you ensure you’re testing the true URL as intended:

url = "http://localhost:8000 "
is_valid = is_string_an_url(url.strip())  # Returns True

Trimming whitespace helps your validations stay reliable, prevents false negatives, and ensures your applications don’t accidentally reject legitimate URLs due to minor copy-paste issues.

With these approaches—regex for quick checks, and libraries for thorough validation—you can confidently handle URL validation in a variety of Python projects.


A More Robust Solution: Comprehensive URL Validation

While the above regex patterns cover most everyday use-cases, URLs in the wild can be quite unpredictable. For bulletproof validation—recognizing everything from localhost to exotic internationalized domains, and robustly excluding invalid edge cases—you may want something more thorough.

Here's a regex pattern that takes into account:

  • Protocols: Supports http, https, ftp, rtsp, rtp, and mmp

  • Authentication: Handles optional user:pass@ credentials

  • IP Addresses: Accepts public IPs, rejects private/local addresses (e.g., 127.0.0.1, 192.168.x.x)

  • Hostnames & International Domains: Supports Unicode characters and punycode

  • Ports: Optional, supports typical port ranges

  • Paths & Queries: Optional, matches any valid path, query string, or fragment

import re

ip_middle_octet = r"(?:\.(?:1?\d{1,2}2[0-4]\d25[0-5]))"
ip_last_octet = r"(?:\.(?:[1-9]\d?1\d\d2[0-4]\d25[0-4]))"

URL_PATTERN = re.compile(
    r"^"  # start of string
    r"(?:(?:https?ftprtsprtpmmp)://)"  # protocol
    r"(?:\S+(?::\S*)?@)?"                  # optional user:pass@
    r"("                                   # host/ip group
        r"(?:localhost)"                  # localhost
        r"(?:(?:10127)" + ip_middle_octet + r"{2}" + ip_last_octet + r")"  # 10.x.x.x, 127.x.x.x
        r"(?:(?:169\.254192\.168)" + ip_middle_octet + ip_last_octet + r")"  # 169.254.x.x, 192.168.x.x
        r"(?:172\.(?:1[6-9]2\d3[0-1])" + ip_middle_octet + ip_last_octet + r")"  # 172.16.x.x - 172.31.x.x
        r"(?:(?:[1-9]\d?1\d\d2[01]\d22[0-3])" + ip_middle_octet + r"{2}" + ip_last_octet + r")"  # public IPs
        r"(?:(?:[a-z\u00a1-\uffff0-9_-]-?)*[a-z\u00a1-\uffff0-9_-]+"
        r"(?:\.(?:[a-z\u00a1-\uffff0-9_-]-?)*[a-z\u00a1-\uffff0-9_-]+)*"
        r"(?:\.(?:[a-z\u00a1-\uffff]{2,})))"  # domain names with TLD
    r")"
    r"(?::\d{2,5})?"            # optional port
    r"(?:/\S*)?"                # optional resource path
    r"(?:\?\S*)?"               # optional query
    r"$",
    re.UNICODE re.IGNORECASE
)

def url_validate(url):
    """URL string validation"""
    return URL_PATTERN.match(url)

Why use this?
If you're building forms or tools that need to reliably validate user-submitted URLs—including those with edge-case hostnames or public IP addresses—this pattern will catch what simpler regexes might miss. For example, it will recognize http://sub.例子.测试:8080/path?foo=bar and reject a string like http://192.168.1.1, which is a private IP.

Choose the right level of strictness for your needs:

  • For simple checks (e.g., ensuring a URL looks legit), the first few regexes are fast and easy.

  • If you need enterprise-grade validation or want to be sure you’re not letting through malformed or local network URLs, the comprehensive solution above is your friend.


Extending URL Validation for IPv6 Support

To make your URL validation regex compatible with IPv6 addresses, you’ll need to do two things:

  • Integrate a robust IPv6 validator regex (for example, from a trusted library or resource like Markus Jarderot’s pattern).

  • Adjust your URL parsing logic so it can recognize and accept IPv6 notation within URLs. This typically involves allowing square brackets around the IP portion (e.g., http://[2001:db8::1]:8080/).

A sample step in your validation routine could look like this:

  • When parsing the domain or host part of the URL, check if it’s an IPv6 address using your IPv6 validator. If so, ensure it matches the expected bracketed format for URLs.

By adding these enhancements, your validator will be able to handle URLs featuring IPv6 addresses alongside standard domain names or IPv4 addresses.


Ensuring Your String Is a Single Valid URL

A common pitfall in URL validation is accidentally matching inputs like http://www.google.com/path,www.yahoo.com/path as a single valid URL, when it's really two URLs separated by a comma. To prevent this and ensure your string is exactly one, clean, valid URL, follow these tips:

  • Anchor the regex: Always use ^ (start) and $ (end) in your pattern. This way, only a string that is a single URL—nothing more, nothing less—will be accepted.

  • Avoid matching delimiters: Do not allow characters such as commas or spaces after (or before) the URL in your regex.

  • No partial matches: Use the fullmatch() method rather than match() or search(). It checks if the whole string matches your pattern—not just a part of it.

Here's how your validation logic should look in Python:

import re

def is_strict_single_url(url):
    # Regex allows http/https, domains, subdomains, and optional paths/queries
    pattern = re.compile(r'^(httphttps):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(/[a-zA-Z0-9\-._~:/?#[\]@!$&\'()*+,;=]*)?$')
    return bool(pattern.fullmatch(url))

By using pattern.fullmatch(url), any extra commas, whitespace, or multiple URLs in the string will cause validation to fail—ensuring only single, proper URLs get through.


Enforcing a Maximum URL Length

To make sure your URLs don’t sneak past a set maximum length—commonly 2048 characters—you can add a simple length check before validating the rest of the URL. This is useful for keeping your forms and applications safe from overly long or potentially malicious links.

Here’s what you can do:

  • Trim whitespace from the input to avoid counting accidental spaces.

  • Check the length of the URL string.

  • Raise an error or reject the URL if it’s too long.

For example, before running your usual regex or validation logic:

MAX_URL_LENGTH = 2048
url = url.strip()
if len(url) > MAX_URL_LENGTH:
    raise ValueError(f"URL exceeds the maximum length of {MAX_URL_LENGTH} characters (got {len(url)})")
# Proceed with your normal URL validation checks here

This way, you immediately filter out any URLs that overshoot your preferred limit, keeping your processing tight and controlled. In most web and API environments, 2048 characters is a practical upper bound—used by browsers like Chrome and tools such as Postman—so it’s a solid default.


Python Example Code

import re

def is_valid_url(url):
    pattern = re.compile(r'^(http|https):\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(/[a-zA-Z0-9\-._~:/?#[\]@!$&\'()*+,;=]*)?$')
    return bool(pattern.fullmatch(url))

# Test URLs
print(is_valid_url("https://qodex.ai"))             # True
print(is_valid_url("http://example.com/path"))      # True
print(is_valid_url("ftp://invalid.com"))            # False

Try variations using the Python Regex Tester.


How to Split and Validate a URL Using urllib.parse and Regex

To thoroughly validate a URL in Python, you'll often need to do more than just match the full string with a single regex. Here's a flexible approach combining Python’s urllib.parse with targeted regular expressions for each URL component:

  1. Break Down the URL:
    Use urllib.parse.urlparse() to split your URL into its core parts:

    • Scheme (http, https)

    • Netloc (domain, subdomain, and optional port)

    • Path, query, fragment, etc.

  2. Validate Each Piece:
    Apply regular expressions to the components that matter most for your use case:

    • Scheme: Ensure it’s http or https.

    • Netloc: Confirm it’s a valid domain name or IP address, and optionally check for a port (e.g., example.com:8080).

    • Path: If needed, add checks for valid characters in the path segment.

  3. IP Address Support:
    If your URLs might contain IP addresses instead of domain names, include regex patterns capable of matching IPv4 addresses. For IPv6 support, use a specialized IPv6 validator—such as Markus Jarderot’s widely regarded regex—for robust parsing.

  4. Example Workflow:

    • Parse the URL:

      from urllib.parse import urlparse
      import re
      
      url = "https://127.0.0.1:5000/home"
      parsed = urlparse(url)
    • Validate scheme:

      if parsed.scheme not in ["http", "https"]:
          # Handle invalid scheme
    • Validate netloc (domain or IP, with optional port):

      domain_pattern = r"^[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
      ipv4_pattern = r"^\d{1,3}(\.\d{1,3}){3}$"
      port_pattern = r":\d{2,5}$"
      
      netloc = parsed.netloc.split(':')[0]  # Extract domain/IP
    • For IPv6, integrate a dedicated validation function to accurately detect and confirm legitimate IPv6 addresses.

This modular technique gives you fine-grained control: you can adapt the regex and logic to your specific form, crawler, or parser requirements. It’s a great way to manage tricky edge cases that simple string-wide regex approaches might miss.


Detecting Potential URLs Beyond Strict Patterns

Sometimes, you may need to recognize tokens that could be URLs—even if they're not perfectly formed. For instance, you might encounter strings like yandex.ru.com/somepath or www.example.org, which don’t always match the strictest regex but still represent URLs in practice.

To address this, consider checking two things:

  • Does the text start with common URL schemes or prefixes (like http, www, or ftp)?

  • Does it end with a valid public domain suffix?

Here's a practical Python example that fetches an up-to-date list of public domain suffixes and uses them to identify likely URLs:

import requests

def get_domain_suffixes():
    res = requests.get('https://publicsuffix.org/list/public_suffix_list.dat')
    suffixes = set()
    for line in res.text.split(' '):
        if not line.startswith('//'):
            domains = line.split('.')
            cand = domains[-1]
            if cand:
                suffixes.add('.' + cand)
    return tuple(sorted(suffixes))

domain_suffixes = get_domain_suffixes()

def reminds_url(txt: str):
    """
    Returns True if the text looks like a URL.
    Example:
        >>> reminds_url('yandex.ru.com/somepath')
        True
    """
    ltext = txt.lower().split('/')[0]
    return ltext.startswith(('http', 'www', 'ftp')) or ltext.endswith(domain_suffixes)

This approach is especially useful for quick validation or preprocessing, where you want to capture URLs even if they're missing a protocol or have unusual structures.


Handling Python 2 and Python 3 for URL Validation

Python’s urlparse module is a handy way to validate URLs, but the import path changes between Python 2 and Python 3. Here’s how to ensure compatibility and robust URL checking across both versions.


Cross-Version Import

Depending on your environment, you’ll want to handle the import gracefully:

try:
    # For Python 2
    from urlparse import urlparse
except ImportError:
    # For Python 3
    from urllib.parse import urlparse

Example Function for URL Validation

After importing urlparse, you can create a simple validator function:

def is_valid_url(url):
    try:
        result = urlparse(url)
        return all([result.scheme, result.netloc])
    except (AttributeError, TypeError):
        return False

This function checks for both a scheme (like http or https) and a network location, filtering out partial paths, numbers, and malformed strings.

Sample Usage

urls = [
    "http://www.cwi.nl:80/%7Eguido/Python.html",  # Valid
    "/data/Python.html",                          # Invalid (missing scheme)
    532,                                          # Invalid (not a string)
    u"dkakasdkjdjakdjadjfalskdjfalk",             # Invalid (nonsense string)
    "https://qodex.ai"                            # Valid
]

results = [is_valid_url(u) for u in urls]
print(results)  # Output: [True, False, False, False, True]

This approach keeps your validation logic compatible regardless of whether you're running Python 2 or Python 3. And of course, it’s a good companion to using regex for more nuanced rules.


Alternative: Using Validation Libraries

While regex is great for quick URL checks, Python has some powerful validation libraries that can save you time and headaches, especially when edge cases start popping up.


Using the Package

The package provides simple functions for validating URLs (and many other types of data like emails and IP addresses). Here’s how you can use it:

import validators
print(validators.url("http://localhost:8000")) # True 
print(validators.url("ftp://invalid.com")) # ValidationFailure object (evaluates to False)

For more robust code, consider wrapping this check to always return a boolean:

import validators
from validators import ValidationFailure

def is_string_an_url(url_string: str) -> bool:
    # Always strip whitespace before validating!
    result = validators.url(url_string.strip())
    return result is True  # Only True is valid; ValidationFailure is falsy

Examples

print(is_string_an_url("http://localhost:8000")) # True 
print(is_string_an_url("http://.www.foo.bar/")) # False 
print(is_string_an_url("http://localhost:8000 ")) # True (after .strip())

Tip: Always trim leading and trailing spaces before validating URLs, as even a single space will cause most validators—including regex and libraries like —to reject the input.


Validation Using Django’s URLValidator

If you’re already using Django, leverage its built-in URL validator for comprehensive checks:

from django.core.validators import URLValidator
from django.core.exceptions import ValidationError

def is_string_an_url(url_string: str) -> bool:
    validate_url = URLValidator()
    try:
        validate_url(url_string.strip())
        return True
    except ValidationError:
        return False

Examples

print(is_string_an_url("https://example.com")) # True 
print(is_string_an_url("not a url")) # False

Adding Django just for its URL validation is probably overkill, but if you’re in a Django project already, this is one of the most reliable approaches.

With these approaches—regex for quick checks, and libraries for thorough validation—you can confidently handle URL validation in a variety of Python projects.


Validating URLs with Pydantic

Another slick option for URL validation in Python is the Pydantic library. While it’s most famous for parsing and validating data for FastAPI and configuration models, Pydantic actually provides a robust set of URL data types out of the box.


Pydantic’s URL Types: A Quick Overview

Pydantic comes with several helpful field types—perfect if you need more specificity than just “any old URL.” For example:

  • AnyUrl: Accepts nearly all valid URLs, including custom schemes.

  • AnyHttpUrl: Restricts to HTTP and HTTPS URLs.

  • HttpUrl: Demands HTTP/HTTPS, includes checks for host and TLD.

  • FileUrl, PostgresDsn, etc.: Specialized for files or specific database connections.

Refer to the documentation for a full list of options and scheme support.


How to Use with a Minimal Example

Here’s a typical usage pattern with Pydantic:

from pydantic import BaseModel, AnyHttpUrl, ValidationError

class Config(BaseModel):
    endpoint: AnyHttpUrl  # or choose the URL type you need

try:
    conf = Config(endpoint="http://localhost:8080")
    print(conf.endpoint)  # Will print a validated URL
except ValidationError:
    print("Not a valid HTTP(s) URL")
  • Attempting to create a model with an invalid URL will raise a ValidationError you can catch to handle input errors gracefully.

  • Pydantic also helps clarify why a value is invalid in its error messages.


Limitations and Gotchas

While Pydantic’s validators are thorough, keep in mind:

  • Some schemes (like ftp or database DSNs) require AnyUrl or more specific types (like PostgresDsn).

  • The strictness of validation depends on which field type you pick.

  • Leading/trailing spaces should be trimmed before assignment (Pydantic will usually do this, but don’t rely on it for noisy or poorly sanitized input).


Sample URLs and Outcomes

Here’s a taste of how Pydantic’s AnyHttpUrl responds:

  • "http://localhost" – valid

  • "http://localhost:8080" – valid

  • "http://user:password@example.com" – valid

  • "http://_example.com" – valid (underscore accepted)

  • "http://&example.com" – invalid (symbol not allowed)

  • "http://-example.com" – invalid (hyphen at start is rejected)

For comprehensive URL checks, Pydantic combines convenience with clarity—making your data models safer with minimal effort.


Checking the Latest Public Suffixes for Domain Validation

Sometimes, validating a URL or domain isn’t just about confirming the syntax—especially if you want to ensure your code recognizes valid top-level domains (TLDs) and public suffixes. To stay current with domain extensions (including newer ones like .dev, .app, or .io), you can programmatically retrieve the official public suffix list maintained by Mozilla.

Here’s a simple Python approach that pulls the latest list directly from publicsuffix.org and extracts all recognized domain suffixes:

import requests

def fetch_public_suffixes():
    response = requests.get('https://publicsuffix.org/list/public_suffix_list.dat')
    suffixes = set()
    for line in response.text.splitlines():
        line = line.strip()
        # Skip comments and empty lines
        if line and not line.startswith('//'):
            suffixes.add('.' + line)
    return tuple(sorted(suffixes))

# Fetch the latest suffixes
domain_suffixes = fetch_public_suffixes()
  • What this does:

    • Downloads the current public suffix list.

    • Ignores comments and empty lines in the dataset.

    • Collects each suffix into a tuple for easy lookups.

This technique helps ensure your domain validation logic is aware of every TLD currently recognized by major browsers and libraries—so you’re not blindsided by new suffixes.

Use this in your URL or email checker to make your validations future-proof and standards-compliant.


When Should You Do a DNS Check?

A quick note for thoroughness: validating a URL's format—whether using regex, the validators package, or Django’s built-in tools—only ensures the string looks like a URL. It doesn’t tell you whether that URL actually exists or leads to a live destination.

That’s where DNS checks come in. If you truly need to confirm that a URL points to a real, resolvable domain (e.g., verifying "https://www.google" isn’t just well-formed, but actually goes somewhere), you’ll need to go a step further by performing a DNS lookup. This process asks, "Does this domain exist on the internet right now?"—something no regex or typical package will answer for you.

DNS checks aren’t always necessary for basic validation tasks like form inputs or static checks. But, if you’re building anything that relies on external connectivity (think: crawlers, link checkers, or automated testing tools), adding a DNS resolution step is a good way to catch invalid or unavailable domains before they cause trouble later in your workflow.


Using the Package for URL

If you prefer not to write your own regex, you can easily check if a string is a valid URL by using the popular validators Python package. Here’s a straightforward approach:

import validators

def is_valid_url(url_string):
    result = validators.url(url_string)
    return result is True  # Returns True if valid, False otherwise

# Example usage
print(is_valid_url("http://localhost:8000"))         # True
print(is_valid_url("http://.www.foo.bar/"))          # False
  • The function is_valid_url returns True only when the provided string passes all URL checks performed by the library.

  • Internally, validators.url() returns True when valid, or a ValidationFailure object when not—so this function keeps things simple.

Use this for quick, robust validation without wrangling regex patterns.


Quick tip: Always strip spaces before validation, especially if the URL is coming from user input or copy-paste operations.

This approach is efficient, readable, and saves you from reinventing the wheel when working with URLs in Python.



Using Python Standard Library for URL Validation

Prefer to stick with the standard library? You can use urlparse (available via urllib.parse in Python 3 and urlparse in Python 2) to check whether a string is structured like a URL—without installing any third-party libraries.

Here's a basic approach:

try:
    # Python 2
    from urlparse import urlparse
except ImportError:
    # Python 3
    from urllib.parse import urlparse

def is_valid_url(string):
    try:
        result = urlparse(str(string).strip())
        # Validation: must have scheme (like http/https) and netloc (domain)
        return all([result.scheme, result.netloc])
    except Exception:
        return False

Examples:

print(is_valid_url('http://www.python.org'))        # True
print(is_valid_url('/just/a/path/file.txt'))        # False
print(is_valid_url(12345))                          # False
print(is_valid_url('not a url at all'))             # False
print(is_valid_url('https://github.com'))           # True

Note:
URL parsing checks structure, not whether the URL is actually reachable on the internet. For more extensive validation (including syntax and even DNS lookups), consider using packages like validators or requests. But for basic checks, urlparse fits the bill.


Making Validated URL Objects Act Like Strings

Say you’ve wrapped URL validation inside a custom class—how do you make sure your objects still behave like regular strings throughout your codebase? It’s simple: just ensure your class inherits from str directly, or implements the required string methods. This way, once a URL has passed your checks, you can use it anywhere a string is expected.

For example:

class ReachableURL(str):
    def __new__(cls, url):
        # Validate the URL here...
        # (Assume validation passes for this example)
        return str.__new__(cls, url)

Now, instances of ReachableURL can be used seamlessly—just like ordinary strings:

url_instance = ReachableURL("http://example.com")
print(isinstance(url_instance, str))  # True
print(url_instance.upper())           # HTTP://EXAMPLE.COM

This approach lets you layer extra functionality (like validation or reachability checks) while retaining all the familiar power of Python’s string operations. So, whether you’re concatenating, slicing, or handing off URLs to other libraries, you’ll keep everything as clean and Pythonic as possible.


Using Django’s URLValidator to Check URLs in Python

Django comes with a handy built-in tool for validating URLs: the URLValidator from the django.core.validators module. This validator is designed to determine whether a given string matches the criteria for a valid web address. If you’re already using Django in your project, it’s a convenient and reliable approach for URL validation.

Here’s how you can use it:

  1. Import the Necessary Classes:

    • URLValidator for the actual validation

    • ValidationError to handle invalid cases

  2. Write a Simple Validation Function:

    from django.core.validators import URLValidator
    from django.core.exceptions import ValidationError
    
    def is_valid_url(url: str) -> bool:
        validator = URLValidator()
        try:
            validator(url)
            return True
        except ValidationError:
            return False
    • When you call validator(url), it checks if the url string adheres to standard URL patterns.

    • If the supplied value isn’t a valid URL, it raises a ValidationError. The function returns True for valid URLs and False for invalid ones.

While Django’s URLValidator is powerful, keep in mind that adding Django as a dependency may be unnecessary for lightweight projects. However, for those already using Django, it’s a robust option for all your URL validation needs.


How validators.url Works in Python

If you're looking for a quick, reliable way to check whether a string is a valid URL in Python, the validators package is a handy tool. Its url function makes URL validation straightforward—even for tricky cases.

How It Validates

  • Pass your URL as a string to validators.url().

  • If the input is a valid URL, you'll get True as the result.

  • If the URL is not valid, instead of a simple False, it returns an object called ValidationFailure. While this might feel a bit unexpected, it still makes it easy to know whether your URL passes or fails validation.

Example Usage

Here’s what a typical validation flow might look like:

import validators
from validators import ValidationFailure

def is_string_a_url(candidate: str) -> bool:
    result = validators.url(candidate)
    return False if isinstance(result, ValidationFailure) else result

# Checking results
print(is_string_a_url("http://localhost:8000"))      # Outputs: True
print(is_string_a_url("http://.www.foo.bar/"))       # Outputs: False

This approach ensures you're only working with recognized, well-formed URLs—perfect for situations where data quality matters most.


Enforcing URL Validation with Python Classes and Type Checking

If you want to enforce URL validation at a deeper level across your codebase—not just at input time or form submission—you can use Python’s class inheritance and type checking. By encapsulating validation logic inside custom types, you make it much harder for invalid URLs to sneak into your application logic or data models.


Example: Creating a URL Type with Built-in Validation

You can define a custom string subclass that validates its input every time it’s instantiated. This approach leverages Python’s rich data model and is especially useful if you’re working in larger codebases, or when you want your function signatures and type hints to truly mean “URL—not just any string!”

Here’s a typical pattern using standard library features and popular utilities:

from urllib.parse import urlparse

class URL(str):
    def __new__(cls, value: str):
        result = urlparse(value)
        # Only allow non-empty scheme and netloc (host/address)
        if not (result.scheme and result.netloc):
            raise ValueError(f"Invalid URL: {value!r}")
        return str.__new__(cls, value)

Usage Example:

site = URL("https://wikipedia.org")        # works fine
another = URL("not a url")                 # raises ValueError

Any attempt to create a URL object with an invalid address immediately results in an error, so only valid URLs can be used downstream.


Benefits of Using Custom Types

  • Early Validation: Problems surface instantly at the object creation stage.

  • Type Safety: Your IDE and static analysis tools (e.g., mypy) can help catch mistakes when you annotate with your custom URL type.

  • Cleaner Code: Functions and classes that require URLs can explicitly declare so, boosting readability and reducing runtime surprises.


Extending Functionality

For stricter checks—such as ensuring the URL is reachable, uses HTTPS, or isn’t a localhost address—simply extend your base URL class and add additional validation in __new__.

import socket

class ReachableURL(URL):
    def __new__(cls, value: str):
        instance = super().__new__(cls, value)
        hostname = urlparse(instance).hostname
        if not hostname:
            raise ValueError(f"Invalid URL: {value!r}")
        try:
            socket.gethostbyname(hostname)
        except socket.error:
            raise ValueError(f"Hostname not resolvable: {hostname}")
        return instance

With this approach, your URL handling code stays explicit, self-documenting, and robust—whether you’re writing a web crawler, building APIs with FastAPI or Django, or just aiming for cleaner domain models.


Django’s URL Validator vs. Standalone Packages

If you're considering URL validation for your project, you might wonder whether to rely on Django’s built-in utility or opt for a lightweight standalone package. Here’s a quick breakdown to help you weigh the options:

Advantages of Django’s URL Validator:

  • Comprehensive Checks: Django’s validator is well-tested, supports standard URL patterns, and even has an option to check if the URL actually exists (verify_exists).

  • Integration: Seamlessly fits within the rest of Django’s validation ecosystem, making it perfect for projects already using Django for forms or models.

  • Community and Documentation: You benefit from a large, active community and thorough documentation—making it easier to troubleshoot or extend.

Drawbacks to Consider:

  • Dependency Bloat: Including Django just for URL validation can be overkill—Django is a robust, full-featured framework, and significantly increases your project’s size and dependencies if you’re not already using it.

  • Complexity: For smaller scripts, microservices, or non-Django projects, a standalone library (such as validators or simple regex-based checks) will keep things lean and more easily maintained.

  • Performance: Extra dependencies sometimes add startup time and potential version conflicts, especially in minimalist environments.

Summary:
If your stack already uses Django, leveraging its URL validator is a solid and hassle-free choice. For lightweight projects or scripts, standalone validation packages or tailored regex rules will keep your footprint minimal and setup easier. Choose based on your project’s needs and existing tech stack!


Why Trim Whitespace Before URL Validation?

Before validating a URL, it’s essential to remove any leading or trailing spaces from the string. Even an extra space at the start or end—something easy to miss when copying and pasting—will cause most validation methods, including Python’s strict regex patterns, to treat the URL as invalid.

For example, "http://localhost:8000 " (with a trailing space) will fail validation, even though the actual URL is fine. By using Python’s strip() method, you ensure you’re testing the true URL as intended:

url = "http://localhost:8000 "
is_valid = is_string_an_url(url.strip())  # Returns True

Trimming whitespace helps your validations stay reliable, prevents false negatives, and ensures your applications don’t accidentally reject legitimate URLs due to minor copy-paste issues.


Use Cases

  • Form Validation: Ensure users submit well-structured URLs in web forms.

  • Data Cleaning: Remove or fix malformed links in large datasets.

  • Crawlers & Scrapers: Verify URLs before crawling or scraping content.

  • Security Filtering: Block suspicious or malformed URLs from being stored or executed.


Useful tools:


Categorized Regex Metacharacters 

  • ^ : Matches the start of the string

  • $ : Matches the end of the string

  • . : Matches any character (except newline)

  • + : Matches one or more of the previous token

  • * : Matches zero or more of the previous token

  • ? : Makes the preceding token optional

  • [] : Matches any one character inside the brackets

  • () : Groups patterns

  • | : OR operator

  • \: : Escapes special characters like ":"

Pro Tips

  • Always use raw strings (r'') in Python to avoid escaping issues.

  • Add anchors ^ and $ to match the full URL and avoid partial matches.

  • Use non-capturing groups (?:...) for cleaner matching if needed.

  • Test localhost or custom ports using a regex like: localhost:\d{2,5}

  • Combine this validator with IP Address Regex Python Validator for APIs or internal tools.


More on Domain and Port Validation

When validating URLs, it's important to remember that the regex should handle both the scheme (like http or https) and the domain (or netloc) parts of the URL. The domain section includes everything up to the first slash /, so port numbers (like :8000) are safely included in this part of the match.

For example:


This approach ensures that your validator can match standard domains, custom ports, and even IPv4 addresses.


Supporting IPv6 Addresses

If you need to validate URLs containing IPv6 addresses, consider enhancing your regex or integrating a specialized IPv6 validator. A comprehensive IPv6 regex will handle the full range of valid address formats, so incorporate a solution like Markus Jarderot's IPv6 validator for best results. Remember to check both the domain and the IP format when validating.


Examples in Action

  • IPv4 and Alphanumeric Domains:
    Use a regex that matches standard domains and IPv4 addresses. For reference, tools like can help you test and refine your patterns.

  • IPv6 Support:
    With the right regex, you can capture URLs using IPv6 addresses, ensuring your validation routine is robust for any environment—including internal networks or modern APIs.

By combining these approaches, your URL validation will be flexible enough for everything from localhost development to production-grade, multi-protocol endpoints.


Frequently asked questions

Can I match localhost or internal domains?×
Yes. Adjust the regex to allow patterns like localhost or .local.
Does this support query parameters or fragments?+
What if I want to validate FTP or other protocols?+
Can this match URLs with trailing slashes?+
Can I use this in Django or Flask form validation?+