Skip to content

parsedmarc.utils 🔗

Utility functions that might be useful for other projects

DownloadError 🔗

Bases: RuntimeError

Raised when an error occurs when downloading a file

EmailParserError 🔗

Bases: RuntimeError

Raised when an error parsing the email occurs

MboxIterator 🔗

MboxIterator(path: str)

Class that allows iterating through all messages in an mbox file

Returns tuples of (message_key, message).

convert_outlook_msg 🔗

convert_outlook_msg(msg_bytes: bytes) -> bytes

Convert an Outlook MS file to standard RFC 822 format

Requires the msgconvert Perl utility to be installed.

PARAMETER DESCRIPTION
msg_bytes

the content of the .msg file

TYPE: bytes

RETURNS DESCRIPTION
bytes

A RFC 822 string

decode_base64 🔗

decode_base64(data: str) -> bytes

Decodes a base64 string, with padding being optional

PARAMETER DESCRIPTION
data

A base64 encoded string

TYPE: str

RETURNS DESCRIPTION
bytes

The decoded bytes

extract_xml 🔗

extract_xml(source: str | bytes | BinaryIO) -> str

Extracts xml from a zip or gzip file at the given path, file-like object, or bytes.

PARAMETER DESCRIPTION
source

A path to a file, a file like object, or bytes

TYPE: str | bytes | BinaryIO

RETURNS DESCRIPTION
str

The extracted XML

get_base_domain 🔗

get_base_domain(domain: str) -> str

Get the base domain name for the given domain

note

Results are based on a list of public domain suffixes at https://publicsuffix.org/list/public_suffix_list.dat.

PARAMETER DESCRIPTION
domain

A domain or subdomain

TYPE: str

RETURNS DESCRIPTION
str

The base domain of the given domain

get_filename_safe_string 🔗

get_filename_safe_string(string: str) -> str

Convert a string to a string that is safe for a filename

PARAMETER DESCRIPTION
string

A string to make safe for a filename

TYPE: str

RETURNS DESCRIPTION
str

A string safe for a filename

get_ip_address_country 🔗

get_ip_address_country(
    ip_address: str, db_path: str | None = None
) -> str | None

Get the ISO code for the country associated with the given IPv4 or IPv6 address

PARAMETER DESCRIPTION
ip_address

The IP address to query for

TYPE: str

db_path

Path to a MMDB file from MaxMind or DBIP

TYPE: str | None DEFAULT: None

RETURNS DESCRIPTION
str | None

And ISO country code associated with the given IP address

get_ip_address_info 🔗

get_ip_address_info(
    ip_address: str,
    ip_db_path: str | None = None,
    cache: ExpiringDict | None = None,
    offline: bool = False,
    nameservers: list[str] | None = None,
    timeout: float = 2.0,
) -> dict[str, Any]

Get reverse DNS and country information for the given IP address

PARAMETER DESCRIPTION
ip_address

The IP address to check

TYPE: str

ip_db_path

path to a MMDB file from MaxMind or DBIP

TYPE: str | None DEFAULT: None

cache

Cache storage

TYPE: ExpiringDict | None DEFAULT: None

offline

Do not make online queries for geolocation or DNS

TYPE: bool DEFAULT: False

nameservers

A list of one or more nameservers to use (Cloudflare's public DNS resolvers by default)

TYPE: list[str] | None DEFAULT: None

timeout

Sets the DNS timeout in seconds

TYPE: float DEFAULT: 2.0

RETURNS DESCRIPTION
dict[str, Any]

Dictionary of (ip_address, country, reverse_dns, base_domain)

get_reverse_dns 🔗

get_reverse_dns(
    ip_address: str,
    cache: ExpiringDict | None = None,
    nameservers: list[str] | None = None,
    timeout: float = 2.0,
) -> str | None

Resolve an IP address to a hostname using a reverse DNS query

PARAMETER DESCRIPTION
ip_address

The IP address to resolve

TYPE: str

cache

Cache storage

TYPE: ExpiringDict | None DEFAULT: None

nameservers

A list of one or more nameservers to use (Cloudflare's public DNS resolvers by default)

TYPE: list[str] | None DEFAULT: None

timeout

Sets the DNS query timeout in seconds

TYPE: float DEFAULT: 2.0

RETURNS DESCRIPTION
str | None

The reverse DNS hostname (if any)

human_timestamp_to_datetime 🔗

human_timestamp_to_datetime(
    human_timestamp: str, to_utc: bool = False
) -> datetime

Converts a human-readable timestamp into a Python datetime object

PARAMETER DESCRIPTION
human_timestamp

A timestamp string

TYPE: str

to_utc

Convert the timestamp to UTC

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
datetime

The converted timestamp

human_timestamp_to_timestamp 🔗

human_timestamp_to_timestamp(human_timestamp: str) -> float

Converts a human-readable timestamp into a UNIX timestamp

PARAMETER DESCRIPTION
human_timestamp

A timestamp in YYYY-MM-DD HH:MM:SS format

TYPE: str

RETURNS DESCRIPTION
float

The converted timestamp

is_mbox 🔗

is_mbox(path: str) -> bool

Checks if the given content is an MBOX mailbox file

RETURNS DESCRIPTION
bool

If the file is an MBOX mailbox file

is_outlook_msg 🔗

is_outlook_msg(content: Any) -> bool

Checks if the given content is an Outlook msg OLE/MSG file

PARAMETER DESCRIPTION
content

Content to check

TYPE: Any

RETURNS DESCRIPTION
bool

If the file is an Outlook MSG file

load_bytes_from_source 🔗

load_bytes_from_source(source: str | bytes | BinaryIO)

Load bytes from source.

PARAMETER DESCRIPTION
source

A path to a file, a file like object, or bytes.

TYPE: str | bytes | BinaryIO

parse_email 🔗

parse_email(
    data: bytes | str,
    strip_attachment_payloads: bool = False,
) -> dict[str, Any]

A simplified email parser

PARAMETER DESCRIPTION
data

The RFC 822 message string, or MSG binary

TYPE: bytes | str

strip_attachment_payloads

Remove attachment payloads

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
dict[str, Any]

Parsed email data

parse_email_address 🔗

parse_email_address(
    original_address: str,
) -> dict[str, Any]

Parse an email into parts

query_dns 🔗

query_dns(
    domain: str,
    record_type: str,
    cache: ExpiringDict | None = None,
    nameservers: list[str] | None = None,
    timeout: float = 2.0,
) -> list[str]

Make a DNS query

PARAMETER DESCRIPTION
domain

The domain or subdomain to query about

TYPE: str

record_type

The record type to query for

TYPE: str

cache

Cache storage

TYPE: ExpiringDict | None DEFAULT: None

nameservers

A list of one or more nameservers to use (Cloudflare's public DNS resolvers by default)

TYPE: list[str] | None DEFAULT: None

timeout

Sets the DNS timeout in seconds

TYPE: float DEFAULT: 2.0

RETURNS DESCRIPTION
list[str]

A list of answers

timestamp_to_datetime 🔗

timestamp_to_datetime(timestamp: int) -> datetime

Converts a UNIX/DMARC timestamp to a Python datetime object

PARAMETER DESCRIPTION
timestamp

The timestamp

TYPE: int

RETURNS DESCRIPTION
datetime

The converted timestamp as a Python datetime object

timestamp_to_human 🔗

timestamp_to_human(timestamp: int) -> str

Converts a UNIX/DMARC timestamp to a human-readable string

PARAMETER DESCRIPTION
timestamp

The timestamp

TYPE: int

RETURNS DESCRIPTION
str

The converted timestamp in YYYY-MM-DD HH:MM:SS format