models
AnalysisGroup
Bases: UUIDTimeStampedModel
Abstract group to assign a record to for purposes of analysis.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
Name of the group. |
podcasts |
QuerySet[Podcast]
|
Podcasts explicitly linked to group. |
seasons |
QuerySet[Season]
|
Seasons explicitly linked to group. |
episodes |
QuerySet[Episode]
|
Episodes explicitly linked to group. |
get_all_episodes
Get all episodes, explict and implied, for this Analysis Group.
Source code in src/podcast_analyzer/models.py
get_all_people
Returns a QuerySet of all People that are associated with this group.
Source code in src/podcast_analyzer/models.py
get_all_podcasts
Returns a QuerySet of all Podcast objects for this group, both explicitly assigned and implied by Season and Episode objects.
Source code in src/podcast_analyzer/models.py
get_all_seasons
Returns a QuerySet of all Season objects for this group, both explicit and implied.
Source code in src/podcast_analyzer/models.py
get_counts_by_release_frequency
Get counts of podcasts by release frequency.
NOTE: This is based on podcasts' current release frequency. We can't reliably calculate this based on isolated seasons and episodes.
Source code in src/podcast_analyzer/models.py
get_itunes_categories_with_count
For all associated podcasts, explicit or implicit, return their associated distinct categories with counts.
Source code in src/podcast_analyzer/models.py
get_median_duration_timedelta
Return the median duration of episodes as a timedelta.
Source code in src/podcast_analyzer/models.py
get_num_dormant_podcasts
Get the podcasts connected, explict or implicit, that are dormant.
get_num_podcasts_using_trackers
Feeds that contain what appears to be third-party tracking data.
get_num_podcasts_with_donation_data
Feed contains structure donation/funding data.
get_num_podcasts_with_itunes_data
get_num_podcasts_with_podcast_index_data
get_total_duration_seconds
Calculate the total duration of all episodes, explicit and implied for this group.
Source code in src/podcast_analyzer/models.py
median_episode_duration
num_episodes
Returns the number of episodes associated with this group, whether directly or via an assigned season or podcast.
num_people
Returns the total number of people detected from episodes associated with this group.
num_podcasts
Returns the total number of podcasts in this group, both explicitly and implied.
num_seasons
Returns the number of seasons associated with this group, both direct associations and implicit associations due to an assigned feed.
ArtUpdate
Bases: Model
Model for capturing art update events. Useful for debugging.
Attributes:
Name | Type | Description |
---|---|---|
podcast |
Podcast
|
Podcast that this update relates to. |
timestamp |
datetime
|
Timestamp when the update was requested. |
reported_mime_type |
str
|
The mime_type returned by the remote server. |
actual_mime_type |
str
|
The actual mime_type of the file. |
valid_file |
bool
|
Whether the file was valid and of the allowed mime types. |
Episode
Bases: UUIDTimeStampedModel
Represents a single episode of a podcast.
Attributes:
Name | Type | Description |
---|---|---|
podcast |
Podcast
|
The podcast this episode belongs to. |
guid |
str
|
GUID of the episode |
title |
str | None
|
Title of the episode |
ep_type |
str
|
Episode type, e.g full, bonus, trailer |
season |
Season | None
|
Season the episode belongs to. |
ep_num |
int | None
|
Episode number |
release_datetime |
datetime | None
|
Date and time the episode was released. |
episode_url |
str | None
|
URL of the episode page. |
mime_type |
str | None
|
Reported mime type of the episode. |
download_url |
str | None
|
URL of the episode file. |
itunes_duration |
int | None
|
Duration of the episode in seconds. |
file_size |
int | None
|
Size of the episode file in bytes. |
itunes_explict |
bool
|
Does this episode have the explicit flag? |
show_notes |
str | None
|
Show notes for the episode, if provided. |
cw_present |
bool
|
Did we detect a content warning? |
transcript_detected |
bool
|
Did we detect a transcript? |
hosts_detected_from_feed |
QuerySet[Person]
|
Hosts found in the feed information. |
guests_detected_from_feed |
QuerySet[Person]
|
Guests found in the feed information. |
analysis_group |
QuerySet[AnalysisGroup]
|
Analysis Groups this is assigned to. |
duration
property
Attempts to convert the duration of the episode into a timedelta for better display.
create_or_update_episode_from_feed
classmethod
create_or_update_episode_from_feed(
podcast: Podcast,
episode_dict: dict[str, Any],
*,
update_existing_episodes: bool = False
) -> bool
Given a dict of episode data from podcastparser, create or update the episode and return a bool indicating if a record was touched.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
podcast
|
Podcast
|
The instance of the podcast being updated. |
required |
episode_dict
|
dict[str, Any]
|
A dict representing the episode as created by |
required |
update_existing_episodes
|
bool
|
Update data in existing records? Default: False |
False
|
Returns: True or False if a record was created or updated.
Source code in src/podcast_analyzer/models.py
1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 |
|
get_file_size_in_mb
ItunesCategory
Bases: TimeStampedModel
Itunes categories.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
Name of the category |
parent_category |
ItunesCategory | None
|
Relation to another category as parent. |
Person
Bases: UUIDTimeStampedModel
People detected from structured data in podcast feed. Duplicates are possible if data is tracked lazily.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
Name of the person. |
url |
str | None
|
Reported URL of the person. |
img_url |
str | None
|
Reported image URL of the person. |
hosted_episodes |
QuerySet[Episode]
|
Episodes this person has hosted. |
guest_appearances |
QuerySet[Episode]
|
Episodes this person has a guest appearance. |
distinct_podcasts
Get a count of the number of unique podcasts this person has appeared on.
get_distinct_podcasts
Return a queryset of the distinct podcasts this person has appeared in.
Source code in src/podcast_analyzer/models.py
get_podcasts_with_appearance_counts
Provide podcast appearance data for each distinct podcast they have appeared on.
Source code in src/podcast_analyzer/models.py
get_total_episodes
has_guested
has_hosted
Counts the number of episodes where they have been listed as a host.
Podcast
Bases: UUIDTimeStampedModel
Model for a given podcast feed.
Attributes:
Name | Type | Description |
---|---|---|
title |
str
|
The title of the podcast. |
rss_feed |
str
|
The URL of the RSS feed of the podcast. |
podcast_cover_art_url |
str | None
|
The remove URL of the podcast cover art. |
podcast_cached_cover_art |
File | None
|
The cached cover art. |
last_feed_update |
datetime | None
|
When the podcast feed was last updated. |
dormant |
bool
|
Whether the podcast is dormant or not. |
last_checked |
datetime
|
When the podcast feed was last checked. |
author |
str | None
|
The author of the podcast. |
language |
str | None
|
The language of the podcast. |
generator |
str | None
|
The reported generator of the feed. |
email |
str | None
|
The email listed in the feed. |
site_url |
str | None
|
The URL of the podcast site. |
itunes_explicit |
bool | None
|
Whether the podcast has an explict tag on iTunes. |
itunes_feed_type |
str | None
|
The feed type of the podcast feed. |
description |
str | None
|
The provided description of the podcast. |
release_frequency |
str
|
The detected release frequency. One of: daily, often, weekly, biweekly, monthly, adhoc, unknown. |
feed_contains_itunes_data |
bool
|
Whether the podcast feed contains itunes data. |
feed_contains_podcast_index_data |
bool
|
Whether the podcast feed contains podcast index elements. |
feed_contains_tracking_data |
bool
|
Whether the podcast feed contains third-party tracking data. |
feed_contains_structured_donation_data |
bool
|
Whether the feed contains donation links. |
funding_url |
str | None
|
Provided URL for donations/support. |
probable_feed_host |
str | None
|
Current assessment of the feed hosting company. |
itunes_categories |
QuerySet[ItunesCategory]
|
The listed iTunes categories. |
tags |
list[str]
|
The list of keywords/tags declared in the feed. |
analysis_group |
QuerySet[AnalysisGroup]
|
The associated analysis groups. |
median_episode_duration_timedelta
property
Returns the median duration as a timedelta.
total_duration_timedelta
property
Returns the total duration of the podcast as a timedelta object.
ReleaseFrequency
Bases: TextChoices
Choices for release frequency.
afetch_podcast_cover_art
async
Does an async request to fetch the cover art of the podcast.
Source code in src/podcast_analyzer/models.py
alast_release_date
async
Do an async fetch of the last release date.
Source code in src/podcast_analyzer/models.py
analyze_feed
async
Does additional analysis on release schedule, probable host, and if 3rd party tracking prefixes appear to be present.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
episode_limit
|
int
|
Limit the result to the last n episodes. Zero for no limit. Default 0. |
0
|
full_episodes_only
|
bool
|
Exclude bonus episodes and trailers from analysis. Default True. |
True
|
Source code in src/podcast_analyzer/models.py
analyze_feed_for_third_party_analytics
async
Check if we spot any known analytics trackers.
Source code in src/podcast_analyzer/models.py
analyze_host
async
Attempt to determine the host for a given podcast based on what information we can see.
Source code in src/podcast_analyzer/models.py
calculate_median_release_difference
async
staticmethod
Given a queryset of episodes, calculate the median difference and return it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
episodes
|
QuerySet[Episode]
|
Episodes to use for calculation. |
required |
Returns: A timedelta object representing the median difference between releases.
Source code in src/podcast_analyzer/models.py
calculate_next_refresh_time
Given a podcast object, calculate the ideal next refresh time.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
last_release_date
|
datetime
|
Provide the last release date of an episode. |
required |
Returns: Datetime for next refresh.
Source code in src/podcast_analyzer/models.py
fetch_podcast_cover_art
Does a synchronous request to fetch the cover art of the podcast.
Source code in src/podcast_analyzer/models.py
get_feed_data
Fetch a remote feed and return the rendered dict.
Returns:
Type | Description |
---|---|
dict[str, Any]
|
A dict from the |
Source code in src/podcast_analyzer/models.py
last_release_date
Return the most recent episode's release datetime.
Source code in src/podcast_analyzer/models.py
median_episode_duration
process_cover_art_data
process_cover_art_data(
cover_art_data: BytesIO,
cover_art_url: str,
reported_mime_type: str | None,
) -> None
Takes the received art from a given art update and then attempts to process it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cover_art_data
|
BytesIO
|
the received art data. |
required |
cover_art_url
|
str
|
the file name of the art data. |
required |
reported_mime_type
|
str
|
Mime type reported by the server to be validated. |
required |
Source code in src/podcast_analyzer/models.py
refresh_feed
Fetches the source feed and updates the record. This is best handled as a scheduled task in a worker process.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
update_existing_episodes
|
bool
|
Update existing episodes with new data? |
False
|
Returns:
Type | Description |
---|---|
int
|
An int representing the number of added episodes. |
Source code in src/podcast_analyzer/models.py
schedule_next_refresh
Given a podcast object, schedule it's next refresh in the worker queue.
Source code in src/podcast_analyzer/models.py
set_dormant
async
Check if latest episode is less than 65 days old, and set
dormant
to true if so.
Source code in src/podcast_analyzer/models.py
set_release_frequency
async
Calculate and set the release frequency.
Source code in src/podcast_analyzer/models.py
total_duration_seconds
Returns the total duration of all episodes in seconds.
Source code in src/podcast_analyzer/models.py
total_episodes
update_episodes_from_feed_data
update_episodes_from_feed_data(
episode_list: list[dict[str, Any]],
*,
update_existing_episodes: bool = False
) -> int
Given a list of feed items representing episodes, process them into records.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
episode_list
|
list[dict[str, Any]
|
The |
required |
update_existing_episodes
|
bool
|
Update existing episodes? |
False
|
Returns:
Type | Description |
---|---|
int
|
The number of episodes created or updated. |
Source code in src/podcast_analyzer/models.py
update_podcast_metadata_from_feed_data
Given the parsed feed data, update the podcast channel level metadata in this record.
Source code in src/podcast_analyzer/models.py
PodcastAppearanceData
dataclass
PodcastAppearanceData(
podcast: Podcast,
hosted_episodes: QuerySet[Episode],
guested_episodes: QuerySet[Episode],
)
Dataclass for sending back structured appearance data for an individual on a single podcast.
Attributes:
Name | Type | Description |
---|---|---|
podcast |
Podcast
|
Podcast the data relates to. |
hosted_episodes |
QuerySet[Episode]
|
Episodes hosted by them. |
guested_episodes |
QuerySet[Episode]
|
Episodes where they appeared as a guest. |
Season
Bases: UUIDTimeStampedModel
A season for a given podcast.
Attributes:
Name | Type | Description |
---|---|---|
podcast |
Podcast
|
The podcast the season belongs to. |
season_number |
int
|
The season number. |
analysis_group |
QuerySet[AnalysisGroup]
|
Analysis Groups this is assigned to. |
TimeStampedModel
Bases: Model
An abstract model with created and modified timestamp fields.
UUIDTimeStampedModel
Bases: TimeStampedModel
Base model for all our objects records.
Attributes:
Name | Type | Description |
---|---|---|
id |
UUIDField
|
Unique ID. |
created |
DateTimeField
|
Creation time. |
modified |
DateTimeField
|
Modification time. |
cached_properties |
list[str]
|
Names of cached properties that should be dropped on refresh_from_db |
refresh_from_db
Also clear out cached_properties.
Source code in src/podcast_analyzer/models.py
calculate_median_episode_duration
Given an iterable of episode objects, calculate the median duration.
If not a QuerySet, first convert to a queryset to order and extract values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
episodes
|
Iterable[Episode]
|
An iterable of episode objects, e.g. a list or QuerySet |
required |
Returns:
Name | Type | Description |
---|---|---|
int |
int
|
The median duration in seconds. |
Source code in src/podcast_analyzer/models.py
podcast_art_directory_path
Used for caching the podcast channel cover art.