dplocki / podcast-downloader Goto Github PK

View Code? Open in Web Editor NEW

109.0 2.0 15.0 1.21 MB

The Python script for downloading new mp3 from RSS given channels

License: GNU General Public License v3.0

Python 99.74% Dockerfile 0.26%

python3 podcast script automation rss rss-feed-bot no-database json-configuration

podcast-downloader's People

Contributors

Stargazers

Watchers

Forkers

bnolet ru-fu podcast-ai 10-15-5 yeye-coder shoz-f bmilde vocacash amriz npow ryanquey lucasjet81 puluo2void hamzafarhan oceanofanythingofficial

podcast-downloader's Issues

The structure of configuration file needs to be redesigned

Is your feature request related to a problem? Please describe.
Currently all options in the configuration file are just podcasts data. No room for general options.

Describe the solution you'd like
In config file there should be section for general options.

The script output is on stderr

Deal with "gaps" of episodes inside the podcast directory

Final filename can exceed 255 chars

Describe the bug
Final filename can exceed 255 if template string includes another pattern in addition to the title, causing the program to crash.

To Reproduce
Steps to reproduce the behavior:

Set file_name_template to "[%publish_date%] %title%.%file_extension%". Download an episode with title longer than 255 chars.

Expected behavior
Program should not crash. Need to truncate expanded template.

Desktop (please complete the following information):

Link to RSS feed: https://lexfridman.com/feed/podcast/

Additional context
Checked the code. Looks like the truncation only applies to the title, and not the expanded template.

def str_to_filename(value: str) -> str:
    value = unicodedata.normalize("NFKC", value)
    value = re.sub(r"[\u0000-\u001F\u007F\*/:<>\?\\\|]", " ", value)

    return value.strip()[:FILE_NAME_CHARACTER_LIMIT]


def file_template_to_file_name(name_template: str, entity: RSSEntity) -> str:
    return (
        name_template.replace("%file_name%", link_to_file_name(entity.link))
        .replace("%publish_date%", time.strftime("%Y%m%d", entity.published_date))
        .replace("%file_extension%", link_to_extension(entity.link))
        .replace("%title%", str_to_filename(entity.title))
    )

Add download-n-last-episodes

New feature "download-n-last-episodes" (as part of if_directory_empty)

Rss Feed for Podcast- file_name_template is ignored

love your lib

on MacOS - i could not get the code to rename the files - its always default.mp3 -
[2024-05-16 13:05:15] Flophouse: Downloading file: "https://afp-9384.calisto.simplecastaudio.com/fd7fc5f6-2d39-4a19-a56f-c31910966c15/episodes/e744acd8-b92e-4fce-83d2-daa0f9f55ce4/audio/128/default.mp3?awCollectionId=fd7fc5f6-2d39-4a19-a56f-c31910966c15&awEpisodeId=e744acd8-b92e-4fce-83d2-daa0f9f55ce4&nocache"

this is the RSS

<item>
		<title>Episode 424 &#8211; Baby Geniuses, with Linda Holmes</title>
		<link>https://www.flophousepodcast.com/2024/05/episode-424-baby-geniuses-with-linda-holmes/</link>
		
		<dc:creator><![CDATA[flophouse]]></dc:creator>
		<pubDate>Sat, 11 May 2024 12:00:00 +0000</pubDate>
				<category><![CDATA[Episodes]]></category>
		<category><![CDATA[Baby Geniuses]]></category>
		<category><![CDATA[Christopher Lloyd]]></category>
		<category><![CDATA[Dan McCoy]]></category>
		<category><![CDATA[Dom DeLuise]]></category>
		<category><![CDATA[Elliott Kalan]]></category>
		<category><![CDATA[flop flashback]]></category>
		<category><![CDATA[Kathleen Turner]]></category>
		<category><![CDATA[Kim Cattrall]]></category>
		<category><![CDATA[Linda Holmes]]></category>
		<category><![CDATA[Peter MacNicol]]></category>
		<category><![CDATA[Stuart Wellington]]></category>

and my config file

  {
            "name": "Flophouse",
            "rss_link": "https://www.flophousepodcast.com/feed/",
            "path": "/Users/me/FlopHouse",
   "file_name_template": "[%publish_date%]-%title%.%file_extension%"
}

error

C:\Users\Filipe Mota>python -m podcast_downloader
[←[2m2023-10-15 16:20:41←[0m] Loading configuration (from file: "←[97m~/.podcast_downloader_config.json←[0m")
Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "C:\Users\Filipe Mota\AppData\Roaming\Python\Python312\site-packages\podcast_downloader_main.py", line 159, in
load_configuration_file(os.path.expanduser(CONFIG_FILE)),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Filipe Mota\AppData\Roaming\Python\Python312\site-packages\podcast_downloader\parameters.py", line 21, in load_configuration_file
return json.load(json_file)
^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json_init.py", line 293, in load
return loads(fp.read(),
^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json\decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Invalid \escape: line 7 column 23 (char 217)

Limit for download at once should be also present in configuration file

Describe the bug
There is no way to limit download files in configuration file.

Expected behavior
I can enter the value for limit into configuration file.

Desktop (please complete the following information):

OS general
Python version: general
Version: 0.1.1
Link to RSS feed: general

Is there timeout reboot or timeout skip feature？

Is your feature request related to a problem? Please describe.
When I download many episodes of a podcast, it probably block for a long time when downloading one episode and I have to manually kill the program and restart it, which isn't automated enough.

Describe the solution you'd like
I'm wondering if it's possible to set up a timeout restart or skip mechanism to ensure that the podcast list in the config ends up downloading smoothly

Podcast Downloader cannot download podcast from bowuzhi.fm

Describe the bug
I tried to use Podcast Downloader to download podcast from bowuzhi.fm, but got the following error:
urllib.error.HTTPError: HTTP Error 403: Forbidden

Desktop (please complete the following information):

Link to RSS feed

Support mp4 file downloads

Is your feature request related to a problem? Please describe.
Some podcasts like this one have both .mp3 and .m4a audio files.

Describe the solution you'd like
It would be cool if the script could download both kinds!

Describe alternatives you've considered
Doing it in a shell command instead 🤷🏻 😅 I prefer the way your script keeps track of files already downloaded though!

New method of checking the last time run

Default location of configuration file

Is your feature request related to a problem? Please describe.
As the project become a Python module, configuration file needs to be in home directory.

Describe the solution you'd like
The configuration needs to placed in the home path, to be independent of calling place

Describe alternatives you've considered
I think the script parameter will be nice.

Package deploy job is reacting on every pull request

Describe the bug
Each started pull request starts deploy

the download result is NONE from this xml https://feed.xyzfm.space/jve6gh9jt8vm

Describe the bug
the download result is NONE from this xml https://feed.xyzfm.space/jve6gh9jt8vm

Screenshots
[?[2m2023-05-27 10:54:22?[0m] Loading configuration (from file: "?[97mD:\AudioProject\data_engineering\podcast-downloader-master\config\config.json?[0m")
[?[2m2023-05-27 10:54:22?[0m] Checking "?[97m北海怪兽?[0m"
[?[2m2023-05-27 10:54:28?[0m] Last downloaded file "?[97m?[0m"
[?[2m2023-05-27 10:54:28?[0m] ?[97m北海怪兽?[0m: Nothing new
[?[2m2023-05-27 10:54:28?[0m] ------------------------------
[?[2m2023-05-27 10:54:28?[0m] Finished

rss has no attribute 'href'

Describe the bug
it' s not this project fault but the podcast rss fault, i wonder if there's a solution.
the rss like ' https://feeds.audiomeans.fr/feed/88cf4afb-075f-42e2-b94b-3f3d4ed98f69.xml', download it and it will return: "AttributeError: object has no attribute 'href' "

To Reproduce
Steps to reproduce the behavior:
{"if_directory_empty": "download_all_from_feed",
"podcasts": [
{
"name": "test",
"rss_link": "https://feeds.audiomeans.fr/feed/88cf4afb-075f-42e2-b94b-3f3d4ed98f69.xml",
"path": "~/test"
}
}

Non-existing feed does not cause error

Describe the bug

Non-existing feed, but script act normally, like nothing happened.

To Reproduce
Steps to reproduce the behaviour:

{
    "if_directory_empty": "download_from_4_days",
    "podcasts": [
        {
            "name": "Python for dummies",
            "rss_link": "http://python-for-dummies/atom.rss",
            "path": "~/podcasts/PythonForDummies"
        },

Log:

[2024-03-21 21:29:59] Checking "Python for dummies"
[2024-03-21 21:29:59] Last downloaded file "<none>"
[2024-03-21 21:29:59] Python for dummies: Nothing new

Expected behavior
An error?:)

Add wait time before downloading.

Please, add new variables.

Hi, thanks for the excellent work.

1 - I came to ask if you could add new variables.

For example in this RSS I would like to get the description and the author
https://www.omnycontent.com/d/playlist/8c0a4104-a688-4e57-91fd-ad7b00d5dddd/a32cf512-c3ce-4057-8ec8-af3400c547e5/ac708daf-04da-4352-ae6d-af3400ca82ad/podcast.rss

2 - In the same RSS gives this error

[←[2m2023-05-08 16:42:55←[0m] ←[31mError:←[0m The podcast file "←[97mhttps://traffic.omny.fm/d/clips/8c0a4104-a688-4e57-91fd-ad7b00d5dddd/a32cf512-c3ce-4057-8ec8-af3400c547e5/f789c11e-447f-460d-a89c-af390172e0b3/audio.mp3?utm_source=Podcast&in_playlist=ac708daf-04da-4352-ae6d-af3400ca82ad←[0m" could not be saved to disk "←[97mC:\Users\Filipe Mota/Downloads/Podcast/A caminho do Catar[20221027] Portugueses a viver no Catar "É um país muito rico e compensa vir para cá trabalhar".mp3←[0m" due to the following error:
←[97m[Errno 22] Invalid argument: 'C:\Users\Filipe Mota/Downloads/Podcast/A caminho do Catar\[20221027] Portugueses a viver no Catar "É um país muito rico e compensa vir para cá trabalhar".mp3'←[0m

3 - In the RSS below this error in ep1 and in the trailer.

[←[2m2023-05-08 15:43:26←[0m] ←[31mError:←[0m The podcast file "←[97mhttps://traffic.omny.fm/d/clips/b04d3ae5-22c4-41b6-b20a-aa54000ba759/4093b241-20e0-4025-8a00-afba013b2218/29e80dd4-0527-4d7d-85e9-afc401721117/audio.mp3?utm_source=Podcast&in_playlist=b150e14d-4d2e-4c4e-9cf2-afba013f7a91←[0m" could not be saved to disk "←[97mC:\Users\Filipe Mota/Downloads/Podcast/O Sargento na Cela 7[20230314] Estreia. "O Sargento na Cela 7". Episódio 1 O Prisioneiro.mp3←[0m" due to the following error:
←[97m[Errno 22] Invalid argument: 'C:\Users\Filipe Mota/Downloads/Podcast/O Sargento na Cela 7\[20230314] Estreia. "O Sargento na Cela 7". Episódio 1 O Prisioneiro.mp3'←[0m

https://www.omnycontent.com/d/playlist/b04d3ae5-22c4-41b6-b20a-aa54000ba759/4093b241-20e0-4025-8a00-afba013b2218/b150e14d-4d2e-4c4e-9cf2-afba013f7a91/podcast.rss

4 - And I wish there was an alternative to the date.

YEARMMDD and YEAR.MM.DD

With the dots on the dates it would make it a lot easier to read

5 - I have a question

The possibility of having more than one podcast in a JSON file? Well, I tried and I couldn't.

6 - Error because of accents

https://rss.podplaystudio.com/3240.xml

Thanks and keep up the great work.
Best regards,
BlackSpirits

The downloaded audio does not match the audio provided by rss.

To Reproduce
Steps to reproduce the behavior:

Enter configuration
{
"if_directory_empty": "download_all_from_feed",
"podcasts": [
{
"name": "Thai PBS Podcast",
"rss_link": "https://www.thaipbspodcast.com/program-rss.php?id=133",
"path": "xxx",
"podcast_extensions": {".mp3": "audio/x-m4a"}
}].
}
See error
The file sizes are all 65KB and there is a read error.

Screenshots

Display list of the files for given feed

If directory is empty an exception is thrown

Describe the bug
If you trying to check the empty directory, an exception is thrown.

To Reproduce
Steps to reproduce the behavior:

setup podcast
make sure, that the directory of it is empty
run script

Additional context

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "__main__.py", line 79, in <module>
    last_downloaded_file = get_last_downloaded(rss_source_path)
  File "downloaded.py", line 23, in get_last_downloaded
    return next(get_downloaded_files(podcast_directory))
StopIteration

Automatic reading the title of the feed from feed itself

OPML support

There is a problem with reading configuration file

Describe the bug
Script cannot find the existing configuration file on home directory: ~/.podcast_downloader_config.json

To Reproduce
Steps to reproduce the behavior:

Place configuration file: ~/.podcast_downloader_config.json
Run script
See error: "Cannot find configuration file"

Expected behavior
Run without problems

Can't download from this RSS

Describe the bug
Can't download from this RSS: "https://www.omnycontent.com/d/playlist/6dd8413b-ede6-483a-bf4e-ab80014939de/20f4bf02-d62f-40b2-b532-af10011ba71b/2bdbf0f4-e0ca-4343-9fb2-af10011ba729/podcast.rss"

To Reproduce
jason file:

{
    "if_directory_empty": "download_from_4_days",
    "podcasts": [
        {
            "name": " Listening Time",
            "rss_link": "https://www.omnycontent.com/d/playlist/6dd8413b-ede6-483a-bf4e-ab80014939de/20f4bf02-d62f-40b2-b532-af10011ba71b/2bdbf0f4-e0ca-4343-9fb2-af10011ba729/podcast.rss",
            "path": "./ttt",
            "file_name_template": "[%publish_date%] %title%.%file_extension%"
        }
    ]
}

Command:
python3 -m podcast_downloader

Expected behavior
Download episodes.

Error message

[2023-02-05 14:48:21] Loading configuration (from file: "~/.podcast_downloader_config.json")
[2023-02-05 14:48:21] Checking " Listening Time"
[2023-02-05 14:48:22] Last downloaded file "<none>"
[2023-02-05 14:48:22]  Listening Time: Nothing new
[2023-02-05 14:48:22] ------------------------------
[2023-02-05 14:48:22] Finished

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: Ubuntu
Python version Python 3.8.10
Version ??? (I don't know this mean the version of what.)
Link to RSS feed [e.g. https://www.omnycontent.com/d/playlist/6dd8413b-ede6-483a-bf4e-ab80014939de/20f4bf02-d62f-40b2-b532-af10011ba71b/2bdbf0f4-e0ca-4343-9fb2-af10011ba729/podcast.rss]

Run pip3 install podcast_downloader or python3 -m pip install podcast_downloader
Run pip3 show podcast_downloader
Observe version installed is 0.1.1

Expected behavior
Version of install to be 0.2.0 or latest

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: macOS
Python version: 3.9
Version: 0.1.1
Link to RSS feed: Not applicable

Additional context
Originally thought this was an issue with the app itself, but realized I didn't actually have the latest version of the package

dplocki / podcast-downloader Goto Github PK

podcast-downloader's People

Contributors

Stargazers

Watchers

Forkers

podcast-downloader's Issues

Recommend Projects

Recommend Topics

Recommend Org