dplocki / podcast-downloader Goto Github PK
View Code? Open in Web Editor NEWThe Python script for downloading new mp3 from RSS given channels
License: GNU General Public License v3.0
The Python script for downloading new mp3 from RSS given channels
License: GNU General Public License v3.0
Is your feature request related to a problem? Please describe.
Currently all options in the configuration file are just podcasts data. No room for general options.
Describe the solution you'd like
In config file there should be section for general options.
Describe the bug
Final filename can exceed 255 if template string includes another pattern in addition to the title, causing the program to crash.
To Reproduce
Steps to reproduce the behavior:
file_name_template
to "[%publish_date%] %title%.%file_extension%"
. Download an episode with title longer than 255 chars.Expected behavior
Program should not crash. Need to truncate expanded template.
Desktop (please complete the following information):
Additional context
Checked the code. Looks like the truncation only applies to the title, and not the expanded template.
def str_to_filename(value: str) -> str:
value = unicodedata.normalize("NFKC", value)
value = re.sub(r"[\u0000-\u001F\u007F\*/:<>\?\\\|]", " ", value)
return value.strip()[:FILE_NAME_CHARACTER_LIMIT]
def file_template_to_file_name(name_template: str, entity: RSSEntity) -> str:
return (
name_template.replace("%file_name%", link_to_file_name(entity.link))
.replace("%publish_date%", time.strftime("%Y%m%d", entity.published_date))
.replace("%file_extension%", link_to_extension(entity.link))
.replace("%title%", str_to_filename(entity.title))
)
New feature "download-n-last-episodes" (as part of if_directory_empty)
love your lib
on MacOS - i could not get the code to rename the files - its always default.mp3 -
[2024-05-16 13:05:15] Flophouse: Downloading file: "https://afp-9384.calisto.simplecastaudio.com/fd7fc5f6-2d39-4a19-a56f-c31910966c15/episodes/e744acd8-b92e-4fce-83d2-daa0f9f55ce4/audio/128/default.mp3?awCollectionId=fd7fc5f6-2d39-4a19-a56f-c31910966c15&awEpisodeId=e744acd8-b92e-4fce-83d2-daa0f9f55ce4&nocache"
this is the RSS
<item>
<title>Episode 424 – Baby Geniuses, with Linda Holmes</title>
<link>https://www.flophousepodcast.com/2024/05/episode-424-baby-geniuses-with-linda-holmes/</link>
<dc:creator><![CDATA[flophouse]]></dc:creator>
<pubDate>Sat, 11 May 2024 12:00:00 +0000</pubDate>
<category><![CDATA[Episodes]]></category>
<category><![CDATA[Baby Geniuses]]></category>
<category><![CDATA[Christopher Lloyd]]></category>
<category><![CDATA[Dan McCoy]]></category>
<category><![CDATA[Dom DeLuise]]></category>
<category><![CDATA[Elliott Kalan]]></category>
<category><![CDATA[flop flashback]]></category>
<category><![CDATA[Kathleen Turner]]></category>
<category><![CDATA[Kim Cattrall]]></category>
<category><![CDATA[Linda Holmes]]></category>
<category><![CDATA[Peter MacNicol]]></category>
<category><![CDATA[Stuart Wellington]]></category>
and my config file
{
"name": "Flophouse",
"rss_link": "https://www.flophousepodcast.com/feed/",
"path": "/Users/me/FlopHouse",
"file_name_template": "[%publish_date%]-%title%.%file_extension%"
}
C:\Users\Filipe Mota>python -m podcast_downloader
[←[2m2023-10-15 16:20:41←[0m] Loading configuration (from file: "←[97m~/.podcast_downloader_config.json←[0m")
Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "C:\Users\Filipe Mota\AppData\Roaming\Python\Python312\site-packages\podcast_downloader_main.py", line 159, in
load_configuration_file(os.path.expanduser(CONFIG_FILE)),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Filipe Mota\AppData\Roaming\Python\Python312\site-packages\podcast_downloader\parameters.py", line 21, in load_configuration_file
return json.load(json_file)
^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json_init.py", line 293, in load
return loads(fp.read(),
^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json\decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Invalid \escape: line 7 column 23 (char 217)
Describe the bug
There is no way to limit download files in configuration file.
Expected behavior
I can enter the value for limit into configuration file.
Desktop (please complete the following information):
Is your feature request related to a problem? Please describe.
When I download many episodes of a podcast, it probably block for a long time when downloading one episode and I have to manually kill the program and restart it, which isn't automated enough.
Describe the solution you'd like
I'm wondering if it's possible to set up a timeout restart or skip mechanism to ensure that the podcast list in the config ends up downloading smoothly
Describe the bug
I tried to use Podcast Downloader to download podcast from bowuzhi.fm, but got the following error:
urllib.error.HTTPError: HTTP Error 403: Forbidden
Desktop (please complete the following information):
Is your feature request related to a problem? Please describe.
Some podcasts like this one have both .mp3
and .m4a
audio files.
Describe the solution you'd like
It would be cool if the script could download both kinds!
Describe alternatives you've considered
Doing it in a shell command instead 🤷🏻 😅 I prefer the way your script keeps track of files already downloaded though!
Is your feature request related to a problem? Please describe.
As the project become a Python module, configuration file needs to be in home directory.
Describe the solution you'd like
The configuration needs to placed in the home path, to be independent of calling place
Describe alternatives you've considered
I think the script parameter will be nice.
Describe the bug
Each started pull request starts deploy
Describe the bug
the download result is NONE from this xml https://feed.xyzfm.space/jve6gh9jt8vm
Screenshots
[?[2m2023-05-27 10:54:22?[0m] Loading configuration (from file: "?[97mD:\AudioProject\data_engineering\podcast-downloader-master\config\config.json?[0m")
[?[2m2023-05-27 10:54:22?[0m] Checking "?[97m北海怪兽?[0m"
[?[2m2023-05-27 10:54:28?[0m] Last downloaded file "?[97m?[0m"
[?[2m2023-05-27 10:54:28?[0m] ?[97m北海怪兽?[0m: Nothing new
[?[2m2023-05-27 10:54:28?[0m] ------------------------------
[?[2m2023-05-27 10:54:28?[0m] Finished
Describe the bug
it' s not this project fault but the podcast rss fault, i wonder if there's a solution.
the rss like ' https://feeds.audiomeans.fr/feed/88cf4afb-075f-42e2-b94b-3f3d4ed98f69.xml', download it and it will return: "AttributeError: object has no attribute 'href' "
To Reproduce
Steps to reproduce the behavior:
{"if_directory_empty": "download_all_from_feed",
"podcasts": [
{
"name": "test",
"rss_link": "https://feeds.audiomeans.fr/feed/88cf4afb-075f-42e2-b94b-3f3d4ed98f69.xml",
"path": "~/test"
}
}
Describe the bug
Non-existing feed, but script act normally, like nothing happened.
To Reproduce
Steps to reproduce the behaviour:
{
"if_directory_empty": "download_from_4_days",
"podcasts": [
{
"name": "Python for dummies",
"rss_link": "http://python-for-dummies/atom.rss",
"path": "~/podcasts/PythonForDummies"
},
Log:
[2024-03-21 21:29:59] Checking "Python for dummies"
[2024-03-21 21:29:59] Last downloaded file "<none>"
[2024-03-21 21:29:59] Python for dummies: Nothing new
Expected behavior
An error?:)
Hi, thanks for the excellent work.
1 - I came to ask if you could add new variables.
For example in this RSS I would like to get the description and the author
https://www.omnycontent.com/d/playlist/8c0a4104-a688-4e57-91fd-ad7b00d5dddd/a32cf512-c3ce-4057-8ec8-af3400c547e5/ac708daf-04da-4352-ae6d-af3400ca82ad/podcast.rss
2 - In the same RSS gives this error
[←[2m2023-05-08 16:42:55←[0m] ←[31mError:←[0m The podcast file "←[97mhttps://traffic.omny.fm/d/clips/8c0a4104-a688-4e57-91fd-ad7b00d5dddd/a32cf512-c3ce-4057-8ec8-af3400c547e5/f789c11e-447f-460d-a89c-af390172e0b3/audio.mp3?utm_source=Podcast&in_playlist=ac708daf-04da-4352-ae6d-af3400ca82ad←[0m" could not be saved to disk "←[97mC:\Users\Filipe Mota/Downloads/Podcast/A caminho do Catar[20221027] Portugueses a viver no Catar "É um país muito rico e compensa vir para cá trabalhar".mp3←[0m" due to the following error:
←[97m[Errno 22] Invalid argument: 'C:\Users\Filipe Mota/Downloads/Podcast/A caminho do Catar\[20221027] Portugueses a viver no Catar "É um país muito rico e compensa vir para cá trabalhar".mp3'←[0m
3 - In the RSS below this error in ep1 and in the trailer.
[←[2m2023-05-08 15:43:26←[0m] ←[31mError:←[0m The podcast file "←[97mhttps://traffic.omny.fm/d/clips/b04d3ae5-22c4-41b6-b20a-aa54000ba759/4093b241-20e0-4025-8a00-afba013b2218/29e80dd4-0527-4d7d-85e9-afc401721117/audio.mp3?utm_source=Podcast&in_playlist=b150e14d-4d2e-4c4e-9cf2-afba013f7a91←[0m" could not be saved to disk "←[97mC:\Users\Filipe Mota/Downloads/Podcast/O Sargento na Cela 7[20230314] Estreia. "O Sargento na Cela 7". Episódio 1 O Prisioneiro.mp3←[0m" due to the following error:
←[97m[Errno 22] Invalid argument: 'C:\Users\Filipe Mota/Downloads/Podcast/O Sargento na Cela 7\[20230314] Estreia. "O Sargento na Cela 7". Episódio 1 O Prisioneiro.mp3'←[0m
4 - And I wish there was an alternative to the date.
YEARMMDD and YEAR.MM.DD
With the dots on the dates it would make it a lot easier to read
5 - I have a question
The possibility of having more than one podcast in a JSON file? Well, I tried and I couldn't.
6 - Error because of accents
https://rss.podplaystudio.com/3240.xml
Thanks and keep up the great work.
Best regards,
BlackSpirits
To Reproduce
Steps to reproduce the behavior:
Enter configuration
{
"if_directory_empty": "download_all_from_feed",
"podcasts": [
{
"name": "Thai PBS Podcast",
"rss_link": "https://www.thaipbspodcast.com/program-rss.php?id=133",
"path": "xxx",
"podcast_extensions": {".mp3": "audio/x-m4a"}
}].
}
See error
The file sizes are all 65KB and there is a read error.
Describe the bug
If you trying to check the empty directory, an exception is thrown.
To Reproduce
Steps to reproduce the behavior:
Additional context
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "__main__.py", line 79, in <module>
last_downloaded_file = get_last_downloaded(rss_source_path)
File "downloaded.py", line 23, in get_last_downloaded
return next(get_downloaded_files(podcast_directory))
StopIteration
Describe the bug
Script cannot find the existing configuration file on home directory: ~/.podcast_downloader_config.json
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Run without problems
Describe the bug
Can't download from this RSS: "https://www.omnycontent.com/d/playlist/6dd8413b-ede6-483a-bf4e-ab80014939de/20f4bf02-d62f-40b2-b532-af10011ba71b/2bdbf0f4-e0ca-4343-9fb2-af10011ba729/podcast.rss"
To Reproduce
jason file:
{
"if_directory_empty": "download_from_4_days",
"podcasts": [
{
"name": " Listening Time",
"rss_link": "https://www.omnycontent.com/d/playlist/6dd8413b-ede6-483a-bf4e-ab80014939de/20f4bf02-d62f-40b2-b532-af10011ba71b/2bdbf0f4-e0ca-4343-9fb2-af10011ba729/podcast.rss",
"path": "./ttt",
"file_name_template": "[%publish_date%] %title%.%file_extension%"
}
]
}
Command:
python3 -m podcast_downloader
Expected behavior
Download episodes.
Error message
[2023-02-05 14:48:21] Loading configuration (from file: "~/.podcast_downloader_config.json")
[2023-02-05 14:48:21] Checking " Listening Time"
[2023-02-05 14:48:22] Last downloaded file "<none>"
[2023-02-05 14:48:22] Listening Time: Nothing new
[2023-02-05 14:48:22] ------------------------------
[2023-02-05 14:48:22] Finished
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Is your feature request related to a problem? Please describe.
For a better organization it would be interesting to include the possibility that the file name contains the episode title
Describe the solution you'd like
A new flag in the configuration could be require_title
Is your feature request related to a problem? Please describe.
Missing the check-in workflow.
Describe the solution you'd like
Adding the workflow which will checking all the new commit by testing them.
Is your feature request related to a problem? Please describe.
Now if the directory for podcast is empty, the script will download all mp3s from RSS. It's not good thing if someone is update with current podcast.
Describe the solution you'd like
An option in config file which determine which how often this file is run (e.g. in form of days number).
Describe the bug
Wrong version of podcast_downloader is installed
To Reproduce
Steps to reproduce the behavior:
pip3 install podcast_downloader
or python3 -m pip install podcast_downloader
pip3 show podcast_downloader
Expected behavior
Version of install to be 0.2.0 or latest
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
Originally thought this was an issue with the app itself, but realized I didn't actually have the latest version of the package
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.