Download all MP3 in XML file

NodeBytes · Jan 2, 2014

Hello all,

I have an XML file with links to mp3 files. (RSS feed)

I need to download all the mp3 files that are linked to on this XML file.

Here's a sample item from the RSS feed.

<item>
<title>Simply Christmas, Part 3</title>
<itunes:author>Tom Hughes</itunes:author>
<itunes:summary/>
<enclosure url="http://cachurch.com//podcasttracking/667d51e5-828a-4091-bb86-e74ed94a8c75/3e3ad3c2-8df2-45c8-b155-d3bc89f96d51/SimplyChristmasPt3.mp3" length="51853488" type="audio/mpeg"/>
<guid>
http://cachurch.com//podcasttracking/667d51e5-828a-4091-bb86-e74ed94a8c75/3e3ad3c2-8df2-45c8-b155-d3bc89f96d51/SimplyChristmasPt3.mp3
</guid>
<pubDate>Sun, 22 Dec 2013 12:00:00 GMT</pubDate>
<itunes:duration>33:08</itunes:duration>
<itunes:keywords>
Tom, Hughes, Simply, Christmas, December, 21, 22, 2013, sermon
</itunes:keywords>
</item>

How would you scrape this for just the links to the mp3 files and then download them?

Thanks,

Brendan

dannix · Jan 2, 2014

the simplest would be to let wget do it for you. This will however download all listed files. wget -F -i your_xml_file

fisle · Jan 2, 2014

In Python:

# -*- coding: utf-8 -*-
import urllib
from bs4 import BeatifulSoup

url = 'insert_url_here'
data = urllib.request.urlopen(url).read()

soup = BeatifulSoup(data)
songs = soup.find_all('guid')
for song in songs:
song = song.string.strip()
filename = song.split('/')
urllib.urlretrieve(song, filename[-1])
print '{!s} downloaded'.format(filename[-1])

Needs BeautifulSoup4 (pip install BeautifulSoup4)

texteditor · Jan 2, 2014

Flexget is basically made from magic and is my go-to multitool for grabbing files from RSS, I use it for torrents but their example page has an example for scraping mp3s from HTML - since flexget filters can be applied to RSS the same way it should work just fine

http://flexget.com/wiki/Plugins/rss

http://flexget.com/wiki/Plugins/html

once you get your little yaml config setup correctly, just run flexget in cron.

It also keeps a database of files it has seen before so it doesn't accidentally grab the same one twice

Download all MP3 in XML file

NodeBytes

Dedi Addict

dannix

New Member

fisle

Active Member

texteditor

Premium Buffalo-based Hosting