This might be a bit difficult but is it possible (automation)?

tofumusic · Aug 17, 2004

I hope this is possible because I'm going to be stuck on this for a looooooooong time if not.
In a text editor I have a lot of different web sites that link to several files.
The string goes something like <ASX version = "3.0"> <Entry> <Ref href = "http://www.websitegoeshere.org/dir/filename.ext" /> </Entry> </ASX>
I'd like to go to each individual site and by doing this the files will automatically download but..
I can't go to the sites unless I do it individually or unless I cut and paste each www.website.com/dir/filename.ext from the text editor and into Word or something so that I can then click on each one and the web browser will open and retrieve the files.
Are there any other ways to automate this long process? There are over 1,120 files so you can imagine why I'm looking very hard for a faster way

Thanks in advance for any help
-tufomusic

scruffy · Aug 18, 2004

I'd do this in a couple of steps - first get a file that contains nothing but urls, maybe one per line. I took the sample you posted, and assuming that the example string is alone on each line, this will do it:

cat filename | cut -f 4 -d \" > otherfilename

I don't know the full format of the file of course, so there might be more to it than that - maybe pass it through grep for the string 'http' before passing it to cut...

Then, the downloading part is easy. The csh syntax is:

foreach url ( `cat otherfilename` )
curl -O $url
end

The bash syntax will be different. If you don't want to be bothered to figure it out, just use csh for the downloading loop.

anarchie · Aug 18, 2004

Use this handy perl oneliner. If you have a file full of things like "<Ref href = "http://www.websitegoeshere.org/dir/filename.ext" />" then this will extract and download them all using curl. Assuming the file is named asxfile

perl -n -e 'm$<Ref href = "(.*)" />$;`curl -O $1`' < asxfile

tofumusic · Aug 18, 2004

wow both of these replies are absolutely genious but now I'm beginning to realize that I left a couple of important things out and I'll elaborate now for the sake of confusion.. . On my part, of course..
I've been working at this and I have about 800 more to go (I've been doing this manually! lol)
So i have a directory of .wax files and I'm extracting the urls from the files by opening about 40 at a time with a text editor. From there on it's cut and paste and then click or paste the url (but I can't do that from the text editor because the html code isn't right or something and won't let me click). Is there a way to tailor any of the above commands to fit my situation further? Like:
go to this directory, open all the .wax files in a text editor, extract the urls (ex.) from 'www. to .wma), put those urls into a different file (or even better open each url)? Sorry to make this more complicated. Thanks for the amazing solutions, you guys are quite amazing. I'm inspired to learn a language now,. . Which should I start with?

tofumusic · Aug 18, 2004

Ok I'm trying the Perl on the whole directory but it's not working.. The command I'm trying is perl -n -e 'm$<Ref href = "(.*)" />$;`curl -O $1`' < /users/me/desktop/dirwithallthewaxfiles/*
This doesn't work for some (probably simple) reason but it works flawlessly for individual files so I'm going to try this and see how far I can get. Thanks again to both of you you've really saved me a ton of time

anarchie · Aug 18, 2004

perl -n -e 'm$<Ref href = "(.*)" />$;`curl -O $1`' /users/me/desktop/dirwithallthewaxfiles/*

This might be a bit difficult but is it possible (automation)?

tofumusic

Registered

scruffy

Notorious Olive Counter

anarchie

NSCoder

tofumusic

Registered

tofumusic

Registered

anarchie

NSCoder