VicandMeliJvR
Registered
Dear All,
Could you please help me find a way / workflow that Automator could use to cut the first 10 and last 10 lines of a text document?
I would like to use this to remove menu bars titles, addition links etc. that I don't need after I used "Get Text from Webpage" in Automator.
My situation is that I have a subscription for a year. This gives me 365 pages that I can click on everyday to read, but I'd rather extract the text from all the web pages in advance and use "Text to Speech" to create a playlist in Itunes and add the text of the web pages as lyrics for each spoken track.
Ideally it would have been perfect if I could extract only text or the main content from the web pages for e.g. Safari Reader content. I have searched wide and far but even curl or readability javascript still includes the additional links I don't need.
These pages are all based on the same template and therefore I think deleting the 1st and last 10 lines from the text files could work.
If there is no way to do this then is there then possibly a way to automatical delete anything above and below certain keywords perhaps for e.g. everything above "Welcome" and everything below "Additional Links"?
I would really appreciate any help on this.
Many thanks,
Vic
Could you please help me find a way / workflow that Automator could use to cut the first 10 and last 10 lines of a text document?
I would like to use this to remove menu bars titles, addition links etc. that I don't need after I used "Get Text from Webpage" in Automator.
My situation is that I have a subscription for a year. This gives me 365 pages that I can click on everyday to read, but I'd rather extract the text from all the web pages in advance and use "Text to Speech" to create a playlist in Itunes and add the text of the web pages as lyrics for each spoken track.
Ideally it would have been perfect if I could extract only text or the main content from the web pages for e.g. Safari Reader content. I have searched wide and far but even curl or readability javascript still includes the additional links I don't need.
These pages are all based on the same template and therefore I think deleting the 1st and last 10 lines from the text files could work.
If there is no way to do this then is there then possibly a way to automatical delete anything above and below certain keywords perhaps for e.g. everything above "Welcome" and everything below "Additional Links"?
I would really appreciate any help on this.
Many thanks,
Vic