Script Wizard Help Requested
In a previous query, I asked for help on how to get a passel of files off of archive.org. I got some excellent help in regard to that, but now I have a bit of a problem.
Each of the files, in several layers of directories, has a <BASE URL= (the page's original, unarchived URL, on a site that no longer exists)> tag on it...which means that any link clicked on the page will not go to the file relative to it in the hierarchy, as it should, but will instead try to link to a website that no longer exists.
Each file also has several <script= >statements in it.
What I'd really appreciate if someone could do for me would be to write me a perl or shell script that I could run that would...
Remove everything from the start of <BASE to the next > after it
Remove everything from the start of <script to the next </script> after it. (Disregarding case, as there are some scripts and SCRIPTs in there.)
...in all .htm files in the hierarchy, however far down it goes.
Can anyone out there come up with something like that?
Thanks...
Each of the files, in several layers of directories, has a <BASE URL= (the page's original, unarchived URL, on a site that no longer exists)> tag on it...which means that any link clicked on the page will not go to the file relative to it in the hierarchy, as it should, but will instead try to link to a website that no longer exists.
Each file also has several <script= >statements in it.
What I'd really appreciate if someone could do for me would be to write me a perl or shell script that I could run that would...
...in all .htm files in the hierarchy, however far down it goes.
Can anyone out there come up with something like that?
Thanks...
