Discussion:
free download website from Wayback & translate, Ubuntu 18.04
(too old to reply)
Chuck
2020-11-29 17:14:40 UTC
Permalink
Not sure if you can help, on Ubuntu 18.04 with Firefox. I do have
Chrome as secondary, but don't often use it. There's a website I want
to download in its entirely, but only now appears on the Wayback site.
Is there a free method I can use to download it. Also, as it is in
Italian, I tried Google translate of the Wayback link and it doesn't
seem to translate it at all. Any thoughts on that would be welcome too.
Thanks in advance.
Paul
2020-11-29 19:03:14 UTC
Permalink
Post by Chuck
Not sure if you can help, on Ubuntu 18.04 with Firefox. I do have
Chrome as secondary, but don't often use it. There's a website I want
to download in its entirely, but only now appears on the Wayback site.
Is there a free method I can use to download it. Also, as it is in
Italian, I tried Google translate of the Wayback link and it doesn't
seem to translate it at all. Any thoughts on that would be welcome too.
Thanks in advance.
They suggest User Agent spoofing as part of the recipe here. Something
about Chrome generating a cURL command ? Dunno.

https://unix.stackexchange.com/questions/280645/accessing-google-translate-via-wget

The translate.google.com isn't as friendly as it once was. I don't
expect what you're doing, to pass whatever they use for bot checks.
Any automation run is probably considered an "attack" by the person
who codes translate.google.com . Your coding has to be better than
your "adversary" :-) Translate.google.com could have a page length
limit as well, so if a page comes back with 200 lines of English
followed by 800 lines of Italian still needing translation,
I would not be surprised.

It's better to find a way to get your sourcing site, to offer its own
translated pages. I'm not saying archive.org offers that, just that
some sites offer, say, English and German pages, and you can make
a language selection on entry to the site. Just as Wikipedia
might have it.wikipedia.org and en.wikipedia.org, and articles of
the two sites aren't all the same. Sometimes it.wikipedia.org would
have a topic that has never been translated for their other
sites.

Paul
Chuck
2020-11-29 19:52:28 UTC
Permalink
Post by Paul
Not sure if you can help, on Ubuntu 18.04 with Firefox.  I do have
Chrome as secondary, but don't often use it.  There's a website I want
to download in its entirely, but only now appears on the Wayback site.
Is there a free method I can use to download it.  Also, as it is in
Italian, I tried Google translate of the Wayback link and it doesn't
seem to translate it at all.  Any thoughts on that would be welcome
too.  Thanks in advance.
They suggest User Agent spoofing as part of the recipe here. Something
about Chrome generating a cURL command ? Dunno.
https://unix.stackexchange.com/questions/280645/accessing-google-translate-via-wget
The translate.google.com isn't as friendly as it once was. I don't
expect what you're doing, to pass whatever they use for bot checks.
Any automation run is probably considered an "attack" by the person
who codes translate.google.com . Your coding has to be better than
your "adversary" :-) Translate.google.com could have a page length
limit as well, so if a page comes back with 200 lines of English
followed by 800 lines of Italian still needing translation,
I would not be surprised.
It's better to find a way to get your sourcing site, to offer its own
translated pages. I'm not saying archive.org offers that, just that
some sites offer, say, English and German pages, and you can make
a language selection on entry to the site. Just as Wikipedia
might have it.wikipedia.org and en.wikipedia.org, and articles of
the two sites aren't all the same. Sometimes it.wikipedia.org would
have a topic that has never been translated for their other
sites.
   Paul
I'm sorry that I didn't get a chance to come in here earlier and say to
disregard this post. The website in question actually was viewable
without going through Wayback and I hadn't realized at first. Then, I
did use Google Translate, which worked fine, and I simply printed to a
PDF, but now the PDF is having problems as I created a new post.

Continue reading on narkive:
Loading...