COLLECTED BY
Organization:
Internet Archive
These crawls are part of an effort to archive pages as they are created and archive the pages that they refer to. That way, as the pages that are referenced are changed or taken from the web, a link to the version that was live when the page was written will be preserved.
Then the Internet Archive hopes that references to these archived pages will be put in place of a link that would be otherwise be broken, or a companion link to allow people to see what was originally intended by a page's authors.
The goal is to
fix all broken links on the web.
Crawls of supported "No More 404" sites.
This is a collection of web page captures from links added to, or changed on, Wikipedia pages. The idea is to bring a reliability to Wikipedia outlinks so that if the pages referenced by Wikipedia articles are changed, or go away, a reader can permanently find what was originally referred to.
This is part of the Internet Archive's attempt to
rid the web of broken links.
The Wayback Machine - https://web.archive.org/web/20140417082107/https://archive.org/post/350738/updated-wayback-machine-in-beta-testing
|
Poster:
|
gojomo |
Date:
|
Mar 3, 2011 5:28pm |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
The classic Wayback Machine will eventually be shut down, some time later this year.
We're running them both in parallel for now so the performance can be compared, and so that one or the other is always available if problems are encountered.
All URLs linking to both the classic and new will continue to work; they'll just eventually all show the latest interface.
If you can express in more detail the ways you preferred the old interface, or suggest change to the new interface, that will help us improve the new interface.
- Gordon @ IA
|
Poster:
|
Isashi |
Date:
|
Mar 11, 2011 9:21am |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
the classic alternate should stay, later all willl miss it, and this all for what ? i say it from experience, that the classic one ever is good morking no mistakes or any bugs, when it will removed so all will make a petition :|
and... why a new design ?
- isashi
PS: when somethin should be changed, then the forum, the design is confusing and not the classic way back maschine deisgn ; )
This post was modified by Isashi on 2011-03-11 17:21:43
|
Poster:
|
gojomo |
Date:
|
Mar 12, 2011 12:38pm |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
I'm glad the classic always met your needs, but it had many problems that affected others. Lots of crawled content did not display properly, and under heavy traffic it would often become slow or intermittently unreliably, and without major changes its capacity could only be increased by adding a lot of hardware.
Also, it was based an inherited collection of Perl and C code which we did not have permission to share with the public.
Instead, we created the all open-source 'Wayback' project for improved content playback and more efficient and reliable operation. It's been in use for years on smaller collections, and at other institutions, and should now be better in almost all situations.
There are still limitations, especially with regard to rich media and highly interactive web content, but now all fixes can be devised and shared among a larger community of users.
If you can list any specific areas where the new does not meet your needs as well as the classic, we'll work to close the gap before the classic is completely retired.
- Gordon @ IA
|
Poster:
|
chris_h |
Date:
|
Aug 4, 2011 10:33am |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
Greetings,
The perl||C scripts were reliable, your new code is not.
If you had issues with copyright/propriety with the old ones, I will re-write them and license them under a BSD/MIT license so that you can use/develop with them in a more open fashion.
Reading the feedback, and in conversations with others, it has become clear that the new UI IA has chosen is NOT popular. Nor do I (or others) find it reliable. If your hardware tends to become overloaded using the old UI, please consider using a "round-robin" approach. This simply requires creating mirrors that can be utilized on a "most available" basis. We have a great many resources available here, and have been "connected" since 1975. We could most probably provide a mirror gratis.
Please do not forget where the Wayback Machine came from -- it is what made it great. It appears that it's current charted course will eliminate most, if not all of that.
Please note that none of the preceding was intended to be insulting, or defamatory in any way. It was simply an observation, accompanied by suggestion(s).
Thank you for all your time and consideration in this matter.
--Chris
|
Poster:
|
luke d |
Date:
|
Mar 15, 2011 10:25pm |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
When I use the classic Wayback Machine it doesn't find my site, but when I use the Beta Wayback it has a crawl from 2009. Why is this and what can I do to get my domain age to appear correctly?
|
Poster:
|
gojomo |
Date:
|
Mar 16, 2011 6:48pm |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
The Beta of the new Wayback Machine has a more complete and up-to-date index of all crawled materials into 2010, and will continue to be updated regularly. The index driving the classic Wayback Machine only has a little bit of material past 2008, and no further index updates are planned, as it will be phased out this year.
Please consult the Beta for the latest material. If there are purposes for which the new Wayback Machine is not as useful as the old, please let us know; we'll be trying to eliminate any reasons to prefer the old before phasing it out.
- Gordon @ IA
|
Poster:
|
luke d |
Date:
|
Mar 16, 2011 11:55pm |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
I don't know why, but google and some of the other search engines must not know how to read the new information because they say that my archives from 2009 don't exist. I just wondered because it affects SEO and web optimization that they can't find my archives.
|
Poster:
|
gojomo |
Date:
|
Mar 28, 2011 2:13pm |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
Neither the classic nor the new Wayback Machine are open to search engine crawling, so the presence or completeness of archived pages there shouldn't have any effect on search engine results.
- Gordon @ IA
|
Poster:
|
luke d |
Date:
|
Mar 28, 2011 3:27pm |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
I understand. However, google does count the domain age (per first record in archive.org) in their equation for assigning page rank, and since they can't find any records for my domain in archive.org they give me a lower page rank than my competition. It's not something that I expect you to fix, as google is the one who hasn't learned how to read the new system, but the new system does hurt my web business.
The rumors are that google is creating their own archive system now and won't be using archive.org or any other systems soon anyway. It's just difficult learning how to do business in this ever-changing environment . :)
Thanks.
|
Poster:
|
gojomo |
Date:
|
Mar 28, 2011 3:48pm |
|
Forum:
|
web
|
Subject:
|
Re: Updated Wayback Machine in Beta Testing |
I strongly doubt Google is looking at our oldest-page for any age-of-domain decisions, at least not for anything in the last few years.
For the last 10 years or so, Google's own internal records of domain-names (ownership and DNS-resolution) and website-lifetimes are likely to be as good or better than ours. And, if they were scraping our dates-page in an automated fashion, they'd have to be doing it surreptitiously against our robots.txt -- also unlikely.
- Gordon @ IA