2022-04-25 LIVE from NYPL The Internet Archive at 25: Brewster Kahle and Tony Marx >> Tony Marx: Good evening. Good evening. How's everybody doing? >> Audience: Great. Great. >> Tony Marx: Welcome to the Niarchos Foundation Library. It's good to have you all here and it's especially great to have you here, Brewster. >> Brewster Kahle: Tony, great to be here. Thank you. >> Tony Marx: Always. The -- so it's our pleasure and honor to have Brewster Kahle here to help -- so that we can help to celebrate the Internet Archive's 25th anniversary. Amazing work. [ Applause ] And of course, who better than Brewster to help us think about the Internet Archive as well as all things digital library, the future, as well as the past. You all know that Brewster has been at it for a while. A graduate of MIT in 1982 in artificial intelligence. >> Brewster Kahle: Yes. >> Tony Marx: Worked with Marvin Minsky. Holy moly. For those who -- that's incredible. >> Brewster Kahle: The founder of artificial intelligence. It's turned out that it was a good idea. >> Tony Marx: [Laughter] My favorite line about artificial intelligence is from Stephen Hawking when asked about whether he was worried about the future of artificial intelligence, and his answer effectively was, not as much as I'm worried about the future of human intelligence. Invented precursor to the World Wide Web, Alexa. You know, just at the formation and for the last 25 years doing fundamental transformative work in ensuring access to information of the most essential kind to the world through the Internet Archive. Look, I'll get out of the way so we can have a conversation, except that I think we're all very familiar of how the internet has gone wrong, at least compared to what we expected, privacy, the effect of social media on the youth, surveillance, capitalism, all of it. We're clearly not in the utopia that we imagined or that perhaps we imagined and yet there are still amazing things. I mean, we forget how totally transformed our life is, except when we forget our phone somewhere. And lots of amazing, good, and great things. I mean, think about the access to information. Think about Wikipedia. Think about the Internet Archive and everything that it's doing. So, you know, this library, obviously, very much a part of these conversations has, you know, work with everyone that we can to sort of move this forward in the various ways that we can. So we wanted to help celebrate and to hear from Brewster. Brewster, why don't we start -- if you don't mind, let's go back for a minute and just tell me, what did you think the internet was going to do? What did you think the world would look like, you know? And sort of what happened then? [Laughter] >> Brewster Kahle: The internet that got me excited, right? So the vision of the internet is the library of everything, right? It was the -- that was the possibility that there was a transition, it doesn't happen very often, of how people express themselves and pass on knowledge. And it was the print to digital transition. And we knew that this was coming even when we were -- you know, in 1980, we sort of assumed that it already had happened, but it didn't. And so this transition was going to happen from print to digital. And so I looked at what happened when things went from manuscript -- from oral to manuscript and then manuscript to print, and it was kind of chaos. These transitions, whether it's the Protestant Reformation was sort of empowered by this new printing press thing, Socrates was -- >> Tony Marx: Invented to create indulgences, not for Bibles. I love that part. [Laughter] >> Brewster Kahle: Yes, you've written the book on the Reformation, he has. And then Socrates was against writing things down, because we would think differently. So these transitions were extremely important. So I thought, okay, why don't we go and try to do something right with this digital transition? And so how do you go and you got to anchor it right in the open. So help work on trying to get the openness to work, the ways, and it became the World Wide Web, which was good. I helped get the New York Times online, the Washington Post -- Wall Street Journal, New York Times, Encyclopedia Britannica. These published -- okay, let's go to this World Wide Web and make that work. And by 1996, we could build the library. We thought, okay, you know, it's going well enough, but this is -- you know, remember, it's, you know, things go back to a wavy screen here. It was 1996 when there's -- you had still lists of what's new on the net, right? That was how you navigated and found things. There's the What's New page in mosaic. I don't know that there was Netscape yet. There wasn't Google yet. There were just was AltaVista. But okay. But it was going this way and it was open. It was good. Let's build the library. Let's make it so that the expression of people that are going and putting these things in this open world could be remembered, that it would basically be a foundation. These are good times. It wasn't without their fights and those -- the status quo people that were trying to keep the things like the World Wide Web from happening. But then we started going and archiving the World Wide Web and we were called crazy. You know, it was like either you couldn't do it or why bother? Went and said, no, no, no, it's going to be really important. Right? People are expressing themselves out there. Their words deserve to be in the library. >> Tony Marx: This is a historical record of all websites. >> Brewster Kahle: Yes, the ideas we started crawling, you know, by making a computer robot that basically would download a page and then record it and then look for all of the links on it and add that to the list to crawl. And they would just they would do another one and capture it and they would just keep going and going and going. And I was looking back and the first complete collection of the World Wide Web was 2 terabytes. That was it. It was 2 terabytes in size. And I wanted to give it to the Library of Congress and they -- it took them 18 months to go and say yes. Now you say, why does it take -- you know, why not? And it's like, what does it deserve to be in the library? I mean, this is the time -- you know, when I look back at the Reformation, libraries didn't collect print, because, you know, all the real stuff was manuscript. >> Tony Marx: Ephemeral. It was too ephemeral. >> Brewster Kahle: It was ephemeral. It's like pamphlets. Those were the blog posts of the time. Okay, they were important ones when they were put out by Luther and Erasmus, but they were just not thought of as library grade. So we wanted to get this copy into the Library of Congress. They said yes and so I commissioned an artist named Alan Ralph to make what does it look like? You have this opportunity, right? To, you know, like hand something, a bunch of bits. And so he came up with a sculpture that had blinking four screens that showed web pages from it and he called it 2 terabytes in 63 inches, the World Wide Web, 1997. And because he knew, this artist, Alan Ralph, knew it was going to go up and up and up. And so it was in the front of the Madison building for 15 years. When we walked in, there was this beautiful sculpture showing that, you know, Library of Congress had a copy. And then they said, but we have the bits, so do you want it back, Brewster? And so a couple years ago, I guess 10 years ago now, we got it back. So we now have this beautiful sculpture. So that was, you know, to imagine sort of those days. >> Tony Marx: Actually, help us understand. I mean, I remember when I first heard that the Library of Congress was going to be collecting an entire -- the entire corpus of that -- everything on was it Twitter, or Facebook, or one of those? And, you know, old folks like me sort of thought, oh my God, that's like too much and not worth it. Right? >> Brewster Kahle: Right. >> Tony Marx: Why is that wrong? >> Brewster Kahle: You don't know what's going to be really important. And libraries in this digital age have got to be much more proactive than they used to be. We used to just kind of buy the new books and then we wait for people to die and then we get their stuff. But in this digital world, the average life of a webpage is only 100 days before it's either changed or deleted, 100 days. So you've got to be proactive to go and get it. And you've got to then -- and you don't know what's actually going to be the valuable things out of it. And one of the real wins for me was the Internet Archive started collecting the World Wide Web, these robots. But we found that actually it was a -- we needed people's help. We needed librarians' help. We needed librarians' help from all over to go and build a great collection. And we -- they needed our tools to help build their collection. And it's really important that the collections interlink, because it's the web, right? And so what was, I thought, kind of a surprising thing in the early days of the Internet Archive was how apt the libraries were for going and collaborating and working together to go and have their librarians go and say, okay, there's an election in Peru and we need to go and collect these websites every two days, because. And on this day, we need to do it every three hours. And so there was just spreading out of just like, okay, let's make sure that we're building these collections of these digital materials. And there are now over 900 different libraries, museums, national libraries. We collect the web for the Library of Congress. Their librarians go and say, these are the important things. It's been this amazing collaborative project of the libraries to go and respond to the challenge and the difficulty of the digital era. And you bring up Twitter. So Twitter, actually to their credit, they said, we'll donate a copy to the Library of Congress. That's terrific. They said, yes. And it just started coming in and it was a lot and it was more than they could really handle. And they then went selective, which I don't think was the right idea. So we've worked hard to go and do this, because it's a little hard to kind of imagine the scale of the project. So the Library of Congress, if you just had to take the books in the Library of Congress, a book is about a megabyte. So, you know, this book -- >> Tony Marx: That might be 2 megabytes, that one. >> Brewster Kahle: Yeah, it might be 2 megabytes. It weighs a lot. It's heavy. It's about a megabyte. And Library of Congress says there are about 28 million books in the Library of Congress, 28 million megabytes. Twenty-eight million megabytes is 28 terabytes. So that used to be a lot, but we actually collect that from the World Wide Web before lunch every day. So it's that kind of collection of -- on the amount of information that's now coming out that sets some challenges and actually really encourages people to work together. And we've been working together with the New York Public Library on web collecting, also in digitizing books for years and years and years. New York Public Library, as you may know, is done simply and really work to go and make it so that this digital revolution is put in place, large part by the libraries themselves. Not always just playing a pick up the pieces role later. So New York Public Library has played a very supportive collaborative role in the building of our digital ecosystem that I'm not sure is always acknowledged. So I just wanted to make sure that -- yeah [applause] it's happening and it's tremendous. But yes, we've got to be proactive rather than reactive in the digital sphere. So we have all of Donald Trump's tweets, all of them. >> Tony Marx: It is the historical record. >> Brewster Kahle: It's the historical record, but it's not on Twitter anymore. So it's not there. Right? It was so important, right? When you're going through all of that, but it's gone. So and Twitter is, you know, not going to, you know -- anyway, so it's -- [laughter] >> Tony Marx: It's in the news today. I have heard that, yeah. >> Brewster Kahle: Yeah. >> Tony Marx: There's something happening, right? >> Brewster Kahle: There's issues with these companies, right? And so libraries have always worked in parallel with the publishers and made it go and sort of stay out of each other's ways and make the whole thing work. And so did you know that there used to be Yahoo Videos? Not anymore. There was Google Videos before they bought YouTube. There were 6 million of them. There is all these expressions of people's creativity that are in these large scale platforms and companies. And then what happens to them? And that's what libraries are for. >> Tony Marx: So a very quick story. When I got the job at the library, the way I remember this, they showed me around and I came upon a room across the street that, as far as I remember, I'm sure this isn't true, looked like a pile of phonebooks, like the biggest pile of phonebooks you've ever seen. And I was like, why the hell we have that? Right? And they said, let us tell you a story. And the story was that after the Second World War, the Polish government sued for a restitution by the Jews, said there were no Jews living here permanently. Prove it, that there were. No one could prove it, until they found one copy of the Warsaw directory across the street with the names and addresses. So the historical record, obviously crucial. Let's switch. And I'll just say the ecosystem of the production of books and of information was built over a long period of time. We're all reading and learning with -- you know, and it's taken lots of different actors doing lots of different pieces, sometimes cooperatively, sometimes competitively, but it's worked. Now we're at, you know, a whole different thing. And I think it's fair to say that we are both motivated by a simple aspiration, at least in the world of books, which I'm now turning to, which is everyone on the planet should have access to everything that's been written through their library, for free. And that is, to us in library world, that's a basic human right. It's the foundation of democracy. It's the foundation of an economy that works, of a civilization that works. Tell me about books. Tell me about the Internet Archive, where you are -- where you started and where you are. >> Brewster Kahle: Books. >> Tony Marx: Books. >> Brewster Kahle: For the love of books. [Laughter] So, okay. So we started with the World Wide Web, the most fragile of media. By 2000, we got going at television. Turns out that nobody really had built a large scale collection of television. The Library of Congress was going to and it was going to take them a while. So we hit the record button. Russian, Chinese, Japanese, Iraq, Al Jazeera, BBC, CNN, ABC, Fox, 24 hours a day DVD quality. And that's been going on since the year 2000. So 2001, there's an event you may -- and so we tried to figure out after that event, what do we do? How do we help? And people in the United States were trying to figure out, you know, what's the reactions of others and people in the rest of the world were trying to figure out American reactions. So we put up one week of all of the television news from around the world about September 11th, from just like the hour before the first plane went in, for a week. And what was, I think astonishing, even if it wasn't -- and we put it up one month on October 11th, 2001. And so this is four years before YouTube, right? This was when this was really, really hard to do. And -- but it really, I think, helped people kind of be able to see the Russian reaction and the Russian announcers were just breathless. It was like, holy grail, this is going on. And the sympathy and the -- but you could also see during that of, when did it turn into a war on terror? And who did that? Right? It could have been framed in many different ways and it didn't. So anyway, so television and get going really at that. Then in 2001, 2002, we started really getting going on books digitization, the Million Books Project, which was a joint project between the United States government, the Chinese government, and the Indian government to go and do mass digitization of books. And that's now continued to go on. About -- the Internet Archive digitizes books. The Google Books Project was certainly the big thing that everybody was trying to figure out, what does it mean to have this tech company go and do this book collection and how should that all work? And there are just lawsuits that went on and on. And then the judge basically said, yeah, what you're doing is legal, which was the snippet view and let the machines read. But there were a lot of us that wanted to do something more open than that. And New York Public -- >> Tony Marx: Can I just be clear? I mean, so snippets the court said, you can read a little piece. Right? That's not going to sort of get in the way of commerce of somebody buying -- wanting and not buying the whole thing. But of course, it's sort of -- it's a cruel joke in the sense of, you know, if you think about access to information, snippets is sort of the opposite of what we want for quality information and going through it. >> Brewster Kahle: Great. And to educate a population that's more and more online, you know, as it -- >> Tony Marx: At least in snippets. >> Brewster Kahle: And yeah, you know, that the line that if it's not online, it's as if it doesn't exist. I think is true. And for those of us that know anything deeply enough, a lot of what we know of -- a lot is not online yet. Or if it is online, it's available only in a very, you know, rarefied way. So how do we -- >> Tony Marx: Explain that. People think that it's all there? >> Brewster Kahle: Oh, no. The 20th century is just not online. So we get going at digitizing books. There are about 500 libraries that have all used the Internet Archive's resources and the scanning to go and make their mostly public domain books available to the world. So 136,000 books that are in the collection of the New York Public Library have been digitized and made freely available to anybody to download, to reuse, to go and think about any which way they want to. Hurray for the New York Public Library, right? That's just -- I think it's important to celebrate the participation that these libraries have done. But that's not all books. Really, the big missing piece is the 20th century. So the 20th century is caught in this copyright zone of where it's still under copyright or maybe, at least enough to be sort of confusing, and -- but it's -- so it's not as available as it can be. So foreign. And so when five years ago we were trying to figure out how to reinforce the information ecosystem against disinformation, because we went through the 2016 election and no matter who you, you know, wanted to be pre -- become president, the whole information ecosystem of the internet suffered. It was -- just didn't work very well. There was just -- we needed to go and do something a lot better. We need a better internet than that to be able to survive as a democracy. So we talked with Wikipedia and Katherine Maher, who's the head of Wikipedia and said, you know, what do we do? How do we make this better? And she said, she was worried that truth might fracture. And I said, guys that sounds horrible. What does that mean? She said that Wikipedia is built on this idea that on any subject though, a consensus will arise, that if you -- and if not, there's sort of this push and pull, but it's not two different truths. She was worrying that two different truths would happen about things. I was like, that sounds bad. We can't lose Wikipedia. And so I said, how would this happen? She said something I'd never heard before, citation wars. Behind every Wikipedia page, there's a pushing and shoving of what gets put in the encyclopedia of the internet, Wikipedia. And she said it was based on people would win arguments based on the heft of the source and whether you could click on it. If you couldn't click on it and check the reference, it got less weight. >> Tony Marx: It doesn't sound unreasonable. >> Brewster Kahle: Doesn't sound unreasonable. I said at that time, this was right after the 2016 election, I said, okay, we will do what we can to be the library of the internet. We'll go and reinforce Wikipedia. So we wanted to turn all of the references blue, every -- the references at the bottom, so you can go deeper. And it turned out there were a lot of broken links there. There were a lot of links to books that you couldn't click on and there -- and you -- >> Tony Marx: Couldn't click on because they -- the book is not available for you to click on it. >> Brewster Kahle: Just wasn't available to click on. And so we prioritized fixing the broken links. We fixed 12 million broken links by pointing them into the Wayback Machine. We then went through and started changing the book links so that they -- you could click on them. So if it's got a page number, it goes -- opens right to the right page. And if you could see one or two pages and then beyond that, you have to borrow the book or go and buy they book. Okay. But you can get basically -- you can go and find out whether it's right or not. And it turns out -- so we've now fixed 500,000 links to 100,000 different books in Wikipedia. We prioritize digitizing those books and we did -- we lent those books using a system that was pioneered by Michelle Wu in the Boston Public Library 10 years before, towards going and having it so that you have a little bit that you can see. But if you want more of it, then you borrow the book and one reader at a time. And it's worked great. So that went -- that started in 2011 with -- and worked along fine. And so we basically did this. So an example, there's a page on Wikipedia for the Holocaust. It's a bibliography of the Holocaust. It lists 140 books. Before we kind of concentrated on it, there weren't any links. Right? There were just dead text. So you could, you know, try to find it, but most of these things were way out of print, really expensive, whatever. So it turned out we had 82 of them. So we wove those into the pages. And also we went out to go and prioritize getting the others. So we've prioritized all the books that are in Wikipedia. We're prioritizing a book drive of the Ukrainian Wikipedia. So there's the Ukrainian Wikipedia. We've taken all of the books, lists of that, and we're prioritizing getting a book drive to go and get those books digitized so that people that are now in diaspora can have access to their own literature, at least prioritized in that way. >> Tony Marx: So this is part of, I assume, a global effort to preserve archives, et cetera, in the Ukraine and -- >> Brewster Kahle: Absolutely. There's over 1,000 volunteers that are doing web archiving and that all of materials go into the Internet Archive, because we've given away free accounts on these things and there's also a Save Page Now. So there's 1,000 people that are working on this and there were hundreds of people or maybe 1,000 people, working during the tsunami in Japan. So every time there are these events, there are people that sort of say, I'm going to help in this. Or when Trump was elected, people were worried about the websites, the government websites, having to do with climate and ecology. And so there was enormous groups of people that came together to go and try to archive this. They're all in the Wayback Machine now. >> Tony Marx: Tell me about how you think about the role of publishers in this ecosystem at this point and going forward. >> Brewster Kahle: Oh, publishers, you know, are really there to go and get new stuff out, cultivate authors. Let's have them give some better deals to authors, I would say, to go and keep that whole structure going. And libraries in the United States is about a $12 billion a year industry, 12 billion. And about 3 to $4 billion of that goes to buy publishers' products. So it is a very decentralized support structure for publishers and I would like to see this continue. So what libraries do is they buy books, they preserve them and then they lend them out. Let's have that structure continue even in the e-book world, which right now actually is kind of a problem in that area. >> Tony Marx: I agree. I mean, I think we agree that we need the publishers as part of the ecosystem to -- and we need authors to be paid as part of the ecosystem so we'll have creativity. I think we also agree that, I mean, those aspects of the capitalist system of production of, you know, books is working. The surveillance capitalism side, the sort of the side that's come with digital doesn't feel so good. And as you said, it's one of the reasons why libraries are asserting themselves and creating our own apps and protecting privacy and not letting commercial interests drive, you know, the choices. You've been leading that from the word go. >> Brewster Kahle: I think we're part of a library system that's trying to figure this out. And the Internet Archive plays a role, but it's a cooperative role towards trying to have the system work with many winners. How do we have a system we have many winners so you don't end up with just a few publishers, or you don't just end up with everybody just reading a few authors, or just having a few libraries? How do you have a game with many winners, where you actually have lots of participants that people are compensated and they don't sort of strangle each other or just collapse to -- into too few? The publishing industry, the book publishing industry, is doing better than it ever has before. I mean, it is just going up, up, up. But there are very few gigantic publishers in the trade publishing globally. That's too few voices. So in the academic domain as well, there's just too few. We want lots of people being able to participate in new and different ways and come up with some creative, new approaches. You know, BRIC House in Brooklyn is doing some fun new style of publishing and supporting their authors' approach as a cooperative. It's like, great, let's go. And we -- as we libraries, let's go and buy their books, their e-books. And they're actually selling them like really selling them in this sort of traditional way that you can buy a book. But how do you get an e-book without sort of these license agreements that are really pretty strange and they're very -- they don't allow some of the traditional things you would imagine? So how do we go and make a system with many winners? I think that's the internet that we all need. Otherwise, I don't sure how we as libraries are going to help participate in fighting disinformation, doing our role of long-term preservation. How do we go and do this? The opportunity of the internet was to have lots of participants and we are ending up with, you know, just a Facebook, a Twitter, a Google, an Apple, and an Amazon that are very, very successful, but it's too few players. >> Tony Marx: Right. First, I should say there -- my colleagues have some cards. There are golf pencils. If you have some questions, we're going to break for that in about 10 minutes. So please write them down and we'll -- Brewster will handle them as best we can. The -- let's see. I think, Brewster, we fundamentally agree on what is so important, which is we need the diversity of views for this society, this democracy, this economy to work. And, you know, not everything is leading in that way. And the concentration of ownership can get in the way on that, as well as commercial constraints on the sharing of information. The -- instead, what do you think of as the big structural impediments at this point? Thinking about now going forward, to get to where we want to get, help us think about the map of traps -- of sand traps, if you will, that we need to be thinking about. I ask this selfishly. >> Brewster Kahle: Yeah. In the short term, some things just seem like kind of too good to be true. Like, remember when you first got Netflix and it was like, wow, I could get any DVD. You know, it was like -- >> Tony Marx: In the mail. >> Brewster Kahle: In the mail. It would come and it would be this sort of feed of these things. I didn't have to go down to Blockbusters. It was kind of awesome. And then it kind of evolved into this streaming thing and then things started disappearing and it was like, well, you better see this by the end of the month or you can't have it. We kind of gave up our DVD collection and our VHS collection. And it sort of became this sort of shifting sands. And I think they were -- then there's Spotify. Right? You know, the number of people that actually have a CD player, other than in their car, is starting to become small. So it's whatever you kind of can get by asking for it. You can ask your, you know, Google device or Alexa device and it starts playing stuff for you. But is it all of it? Is it kind of the interesting stuff? Is it at the end of the day, are you going to have a record collection to give to your kids? No, you're not. So we kind of got this streaming world and there's now this sort of idea of having a Netflix of books, and that's what a library is. We've sort of outsourced our collection criteria to these sort of corporate subscriptions. And these books kind of come and go from -- and they can change what's that book at any time and they can take it away from your reader at any time. And it's just like, I kind of, you know, okay, Netflix, all right, music, yeah. Books. Its books. It's how we as a society think things through, is books. So I think that we've gotten sort of addicted to these streaming services, where there's very few players. And the idea of having okay, now you've got HBO Max and, you know, a couple others to choose from. That's not the right number of ways to be informed as a citizen, especially in the realm of books. >> Tony Marx: Well, the beauty of the internet, amongst other things, is it allows you to get to the long tail, unless it's not commercially viable and that decides whether you can go onto the long tail. The -- I mean, look, I think we spend a lot of time these days, as we should, talking about book banning. This institution is just on ban some books. Everybody's trying to figure out what to do, right? But it's interesting, yes, that is a serious threat to access to information to democracy. But so is the less obvious, less loud day-to-day commercial decisions that take things away or prevent you from having access to them. I'm not saying out of evil necessarily, but just that's how the system works. And that's a problem. I mean, that's, as you said, that's why we decided we needed to create our own app to offer this platform. And simply instead of overdrive, with all due respect, it's very simple, we want to protect people's privacy. We don't want you surveilled while you're watching it. And as importantly, we want everything there, even if it's not commercially, you know, sort of viable. A commercial entity will put the bestsellers forward so that you end up buying more, you know, licenses to more bestsellers. That's great. But that's not the only business we're in, right? We're not, you know, just in the latest bestseller. We're in the corpus of human information. Right? >> Brewster Kahle: And thank you for doing that. I mean, I got to visit someplace completely great this morning. It's called ReCAP. It is down in Princeton, New Jersey, and it is the offsite repository for Harvard, Columbia, and New York Public Library, and -- >> Tony Marx: Princeton. >> Brewster Kahle: In Princeton. And it is huge and it's got 17 million books. It is gigantic. And they basically will turn around a book in 24 hours out of this huge collection and they'll basically go and bring it back in. You know, I look at this and so it's like, thank you, right? For going and doing this. And they're actually about to go and commission the next bunch of it that's got another 4 million volumes that they can take in. It's great that you can get access to it. How many of those are online? And we've only done 6 million books total, right? And there's 17.2 just in that one building. So we have so much more to go. >> Tony Marx: That building, that is amazing. It is the long tail and we do scan on demand and we keep that, obviously. The -- you know, I think -- and I should just say the ReCAP is interesting, because it says even in the analog world, in the, you know, the old world of, you know, storage of massive, millions of books, there's still interesting efficiencies you can find. I mean, we were collecting side-by-side with Princeton and Columbia and us in Princeton, meaning we would buy three copies of a book that probably no one was going to look at for a very long time, sitting there next to each other so we could say we have our own copy. All right? Two things happen, once -- one, Harvard says we want in. This is the richest university in the world. And the reason is instead of buying now four copies, we'll buy one copy and we'll share it. And that meant -- and those private collections of the richest universities in the world, which had been off limits to the public are now available to the public. The reverse of privatization. I say these stories just as -- and we doubled the size of our collection overnight. The -- it's -- no, it's -- Brewster, the -- thank you. >> Brewster Kahle: It's important. >> Tony Marx: It wasn't me, but it's -- yes, but it's -- it says you can like take -- I mean, it's sort of obvious things that just say we need to think about this differently. >> Brewster Kahle: Let's do and make sure those 17.2 million are digitized and made available to the Wikipedia generation. >> Tony Marx: Absolutely. >> Brewster Kahle: Let's just go and do that. >> Tony Marx: I want to do that. And if Google scanned 30, 40 million books, I'll just say this clearly, it makes no sense for society to rescan those books. >> Brewster Kahle: Unless they don't make it terribly available. >> Tony Marx: Well, they best, because that's what society needs them to do. Right? And if we need to give them assurances, work with them, we will. But again, we have to find ways through this log gentlemen. You've been the leader in that. >> Brewster Kahle: Well, we're another -- trying to help by working with now 500 libraries to go and try to get the books collections. The number of -- that we've worked with in terms of getting the 78s, which is before my time, you know, the crank, the horn, the dog, those, right? The 78s, you know, [multiple speakers] >> Tony Marx: And we go further back than that. We got cylinders. We got the wag, right? >> Brewster Kahle: We're on. Trying to get all of that available, it's been fantastic. We've worked now with over 100 different collectors and libraries to go and make it so that that treasure trove is available. And it's great. I mean, so the real goal, you know, since the Library of Alexandria, the myth of the Library of Alexandria is universal access to all knowledge. Can we make everything ever written, expressed by people ever, available to anybody that could learn from it? Can we do that? Economically, technologically, all which absolutely we can. The internet is a fantastic resource to be able to make that dream happen. And I would go further that if we don't do it, we're going to shortchange the next generation, because they'll learn from whatever they can get ahold of. And if they're not getting ahold of the best, they're going to get brought up on whatever dribble is out there. And right now, we actually have a problem and I think we're seeing some of that problem. Because when you say internet to people now, they kind of reel back as if it's not a good thing and it's because it's kind of become a Twitter, a Facebook, or a really superficial type packaged propaganda machine of either celebrities, or politicians, or state actors that are really pushing certain things to make those freely available. And a lot of the good stuff isn't either available at all, or it's behind some sort of paywall, or you have to be on campus in some expensive university you're not at. This doesn't make any sense. And I think we're seeing a suffering by not putting a layer of good information out there that's referenceable, usable. And that's what we are for. That's why we're paid the big bucks. >> Tony Marx: The -- I totally agree. I will say, because we are the oldest, biggest library in the country or the biggest, anyway, of the public libraries, I mean, we have to do that. We also have to maintain close to 100 facilities, like this amazing new one, because people are human beings and they want to actually be together, the way we all wanted to be back together. We need to keep physical collections, because that's also, you know, essential. And education programs. Oh, and we haven't even touched the fact that here in New York, the capital of the information age, we are talking about how to make sure everything's here, but there are 2 million New Yorkers who don't have access to broadband at home. I mean, that's just in New York. I mean, so, you know, we've got some work to do and yet everyone thinks, either it's all already there. Or they think it's yucky, though everyone's addicted to it for buying everything, getting everything, finding friends, finding whatever, you know. I mean, it's incredible. >> Brewster Kahle: But people are using these things in new and different ways. So we digitize all of these books, right? And people go in and out of the books really quickly. So it's often they're only on there for maybe 30 seconds, a minute, a minute and a half, two minutes, in general. It's interesting. So it's being used in a very different way. >> Tony Marx: They're looking at a book for 30 seconds? >> Brewster Kahle: Thirty seconds. They're reading it a lot like webpages, just really good ones. And I think that's actually -- >> Tony Marx: Is that because the way people have learned to think and digest information has changed by the speed and readiness of this? >> Brewster Kahle: I think just by having these screens around, people don't necessarily read in the same way. Or they're using our library, which is -- remember its old stuff, right? We have the old stuff. So it's not the newest, you know, beach bestseller. We're not -- we don't do those. >> Tony Marx: We have the beach bestsellers. >> Brewster Kahle: So but the way that people go from a Wikipedia article into a book, they don't necessarily, you know, go to the beginning and just start reading it online all the way through. They're using it and putting together their world out of lots of pieces of information. And actually, a lot of libraries are that way. The use in libraries are often different. There are public libraries that will lend out a book that they'll -- people will read from beginning to end, but that's not what we see in general in the Internet Archive's book use. >> Tony Marx: Have you seen a -- and sorry, quite -- [inaudible]. The -- have you seen in the 25 years of tracking the changes in behavior of reading, since you can watch that online? >> Brewster Kahle: We see people going through web pages very, very quickly. People -- and what -- and an encouraging thing is people don't suffer fools lightly. They bounce off of crap websites immediately, in our experience. I mean, they get it and they're gone. So actually, I'm really quite encouraged by the sort of summary views that you get of web usage. So people are discerning, people are interesting, people are peculiar. They're interested in what they're interested in. And so I found, in general, the user populations that we serve to be the sort of upstanding people that you want to be serving. >> Tony Marx: But are they looking for less intensely? Are they reading for less time? >> Brewster Kahle: I don't have it over time. >> Tony Marx: Got it. Okay. >> Brewster Kahle: I don't. >> Tony Marx: So, all right. Some questions. So curation. How do you make sure that -- sorry, I'm going to paraphrase. How do you make sure that you're not just storing misinformation and bad ideas? Where's the curation, the validation? >> Brewster Kahle: Good question. Like in the web, what we try to do is go and collect, well, it all, and then try to provide some level of context for it. Often the context comes from who collected it and why. So we will link back to the provenance of the webpages in the Wayback Machine to the organization that requested that it be collected. And that will often be in a collection that has titles on it that are annotated by the librarians. So then there's real sort of, you know, bad stuff that's sometimes uploaded to the archive. I don't know. You know, bad stuff and we'll just take it off the -- out of public view, at least and so try to -- or put some sort of level of speed bumps or try to provide sort of context when things have been debunked or the like, to go and help people understand what other people have said about it. So one of the great things about libraries an important things is context. And, you know, it's the card catalogs, it's the -- >> Tony Marx: It's the librarians you trust. >> Brewster Kahle: It's the librarians you trust. It's like, what is this for? And so we try to have a very broad range of materials in the Internet Archive's collections, the web collections, book collections, music collections. And then we try to provide context for people to know what it is that they're looking at. We try. >> Tony Marx: I did notice, I was reading President Obama's remarks at Stanford that he made last week, I guess, on technology, and one of the things he said was that actually, we can solve echo chamber problems if you put other information, if you curate, if you put other information in front of people, you can have adjustment. I thought that was interesting outside, just because a lot of the social science previously, that I'd been reading, had said, if you try to break people's echo chambers, they react negatively in extreme further. And yet Barack Obama was saying, no, actually we can challenge that. Right? With better, more, varied information. >> Brewster Kahle: I like the line, the answer to bad speech is more context. Right? So, you know, basically -- but the web has sort of evolved into this thing that often we take on our phones and we just get to see this one snippet, this one point of view. We don't get all the information around it. >> Tony Marx: Or who produced it or -- right. >> Brewster Kahle: And right. Just like, where's this from? Is this been rattling around the internet and it's, you know -- or is -- am I the first person? Is this -- so what am I looking at? And so the Internet Archive's tried to do some of that. But I think we have so much further we can go to try to help people understand. Most people that we see actually want to know what's going on. There's some people that just kind of want to be entertained, just, you know, bring me along. My bias bubble's fine. But most people don't want to feel like they're being lied to or taken. And so why don't we help that? As libraries, as also technology companies, we can go and bake better products that serve more than just advertisers. >> Tony Marx: So this is a flip, a slightly flip of the usual argument, how do you -- and again, paraphrasing, is there a Western bias in your collecting, our collecting, in the sense of, you know, that that's just sort of what we go to and that therefore, ideas from outside of those traditions are sort of being left aside when we may need more varied ideas? >> Brewster Kahle: So the Internet Archive's web collection is biased by the people that are participating in building it and in the sort of the commands the robots. But we're collecting 360 million pages every day, so it's a lot. And we try to follow and do as broad as we can and engage people from around the world, mostly in libraries and museums but also in governments, to go and help figure out what should be in it. Should we do more? Absolutely. Interested in the breadth. The book collection is really quite narrow. We've got a really great collection of Indian materials, because the Indian government invested so much in going and doing digitization and making public their materials. United States. Canada has been doing a lot. Europe. But, you know, but there's sections of the world that are really underrepresented and we're interested in finding more of those libraries, museums, archives to participate and work together with. >> Tony Marx: So here's one. How do you balance historical preservation against people's right to be forgotten, as part of the right to privacy? >> Brewster Kahle: Right. Right, right, right. Universal access and privacy are sort of flip sides of the same coin. So how do you deal with this? The Internet Archive with the -- let's just take the web collection. If people don't want to be in the web collection, right? Their blogs or whatever, they write to info@archive.org, and they do all the time. And then we try to make sure that they are the person that they say they, you know, are saying they are on the internet on the site of the website. >> Tony Marx: As versus a competitor saying, take down my competitor. >> Brewster Kahle: Yeah, that kind of thing. And in general, we give quite a latitude towards people to take things down, because a lot of the web was not meant forever and you might want to sort of get over, you know, that part of your life or whatever. And so it's -- so we have sort of a bend-not-break kind of -- it's not absolutist. And this has worked for 25 years now, in terms of sort of working that through. So there's sort of an escape valve, if you will. >> Tony Marx: Here's a classic question of the era, who funds this? And who decides who has access to it? >> Brewster Kahle: Oh, good, good, good, good, good. Yes. So we're funded a lot, like, you know, other libraries. So we get -- well, actually, libraries fund a lot of what it is we do by collecting webpages, including the New York Public Library. Thank you. To go and collect webpages so that the Library of Congress, National Archives, national libraries around the world to go and build these web collections that they get to have and that are also part of the Wayback Machine. Then another large section is people paying to digitize books. Then grants from large philanthropies. And then at -- we have the end-of-the-year donation drive sort of the Wikipedia or the NPR, please. [Laughter] And I'm really quite proud that 100,000 people last year decided to donate to the Internet Archive and it's been going up. And it's based on that broad public support that allows us to do -- not just spend the time working on projects that are funded through the grants, but for the broad public. And so I think there's -- things changed, I think, a lot with -- in the United States with the last election and also the 2016 election and also COVID and we're all homeschoolers now. We all have to basically work with the materials that we have available to us online. So there's a real increase in the amount of outpouring towards Wikipedia, the Internet Archive, other Electronic Frontier Foundation, the Public Knowledge, these public-spirited internet era organizations that are trying to build the tools for being an educated digital citizen. Oh, and yeah, we're 501(c)(3) nonprofit. So if anybody [multiple speakers] >> Tony Marx: Anybody has access. >> Brewster Kahle: It's tax deductible, please. Also, oh, people are donating physical materials to the Internet Archive. >> Tony Marx: So then scan. >> Brewster Kahle: We -- well, we preserve it. First of all, we start by preserving these materials and then we digitize them once funding allows. And right now, actually, funding's going very well in that area. So we now physically own, 3 or 4 million books, lots of 78 RPM records. We're just in the process of getting 600,000 CDs, audio CDs. So we can't make those available online, except maybe just -- >> Tony Marx: That's one person's collection? [Laughter] >> Brewster Kahle: No, these are institutional collections. The Boston Public Library gave us their sound archives. So that -- and we just finished digitizing all those long playing records. And so you say why? Because you can only make 30 seconds available with the artwork. So why would you do that? Because we actually help because we had this. We were able to help get a law passed last year, thanks to Ron Wyden, actually, to go and make it so the pre-1971 out-of-print recordings could be made available by libraries for free. And then we said, great, we've got 20,000 of those. And so we were able to go and put those up, because Boston Public Library's sort of forward looking nature are funders forward looking nature, even though we could only do some amount with it. In the meantime, things change as -- and the laws change to reflect the new realities. >> Tony Marx: All right. So this is interesting. This is curation and somebody's clearly been impressed by the scale of the amount of information and they ask are there ways to narrow down the vast materials available in the same way physical libraries can offer? In other words, if I understand the question, you know, we've got 65 million physical items here and -- >> Brewster Kahle: Rock and roll. >> Tony Marx: -- but at least as importantly, we have librarians who will help you make sense out of that and say, no, no, no, you want to start here, don't go there, don't trust this, whatever. How does that work and how can it be trusted? >> Brewster Kahle: Not well enough. >> Tony Marx: Right. >> Brewster Kahle: So we really need help in building curated collections. So we have a lot of the curation that came with the collections they came from. So for instance, the books that came from the New York Public Library, you can see all the library books that came from the New York Public Library that brought something to it. There's subject indexes that come with the different materials. That helps, but it's still kind of confusing. But I think what we need is citizen curators. We need people to go and build collections and navigation structures that are new and different towards moving through it. Because frankly, it's kind of confusing and a little daunting. If you go to archive.org, it's just like, holy crow is that a lot. And you're kind of like, don't I have something else to do? I mean, so we need more hooks to get in. Openlibrary.org was a collection that -- was a website that you can build lists on, and people do. They build the lists of the books that they think are really important to read together and they pass those lists around and they build on each other. So I think that we need to go and bring people together to go and filter, not just books, but the music and all the different things that are out there to go and help people through it. Let's use our time and expertise to leave something, rather than just bantering on some chat network somewhere. Let's go and have something that we show for. I mean, one of the big differences, I think, between web 1 and web 2, everybody defines it a little bit different, the web 1, where we had the Wikipedias, the Internet Archives, you had blogs and things like that, by participating in the internet, you left something that people could build on. And the web 2, if you define it as kind of these platforms, it's just it scrolls away. It's not called pages anymore, it's called a feed. I mean, a feed. Right? Isn't that what you do to horses? You go [inaudible]. Right? I mean, it is like something is really wrong here. We shouldn't be spending our time just babbling and having it just go off into the ether. We should be building something together. So I think that's curation. That's individual content. That's new and different things. Some of it'll be compensated. Most of it won't be. And it will be a better internet for it, by turning our tools towards let's go and create things that are worthy of being in the library. >> Tony Marx: I couldn't have said it any better. Ladies and gentlemen, Brewster Kahle. [ Applause ] >> Brewster Kahle: Thank you. Thank you, Tony. >> Tony Marx: Really, thank you for the amazing work that you're doing. >> Brewster Kahle: This is an honor to be here. >> Tony Marx: And, you know, it's not easy and you've been at it and inspiring us and guiding us and helping us and we're all working together and we will get there. And thanks to you. So. >> Brewster Kahle: Thank you. >> Tony Marx: Thanks to you all to be here. [ Applause ]