To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

Langley Green & West Green (electoral division)

From Wikipedia, the free encyclopedia

Langley Green & West Green
Langley Green & West Green

Shown within West Sussex
District: Crawley
UK Parliament Constituency: Crawley
Ceremonial county: West Sussex
EU Constituency: South East England
Electorate (2009): 9193
County Councillor
Brenda Smith (Lab)

Langley Green & West Green is an electoral division of West Sussex in the United Kingdom, and returns one member to sit on West Sussex County Council.

YouTube Encyclopedic

  • 1/3
    Views:
    449
    344
    1 487
  • ✪ Collections as Data at the Library of Congress
  • ✪ LBCC - We The People: "Resistance Social Movements" - April 2017
  • ✪ Hedrick Smith May 6 2016

Transcription

>> From the Library of Congress, in Washington D.C. [ Silence ] >> Kate Zwaard: This next session is a bit of a treat for me. It gives me a chance to highlight some Library of Congress staff and partners, whose work I think you'll find interesting and relevant. I'll start us off, and then we'll have Matt Weber, an assistant professor in the school of communication and information and co-director of Rutgers NetSci, Network Science Research Lab. Matt and his colleagues Ian Milligan and Jimmy Lin of Archives Unleashed Incorporated led a datathon here at the Library quite recently. Matt is going to talk about Archives Unleashed and its impact on the web archiving community. And last up we'll have Deborah Thomas, the library's program manager for the National Digital Newspaper program. And Leah Weinryb Grohsgal, a senior program officer at the National Endowments of Humanities, and coordinator of the National Digital Newspaper program. Deb and Leah will talk about NEH's recent chronicling America data challenge, in which members of the public submitted projects using historic newspaper data in a national competition. But first, I'll start us off by talking a bit about my team, National Digital Initiatives, who put this on today. Who we are, what we've done and what we're going to do in the future. [ Background Noise ] That's us, National Digital Initiatives. I'm going to wander around, is [inaudible]. Great. So hello. Hello? Great. Hello, and welcome from National Digital Initiatives. Let me introduce you to my team. There's me, I'm Kate, you know, that's my picture and me, my face. And Jamie Mirrors, Jamie are you nearby? There's Jamie. Everybody, say hi to Jamie. And Mike Ashenfelder. Mike, do you mind standing up and waving? And Avy Potter. Also, I think, you guys met Jane a little bit earlier and also I think in the crowd are Eugene Flanagan and Colleen Showgun from our executive management team. They provide the support and structure to make all this possible, so if you like what we're doing, please see them and thank them. So as I mentioned, and I'll talk a little about National Digital Initiatives, our team here, but first I want to tell a quick story, that I think illustrates what the Library of Congress's long and exciting history of technological innovation, and that's the story of Henriette Avram, whose work here at LC replaced ink on paper cataloging with distributed electronic cataloging records. Henriette Avram was born in New York in 1919, she did two years of pre-med at Hunter College and then left to start a family. She was in her 30s when she started learning how to program. And I really like this story because there are people who will still tell you that if you haven't been coding since you were in diapers, you'll never be able to do it. And I'm here to tell you that they're wrong. She did it and she changed the world. Henriette and her team here at the library created MARC, which is a structure to hold bibliographic information for materials of all formats. Does anybody in the room remember card catalogs? There are people like that Colleen's age that do not remember it so don't laugh at me. I really do miss the beauty of them and the smell. Like do you remember how they used to smell? But that's about all I miss. They were kind of hard to use, right? And what Henriette and her team gave us is the power to share that information across oceans and search it at a keystroke. So right now I can sit at my desk and search the holdings of the British Library to decide whether I want to take a plane flight over there. In the past, that would have been impossible. It also gave us the ability to do really good precision recall, you know, like searching with a search string is much easier than looking up beagle and then realizing, well that's not how someone intended to catalog it, you have to look up a different name. It, it also gave us shareability. So it's the kind of work that I find really interesting. An application of effort here made everybody's life a little bit better. So we could swap cataloging records, we could have joint teams where we could do the copy cataloging. You know, it really improved all of our lives. I think of Henriette Avram as sort of a hidden hero of computer science. She's a personal hero of mine. She came to libraries from software like I did, and really took her adopted home as a place of her heart like I do. But she did all of this before computers were networked, before relational databases were a thing, before character encoding was really mature. And I really like to just take a moment and reflect on like, what she was able to give us at a time that you know, this kind of sharing of information was really, very limited. And I think also when I think about her story, it shows me the power of combining disciplines, which she did in her own self, and I think that, knowing many of you, that you do, too. Right? You know, you have a data science background and you work in libraries, or you're a software developer who's also a journalist. And she did that herself. But also we do that by having rooms like this where we share our knowledge and get to know each other. And I think that that's where we see really cool seat changes, where we can provide, like cross disciplines. Her work really started the digital revolution in information science and it's something we carry forward today. Here at the Library of Congress but also broadly in the field. I think about her as the person who really started libraries' commitment to open source. So for those of you who work in software in the libraries, we're avid users and contributors to open source but also standards work. You know, we're famous for getting committees together and all working together, right? And I think it really does lift the tide, and that's the kind of work that I'm really excited about. It also shows, to me, how innovation can look like a dot in the timeline, but it's really not. So people say things like MARC was released to the public in 1968. But really it was ten years of iterative releases, and it only looks like a dot in a timeline in hindsight. And you can, and it still continues to improve today. You know, we have teams of people working on the standard here and internationally. It's an international standard that now contains character support for all current used languages. You know? So it's this thing that continues to blossom. And I think about that in relation to other work here at the library, like LOC.gov. It's also another dot in the timeline but is also a continuous improvement and we add new collections all the time. My colleagues and OCIO and library services, put forward an enormous amount of work to release these new treasures to the public. Just this year we released a lot of new collections including George Patton's diaries, the Chicago Ethnic Arts Projects, and Walt Whitman's papers. And I want to zoom down one more level and just take a close look at one collection that we've released. And we could just as easily be looking at another, at any of the other collections like the ballroom dancing instruction manuals which are a hoot, you should really check them out. Or the 2014 election web archives. But I picked this one because it's really special to me and I find it, I think it's really exciting. Rosa Parks' personal papers and on loan to the library for ten years from the Howard G Buffet foundation. They contain 7,500 pieces of personal correspondence and papers and 2,500 Princeton photographs documenting her private life and public activism on behalf of civil rights for African Americans. The materials came here, they were assessed and stabilized by the LC conservation division. They were cataloged and described by librarians and archivists. Files were moved by people and software. They were assessed, validated, prepared for access. Meta-data was transformed. Websites were created. Search index were populated and rights were assessed. That's a few years worth of work in less than 140 characters. And those of you that are librarians in the room, you know that this is just the job, right? A bunch of invisible work to make information find-able and usable. But I kind of want to make a big deal out of it because we make a big splash for cool new websites and exciting new visualizations but the sustained and careful that it takes to get new information out is something I think we should really celebrate. And I think it also speaks to something that I've been thinking a lot about, which is this tension between innovation and sustainability. And I think about this in terms of LOC.gov and the sustained effort takes in getting new collections ready, and the enduring power of MARC and other bibliographic standards like big frame, and I think of Henriette Avram as my shiny beacon. Because her, you know, level of software in libraries, but also because she really believed in the power of infrastructure. Which is something that I can get really excited about, like I think infrastructure's really cool. You know? It's the roads that we all drive to get cool places, right? And I think that groups like ours, like National Digital Initiatives, can either become or be perceived as like, the cool new things group. And I really don't want to do that. I want to do things that lift the tide and have a view to the future. So with that all in mind as we were doing our planning for National Digital Initiatives, we came up with a couple of goals that I'd like to share with you today. The first one is to maximize the digital collection, the benefit of the digital collection to the American public and to the world. We have a lot of stuff here at LC, we have a lot of it, and a lot of it's publicly available in digital. We have 10 million historic digital newspaper pages, we have 1.2 million prints and photographs, we have the personal papers of George Washington, Lincoln, Carlsig and Jackie Robinson and many other maps, books and archived websites and other treasures. We have this really awesome team here called educational outreach, I don't know how many of you have heard of it, but they focus on K through 12 educators and students and they prepare resources and programs that help teachers get material out to their students from the library, teaching with primary sources. And I'd like to think about how we can extend the benefit that they give to advanced scholars and to the public and to the curious, and how do we teach a new generation of journalists to come to the Library of Congress for reference help and for resources? I think that's the kind of thing that would be really exciting to work on. We'd also like to enable more creative reuse of the collection. The great mustaches of LOC, I tried to grow one today for an example, but it didn't really work out. And people use these like for mugs and t-shirts, and it's really neat to see that kind of economic benefit from things that are in the public collective, but it also, in its fun and exciting, and it brings people to the collection, too, which is great, but it also services a more important scholarly purpose, too. Back to the Rosa Parks collection for a second, which contains this amazing and powerful handwritten note in which she says I have been pushed around all my life and felt at that moment that I couldn't take it anymore. This note gives us insight into Parks as an American hero in that moment. But other pieces give us a more fuller picture of her as a human. And that's why I love her pancake recipe, which was picked up by the popular press and food bloggers. From what I hear, it makes great pancakes. Has anybody tried it? >> Yes. >> Kate Zwaard: Yeah? How is it? >> It's very good. >> Kate Zwaard: Oh yeah? Okay. Peanut butter is her secret ingredient apparently. But it really, the pancake recipe I think, helps us paint a fuller picture of her as a person who had a family, who had friends, you know, who wasn't an icon frozen in time, was a real person. And really inspires us to think about the power we could have just being ourselves and thinking about you know, what we could do. And I think also it brings people into the collection. So people read on their favorite food blog, this thing, and I hope that inspires them to think about what they could learn about Rosa Parks from Rosa Parks and brings them into the collection. The second goal we have at NDI is to incubate, encourage and promote digital innovation. We have a really small staff as you just, as I just demonstrated. Which I like actually, it helps us keep focused, it makes us agile, and it lets us try new things without there being an impact on the critical production work of the library. I think of us as sort of an interface to the outside world. I used to say a semipermeable membrane, but I got a lot of weird looks on that one, so people told me to stop. And in terms of an interface, I think of us as being out in the community, thinking about standards, thinking about projects, that would be really interesting and relevant to people inside the library, but also being involved very closely with our library colleagues and what they're working on, what they're interested in, and helping the communication path between those two. How many of you remember chemistry class? So this is one of my favorite energy diagrams, exothermic reactions as you know from chemistry an exothermic reaction happens spontaneously, but sometimes they need activation energy. And I think that's true of many of our projects here, right? Like, we've got great ideas, we've got tools, but we need a little bit of help getting over the hurdle. And I think in that case NDI can provide a catalytic affect. Right? We can be the catalyst and we're not enzymes, right? We don't have the power of an enzyme. But what we do have the power of is we do have the power of is knowing people, knowing technology, being able to connect folks. So sometimes a little bit of application of effort from an outside team at just the right time can get the ball rolling, and I think that we can help provide that. I like to say that libraries are poor, right? We're not Silicon Valley. But we're scrappy and we like each other. And that's our power, right? We're like a stream over a boulder. Our problems may be immovable, but we are many, we are focused, and we have a very long memory. One of our focuses will be building on relationships, will be building relationships that can lead to successful partnerships. So we co-hosted DIPLA fest a little while ago, which was an exciting event that brought many librarians from across different libraries to here into the national archives, and we also co-hosted Archives Unleashed a few months ago. In the interest of no spoilers, I won't talk much about that because Matt's up next. But I do wonder if I could tell you just a quick story about that? So we had, the teams were made up of scholars, programmers and librarians, which is kind of unusual, it differs from a classic hackathon which is mostly programmers, and that collaboration ended up being really exciting and awesome to watch. And we had one team that was looking at some of the Supreme Court web archive data, and they were getting some results, that, from a text analysis, that they weren't expecting. They looked a little wacky. And, but luckily there as a law librarian on the team, and he said okay, well I know a little bit about the contours of the data, and the reason you're getting this is because blah, blah, blah. Like he explained that like, there's some weird artifacts that you might expect. So he helped them construct a query that, that made the results work out the way that they could have expected. And I think that really illustrates the power that libraries have in data science. Right? Where else can you get, where else can you go to get the data? But also to talk with somebody who really understands it intimately and can help you with your research. We're also working on a proof of concept for our digital scholars lab. At the Library of Congress we have the John W. Kluge Center, which hosts scholars from all around the world to be in residence here at the library, to work with our collections, and also to create a community together, and then share the results of their work with policy makers and the public. And we're thinking about how we can help support them with advanced digital scholarship research in partnership with the Kluge Center staff. To that end, we've engaged with two outside experts, Dan Chudnov and Michelle Gallinger, to do a proof of concept implementation. We're thinking about something very lightweight that we can iterate over time, and we will have a report out after that's completed and we hope that we'll have something to share with you soon. I'd like to introduce you Ton Wong and Chris Adams, our inaugural digital innovation fellows at NDI. They're doing a fellowship that's focused on demonstrating an innovative use of the digital collections, and we hope to broaden the applicant pool for in future years, so keep a look out for it. Biking will not be a requirement, that's just a coincidence. So if you don't like biking, don't worry. And lastly, this meeting today, as you could probably tell our theme for this year was collections as data. And this is the capstone of this event where we invite all of you fine folks here to chat together. This is a particular love of mine so I'm really excited about it. But we're also exploring other avenues, like how can we encourage more contributions to open source and shared tooling? How do we evaluate and help with skill building, technical skill building in libraries? And how do we improve on shared infrastructure? Things like that, where we can apply a little bit of effort to get a large result. So contact us, we're small, we're new, but I'm proud of what we've been able to accomplish already. I hope today that you'll enjoy our, this selection of speakers that we've put together. I'm really excited to hear from them. But I hope you'll also take the opportunity to make friends with each other, get to know each other, and think about Henriette Avram and the power that a determined individual can have over the course of human events. Because I look around the room and I see hundreds of you, and more importantly I see a network of professionals that can really help actualize those ideas. So think big in a sense. Thank you. [ Applause ] [ Background Noise ] >> Matthew Weber: Good morning. I'm Matt Weber, I'm an assistant professor at Rutger's University. I'm in the department of communication in the school of communication and information, and it's really an honor to be here today. Thank you to Kate and to her team for organizing this. Thank you for inviting me back after that chaos that we created when we were here in June. I had the privelege of being here in June with my colleagues Ian Milligan and Jimmy Lin, both of whom are faculty at University of Waterloo, outside of Toronto up in Canada, to host the Archives Unleashed event. It was the second iteration of Archives Unleashed and I want to tell you a little bit about the Archives Unleashed journey, what we aim to accomplish, and why I see this as being such an important movement and important initiative for engaging with collections as data. So Archives Unleashed is an agenda that Ian, Jimmy and I put together to address what we see as a central problem of web archives being a tremendous untapped resource for scholars, something that has incredible potential for research and yet, scholars are relatively unaware of how archives can be utilized, what we can do with them. We came up with this idea, a little more than a year ago, when Jimmy, Ian and I were at a conference together. We were at a reception afterwards and as we do at these receptions we were standing around a bar table drinking a few beers, and we were talking about the different data sets that we had, and we were talking about under-utilized these data sets are. And we jokingly said, we've got to do something about this, we have to figure out how we can better educate people and how we can unleash the power of the archives that we have, and Jimmy jokingly said, we have to unleash the archives, come on guys, we've got to do this. And I should have run away screaming at that point, because low and behold we actually came up with a way to start educating people about how to utilize archives as a research tool. And so we saw this opportunity to develop a forum for researchers to engage with archives and to learn about the ways in which these can be utilized in their research agendas. Web archives are an amazing resource, but access and tools for utilization are often a problem. We're seeking to address that. We're seeking to keep calm and hacked and to engage with the data. Our vision behind this is interdisciplinary, and our goal in this as Kate highlighted, is to bring together researchers from different backgrounds because in part, we've realized that in order to create an active research agenda, to bring researchers in to the fold to work with web archive data, we need to bring together different disciplines. Point in case, I'm a communications scholar, Ian's a historian, and Jimmy's a computer scientist. We come from entirely different disciplinary backgrounds, we speak entirely different languages as academics, and yet, together in the same room, we're able to work with data, and create meaning out of that data as we collaborate. I'm going to borrow a story that Ian likes to tell because I think it's particularly salient to the conversation that we're starting to have here today, which is that Ian comes from the world of history, I come from the world of communication research and we often in those disciplines, look at data, look at the data that we're given, as a black box. Look at the way that that data is collected is a black box. It sits in this box, and we go to it, and we say look, here's what I need. I need some data on newspapers in the United States, give me what you've got. And that data's spat out to me and I use it and I create research and I never really ask how that data came to be. And what we've started to realize is that we need to open that black box. And why it's so important to have these different disciplines represented and engaging with one another is that by opening up that black box, we can start to ask some important questions about what the data are, how the data came to be, what some of the underlying assumptions are, and then how that data can better be utilized in research to, I was actually very excited to see JR's slide on hotels and the map of hotels in the United Stated, in the world, because I think it underscores that problem of needing to better understand exactly what the data are and how they came to be, what they mean, so that we can better utilize them in research. Okay. So let's put this into action. We came up with this idea that in order to better educate researchers, better educate scholars about the ways that web archives can be utilized in research, we needed to create a venue for scholars to come together to be, to learn, to exchange ideas, to educate one another about the tools that are available. And we quickly realized it takes an army. We've been very fortunate to have a lot of collaborators in this initiative. We've had support from our universities, from Rutger's University, from the University of Waterloo. We've had support from the Library of Congress, from the National Science Foundation, from the Institute for Museum and Library Services, the Internet Archive, University of Toronto, and I could go on with this, but our list of collaborators has really been never ending and we're always amazed at the people who are willing to support us in this endeavor. The format that we came up with was to host a two to three day event where we brought together an invited group of academics, largely graduate students, faculty, librarians, we start with introductions, we start by getting people networked, engaged with one another, we break everyone out into teams, into work sessions and then we have people come up with these presentations based on what they did. So let me show you a little bit more about what that is and what that looks like, and I'm going to highlight the last two events that we've hosted, the first two events that we've hosted. The first was Archives Unleashed one, which we hosted in Toronto in March, and the second was Archives Unleashed 2.0 which we hosted in June here at the Library of Congress. And so what this plan looks like in action, is academics in the room together are starting to engage with one another, starting to mingle. And what we do is we bring these people together, we bring our 40 to 50 scholars together into a room, we've assembled for each of these Archives Unleashed events, a set of donated curated data sets. In the case of the event that we hosted at the Library of Congress, we had a data set of supreme court hearings, web archived pages around supreme court hearings. We had archived web pages around the US elections. We had archived web pages and Twitter data around terrorist activity, archived known terrorist's Twitter accounts. We had donated collections from the Canadian government. We put these collections out there, we put the team, the individuals together in a room, and we start to ask people what they're interested in. And as moderators, Jimmy, Ian and I take on this role of trying to assemble groups into teams, assemble individuals into teams that are focused on particular questions, whether that be sentiment analysis, event detection, and we were able, at both of these events, to formulate these groups of four to five individuals that would then spend the next two days working together on an emergent research questions. In order to help to lubricate this process, we believe in the importance of socialization, and it really is critical for these events that it's not just everyone together in a room, that we're getting people out and engaged with one another, socializing, so absolutely the first day, we get everyone out to a bar in the evening, we get people engaged with one another, we do our fun getting to know you games, and really break down some of the barriers that might exist between these individuals otherwise. And then we come back day two, day three, and we put everyone to work. We spin up virtual machines using donated competing resources, we allow the teams to have access to high performance computing needs, that they would need in order to quickly analyze and crunch the data. But we also bring social scientists in the fold. We bring communication scholars, we bring historians to the table, so that we can also be asking critical questions about what the data mean. We have a lot of applicants who will come to us when we put the calls out for these events, who say look, I don't have any technical skills but I have a lot of interesting questions I want to ask about web archives, these are exactly the people we want in the room, because we want to be asking those critical questions of the data that are being extracted from web archives. We have lightning talks where we offer scholars an opportunity to share the work that they're doing, again, to build the community and to help each other to engage with one another. And then ultimately, at the end, we share the work that we've been doing. We come back together at the end and present on the research that we were able to generate and spin up over two to three days. This is a photo of the group that came together for Library of Congress, in just a couple months since we hosted this event, we have been following up with everyone who participated, and it's been amazing to see how these teams coalesce and how they've continued to stay in touch with each other, and carry on the event, the research agendas that they started at this two to three day workshop. So we share, we engage, and we ask questions of each other based on the research that we generated. We also form connections. I'm an academic, I'm a researcher, so I can't help but study the people that we bring together, and I have a graduate student who's very interested in the way that people form collaborations, and he mapped out socio-grams, a socio-matrix, of the way that people connected at the Archives Unleashed event. This is actually based on the Toronto event, think back out to JR's image of the seventh grade classroom, this is our scholars engaging with each other and forming connections just over two to three days, and this was asking people who you would talk to and exchanged ideas with before the event versus who you had exchanged ideas with after the event, and we saw that we formed some very close knit groups out of this, and close knit clusters of researchers, over the duration of this event. We also of course polled individuals to see if they were satisfied with their experience, if they were satisfied with the engagement, and not that we're patting ourselves on the back here, but people walked away generally happy and feeling that they had engaged across barriers. We also some really cool research coming out of this. And it's only two to three days, but we're able to generate some pretty interesting insights into the data that we granted teams access to. So Kate mentioned the supreme court data, this is a word cloud visualization of the URLs that were in the supreme court data, and you start to have some interesting head scratching moments, for instance here seeing doubleclick.net popping up is one of the most prominent websites in the supreme court data. This is where you have that head scratch moment, you're like well I'm looking at supreme court data, why am I seeing double click in here? And this is where it's so helpful to have not only the computer scientists but humanitarians and the people who created the collections because we start to realize there's some quirks in the data, clean this out, pull out double click, which is an advertising hub, an advertising website, and we're able to extract some more valuable meaning around the way that these websites are connected. We're able to generate word clouds around the confirmation votes to see, to see a little bit as to how people were voting, whether the confirmation candidates were being supported, not supported, but also to see some of the key words that were emerging and the themes that were emerging in the discussion around these candidates based on the text that was in the archived web pages. And again, only two to three days. So this is an initial snap shot here. We're also able to move a little bit further, I mentioned team looking at terror risk networks and ISIS Twitter data, and they were actually able to take this a little bit further and find that there were particular types of Twitter accounts that served as amplifiers in, on Twitter, to amplify certain messages and certain themes in the conversation, and they were able to identify the characteristics of those Twitter accounts based on the data they had available. We're moving this forward now and we have a third event that's tentatively planned for February of 2017 at the Internet Archive in San Francisco, and we have a fourth event planned at the British Library in June of 2017. And after that, we're focused now on building a sustainable model for this. The work to date on Archives Unleashed has been driven by myself, Ian and Jimmy, and our collaborators and that network that I mentioned at the beginning, but we recognize that going forward we need to create a broader educational model for working with web archive data and for educating researchers and librarians about ways to open up access to these data sets. And so Kate referenced Archives Unleashed, LLC, we're looking to build and we've applied to create a nonprofit organization around this idea of Archives Unleashed, to built a more sustainable educational model for bringing together interdisciplinary teams to engage with web archives and to learn the skills necessary to create research out of these data sets. And so we have this long term vision of creating a sustainable model that will move us beyond these Archives Unleashed events, and build on the momentum that we've created thus far. If you have questions about any of these events, if you'd like to learn more, I'm happy to serve as a point of contact, so please shoot me an e-mail. The ArchivesUnleashed.com website has all of the information about the event that we hosted at Library of Congress. If you replace the .com with .ca, you will find a variation about the first event that we hosted. You can also access the Archives Unleashed 1.0 and I believe the 2.0 projects are now up as well, via that bitley link. It's also linked via ArchivesUnleashed.com. And thank you again for having me here today, thank you again for listening. We really appreciate your time. [ Applause ] [ Background Noise ] >> Hi, can you hear me? Great. [ Background Noise ] >> Do you have slides for us? Oh. >> Leah Weinryb Grohsgal: Cool. Thank you. Thank you partner. Good morning, yes, a round of applause. Good morning, I'm Leah Weinryb Grohsgal, I'm a senior program officer in the division of preservation and access at the National Endowment for the Humanities, and I'm also the program coordinator for NEH for the National Digital Newspaper program. This is Deb Thomas, Deborah Thomas, she is the program manager for the National Digital Newspaper program at the Library of Congress. So the NDNP is a partnership between NEH and the Library of Congress, that represents a long term effort between these two agencies to develop Chronicling America, an open access searchable database of historic US newspapers. And Kate talked about sustained efforts, this is really the culmination sort of, of a sustained effort. This program's been going on for 10 plus years, and it also builds on the US newspaper program, which was the effort to microfilm all those historic newspapers before they fell away to dust. At this point, we have millions of pages of digitized newspapers and descriptive information contributed by states and territories across the country. One of the best things about NDNP is that Library of Congress has made all of the data open. And they've developed a well documented API to explore it in a number of different ways. It's an incredible collection of data, that's just waiting to be used for scholarship, research, education and public programming. I'm here to tell you about a successful contest I ran at NEH to encourage creative and substantial use of Chronicling America. First, Deb is going to go over the goals of NDNP and explain the data she and her colleagues have made available, and after that, I'll describe the data challenge and the excellent results. Deb. >> Deborah Thomas: Thank you Leah. So I made the magic happen, but now I don't know what I'm doing. [ Background Noise ] Ah ha. Alright, well, so this is newspapers, historic newspapers in particular, and I'm just going to show you a few, oops, that was accident. I'm going to show you a few of examples of newspapers. That may or may not show up on the screen. Anyway, newspapers have something in them for any kind of discipline or humanities studies or social sciences. As you can see from the images in the background there are advertisements, there are news articles on scientific events, there's economic articles, there's genealogy, there's literature, any kind of thing you want to study can be found in historic newspapers. The goals of the National Digital Newspaper Program in particular were primarily to enhance access to American newspapers. We are, our whole goal with the program was to create a national level collection of what is currently today, a distributed level, a distributed collection across the United States. The Library of Congress has a very large collection of American historic newspapers, but is not, is has a particular collection focus, it is not the collection of American history newspapers. The collection of American historic newspapers is actually stored in state libraries and archives around the country. And so using digital technologies, we're able to bring this material back together again through the partnerships with the National Endowment for the Humanities and individual state institutions, representing their state in the program. We have developed a permanent digital resource including selected historical content from each, from the states that are currently participating. Eventually we hope to cover all 50 states, as well as many territories as are able to participate. The structure of the program is focused around shared resources. We share the cost of the program with the National Endowment for thr Humanities, the National Endowment for the Humanities provides funding to each state to select and digitize materials from their state collections, their newspaper collections, based on some basic criteria that we provide, and they digitize their collections to specifications established by the Library of Congress, the Library of Congress receives the data, aggregates it, and sustains it for the long term and makes it available to the public. One of the key elements of the program over the past 10 years and into the future is focusing on paste scalability, so that the Library of Congress and the National Endowment for the Humanities can make this program last into the long term, essentially so and to ensure that the data that we get at the end of the program is similar to the data, is usable with the data that we get in the beginning of the program. So we have gradually increased the size of the program. We started in 2005 with six awardees, which seemed like a moderate amount of people to start working with, and now we're up to working with more than 40 states and territories and we have many, many, many millions of newspapers. In addition to all those aspects of actually creating the content and bringing the content together, we focus the structure of the program and the resources on planning ahead for technical change and sustainability in the fact that we have ensured, and through various mechanisms, that the data that we receive is all fairly consistent. We have a very detailed technical specification, which while has been described as richly detailed, we're, we have a fair level of confidence that this material is in fact going to last into the future, and whatever the next form that this digital data needs to take, to be usable in the systems of the future, we can move it in that direction as a whole data set and we won't be dealing with individual pieces from 50 different producers. So this is a screen shot of the Chronicling America website, it's LOC.gov/chroniclingamerica, it's chroniclingamerica.loc.gov or the historic newspapers' badge from the Library of Congress home page, it provides access to searching the newspapers and browsing them as well. And this is the user interface available. This is a map of the states participating in the program so far. As you can see, it's happily very green. We're still working on filling in all those other little spots, but this year we just, NEH just announced grants to Alaska, Colorado, Maine and New Jersey. We're very happy to have them join the program and join the other 40 states including and territories including Puerto Rico and the District of Columbia which the Library of Congress represents, which is all available in the Chronicling America website. So currently the status of the website, is today we have 11.3 million pages online, we have, that's approximately 75 terabytes of live storage, and we also have 750 terabytes of archival storage, the master archival files are kept offline for safe keeping. We have 44 states and territories participating in the program actively, we have representation content from 40 states and territories. We also have, in addition to the actual digitized newspaper pages, one of the tasks of the awardees, when they select the digitized, the newspapers they want to digitize, is to create a 500 word essay about the significant historical context of that newspaper. Why was this one particularly important enough to make available? To digitize and make available? And so we have 1,100 of these essays that have been written by scholars and knowledgeable researchers around, and librarians from the country, that really put the newspapers in those, in their place in time, and explain why these were significantly important content. In addition, just the 11 million pages represent 2,100 different titles. But that, I talked a little bit about, we, one of the goals of the program is to create a selected amount of content available through this program and through the Library of Congress website. That's a selected amount of content because at 11 million pages, we're only working with 2,100 titles so far? Out of the 153,000 titles that have been published in the United States. That gives you a little bit of a sense of the scale of newspapers that have been published in the United States. This is always ever going to be a slice off the top, and hopefully it's a very rich and valuable slice to researching scholars. And in addition, for reasons of sustainability in archiving, we have 10,000 reels of microfilm that were created during the digitization process. So just a glimpse into what the site allows you to do. The search has, the site has two main features. One is the digitized newspapers, which are full text search-able at the page level. You can search by place, time and key word, which are the really the important things when you're looking at newspapers. You have a certain a mount of information about every issue and page, you can navigate within them, you have visual search highlights that are in red. I don't know if you can see it on this screen or not, that indicate where your key words occur in the page, and that gives you the chance to zero in visually on articles of interest and where your results may be, you may be willing to spend the time zooming and panning into the materials. In addition, there's the ability to manipulate the image on the screen and to make it full screen. The other kind of data, which is also part of the Chronicling America website, is the US newspaper directory. This sort of forms the backbone of all of the newspapers that we might want to digitize, and actually represents the program that Leah mentioned, the USNP program, the United States Newspaper Program, which was structured to identify, catalog and selectively microfilm the nation's newspapers. All the records that were cataloged under that program, the 150,000 titles that we know that have been published in the United States, if they have digitized content elsewhere, it's the records and they provide more information for resources that enhances access to the newspaper, more information for researchers and enhances access to newspapers across the country. So the US newspaper directory is also search-able by place, time and key word, and it includes website links to individual locations where these materials may be found, as well as holdings records. And finally, well not quite finally, but close, what's available and how is it available? Leah mentioned open access to this data, well what that means is that the data is available for harvesting for reuse outside of the individual interfaces that we provide through the website. So we have digitized page images, we have objectable character recognition which is machine readable text for all the pages, we have meta-data that surrounds every page and issue, which is in a standardized METS format in MODS descriptive characters, descriptive characteristics, which describes the place and time of that particular issue as well as the newspaper directory records. All of these pieces can be taken out of the site and analyzed in different ways by researchers or individuals, that don't involve the actual web, website. So the way that we make the data available is through the open API as well through the public website. We provide stable URLs that are helpful for many reasons, in particular, they are machine, human readable, so that you can project and predict what the next URL you might be accessing could be. There are machine readable views like JASON [assumed], which makes it a bit easier to play with the stuff. And there are prefab data sets that we have created specifically for the, on the request of researchers who really just wanted to access the OCR for example, so we have created batched downloads that people can access and take away for use and manipulation later. And now I will turn it over to Leah to to tell you about the specific uses that have been made of late. >> Leah Weinryb Grohsgal: Alright. Cool. Thanks Deb. So you just heard about the incredible amount of data that we have available and the number of ways that researchers can access it. Chronicling America is an exceptionally valuable and public spacing product. It's one of the best known and most used resources that's ever been produced by both NEH and the Library of Congress. As a historian myself, I used the site well before I came to NEH for my own research. The site's used heavily through the user interface and there have been a few large scale digital humanities projects done with the open data that are really cool to look at. But we at NEH wanted to see even more people using the open data. And NEH we decided to try hosting a contest to get people thinking about creating interesting projects. So we used challenge.gov, which is a platform for agencies across the federal government to host competitions. It's been around since 2010 and it's a response to the White House office of science and technology policy's call for government agencies to promote innovation using prizes and challenges. Our challenge encouraged members of the public to ask the question, how can you use open data to explore history? We purposely left the challenges parameters broad, and we encouraged entrants to be creative in thinking about what humanities themes really interested them, and how they would get at that through the open data. So projects could create maps, visualizations or tools, and entrants also were allowed to mash up the data in Chronicling America with other data sets in order to get their results. Because the Chronicling America database covers so much in these millions of pages from north to south, east to west, across the country, various political, religious and cultural standpoints, we really think that the sky is the limit. And we had prizes! We got many interesting entries. I don't have too much time to go too deeply into any of them, but I think a short look at each of the six winners will give you a taste of the variety of projects we got. You can find links and more information about the winners and their projects on the NEH website if you're interested. Our first prize winner was Lincoln Mullin, for a site entitled America's Public Bible, bible quotations in US Newspapers. The site tracks biblical quotations in American newspapers, to see how the bible was used for cultural, social, religious or political purposes. It shows how the bible was a contested yet common text, so you'll find quotations in things like sermons in Sunday School lessons, which were reprinted in newspapers across the country, but you can also find the bible used in every side of almost every issue including slavery, women's suffrage and capitalism. Next we had a tie for second place. One second prize winner is by Andrew Bales, entitled American Lynching, uncovering a cultural narrative. This site explores America's long and dark history with lynching, in which newspapers acted as both a catalyst for public killings and a platform for advocating reform. Bales integrated the Chronicling America data with data sets on lynching for Tuskegee University as well as project HAL which is based on the bectonay [assumed] confirmed inventory. The site is intended to tell the story of lynching in America through maps and other visualizations as we often do with data but Bales also wants to connect that data with the stories of the victims. Oops. Sorry, I didn't advance the slide. That's a quick look. The other second prize went to Amy Glroux, Marcy Galbreth and Nathan Glroux for Historical Agricultural news. This is a search tool for exploring information on the farming organizations, technologies and practices of America's past. The group sees farming as window into all kinds of things about American history including social, economic, political and cultural. Additionally, they see their project focusing on one subject area as sort of a statement about big data. So by narrowing it down into this one subject area, they've created a useful data set that can be kind of extracted from this massive data in case people maybe don't know what to do with it because there's so much of it. So the challenge's third prize was also a tie. One third prize went to another team, Kristi Palmer, Kaitlin Polick and Ted Polly, for Chronicling Hoosier. This site tracks the origins of the word Hoosier, it's geographic distribution, and its positive and negative connotations over time. The project uses visualizations that are all connected with Chronicling America to tell the story of the word, which is not what you might expect. Additionally, they have documented and made all of their code open, so that anyone inclined to do so can apply it to another word that they're interested in. The other third prize award went to another team, Claudio Saunt and Trevor Goodyear for Usnewsmap.com. Saunt and Goodyear made a really interesting point when they're explaining the origins of the project. So many people use Google engrams viewer in order to track public discourse across the United States. But books take time to write and publish, and many times they come out in multiple editions, years apart, potentially skewing the results if you're looking at what people were talking about at any given time. Because of their quick publication schedule, newspapers capture the public discourse better than almost anything else. So this site is built on the Chronicling America data and allows users to discover patterns and watch news go viral so to speak. For example, you can track terms like miscegenation and scalawag which quickly gained currency in the 1860s in response to the political situation. And finally, we awarded a K through 12 educational prize, which this year went to teacher Ray Palin and the AP US History students at Sunapee High school in Sunapee New Hampshire, for Digital APUSH, revealing history with Chronicling America. So this group of 15 students used word frequency analysis, which is a kind of distant reading, according to them, to discover patterns in news coverage. They broke into teams and researched issues that caught their interest. For example, the case Plessy vs Ferguson, the words secede and secession, Uncle Tom's Cabin, the KKK, and labor unions. According to Palin, the project was similar to the usual assignment of writing a paper in that students had to identify an important historical question and think about ways to research the answer. Using data, however, allowed them to learn a whole different set of skills than writing a research paper. So I realize I didn't have much time to go into much depth about these projects, but only to provide teasers, just to indicate the vast array of subjects represented. I do think that this contest holds several lessons for those of us interested in collections as data. First, know your collections. I've been interested in the kinds of projects that could come out of Chronicling America since before working at NEH, when as a historian, I used the site for my own research on civil liberties, religious groups and free speech. Additionally, as a librarian, the site is also important to me as one of the only freely accessible sources for historic news. The collection holds so many possibilities as a data set that it was good to see the field respond to this challenge. Many of the winners sheepishly told us that they were happy about the contest because it gave them a deadline to a project that they had been intending to do for a long time. It may not be possible always to hold a contest like this, but it might be something to think about if you want to gather a critical mass of people working on your collection and stir up excitement about it. I want to call attention to how many team projects we had here as well. And I think that it's important to remember that a lot of times we have people working together outside of traditional academic department and crossing departmental lines and we should encourage this in any way that we can. I think that the high school teacher Mr. Palin pointed out that academic value of these projects extends well beyond second, what he pointed out extends well beyond the lessons in secondary school that he is teaching. This intellectual work really involves, as he pointed out, similar question identifying to any sort of research and should be valued in the academic community no matter what you're doing. And finally, I think this contest points to the importance with any collection of maintaining contact with the communities using the product. Not just putting the data out there, but really trying to understand how people are inclined to use the collection, and learning from them at the same time that they learn from the collection. Thank you very much, and we're happy to answer any questions, you know, during the course of the meeting. [ Applause ] >> This has been a presentation of the Library of Congress. Visit us at LOC.gov.

Contents

Extent

The division covers the neighbourhoods of Langley Green and West Green, which form part of the urban area of the town of Crawley, and also Gatwick Airport.

It falls entirely within the un-parished area of Crawley Borough and comprises the following borough wards: Langley Green Ward and West Green Ward.

Election results

2013 Election

Results of the election held on 3 May 2013:

Langley Green & West Green
Party Candidate Votes % ±
Labour Brenda Smith 1,558 58.4 +12.9
UKIP Peter Brent 533 20.0 N/A
Conservative Vanessa Cumper 499 18.7 -18.5
Liberal Democrats Kevin Osborne 77 2.9 -14.4
Majority 1,025 38.4 +30.1
Turnout 2,667 27.9 -5.4
Labour hold Swing

2009 Election

Results of the election held on 4 June 2009:

Langley Green & West Green
Party Candidate Votes % ±
Labour Brenda Smith 1,396 45.5
Conservative Lee Gilroy 1,140 37.2
Liberal Democrats Kevin Osborne 532 17.3
Majority 256 8.3
Turnout 3,068 33.4
Labour win (new seat)

This division came into existence as the result of a boundary review recommended by the Boundary Committee for England, the results of which were accepted by the Electoral Commission in March 2009.

References

Election Results - West Sussex County Council

External links


This page was last edited on 17 June 2019, at 11:45
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.