#CRL_Leviathan Session 2: Libraries and the Information of Governments

Keynote: Approaching Leviathan: The Dangers and Opportunities of “Big Data” 
John S. Bracken, Director, Journalism and Media Innovation, Knight Foundation
YouTube-logo-full_color Adobe PDF icon

  • How to deal with big data is only half the story > we must also focus on organizational culture and adaptation or we lose track of the importance of culture and people
  • There is so much data now > what’s important is the process, what you do with it, and the talent you build around it > we must adapt and create a bridge between traditional skills and new quantitative approaches
  • There is skepticism about technology and our reliance on it and this is colliding with the emerging culture of break things and focusing on future and the next challenges
  • How does the civic sector do a better job of adaptability to build the tools that people want and need?
  • “Make something people want or move on” outlook is much harder to accomplish in civil society
  • The biggest cognitive switch we need to make is enabling ourselves to make mistakes
  • The Knight foundation works in the space of news and journalism, but links it to the community > learn more about the Knight Foundation here: http://www.knightfoundation.org/

Government Records and Information: Real Risks and Potential Losses 
James A. Jacobs, Data Services Librarian Emeritus, UC San Diego, and technical advisor for CRL Certification Advisory Paneldfd
YouTube-logo-full_color Adobe PDF icon

  •  There are many gaps in what we know: no list of born-digital government information, no list of all government websites, no list of preserved born-digital gov info
  • What we do know: FDLP libraries have preserved millions of volumes of non-digital government information and most born digital information is not held, managed, organized, or preserved by libraries
    • Preservation is at the mercy of budgets and social priorities > risk increases if  persevering agency is the creator and doesn’t have preservation as mission or if preserving agency governed by politicians
  • The production of digital documents is far outpacing what’s being done to preserve these documents
  • Key issues:
    • Versioning
    • The need for persistent URLs
    • The need for temporal context (ex: link to version of document or site that author linked to at time of publication and not updated version)
    • E-government issues (e-gov often hides information behind services > how to we preserve this information)
    • Relying on government for preservation and free access (most agencies do not have the mandate to preserve indefinitely – this is even the case for GPO)
    • Collections need services to provide important context for interpretation
  • When we create dark archives we’re not creating a value for our community > we need to create immediate value for our users
  • Who should preserve?
    • Option one: the government alone
    • Option two: the government with non-governmental partners (ex: GPO + LOCKSS-USDOCS)
    • Option three: non-governmental organizations without government cooperation (ex: Internet Archive)
  • There are different methods for selecting what needs to be preserved (the solutions should be mixed and the issue should be tackled collaboratively)
    • Broad web harvesting (ex: Internet Archive)
    • Focused selection (ex: by agency or title by tile)
    • Digital deposit (ex: deposit by creators to memory institutions)
  • When planning for preservation focus on different user-communities: don’t look at the web and decide what to preserve, look at the web and preserve based on what users will need
  • Every library should participate in digital preservation > it’s about building the value of libraries > collections and services should be reliable and useful > shared collections and services can be built with different contributions – not all libraries have to be data centres
  • Summary of key points:
    • Preserve born digital government information – the technology exists
    • Every library can and should participate
    • We can add value to the information by building collections of use to our user communities

The Digital Future of FDsys and the Federal Depository Library Program: A Public Policy Analysis 
R. Eric Petersen, Specialist in American National Government, Congressional Research Service
YouTube-logo-full_color Adobe PDF icon

  • Challenges
    • Access and service (tangible, digital, or both?)
    • Costs (Less print distribution, but still costs libraries to maintain
    • FDSys – there is no good model for permanent digital retention > we will have to update software and touch digital assets to make sure access continues > ongoing investment and responsibility required > every 8-10 years will require entire overhauls and updates
    • Born digital materials > identification, retention, preservation, service
    • Tangibles > retention, digitization, consolidation, service
  • Lack of consensus around:
    • What is to be captured > how to count – websites / documents vs. records
    • How to capture and by whom > GPO / FDSys, originating agencies, third parties
  • Legislative change is slow without clear agreement regarding the solutions among stakeholders
  • Before Congress will engage, we need clear proposals that are broadly supported and offered by stakeholders and interested parties > they must cover issues such as enduring standards for digital retention, who collects and retains born digital content and tangible content, and how the costs will be managed

Panel Discussion: New Models of Access: The Role of Third Party Aggregators and Publishers
YouTube-logo-full_color Adobe PDF icon

Susan Bokern, VP, Information Solutions, ProQuest

  • We all have different roles to play and there’s enough content to go around
  • ProQuest’s essential role is to add value to content
  • ProQuest is focusing on researchers and the improvement of workflow processes to create new research output > enabling researchers to access content more efficiently, providing tools to improve workflow, visualization and analysis tools, not just about content but also about context
  • The process of adding value begins with market research (surveys, advisory boards, focus groups to identify known and unknown needs) > creating acquisitions strategy to develop collection > preserving content or data > keeping the technology up to date > identifying where and how to obtain the content
  • ProQuest takes preservation seriously > content is stored on their own servers > currently exploring a longer-term storage and preservation solution (ex: Iron Mountain)

Robert Lee, Director of Online Publishing and Strategic Partnerships, East View Information Service

  • East View is an aggregator for academic institutions and a variety of international governments
  • Some example projects: GIS, big data, political rallies ephemera
  • Big focus on content from Russia and China > not usually seeking or producing translations, but going after the information and data that’s not always available elsewhere or not the same as what’s provided in English
  • There is an operational risk is that the information received could later be reclassified
  • In China, content can be made available and digitized very quickly but it can also disappear or be blocked quickly, too
  • Interested in exploring cross-platform solutions for content

Robert Dessau, CEO, voxgov

  • Voxgov harvests materials from over 10K web destinations each day > every 6 mins the system looks for new URLs > 49 diff types of documents (fact sheet, social media, congressional, federal register, speeches, etc.)
  • The collection process has evolved rapidly > learned to identify when a website’s format has changed to maintain quality intake of data > 18-22%, depending on the group, falls into the broken link category
  • Interested in tracking conversations from beginning to end to allow a much deeper and more comprehensive level of research
  • The involvement of third parties in the preservation and access process is inevitable
  • Mining the text we have to bring value has not yet been realized

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s