Dangerous Digitization?

•November 1, 2006 • 3 Comments

Our discussions last night really stoked alternative ideas  to cry of the “technical republic”:  digitize, digitize, digitize!  Comprehensive acts of digitizing without the thought of risks for the sake of facility or that lessor mentioned reason;  money, could cause significant intangable losses.  Losses such as provenance, intellectual curosity, and scholarship.  I had mentioned the archival term “provenance” or the original order and arrangement of the records.  There is an unspoken contract of sorts that researchers at the National Archives and probably most others, enter.  It is an agreement that what they had is what you get (WTHIWYG).  While researching in linearity has it faults, it provides structure for hardened scholarship and intellectual creativity.  For example, when a researcher phyiscally visits the National Archives and retrieves records of interest, that interest is metered out in small loads (nine cubic feet) at a time.  The provenance is naturally reinforced because the researcher can only have one folder, from one box, from one office within one agency at a time.  This acts a “force of gravity,” focusing the researcher to learn the filing scheme, organizational priorities, and choices of the agencies at a specific point in time without pushing down current realities into their hypothesis about the past.

Intellectual curosity could be substantially reduced due to comprehensive digitization.  Intellectual curosity or creativity of following leads throughout the documented event is at risk because digitization will naturally become more and more dependent on algorithms and the people programming them.  This puts more distance between the creator and reader and cast everything you see and analyze in light of sophisticated programs.  Besides, selling out your entire archives for “30 pieces of bits and bytes” it creates alienation and reduces the overall archives to just another data mart.  In my experience at the Archives, I have noticed an intangable activity which acts as a driving force for many researchers.  This force, for the lack of a better term is an intoxication with originality.  For example,  one of the most reproduced documents would be our charters of freedom documents, its found online within our site and many others in high-resolution.  But millions of visitors still come to view the original everyday, why?  In the same way many researchers want the experience of handling the original documents and that experience compels the researchers with a since of stewardship to expose errors, reveal truths,  illuminate injustice, and  to make differences within our future society.  Furthermore, researchers are reluctant to reveal their life’s passion to anyone online nor are they willing to go where everyone has been before.  Also, there are instrinic differences in the original documents from those digitized counterparts.  I will call these differences the element of freshness.  Many researchers can review let’s say 60 year documents and tell if anyone else has ever looked at them since its accession.  For example, you can see if the documents have received any conservation efforts, if the original staples have been removed, or whether the original rubber band or fastners are still intact.

I am not all doom and gloom over digitization, I believe all the finding aides or information about archives should be digitized to facilitate searching and accessibility.  Also, most requested documents should also be digitized.


The National Archives or an Archives?

•October 31, 2006 • Leave a Comment

As an Appraisal Archivist, it is up to us on behalf of the Archivist of the United States to ensure the public has records to inspect that document how the government has conducted its business. We ensure the public, such as lawyers, students, historians, and scholars alike have available evidence that documented the rights of american citizens, actions of Federal officials, and the national experience. Unlike many historical websites or so called “archival” sites, posting a small percentage of our holdings is not sufficient to hold our government accountable to its citizens. While it can be argued making billions of documents within our holding digitally accessible and searchable has a greater democractizing effect than our present model; we simply must be realistic. Should the goverment embark on such a global project? The Electronic Records Archives (ERA) will be a comprehensive, systematic, and dynamic means for preserving virtually any kind of electronic record, free from dependence on any specific hardware or software. ERA will change not only the way we preserve digtial records but all our traditional business processes will be altered as well. At the staggering cost of 340 million, ERA does not presently include a massive digitizing project of all our holdings now in paper form.

Do we proceed with the more hands off approach or introduce more implicit intrepretations of making some of our holdings available on the web? Many come to the archives because of its abstinence policy (free of historical intrepretation) and provenance. The National Archives is composed almost entirely of permanent records which are those that I help appraise as having sufficient value to warrant continued preservation by the Federal Government. These permanent records which are determined by their appraisal pursuant to legislation, regulation, or administrative procedure are no longer needed for current business within each agency. Although a “web archives” such as the American Memory Site, has millions of searchable and accessible primary documents, selective or intrepretative publishing for the National Archives in my opinion, does not preserve its greatist contribution to democracy nor responsiblity to our government and citizens.

The National Archives and GIS

•October 16, 2006 • Leave a Comment

Open access has in many ways been the benchmark for the internet.  I am not sure collaborative efforts through digital media will transform the archives into one big well from which google or some other meta-giant can extract all its resources.  However, I do believe GIS holds promising potential for researchers, staff, and management of an archives.  An archives could develop spatial coordinates based on a much smaller geography than the earth’s surface.  Multiple layers, showing for example the location of series of records, how often they were referenced, and any feed back notes from researchers, appraisal reports, processing notes from archivists (e.g. volume), and original notes from the originating agency concerning the records could be superimposed in a single environment for analysis and researching.  Digitizing billions of paper documents does not seem feesible but visualizing the data about the records for more efficient bibliographic searching, space allocation, and contextual analysis would be extremely beneficial to all within the archives.

I believe Linda Hill’s Georeferencing in Digital Libraries, “application of georeferencing to all types of information and the integration of geospatial description, searching, and analysis into digital library practices,” is better suited for an archives rather than a library.  An archives usually contains a much greater volume of records which are semi-structured, partially processed, within multiple formats, and housed within several buildings on several floors within them.  Perhaps geo-archival referencing could relate records (regardless of format) to spatial locations through spatial coordinates.

Remix Culture and APIs Oct 3

•October 10, 2006 • Leave a Comment

I have been contemplating the impact of API’s, interoperability, remix culture, folksonomy, and Web 2.0 on archival theory and practice. Reading about such topics is much like sprinting out beside a marathon runner with all the excitement of keeping up until conditioning and experience leaves you far behind. Sustaining a high technical literacy is an enormous task, it challenges (in my opinion) every archival element from appraisal to reference in ways few conceive today. Tommorrow’s historians and archivist will have to keep abreast of searching, indexing and auto categorization, data mining, and data analysis technology as often as IT specialist keep current with technologies today. Archivist in the future could be threatened by these new fluid boundaries between the production and selection. The universe of information available from the infinite reach of API’s is staggering. The disparate small and large efforts the few or folksonomy, learning ecologies and Paul Miller’s Interoperability researchers who require access to information from a wide range of sources will challenge traditional archivist in ways unforeseen. With much of the technology driven by business, much of the archival holdings could be farmed out or simply converted to just another silo of information with automated processes which rival the discipline itself.

While few silo historians and archivist have scarcely heard of the terms: Interoperability, Folksonomy, and Application Programming Interfaces; the technology exist today to allow researchers to span their search across independent databases, agency archival databases, commercial databases and museum holdings all at the same time from home.

I am not trying to sound the alarm but the inevitable intrusion of technology is coming quickly into the well established disciplines of the archival profession.

Large Scale Databases

•September 26, 2006 • 1 Comment

The readings this week were quite interesting.

Asking the “so what?” question about the “Methodology for the Infinite Archives” William Turkel is raising questions about digital history, which I believe will one day obliterate the heart of archival functions as we know it. The transformation of providing “instant access to the contents of the world’s libraries and archives,” I believe is a much bigger history than Mr. Turkel suggest. Archivists, librarians, curators, records managers, information technologist should all look more “closely and critically” at the very nature of their job functions. I believe the demand for more efficiency and economy is driving the need for new skill sets of digitizing existing sources and exposing repositories through APIs. We are less likely to be driven by the ideals of history such as widespread literacy and not repeating the ills of past decisions. There is more conversation about migration and metadata than about provenance. Are we pacing the halls of Jorge Luis Borges’s the Library of Babel at the speed of business? Finding particular patterns, determine meaningful relationships, categorizing documents, and extracting essential information may presently be the work of present information theorist and but tommorrow’s prerequsite for knowledge workers (analyst, programmers, records managers, archivists, etc).

SAA Guidelines for the Graduate Program http://www.archivists.org/prof-education/ed_guidelines.asp#_ftn1 “Archivists, like all professionals, must rely on knowledge, methods, and perspectives from beyond their own discipline. Archivists need to be knowledgeable about significant theories, methods, and practices of some or all of these fields. Archivists help to secure society’s cultural heritage, protect legal rights and privileges, and contribute to the effective management of a wide range of institutions. Without a careful selection of records, our social, cultural, institutional, and individual heritages will be lost.

I believe archival science is somewhat (LWR) lost without the realization today. Even modest APIs can produce volumes of synthesized knowledge not originally envisioned by many archivist today. Exposed repositories of knowledge to algorithms, ranking schemes, and text analysis challenge every aspect of archival science in terms of who and how many professional archivist will be devoted the craft in the future. The categories of identification, selection, protection, organization, accessibility and description of archival records at the heart of archival science will not only be radically redefined but replaced by a machine in the future. Of course, there will be human analyst who will oversee and carefully review the more critical. However, I believe the historians with more intrepretive functions will survive with more employment buoyancy.

Inexpensive data warehouses in a collaborative environment could produce mega archives accessible to the most casual user from around the world. Although not the focus of Mr. Cohen in his article “from Babel to Knowledge: Data Mining Large Digital Collection” one could only wonder where does the digital history race end?

The U.S. Justice Website Structure

•September 12, 2006 • Leave a Comment

The Federal government web page for the U.S. Department of Justice, www.usdoj.gov is composed of almost entirely of static pages delivering such information as publications, mission statements, and highlights of Justice activities.  The website is composed fo three external style sheets (CSS) to control the look and feel of several sub-sites arranged according to sub-departments (doj agencies).  A quick look at the source information of the index.html reveals the following:

<title>United States Department of Justice</title>

<link rel=”stylesheet” href=”/css/usdoj.css”>

<link rel=”stylesheet” href=”/css/doj_flyout.css”>

<link rel=”stylesheet” href=”/opa/pr/press_release.css” title=”PR specific style”>

<script type=”text/javascript” src=”/scripts/doj_flyout.js”></script>

<script src=”/scripts/mouseover.js” language=javascript></script>

<script src=”/scripts/slideshow.js” language=javascript></script>

<script src=”/scripts/enlarge.js” language=javascript></script>

As the source indicates, javascript is the primary language used throughout the site.  The CSS is composed of flloating palattes with the most important and navigational information in the prominent top-left corner.  There are flyout and drop-down menus which help the user navigate by topic and organization.  Ironically, the dysfunctional seach engine is easy to find.  There are limited images usually confined to identifying important officials.  However, there is a javascript run slideshow highlighting photographs of important or recent DOJ activities.  Following a link to a DOJ department website yet on a different server www.fbi.gov ;a back end database application soliciting tips of criminal activity.  These tips could be compiled and shared with other law enforcement or intelligence entities.

Class discussion of the HistoryWired Website

•September 5, 2006 • 1 Comment

I analyzed the effectiveness of using map technology within the website; HistoryWired: A Few of Our Favorite Things. How does the practice of this new technology inform, educate, and entertain web vistors of the Smithsonian institution’s national museums.

The adaptation of the popular multi-dimensional Money Map tool is supposed to allow users to view vast amounts of exhibit holdings in a simple, graphical, and interactive format. I think it leaves more questions than answers. However, it does empower visitors to participate in making a claim for the most popular visited objects but does this allow them to make more efficient and informed decisions about the holdings of the Smithsonian? There seems to be a natural order in allowing Money Map users to measure performance, risk and other value changes in the stock market. Highlighting patterns and anomolies to make more informed decisions about making money appears to create an audience motivated to use mapping technology. But can the general public formulate any meaningful context to help them understand and experience the Smithsonian’s holdings in a more meaningful way?

In my opinion, the mapping technology does not come across so relevant to the historical and curious audience. Even the most ardent web traveler would not spend the time perusing through this maze and transcribing the hieroglyphs in order to figure out what more is there or for that matter the graphical representation of the percentages of those who have been there before. The types of questions which naturally perculate in the minds of many guests visiting a museum is not how many people have seen the last exhibit. In my opinion, visitors ask what more do you have? and how can I interact with it?

There seems to be a loss of mission at the most and/or a loss of orientation at the very least. The ill chosen colors, popping texts, busy mouseovers, and the lack of a relationship between the timeline and the drop down categories could cause attention fatique and perhaps other incalcuable losses. My navigation experience was fustrating. It was though, I was within the bowels of a museum wandering the halls of an unexplored basement. And though I had unprecedented access to the holdings, I was given a limited map indicating the most visited objects instead of one providing important contextual information, such as the value of the objects or its spatial relationships to other holdings or the building itself.

I believe new dimensions could have been explored, such as graphical interpretations of the volume of objects compared to well established categories such as Women’s Suffrage. I believe this would provide some evidential features as to institutional effectiveness and directions. Knowing the number, the types, the estimated value, and its representative volume compared to the other objects within the same genre or category adds the multi-dimensional aspects sought after and may quinch the thirst of both the novice and expert.

In short, it is an impressive display of adapting technology with precision to reach an unspecified audience concerning a few of their favorite things. The HistoryWired website presentation causes the visitor to “wander” through not wonder about its holdings.