Low Output

I was reading one of the blogs of my coursemates and the way that she spoke about creating a lot of digital photos got me thinking.

I have noticed over the years that compared to most people I do not personally generate a lot of digital content.  I have always been somewhat tech savvy and I love technology and the internet, but that doesn’t seem to have translated well into content generation.

Even writing the course blogs and tweeting has been pretty difficult for me.  It mostly feels like I’m seeding messages out to the void.  Of course, it’s a little different with an audience that is made to read my output as a condition of passing the course!

Mostly, I think the non goal-oriented nature of producing material of this nature has been difficult for me.  I feel that I do better when I have specific criteria to meet.

Is anyone else having a similar experience, or is it just me?

The Bogotá Manhattan recipe + markup

The Bogotá Manhattan recipe + markup

I have to say that I was a little bit incredulous when I first considered the using the itemprop attribute to markup ingredients for a recipe for search engines to detect. Who would want a clumsy list of recipes compiled by a search engine when they typed “Recipes with Sriracha,” rather than a human written article or blog post with hand-selected recipes that are representative of the way Sriracha can be put to use. In fact, the first results that pop-up on that google search are exactly that. This probably suits a more direct query like the above; however, I realized that the real value of marking up ingredients allowed to perform Boolean searches for recipes by ingredient.

One of the most interesting implementations of this is http://www.supercook.com/  This site returns recipes with most closely match the ingredients that you have on hand.  Not only that, when I sampled the recipes that where returned the ingredients were all marked up semantically.  So it seems there is a real tangible benefit to taking the time to think about the metadata encoding on your webpage.

Introduction to XML

XML

At first, I was a little apprehensive to having to become familiar with yet another language, but I was really pleased to find how similar XML is HTML and how easy it is to interpret the tags.

It seems to me that most of the rub with working with XML is become familiar with the particular schema that you are presently working with.  Some of the comfort of learning a language like Python or Ruby is that it is fairly monolithic and that most things that you do in one project can be done in much the same way as another.  It seems to me that this is less the case in XML, but then again, I’m not sure how much work will be done with the form boxes on an editor vs actually tagging elements by hand.

As an aside, I was pleasantly surprised to see the “Rose” poem used as an example in the above link.  Benjamin Britten set it to music in one of the movements of the wonderful Serenade for tenor, Horn and Strings.  I think it would be interesting to see the differences in the metadata used to describe the poem as a poem and the poem set to music as part of a larger work. 

File Naming

File Naming Tutorials

I really enjoyed this video series, lots of great advice that I thought was common sense before I started working with other peoples digital documents!Image

Like the video and my favorite webcomic xkcd, I’m also a big advocate of the YYYY-MM-DD date format.  I’m also a big fan for DD-MMM-YYYY as in 11-Feb-2014, which I think is even less ambiguous, but it has the huge disadvantage of not being machine sortable.  I’m currently in the process of renaming our digitized newspapers so they can conform to these standards.  It’s tedious work, but I’m happy knowing that I’m doing my small part to help preserve these records.

Coming back now to the other aspects of file naming, I’m always shocked at the people who are perfectly happy with important documents like membership lists with names like untitled.doc or worksheet(1).xls  I try to gently encourage people to give them more descriptive names as best as I can and the reasoning behind doing so, such as,  “You might have to find it later!”

Metacrap

http://www.well.com/~doctorow/metacrap.htm

This is a great read, the author certainly has the dry type of humor that appeals to me.

I’m glad to see that the article ended with a “Don’t worry to much about it” attitude.  The same factors that keep us from reaching metadata enlightenment are present in just about all human endeavors and it hasn’t been to devastating yet.

It is good, however, to keep these aspects of human nature in mind when reviewing your or someone else’s metadata.  I’m reminded of the biases that we as a profession have inherited with certain legacy aspects of our field.  Just take a look at how the three hundreds section of Dewey is broken down or look at the changes that were needed to be made and still need to be made to the library of congress subject headings.  It would be pretty easy, particularly with more open-ended metadata content such as descriptions, to misrepresent the item being described.

Where it all went wrong

http://nathan.torkington.com/blog/wp-content/uploads/2011/11/Where-It-All-Went-Wrong.pdf

I really enjoyed reading this article, even though, or maybe especially because, it was so negative.

One of the author’s points that stuck with me was how libraries tend to have crap access to our digital collections. While that’s a trend that I have noticed as well, anyone who has priced getting a sleek and modern system can tell you the reason: it can easily add thousands of dollars to the cost of the project.  I can look at the California Digital Newspaper Collection with envious eyes all I want, but a similar system will not be implemented on my current digitization project.  I am not able to afford one and I do not have the technical ability (or the time!) to produce and maintain one.

Perhaps the best and most difficult to address point that the author made is how Google is the only source that many people go to for information.  It is, of course, much easier to access than any library, which would be bad enough for us, but I have, at least anecdotally, noticed another troubling trend: many young people regard Google not as an authoritative source, but THE authoritative source.

While the author made many strong points, after spending an entire afternoon trying to network a Mac to our copier, I must disagree with the idea of them being made for connectivity.