Bringing open tools to public-domain literature

You are browsing the archive for Musings.

Open Shakespeare presented at NESTA Event

July 8, 2011 in Musings, News

My trip to speak at a ‘digital day’ organised as part of the new ‘Digital Fund for Arts and Culture’ by NESTA (National Endowment for Science Technology and the Arts) was eye-opening, to say the least. I thought I’d put a few of my reflections, general and specific, down in this short post.

About halfway through the day I noticed that little had been said about social media: I mentioned twitter in my presentation about Open Shakespeare, but Facebook (even in a discussion devoted to ‘social media and user-generated content’) was largely absent. Thinking about why this might be, I imagine several reasons: first, a lack of understanding about quite how important facebook now is in internet usage; second, the absence of experience in managing a successful facebook-based fan network; and, in relation to this, third, the peculiar language of ‘likes’ and so on specific to Facebook, and the difficulty of communicating what may be an original artistic project in the standardised vocabulary of such a platform. For a more developed reflection about this point, do have a look at Patrick Hussey’s thoughts on ‘community managers’.

Although people weren’t talking about social media, they were talking about the annotator used on Open Shakespeare. Everyone was agreed that it would almost certainly grow very big, yet also that, before it did, a few things needed to be put in place, namely:

  • Versioning: i.e. a freely annotatable text, from which annotations gradually moved to a more established version.
  • Login: crucial to filtering annotations
  • Tagging: for filtering; already in place, but needs to be simplified

If we want to extend the annotator beyond Shakespeare, and really increase its use, one delegate pointed out how well adapted science fiction would be to the tool. First, science fiction readers tend to be more tech savvy; second, science fiction (like fantasy) often teaches its readers about its world as they read, thus providing information for retrospective annotation without too much additional research (as opposed to Shakespeare, who often demands a grip of sixteenth/seventeenth century England); finally, perhaps one of the most famous science fiction writers of all time, H P Lovecraft, is almost completely in the public domain…

Last but not least in this rag-tag post, a point about some of the other things I heard during the day. Andrew Nairne, Director of the Arts at the Arts Council, spoke about how £20m had been allocated for digital/artistic collaborations, for which the NESTA scheme serves as a pilot. He spoke of “digital” as an “operating context” (so both a context in which to operate, and one, I presume, that operates upon the content delivered through it), yet also underlined the ability of technology to serve the arts, “accelerating and enhancing”. Last but not least, he and several others, pointed to the utility of adopting a “gaming” model for online art, partly, I feel, in an effort to overcome one of the many instinctive fears of arts organisations, whose presence resounded through the beautifully modern NESTA suite from time to time throughout the day.

Open Shakespeare at OKCon 2011

July 3, 2011 in Musings, News, Shakespeare, Technical

OKCon 2011, at the Kalkscheune buildings in Berlin, was fantastic, and I thought it would be a good idea to publish a few reflections on some of the stuff that was going on there, both for the benefit of those who did not make it nor watch the live feeds, and for the chance it offers of mapping Open Shakespeare’s position in the wider Open Knowledge community.

Rufus Pollock provided the opening address, pointing out how the convergence of the two phenomena of greater data availability and advanced computing power had created the perfect conditions for openness to flourish. He announced one such flourishing in the form of datacatalogs.org, which came online at the start of the conference. His next point was to argue that the focus of activities in the community was moving from making data accessible to providing tools for and building communities around that data. Of course, the quantity problem is only half solved (a later speaker pointed out the small quantities of open government data in Asia, for example), but was still at a point where data cycles (ecosystems of community, tools and data) could be founded. This last point fits neatly with Open Shakespeare, since the project is slowly forming just such a cycle: early editions of Shakespeare’s plays are open data, and a small community is either building tools (like the annotator) or using them to create more content about Shakespeare’s works, which in turn offers new programming challenges and so completes the circle.

Glyn Moody’s keynote talk, immediately following Rufus’, approached the topic of Open Knowledge from a different angle, by analysing the current situation in terms of a new abundance which placed pressure on systems, such as the UK’s copyright law, designed for eighteenth-century conditions of scarcity. Although Moody did not mention it, Shakespeare himself was something of a forerunner in this domain: the “fourteen years plus fourteen more” model of copyright established in 1710 was the result of bookseller lobbying, not least that of Jacob Tonson, eager to protect his monopoly on the works of Shakespeare and others (notably Milton, and Dryden’s translations of Virgil). Having sketched out his model of abundance and scarcity, Moody concluded with the provocative question of how open projects would function without copyright, pointing out that many in fact depend upon restrictive legislation as their raison d’être. The only answer that I can give is that open projects would perhaps continue as the first models of communities where exchange and collaboration are well established (as in Open Shakespeare), that is to say, continuing as, in other words, those “data cycles” and “ecosystems” that Pollock had described as the successors to the victories of open data availability.

Later on in the conference, in the second track of talks, a panel on ‘Data Journalism: What Next?’ provided considerable food for thought on the topic of communities, much of it served up by the Guardian’s Simon Rogers. It was he, for example, that questioned the merits of crowd-sourcing, arguing that it did not provide objective data, since its contributors could be extremely biased, an MP participating, for instance, in the crowd-sourced analysis of his own expenses. This point was backed up by Stefan Candea, with both he and Simon Rogers emphasising the important labour that remained for the journalist when it came to looking over crowd-sourced responses and shaping them into a story. A neat example of this was the Guardian’s exploration of Sarah Palin’s emails, where users were directed to a random email and then asked to signal anything of interest. Although not flawless (one imagines a Palin aide slaving away to hide significant correspondence), its randomness nevertheless provided an even coverage of the files. This randomness might be an important tool for Open Shakespeare’s own crowd-sourcing of annotations, as a way of directing users to annotate less-appreciated works. As regards the verifiability of these annotations, Open Shakespeare has the problematic luxury of considering subjective opinion on the Bard’s art as valid as objective facts about it, since these opinions map the contours of contemporary attitudes to Shakespeare. Further, the intense subjectivity of responses to art means that such subjective annotations do not suffer from the problem of verifiability, because no such critical response has ever been verifiable (for those interested, this line of argument is behind Kant’s description of “universal subjective validity” in his Critique of the Power of Judgment).

It is on this idea of subjective annotation, the generation of subjective data, that I would like to bring this summary to a close. The conference was on Open Knowledge, but it is significant that I found the adjective to have been discussed far more often than the noun. Open Shakespeare’s annotation system, the tool that generates its data cycle, provides both verifiable information (“mirth in funeral” is an example of “synoeciosis” in Hamlet) and subjective opinion (“Words, words, words” is, for one user, “one of the most human lines in the play”). Is the second still data? I would argue that it is, but it is of a kind rarely discussed in Berlin. After all, what are we to do with it in order to integrate it back into the system of open data? Such opinion does not atomise easily, just as Shakespeare’s own words resist, with their context and their double meanings, computerised analysis. We can count the instances of the word “prune”, but it takes an article on the subject to bring out the humour from the information generated by the open-source tool. That article itself is data and can be itself the launch pad for new responses, but it moves the axis of the cycle away from developers’ tools and their data and towards the perspective of the user and, more broadly, that of the community. Rufus Pollock was right to argue for the existence of ecosystems of open data, but the case of Open Shakespeare shows that they can only be fully functional if all three elements are given their full weight: tools, data, and users together.

“Time travels in diverse paces”: An Update on Open Shakespeare

June 26, 2011 in Community, Musings, News, Shakespeare

May and a month that has only belatedly met the standard of what Shakespeare calls “hot Junes” have passed since last I wrote an update about Open Shakespeare. As ever, quite a bit has been done on the project, and there remains much more to do in the future.

If one word could sum up the work of May and June, it would be ‘users’. These two months have seen our online presence, especially on twitter, grow: over four hundred and twenty annotations have now been written, and we have been followed by, amongst others, a Tory MP and the artistic director of the Boston Actors’ Shakespeare Project. In order to provide a regular stream of new content for our followers, weekly articles on Shakespeare’s words have been posted over the last eight weeks, those on “dawn” and “drawer” attracting the most interest.

There is no single word with which to encompass our plans for the future. A study of how people use the website, and especially the annotator, is currently underway, the conclusions of which will soon be presented at OKCON 2011, and – if all goes well – in journal format also. One recommendation will be to establish ready-made categories for annotations, in order to make organisation of the comments much easier. Whilst studying the data, it also occurred to me that the website could be extended with the incorporation of famous past annotations, such as those comments made by Johnson and Pope when they each edited Shakespeare’s works in the eighteenth century.

Of course, we need not only incorporate the annotations of Johnson and Pope into Open Shakespeare: we could also expand Open Shakespeare to Open Literature and include their creative work too. Indeed, just such an expansion is likely to take place over the summer, and we would love to hear about any ideas people have for Open Literature: whether, for example, there is a particular (out of copyright) author you would like to see uploaded soon or whether you simply have some thoughts about the layout of it all. As ever, you can get in touch through the website, post to the open literature mailing list, or best of all, add to the new Open Literature Wiki.

Open Shakespeare: March and April

April 30, 2011 in Community, Minutes, Musings, News, Shakespeare

Annotation Sprint II

Our second annotation sprint, taking place at the end of Cambridge University term attracted contributions from all over the internet, particularly from the States. In Cambridge itself, our volunteers continued working on Hamlet, bringing the total number of annotations on this text to nearly 300.

Since this sprint, we have overhauled the aesthetics of the annotator, and added the ability to tag annotations. Work has also begun on other plays by Shakespeare, including: Henry IV pt 1, Much Ado about Nothing, Troilus and Cressida, and more.

Outreach

The project continues to appear at various events in and around Cambridge. Upcoming appearances include:

  • ‘Humanities Research: the future might be digital’, 11am – 4pm 10th May 2011, CRASSH (Centre for Research in the Arts, Social Sciences and Humanities).
  • ‘Food for Thought’, 2pm – 5pm 27 June 2011, English Faculty Library, Cambridge.

We have also began collaboration with local schools in Cambridge in order to test the utility of the annotator tool for Key Stage 3 students of Macbeth.

Online Editions of Shakespeare

January 15, 2011 in Community, Musings, Technical, Texts

The story of Shakespeare on the internet is a tangled tale, and this post is an attempt to unravel it. In expounding the advantages and shortcomings of online editions, I hope also to explain a few of the problems Open Shakespeare faces.

Editions Used by Open Shakespeare

Every work on the Open Shakespeare website has three possible texts, and it is worth explaining their provenance here in detail:

GUTENBURG FOLIO – These are drawn from Project Gutenberg, with the editorial prefaces removed. Nothing else has been changed. The Gutenberg scanner claims that the text “is as close as I can come in ASCII to the printed text,” however it is important to record here several features of his methodology.
- Some spelling “mistakes” have been corrected according to a dictionary created from the spellings of the Geneva Bible and Shakespeare’s First Folio.
- Typos and abbreviations have also been “corrected”
- “Elongated S’s have been changed to small s’s and the conjoined ae have been changed to ae.”
- The actual text itself is composite, made from “30 different First Folio editions’ best pages”

GUTENBERG – Again taken from Project Gutenberg, this time from a more fully edited edition, with a cleaner layout, and the inclusion of 18th century stage directions. Open Shakespeare, as is usual for us, has removed all the prefatory material but kept the edited text as is. Unfortunately, nothing is disclosed about the process of editing or the source texts used except for the single phrase “This etext was prepared by the PG Shakespeare Team, a team of about twenty Project Gutenberg volunteers.”

MOBY – This text comes from the most widely available online edition of Shakespeare, of whose advantages and shortcomings there is a useful summary on the Open Source Shakespeare website.

Other Online Editions: ISE and Wordhoard

ISE

The principle website for online editions of Shakespeare is ISE (Internet Shakespeare Editions) where the following are offered, taking their entry for Hamlet as an example:

TEXT EDITIONS – These cover modern spelling and unmodified spelling versions based on the first folio and quarto 1 and 2, all of which have been edited. In the case of Hamlet this editing has been done by David Bevington, a scholar of some note. For other editions, the editors are less well known, and in many cases there has not yet been a peer review.

FACSIMILES – This is perhaps the real strength of ISE: several different First Folios have been scanned, and the results are very impressive. They also have facsimiles of the 1603 and 1604 quartos of Hamlet.

ANNOTATED EDITIONS – One of these does not yet exist for Hamlet, but David Bevington has again produced a useful peer-reviewed edition of As You Like It, on which one can toggle his annotations and record of collations.

COPYRIGHT – Everything on the ISE is under a variety of copyrights. The copyright for the edited texts uis owned by the editor, and the images that make up the facsimiles have a rather ambiguous copyright situation, depending on their source. Although, ISE state, “All items published on the site of the Internet Shakespeare Editions…may in all cases…be used for educational, non-profit purposes”, quite where an Open License website like our own fits in is deeply ambiguous, since material published on our website could feasibly be used for commercial purposes.

Wordhoard

Provided by Northwestern University, this website provides a set of texts worthy to serve as definitive online editions of Shakespeare. Along with other authors’ works, one can download two versions of Shakespeare’s writings: one encoded in TEI, the other linguistically annotated – which is to say every word in the text is associated with a lemma and part of speech.

For me, the most exciting part of this project is the way in which these lemmatized texts can be manipulated. Northwestern University gives one example: a short program written to answer the question ‘Does Shakespeare use mostly the same vocabulary in each of his works, or does he use different vocabulary?’. I recommend visiting the website for the answer, and for a wealth of other little bits of information about Shakespeare’s vocabulary.

The copyright position of the wordhoard project is complicated. However, the website’s stance is far more ‘open’ than that of the ISE, so collaboration between Wordhoard and Open Shakespeare may be a possibility in the future.

Shakespeare and Media

July 29, 2010 in Community, Musings, News, Publicity, Texts

I spent much of this afternoon perusing the materials available at Shakespeare’s Staging, after its director got in touch with Open Shakespeare. Amongst all the images of past productions, my favourite was one of the earliest: a drawing of Edward Kean as Bertram in All’s Well that Ends Well. I find you get a real sense of Bertram at a perhaps more unguarded moment, mouth closed, eyes set, yet also a little forlorn against the grey backdrop.

These pictures and videos got me thinking about something I said about Open Shakespeare’s annotation tool at OKCON, that by allowing people to digitally annotate we would collect and preserve a continuously evolving catalogue of responses to Shakespeare’s works. Shakespeare’s Staging has done something similar, but, whereas Open Shakespeare is concerned with the text, this site records the response of actors and directors to what Shakespeare wrote. Each performance is, after all, its own unique (re)presentation and interpretation of the text.

The overlap between our work is obvious, and the next step of the process seems clear. If we accept that Open Shakespeare should allow anyone to contribute and share their responses to Shakespeare, and if we decide that performance of a play is itself a response to Shakespeare, then our website should expand to allow records of performances to be included. Such records can exist in written form (I think of that Swiss doctor’s description of a performance of Julius Caesar in 1599), but also as images or videos. Each media in turn brings its own problems. A video recaptures the experience of one spectator, but is one spectator’s view representative of the whole audience’s experience? An image captures a moment, a mood, but gains its force through exclusion. Text can only appeal to the eyes and the ears via the brain.

Given the weaknesses of each medium as a record of responses to Shakespeare, the only reasonable conclusion is to adopt a composite approach. Discussion has begun on how best to do this given the current framework of Open Shakespeare, and if anyone reading this has anything to contribute, please do not hesitate to get in touch.

And because I cannot write a blog post without quoting Shakespeare, please allow me to point out one exquisite exchange between the Clown and the Countess worried about her son Bertram, lines which serve as hints for an actor’s behaviour, as much as recognition of the limitations of the written text.

> CLOWN Why, he will look upon his boot and sing; mend the ruff and sing; ask questions and sing; pick his teeth and sing. I know a man that had this trick of melancholy sold a goodly manor for a song. > > COUNTESS Let me see what he writes, and when he means to come.

Shakespeare’s Staging, and Open Shakespeare too, should let us see what Shakespeare writes in more ways than one.

Open Shakespeare Out of Hibernation

June 4, 2010 in Musings, News, Publicity, Releases, Uncategorized

Exam season is finishing, our free time is returning, and Open Shakespeare is coming back to life. We held a short meeting yesterday evening, and can now announce what we intend to do in the near future:

EXPAND: there will be an Open Shakespeare Party in Emmanuel Fellows’ Garden, Cambridge at 3pm on 14th June. Be there if you can, and if you can’t visit our newly refined ‘Get Involved’ page.

WRITE: the first round of introductions will soon be completed, but we want to welcome more submissions, especially if they build upon the work of previous writers.

BLOG: the Word of the Day feature will be back with us very soon, and will hopefully expand in terms of both writers and articles. The blog itself has already had a little bit of an overhaul, and some out-of-date material will be replaced over the coming weeks.

TEACH: following suggestions made at OKCON, we are proposing the use of Open Shakespeare as a classroom aid. Through this we help to raise the profile of the project, and offer a new way for school children to collaboratively engage with Shakespeare.

These are the main points of the meeting, whose minutes are available for perusal. It remains only for me to quote Nestor, in Troilus and Cressida, and say that this post is only a hint of what’s ahead, and yet…

in such indexes, although small pricks
To their subsequent volumes, there is seen
The baby figure of the giant mass
Of things to come at large.

Open Shakespeare at OKCON

April 27, 2010 in Community, Musings, News, Texts

Last weekend was OKCON, and I delivered a 15 minute introduction to Open Shakespeare there. Little of what I said was new, and the real interest for me came from the discussions I had with other conference-goers during the day. A few of these discussions, and one or two presentations, have given me a several ideas for Open Shakespeare, which I shall outline briefly here.

Sören Auer, speaking on ‘Linked Open Data’, mentioned the beneficial effect that a ‘pingback’ service had provided to the blogosphere, helping to foster conversations and build networks of opinion. This made me wonder at the benefit such a tracking service would have for Open Shakespeare: if you were told when text you had annotated was annotated by someone else, you would have the chance to both share in the new contribution as well as discuss it. The system could also cover the critical introductions and would foster a more personal involvement in the site, which can only be a good thing. There is one downside: such ‘pingback’ services are vulnerable to spam, and Sören Auer was unable to sketch out a suitable response to this threat.

Tom Morris gave a presentation on ‘Citizendium’, whose modus operandi may have something to teach us when it comes to the writing of critical introductions. On Citizendium there is a fixed front article, behind which is a more fluid draft text. Such an arrangement allows both a space for rapid alterations and heated discussion at the same time as it protects the front matter from too extreme a modification, well-meaning or otherwise.

Away from the presentation, I had long discussions about printing the Open Shakespeare Editions with Ben O’Steen. One suggestion was that the problem of incorporating the annotations into the printed text could be solved with a script similar to that which converts blog comment into a printable format. Whatever the solution, some kind of tagging and annotation management system would probably be a prerequisite.

The last idea to come from OKCON (so far…) concerned widening the audience for Open Shakespeare. Several people recommended that we try and get school children involved, since the website could be a useful teaching tool, and encourage a new engagement with Shakespeare. Again, one hesitates to open the website to such a large audience without more means of managing annotations in place…but, still, a trial with just one class and one scene of a play seems to me something we could try right away…

Cardenio or Double Falsehood

April 15, 2010 in Community, Musings, News, Releases

There’s been a bit of a stir in the Shakespearian community recently, what with the release of a new play by the Bard. To be fair, it is not quite so sensational as it sounds: the possibility that part of Cardenio or, as the Arden edition entitles it, Double Falsehood might be by Shakespeare goes back to at least the 18th Century.

What’s new is that textual and historical evidence is now available that confirms this play to be from some time in the early 17th century. It contains, for example, the word “absonant”, which is found only in texts by Shakespeare…and by his successor as writer for the King’s Men, Fletcher. Thus the play is most likely a collaborative work between the two, as was perfectly normal for the period. Other Shakespeare/Fletcher collaborations include King Henry VIII, and possibly parts of Pericles.

I post this news here because such a claim was only made possible thanks to advances in technology dealing with texts. New databases of texts make searches for references to a play far faster and easier, whilst new stylometric algorithms make the most of such databases to pick up minute differences in vocabulary usage that allow an author’s DNA to be distinguished. For the curious, Shakespeare uses “thee” and “hath“, whilst Fletcher, being fifteen years his junior, uses the more modern “ye“.

Perhaps one day, The Open Shakespeare Project will contribute to such breakthroughs. Until then, we have a separate issue to deal with: do we add Cardenio / Double Falsehood to our site?

What do you think? Could you write an introduction to it?

Shakespeare Quarterly part II

April 6, 2010 in Community, Musings, Publicity, Technical, Texts

Here, for those interested, is my response to Professor Andrew Murphy’s article in the Shakespeare Quarterly:

“I am a member of the Open Shakespeare Project (www.openshakespeare.org – not to be confused with Open Source Shakespeare) and found this article extremely interesting. I feel that your conclusion points towards many of the approaches to Shakespeare that our project incorporates, and that are part of a more ’social’ approach to Shakespeare.

It occurs to me that as well as spreading Shakespeare to a far larger audience, cheap editions of Shakespeare are also a godsend for students, who may write their thoughts all over their pages without fear of ruining something expensive. If all these scribbles were collected, a formidable body of knowledge of Shakespeare would be available, as would an evolving record of responses to this writer.

Our site has recently acquired the ability for anyone to annotate Shakespeare’s works, and soon will add the capacity to attribute, tag, sort, and hide the annotations made. With this we hope to create an ‘open’ edition of Shakespeare’s plays that would grow along similar lines to Wikipedia, harnessing the power of the internet to bring many minds to bear upon a single subject.

Such problems as found with the OSS still pose difficulties for us: we have to use Moby as a source text since all others, including (lamentably) the wordhoard text, are under copyrights that conflict with our Open license. Nevertheless, just as textual problems are flagged up in a critical edition with a footnote, so too could such problems be drawn to the reader’s attention through annotation. As Whitney Trettien’s article points out, the web comes into its own when it is an ‘expressive medium’ itself, and not one which, like the OSS, unthinkingly delivers content.

Essentially, ISE already has this kind of thinking process, displaying an editor’s annotation on each text right down to the textual variants. It even has the ability to sort such annotations. However, the problems you identify – different kinds of editing, slow progress, uneven quality – all inevitably result, I feel, from the fact that each text only has a single editor. More editors would speed progress but it is not, of course, a given that more editors would improve quality. Wikipedia is still notorious for its occasional inaccuracies.

Nevertheless, such inaccuracies can be resolved by the same process that generates them. If anyone can annotate, so anyone can also review annotation and improve it. I realise that this is a rather utopian position and that people can as easily vandalise as beautify, but I feel it to be a more tenable one than that held by the websites here. The internet allows for unprecedented levels of input as well as appreciation, and such potential is not exploited by the sites reviewed in this article.

Talking of input and appreciation brings me to one further aspect of these sites that interests me, namely how easily one can print from them. The OSS shines in this respect, but attempting to print an ISE fascimile is rather more difficult. I must also admit that printing from an annotated text at The Open Shakespeare Project is currently impossible: the tool only went live fairly recently, and the site is still very much under construction. One day we hope to harness the accumulated and peer-reviewed annotations of many to produce a printed text, and thus complete a cycle between internet and ‘real world’ Shakespeare.

Such a cycle is ignored at the peril of digital scholarship, for it is the mix of real events and online responses to them that makes Facebook so addictive. Other addictive qualities, such as the relatively small time commitment and the chance to interact with other users could be profitably replicated by internet Shakespeare projects. After all, anything capable of sustaining those involved in the long task of making productive use of Shakespeare is always welcome and need not be to the detriment academic rigour.”

Here is the author’s reply:

James: thanks very much for this thoughtful and very interesting response to the review. I’ve had a quick look at your site and think it’s very interesting. It seems to me that you really are pushing forward with a Web 2.0 approach to things, making your site a good deal more interactive than the three I review here. I like the idea of building up a ‘database’ of annotations — and you’re right, of course: textual annotation might be a way round the problems of having to use an outdated source text. I still tend to worry about Wikipedia as a model, however. I always like to tell my students stories of humourous examples of deliberate tampering with Wikipedia, as a way of warning them off using it in their research (perhaps you may know what happened to Thierry Henry’s page, after France put Ireland out of the World Cup?). Will OSP be entirely ‘user governed’, or will you have some sort of ‘top down’ quality control mechanisms? Andy

The discussion raises some interesting issues. How bitesize and user friendly is our website? To what extent should ‘Open Shakespeare’ be user-governed? Any comments and suggestions you may have will be very welcome.