Understanding Wikimedia, or, the Heavy Metal Umlaut, one decade on

Image

It has been nearly a full decade since Jon Udell’s classic screencast about Wikipedia’s article on the Heavy Metal Umlaut (current textJan. 2005). In this post, written for Paul Jones’ “living and working online” class, I’d like to use the last decade’s changes to the article to illustrate some points about the modern Wikipedia. ((I still haven’t found a decent screencasting tool that I like, so I won’t do proper homage to the original—sorry Jon!))

Measuring change

At the end of 2004, the article had been edited 294 times. As we approach the end of 2014, it has now been edited 1,908 times by 1,174 editors. ((Numbers courtesy X’s edit counter.))

This graph shows the number of edits by year – the blue bar is the overall number of edits in each year; the dotted line is the overall length of the article (which has remained roughly constant since a large pruning of band examples in 2007).

Edits-by-year

 

The dropoff in edits is not unusual — it reflects both a mature article (there isn’t that much more you can write about metal umlauts!) and an overall slowing in edits in English Wikipedia (from a peak of about 300,000 edits/day in 2007 to about 150,000 edits/day now). ((It is important, when looking at Wikipedia statistics, to distinguish between stats about Wikipedia in English, and Wikipedia globally — numbers and trends will differ vastly between the two.))

The overall edit count — 2000 edits, 1000 editors — can be hard to get your head around, especially if you write for a living. Implications include:

  • Style is hard. Getting this many authors on the same page, stylistically, is extremely difficult, and it shows in inconsistencies small and large. If not for the deeply acculturated Encyclopedic Style we all have in our heads, I suspect it would be borderline impossible.
  • Most people are good, most of the time. Something like 3% of edits are “reverted”; i.e., about 97% of edits are positive steps forward in some way, shape, or form, even if imperfect. This is, I think, perhaps the single most amazing fact to come out of the Wikimedia experiment. (We reflect and protect this behavior in one of our guidelines, where we recommend that all editors Assume Good Faith.)

The name change, tools, and norms

In December 2008, the article lost the “heavy” from its name and became, simply, “metal umlaut” (explanation, aka “edit summary“, highlighted in yellow):

Name change

A few take aways:

  • Talk pages: The screencast explained one key tool for understanding a Wikipedia article – the page history. This edit summary makes reference to another key tool – the talk page. Every Wikipedia article has a talk page, where people can discuss the article, propose changes, etc.. In this case, this user discussed the change (in November) and then made the change in December. If you’re reporting on an article for some reason, make sure to dig into the talk page to fully understand what is going on.
  • Sources: The user justifies the name change by reference to sources. You’ll find little reference to them in 2005, but by 2008, finding an old source using a different term is now sufficient rationale to rename the entire page. Relatedly…
  • Footnotes: In 2008, there was talk of sources, but still no footnotes. (Compare the story about Motley Crue in Germany in 2005 and now.) The emphasis on foonotes (and the ubiquitous “citation needed”) was still a growing thing. In fact, when Jon did his screencast in January 2005, the standardized/much-parodied way of saying “citation needed” did not yet exist, and would not until June of that year! (It is now used in a quarter of a million English Wikipedia pages.) Of course, the requirement to add footnotes (and our baroque way of doing so) may also explain some of the decline in editing in the graphs above.

Images, risk aversion, and boldness

Another highly visible change is to the Motörhead art, which was removed in November 2011 and replaced with a Mötley Crüe image in September 2013. The addition and removal present quite a contrast. The removal is explained like this:

remove File:Motorhead.jpg; no fair use rationale provided on the image description page as described at WP:NFCC content criteria 10c

This is clear as mud, combining legal issues (“no fair use rationale”) with Wikipedian jargon (“WP:NFCC content criteria 10c”). To translate it: the editor felt that the “non-free content” rules (abbreviated WP:NFCC) prohibited copyright content unless there was a strong explanation of why the content might be permitted under fair use.

This is both great, and sad: as a lawyer, I’m very happy that the community is pre-emptively trying to Do The Right Thing and take down content that could cause problems in the future. At the same time, it is sad that the editors involved did not try to provide the missing fair use rationale themselves. Worse, a rationale was added to the image shortly thereafter, but the image was never added back to the article.

So where did the new image come from? Simply:

boldly adding image to lead

“boldly” here links to another core guideline: “be bold”. Because we can always undo mistakes, as the original screencast showed about spam, it is best, on balance, to move forward quickly. This is in stark contrast to traditional publishing, which has to live with printed mistakes for a long time and so places heavy emphasis on Getting It Right The First Time.

In brief

There are a few other changes worth pointing out, even in a necessarily brief summary like this one.

  • Wikipedia as a reference: At one point, in discussing whether or not to use the phrase “heavy metal umlaut” instead of “metal umlaut”, an editor makes the point that Google has many search results for “heavy metal umlaut”, and another editor points out that all of those search results refer to Wikipedia. In other words, unlike in 2005, Wikipedia is now so popular, and so widely referenced, that editors must be careful not to (indirectly) be citing Wikipedia itself as the source of a fact. This is a good problem to have—but a challenge for careful authors nevertheless.
  • Bots: Careful readers of the revision history will note edits by “ClueBot NG“. Vandalism of the sort noted by Jon Udell has not gone away, but it now is often removed even faster with the aid of software tools developed by volunteers. This is part of a general trend towards software-assisted editing of the encyclopedia.NoSwagForYou
  • Translations: The left hand side of the article shows that it is in something like 14 languages, including a few that use umlauts unironically. This is not useful for this article, but for more important topics, it is always interesting to compare the perspective of authors in different languages.Languages

Other thoughts?

I look forward to discussing all of these with the class, and to any suggestions from more experienced Wikipedians for other lessons from this article that could be showcased, either in the class or (if I ever get to it) in a one-decade anniversary screencast. :)

Wikis and law school

The excellent Eric Goldman had a good post Tuesday about giving students grades for wikipedia content. This reminded me that ages ago I’d written that two of my classes were going to use wikis, but never followed up on it.

Image

picture: UC Berkeley Law School Quote, by ingridtaylar, used under CC-BY

The classes I used wikis for were different than Eric’s- he actually assigned students to create Wikipedia articles, whereas the four classes I ended up taking with wikis all used school-hosted wikis for a wide variety of purposes:

  • Three designated note-takers taking notes into the wiki, allowing the banning of laptops for other students.
  • Note-taking rotating among all students, with wiki gnoming being (if I recall correctly) an ill-defined grade component, but no non-note-taking articles assigned.
  • Creation of articles in a class wiki being the primary grade for the class, and with some interaction with other student’s work expected, but with no significant intent that the articles written would become a permanent resource for the public. Essays were capped at 1,000 words- which drove many students nuts but led to some fine writing.
  • Creation of articles in a class wiki being the primary grade, with the intent that the class website would build up over the course of repeated class offerings to become an authoritative web asset for the scholarly community working in that area. ((This separate class wiki had a lot of benefits, most notably being that student articles are never targeted for deletion as irrelevant, but obviously the segregation from the main wiki community has drawbacks too. Maybe the equivalent of the class prize for best essay should be that the best article is ‘promoted’ to main wikipedia… ))

(All of these classes except the last were in technology-related courses.)

Despite these widely different set of approaches, several pieces of Eric’s commentary rang very true for me.

First, basic wiki concepts were tough. Partially, this reflects poor technology- the average wiki is needlessly hard to use. ((I think real-time wiki/wysiwyg tools like Wave and Etherpad will help fix this once they mature.)) Eric saw this in his students (“it took students a substantial amount of time to format their entries into Wikipedia’s format”) and I think it was true in my classmates as well.

But it isn’t just about the technology. Eric says “[m]ost students did not intuitively understand how to approach writing an encyclopedic treatment of a topic.” That does not ring perfectly true for me- lots of my classmates read enough of wikipedia that the format was relatively familiar- but it isn’t insane, especially given the very wide variability in the treatment of legal topics in wikipedia. It would almost certainly help to provide a sort of ‘model’ article, much like the model memos used in writing classes. Since most of the cases will be about specific statutes or cases writing two model/template articles should suffice for many classes.

Other wiki concepts, like extensive linking, or publishing drafts to the world in wiki-style, were apparently even more strange to most of my classmates. None of the four class wikis were deeply interlinked or cross-referenced, outside of what was necessary to create a table of contents and occasional outlinks to wikipedia. Similarly, few students were willing to post works-in-progress to the wiki and refine them there- most students preferred to work privately and then put a final text into the wiki. I’m not sure that law school is the right place to teach wiki nature, and indeed Prof. Goldman seems nervous about publishing student work while it is still a work-in-progress ((It might make sense to ‘incubate’ student posts in a separate wiki, so that their classmates can see and participate in each other’s work, before publishing it to Wikipedia.)), but still- I was surprised so few of my classmates appeared to be into the wiki way of creating iteratively edited, interlinked content. ((Tangentially, focusing on linking may also provide the solution to Prof. Goldman’s problem that the school requires seminar papers to be 20 pages long- one article is unlikely to be of equivalent length, but an interlinked network of articles on related cases, statutes, and topics could easily grow to that size.))

Collaboration was another angle that was difficult. Prof. Goldman says “I gave students the option of working together on a topic, but none ended up pursuing that.” This is not surprising- law schools are essentially designed to teach anti-collaboration- but it is a shame, since collaboration is a (the?) crucial skill in legal practice. Some mandatory wiki collaboration (every student required to substantively edit and fact-check another student’s work, as well as their own writing?) might be a small step in the right direction- and might also help alleviate Eric’s concern about the amount of time he spent editing and fact-checking. As a bonus, the wiki nature of the project should make it easy to grade this student editing- the edits will all be right there ((One could imagine giving 40% credit for the article and 10% credit for the quantity and quality of edits made to other students articles, if you had an incubator wiki)).

All these issues make it hard to write good informative wiki-articles in a class context, but surprisingly, they also made the class-notes-in-wiki strategy fall far short of its potential. I would have thought that the lower barrier to entry (no need for perfection) and the stronger incentive for students to delve into them (so that they’d be prepared for exams) would have encouraged these wikis to become ongoing demonstrations in improvement. But instead people just had other things to do, so they tended to languish, untended, until right before exams. I think some ‘live’ wiki technologies like Wave, Etherpad, etc., will help improve that in the future (by allowing more than one editor while the class is actually happening) but until them I’m afraid wiki class notes might not get very far.

In the one class I had that was truly article-oriented, the professor provided a set of suggested questions to research and address. Prof. Goldman seems to regret not doing this from the start, but unfortunately this seems like an inevitable requirement. At the time you want students to start researching and writing they just can’t know the subject area well enough to know what is ‘missing’ from the wiki, so you almost certainly have to provide pointers for all but the most driven students. Note here that this class was in a purely scholarly area (no one was going to treat our work on English property law of the 1300s as legal advice) so we did not have some of the constraints that he felt he had with regards to making sure it was right before it was published. It would be interesting to delve into this question more- given that articles do not identify their authors as lawyers, and given that people come to wikipedia with an expectation that it is imperfect, I wonder if students can be encouraged to publish more work in earlier forms than they might otherwise.

Prof. Goldman concludes that “[i]t is unrealistic to expect that most law students can produce useful entries without supervision.” I’m not sure I’d be so harsh; I think most of my classmates were capable of doing this if prodded to, and it seems like most of Eric’s were too (after more supervision than he expected, admittedly.) But if he is right, this is a pretty sad statement to make. We’re a profession which is necessarily grounded in our ability to communicate, and we should be a profession grounded in our ability to communicate clearly and concisely to a legally unsophisticated public- that is to say, to our clients. If our students can’t write a simple encyclopedia entry, we’re in trouble.

Despite this pessimism, I think the piece gets the most important part exactly right:

I think a wiki entry might be a useful alternative to the traditional seminar paper. I have never been a huge fan of requiring students to write law review-style seminar paper in a semester-long course. Ultimately I think it’s nearly impossible for a novice to come up with a good topic and write a coherent and well-researched paper in a 4 month semester from a cold start. (I expand on that point a little here). As a result, in practice, many student seminar papers devolve into quasi-encyclopedic treatments of a topic with a paragraph of student commentary tacked onto the end. Instead of going through that charade, the professor could channel the student’s research and writing effort into an expressly encyclopedic treatment. This would reduce the pressure students feel to come up with a novel topic, and it would allow the world at large to benefit from the student’s work rather than the effort going into a desk drawer (or worse, the circular file) at the semester’s end.

In my experience, wiki writing- whether the goal is inclusion in Wikipedia or not- really should be part of the law school curriculum. It is better than traditional papers for teaching basic research and scholarship, and if done well, can also teach collaboration, editing, and other writing skills. There is still a lot to learn about the ‘done well’ part, but I hope Prof. Goldman and others continue to experiment with it. They’re doing the right thing even if their students don’t realize it yet :)

Durham, higher education, and me

I’ll be in Durham this weekend; I’m not sure how much time I have free, but drop me a note if you’d like to grab a meal or beverage of choice.

The formal reason I’ll be in Durham is to give a guest appearance at CompSci 82, Technical and Social Foundations of the Internet. I have been involved in academia previously (TAing a couple times, lecturing at the Sloan School of Business at MIT, and moderating a panel at Harvard Law School) but I’m pretty sure this is the first time anyone has been quizzed on the content of my blog. I am terrified at what this indicates for the future of our civilization. :)

Before anyone asks, I don’t know whether guests are welcome to sit in, and I also don’t actually know the time and location of the class. But if I figure those out, I’ll post here.

(Informally, I have tickets to blue/white and the football game and extensive eating plans. But really, the class appearance was scheduled first.)

what I have been up to

Lots of people I saw in Boston were asking ‘what have you been up to’ instead of the usual ‘sounds like things are good from your blog’ :) I guess I’ve been a little quiet here about me, personally. So some updates:

  • School is generally good; the first two years ended up being very successful (low honors first year; high honors last year.) This year I planned to throttle back to have more time for outside projects, so I am taking fewer credits than ever. Unfortunately, I seem to have chosen those credits poorly so I am doing more work than ever. Hence, not much time for outside projects :/
  • Will spend the summer studying for the bar; location TBD (since Columbia throws me out of housing a few days after graduation.) Yes, the bar is hard. Not that hard- Columbia alums pass the California bar at a 90+% rate. But obviously no one wants to be in that 6-8% so of course everyone studies like crazy. That will be me.
  • Have accepted a job at Orrick Herrington Sutcliffe starting early fall ’09 in their Silicon Valley office. I look forward to it- excellent firm, excellent people, probably will not implode in the next year. :) Current plan is to work about 50-50 on startups and technology licensing, but obviously the economy may dictate a different balance. Silver lining of the economy may be more time for pro bono projects, of which I obviously have a long list I’d like to work on.
  • Am not getting married at GUADEC. ;) Probably a low-key family-only affair followed by big, fun parties in Miami and Boston (or New York?)
  • Krissa and I are trying to enjoy NY as much as possible before leaving, which includes lots of live performances (Jazz at Lincoln Center, ‘In The Heights’, Deblois, Nutcracker), lots of eating (Caracas Arepas, dinner at a not-so-expensive place with Steve Martin and a guy who looked a lot like Paul Simon at the next table), and lots of family visits and East Coast travel (I’m in week two of a six week stretch with family or travel every weekend- all four-plus parents, Summit, and a lecture at Duke.)
  • Krissa is good- loving her job still; enjoying NY; looking forward to going home to California. Currently in Turkey biking with her mom, else she’d have been in Boston and in Durham next weekend.

So yeah, life is good. Crazy, but good. Not sure I’d have it any other way.

Apologies to everyone who I said I’d see this morning at Summit; unfortunately I had to change my train to a pretty early train and overslept, so pretty much ran from hotel to train. Next year…

journaling in internet time?

Given that virtually all journal articles are published on SSRN, and read and discussed, before they hit actual journals, could journals seek to substantially shorten the amount of time between submission and publication, so that authors feel that journals are active contributors while the article is ‘hot’, rather than feeling that they are the finishing polishers of an already-cooling project? In particular, is journal work ‘parallelizable’? In other words, if you put four times as many people on it, would it get done four times as fast? Columbia Law Review publishes on the order of 40 pieces a year, and takes around 12 weeks off for summer break. So it averages out to a piece a week, but it is lumped together into eight issues. Could they be publishing one piece a week, and turning pieces around in 1-2 weeks, instead of every 5 weeks or so, turning them around in longer than that? I think that might be a bit ambitious- some parts of the publishing process do not get faster the more people you put on them- but it might not be. I’m curious if any journal is trying this model.

Speed

Speed by José Juan Figueroa. License: Image

I also have some thoughts on journaling at internet attention span (which pre-date, but are similar to, Berkman’s Publius Project) but they aren’t quite ready for prime time yet. (Caveat: they aren’t really my thoughts; something a friend shared with me instead, but I love them.)

second worst dialog I saw during a recent Ubuntu upgrade

Image

This dialog gets points for being graphical, and loses many, many, many points for presenting no information that any reasonable user could possibly get any use from unless they already previously understand (1) what FUSE is (2) how to get FUSE plugins (3) who the ‘first user’ is (4) what the ‘fuse group’ is and (5) how to add users to the ‘fuse group.’ And if you know all those things, you didn’t need the dialog, so kudos for being both useless and intimidating.

The worst dialog was actually a terminal wrapped in the upgrader GUI which stalled my entire upgrade in order to ask me what my terminal encoding was, helpfully presenting a list of 28 possible encodings, of which UTF-8 was 27th and the default was some obscure encoding I’d never previously heard of. (The other times the upgrader stalled the upgrade to ask for input it told me I’d modified config files I’d never previously heard of, much less modified, but at least those had basically the same useful-ish debian config file dialog I’ve been used to for ages.)

Linux has come a long way (the upgrader helpfully offered to do a partial upgrade instead of complaining and dying like previous debian/ubuntu upgrades), but still has a long way to go too.

(These weren’t the only problems I saw; Gerv has a good list of some of the other ones, though I didn’t see all of the ones he did.)

Voting With Your Feet and Other Freedoms

This Post In A Nutshell (aka, the Murray Version)

No one should be surprised that social network users can’t ‘vote with their feet,’ because most users give up a portion of their autonomy when they choose to use web services. This post will suggest that protecting autonomy is desirable and should be designed in to software, and outline five qualities that such software would have.

[The rest of the post will not be brief; it is in part a draft of an essay for my class in ‘Law in the Internet Society’.]

Continue reading “Voting With Your Feet and Other Freedoms”

my blog: the Q&A for law firms and other interested parties

Blogging About

the executive summary:

Nutshell: if you’re a law firm considering hiring me, and you stumble across this blog, please don’t get nervous. Instead, talk to me, and/or read the rest of this post. I’m eager to explain why I blog, and why I think it may make me a better lawyer and a good addition to your firm.

[Image by Hugh Macleod of Gaping Void fame; used with permission under the Creative Commons BY-NC-ND 1.0 license. For more on why Hugh licenses his images this way, see here.]

the full story:

Continue reading “my blog: the Q&A for law firms and other interested parties”

Law Study Systems

[cross-posted from First Movers]

I suppose it was almost inevitable- you can study for LSATs and bar exams online, and you can invest piles of money into various law school study aids, so it was only a matter of time before someone created an online study system for the typical 1L legal curriculum.

And here it is, or at least, the first one that I’ve heard of. The company calls itself Law Study Systems, and right now appears to have coverage of Contracts, Torts, and Criminal Law- not much, but a good start for lots of 1Ls.

I’ve only skimmed the materials, since they don’t cover classes I’m currently taking, but the material appears to be fairly solid and comprehensive. It isn’t the kind of thing that you’d want to learn Contracts from (that is after all what class is for), but it is probably pretty nice for a pre-exam refresher at the end of the semester, or perhaps as an intro to use over the summer before classes start (as I know some classmates did.) And it has fairly broad coverage- 22 ‘tutorials’ on remedies in contracts alone.

Of course, it has the same problems as most review materials- for example, while it has great coverage of remedies, my contracts course did relatively little on remedies, so that material probably wouldn’t have helped me very much. Of course, no review materials (unless your professor happens to write review materials as well as teach) will be perfect in this sense, but the difficulty of skimming this material to find what is relevant may be a slight disadvantage during last minute cramming. A search function (currently missing, as far as I can see) might help alleviate that, and make it more useful for targetted last-minute review.

Schools appear to have the option to work with LSS to customize the materials- which is an interesting idea. To the best of my knowledge, no law school is emulating MIT’s Open Courseware and putting their course materials online in an organized fashion, and perhaps this might be the start of that for some schools. (Someone will have to eventually- the publicity of making your materials the standard reference for everyone on the web will be too big a lure to ignore.)

Software-wise, this is not terribly sophisticated yet- for example, the online LSAT prep I did was much more interactive, with music and animations, both of which this lacks. Despite the lack of sophistication, the important parts look like they are here and would get the job done- for example, while it is slide-based, it also quizzes you during the slides, so you have to pay some attention (and recall previous lectures). And it works in Linux, so it is likely to work on the Mac as well- something my LSAT prep could not do, in part because of the audio requirements.

Given how little I’ve used this, and given how much different people’s study needs vary, I can’t say that I can recommend this without reservation- but it certainly merits looking at if you’re a 1L who wants to look at online options for studying for your core classes.

[Disclaimer: LSS contacted me and listed me as a ‘blog they like’, and gave me free access to their premium materials for this review, but otherwise I’ve received no compensation nor do I have any relationship with them. I reviewed them instead of ignoring them, like I do most linking/review requests, because of my long-standing interest in online education.]