Saturday, April 21, 2007

xkcd rocks my socks

Anyways. No one reads this blog, I know, so I'm going to do some Google bombing. (If you want to know what/why, go here.) But basically, xkcd is my favorite webcomic ever and y'all should check it out. All one of you.

THE ALGORITHM CONSTANTLY FINDS JESUS
THE ALGORITHM KILLED JEEVES
THE ALGORITHM IS BANNED IN CHINA

THE ALGORITHM IS FROM JERSEY

Sunday, April 8, 2007

Books for non-religious people

So I have a lot of fun with LibraryThing's Unsuggester, just finding what kinds of books don't go with other kinds of books. And I thought, well, in honor of the fact that I went to my family's church for the first time since, let's see, Easter last year (Christmas was at my cousins'), I'd post about all these Christian books results that keep popping up. The idea of the UnSuggester is that it finds books that, statistically, are very unlikely to be found in the libraries of owners of the books you search. The thing's addictive . . . (If you'd like an explanation of how the thing works, better than I can put it, try this post on the LibraryThing blog. Which quotes another blog I read, actually, but Neil Gaiman's posts are so long that if I linked right to him *I*'d have to pull out the quote . . .)

But anyhow, the optimal anti-Christian library (if it were composed entirely of books I own):

Obviously, it would include the blatantly irreverent, like Good Omens, by Neil Gaiman and Terry Pratchett. Terry Pratchett's Discworld books, like Thud! and Guards! Guards!, although that might have something to do with the exclamation points. Where's My Cow?, sadly, does not make the cut; it does include some religious books, but its UnSuggestions are much more of a mixed bag.

And the nonfiction section of this anti-Christian library would feature, quite prominently, Jared Diamond's Guns, Germs, and Steel. If we expand the scope to not only books that I own, but books that I have read and enjoyed, we would *not* be able to include Richard Dawkins' works on evolution, despite the atheism professed throughout; these UnSuggestions are instead rife with YA, chick lit, and popular fiction that haunt "intellectual" books. Even Dawkins' atheist manifesto, The God Delusion, is firmly not in the anti-Christian camp.

If we were trying to create a library for the anti-Christian anti-programmer (or the anti-library for the Christan programmer?), it would feature rather prominently Lemony Snicket, for example The Bad Beginning, and, to a lesser extent, The Miserable Mill.

And if we look at the six most owned books on LibraryThing, that is, the Harry Potter books, we discover that for some mysterious reason suggestions have not been generated for Sorcerer's/Philosopher's Stone, or Prisoner of Azkaban. But for the other four books, we discover that these are the most anti-Christian books of all -- virtually every UnSuggestion is a Christian book. The "Rowling-is-the-devil" crowd have done their work well . . .

Another fun UnSuggester game is finding the "contradictions" in one's own library. The most statistically unlikely property of my own library is the combination of YA chick-lit and fairly serious intellectual books. The best pair I can find is The Princess Diaries and How the Mind Works, by Steven Pinker, 5th and 7th on each other's lists, respectively. The second-best I can find is also Pinker-related. Although the UnSuggestions for The Language Instinct contain more that I haven't heard of than that I own, two other chick-lits, Knocked out by My Nunga-Nungas and Dancing in My Nuddy-Pants feature it at number 6 and number 10, respectively.

Reading the full list for How the Mind Works is like reading a list of my favorite books circa 7th and 8th grade, which is rather a hoot, and full of nostalgia. (But does that mean that I am the opposite, statistically, at age 17 than I was at age 13?) I used to own so many Tamora Pierce books, and although now they have found good homes at a used book shop or the children's hospital, depending on when I decided to get rid of them, sometimes I still read them from the library when I'm feeling excessively nostalgic for my childhood, and I'll read her new books, to catch up on old friends more than because they're wonderfully written. Sometimes I do wish I'd held on to some of them, although I don't know where I'd put them -- my shelves overflow[eth]. (What's an appropriately archaic 3rd-plural marker?)

And of course, the beauty of quests like this -- finding out one's own statistical discrepancies on LibraryThing -- is that because this is data about members, if I notice contradictions like these, then enter the contradictory books into my collection, then not only have I found a motivation to enter more books (always a good thing!), but have I actually changed the recommendations, made them a little more accurate, a little more receptive to the diversity that exists among book collections.

. . . And, of course, the next time the suggestions are regenerated, my library will have fewer of these coincidences, because the mere fact of my owning How the Mind Works will push it down about 20 places on the Princes Diaries list.

Saturday, April 7, 2007

LibraryThing and linguists and linguists who blog about LibraryThing

So I was reading this post by Arnold Zwicky over at Language Log, about LibraryThing and linguists. Well, more specifically about the LibraryThing group I Survived the Great Vowel Shift. He analyzes this list, available on the front page:
Top shared books (weighted):
  1. The great Eskimo vocabulary hoax, and other irreverent essays… by Geoffrey K. Pullum (23)
  2. The name of the rose by Umberto Eco (77)
  3. The World's major languages (23)
  4. A course in phonetics by Peter Ladefoged (25)
  5. The language instinct by Steven Pinker (49)
  6. The Odyssey by Homer (78)
  7. The Silmarillion by J.R.R. Tolkien (72)
  8. The complete works by William Shakespeare (72)
  9. Guns, Germs, and Steel: the fates of human societies by Jared Diamond (69)
  10. The Hobbit by J.R.R. Tolkien (86)
If you're wondering about the ordering, it's because of LibraryThing's weighting technique. Although I poked around the site a good bit, I wasn't able to find an explanation of weighting, but I had a hypothesis and I tested it with my trusty graphing calculator, and it held. What this ordering means, as fas as I can gather, is that it relates to the ownership within the group of a given work versus overall ownership of the work throughout LibraryThing. For example, only 23 members of the group own The Great Eskimo Vocabulary Hoax, but only 50 people on LibraryThing own it -- so 46% of people who have this book in their LibraryThing catalogs are members of this group! By contrast, 73 people in the group own The Name of the Rose, which seems like rather a lot more, until you realize that this is only about 1.9% of users who own the book. By the time you get to number 10, The Hobbit, you're down to .86%. The group zeitgeist also includes this contrasting list:
Top Shared Books (unweighted)
  1. The Hobbit by J.R.R. Tolkien (86)
  2. Harry Potter and the Half-Blood Prince by J.K. Rowling (86)
  3. (83)
  4. (82)
  5. Harry Potter and the Chamber of Secrets by J.K. Rowling (82)
  6. Harry Potter and the Order of the Phoenix by J.K. Rowling (79)
  7. Harry Potter and the goblet of fire by J.K. Rowling (78)
  8. The Odyssey by Homer (78)
  9. The name of the rose by Umberto Eco (77)
  10. 1984 by George Orwell (73)
From this list, we discover little interesting. Apparently lots of people own The Hobbit! And they read Harry Potter! (Also there are apparently two different works whose title is merely a spacebar . . .) (And a sidenote: I would suspect that the reason that The Hobbit, beats out The Lord of the Rings so soundly in terms of ownership, no matter where one looks on LibraryThing, is because of the concept of what defines a "work". An omnibus Lord of the Rings and a boxed set Lord of the Rings are obviously the same thing, but there are also three different works involved -- The Fellowship of the Ring, The Two Towers, and The Return of the King, and LibraryThing doesn't currently have a way to acknowledge that. No, really, I was just reading about it.)

So, returning although the post that got me thinking about this by Arnold with a difficult last name says that Shakespeare and Homer beat out technical works of linguistics, technically he's right -- but then, Rowling and a spacebar beat out Shakespeare and Homer! And if you look at the numbers in a slightly different way (a way with division in it! Man, computers rock), technical works about linguistics dominate 4 out of the top 5 slots, which isn't so bad.

P.S. Another part of the analysis that stood out to me was
Tolkien and Pinker are no surprise to me; when I talk with young linguists about what got them into linguistics, Tolkien's invented languages and Pinker's Language Instinct figure prominently in their stories.
And speaking not as a young linguist, but a budding linguist (I used to use that term all the time, but had forgotten about it entirely until I ran into this group), I can definitively say that if I were to tell the entire story of what pulled me into linguistics, Pinker would figure rather prominently, and Tolkien would certainly figure somewhere.

Friday, April 6, 2007

This is my new blog . . innit shiny?

Hello! Probably no one will read this post, but I feel compelled to make it anyways.

This is a rather anonymous blog; I have a rather sprawling internet identity, but I have been finding increasingly that "blogosphere"-type analysis is really fun, and sometimes stuff doesn't fit in comments, or wants a wider audience than an e-mail to the blog's author might get, and I'd rather not start putting public posts in my friends-only livejournal again.

So, this is my new blog. I'm young, I live in the states, I'm something of a nerd. And now I'm off to dinner! LibraryThing-type-post coming soon . . .