Apophenia: IM Presence Vs. Communication

There is a “culture divide” in the instant messaging world and I think that this post by Zaphoria is close to pinning it down. Being online for chat has a tendency to signal different things to different people.

One specific tiny annoying thing I experience in this “presence vs. communication” aspect of the IM world is that when I get up in the morning (or middle of night or whenever my sleeping schedule reaches the “wake up” stage) I will wake my computer and check email and maybe read the news or flip through RSS feeds. Adium, which is often open, automatically logs me on to AIM and MSN. Sometimes I get bombarded with messages…only seconds after gaining consciousness. I think I need a feature in my chat program which delays the log-in-when-waking-computer feature by about 15 minutes to give me time to let the world come into focus.

I think the solution to much of the cultural difference in the IM world is to be found in 3 things:

1) Allow us to control which group of people can view us online at any time. That means we can filter out our “log on only for conversation” friends from our “always on and conveniently reachable for quick questions etc.” types – and be able to do this without them knowing it so they don’t get hurt that we don’t always let them see us online.

2) Ditch the simple binary “Here” vs. “Away” status to have a more rich range of status. You can do this now in many chat programs with text in the status line, but the problem is that people usually ignore them completely even if their program can shows this status line text.

3) Propagate as a universally tolerated practice the sensible idea in my “IM culture” that instant messaging isn’t always (or usually) about a sustained conversation where immediate answers are forthcoming (like a phone conversation) but can involve significant delays in reply and be incorporated into our multi-tasking. Thus I can read a book, read a page or two or three, then reply to an incoming IM (or several queued incoming IM messages) and go back to reading a while – or whatever task I was doing. Too many people I know think that if I say something to them on IM, then I’m doing nothing but IM…which may be true for them but certainly not for me.

These will all contribute, I think to incorporating IM productively into our lives.

WordPress 1.5

I just upgraded to 1.5 of WordPress, which is the blogging software I use to run this site. Not sure when I will get the “theme” more personalized. As those of you visiting the site may notice, I’m currently using the default “theme” that came with the upgrade. I’m hoping this new version will handle comment/trackback spam better and already I like some of its small changes.

New JSTOR Search

For those blessed with access to JSTOR, there is a new search interface which is much more powerful than before. I would provide a link to their news page about it, but it seems to be beyond their wall. I wish we all could have access to databases like this. I remember what it was like to be left out in the cold between my connection with Columbia U and Waseda (and Waseda only provided access while on campus).

Viking History Game

Try the Viking History game over at BBC. It is a great idea to use games like this (this one using Flash) to teach history. I learn some things about viking ship building and the routes ships took to the British Isles.

However, the point of the game seemed to be to kill lots of monks and collect as much treasure as possible. I guess that is realistic enough (though I wish it accounted for the trading and settlement aspects of the vikings, perhaps if you successfully complete the first mission)…but leaves a bad taste in the mouth for a game of educational value…

The concept, however, I think is great and can be applied to all sorts of things for children learning history. I remember I learnt a lot of my geography from games like Where in the World is Carmen San Diego etc.

Comment Spam

After installing multiple anti-spam methods (I may try to add a “are you human?” feature again) I got bombarded by a massive onslaught of commant spam. However, I can’t even call it spam, it was just a malicious attack since all the URLs and emails were just random characters stringed together, making it impossible to show anything they had in common (IPs were all different too)…

I’m turning off comments until I get an even strong anti-spam mechanism in place…this is so sad…

UPDATE: I was hit again today. Looks like I don’t even know how to turn all comments off so I have turned on moderation for all comments, so your submitted comment will have to be approved. No need to resubmit, I’ll approve it when I get it…sorry for the inconvenience.

Harvard Pilot Project with Google

I just got a university-wide email regarding a pilot project that Harvard is starting with Google. It looks like Google will also be joining with other universities in this project, which will begin the work of digitizing, and in the case of public domain works providing public access to, the contents of the Harvard library system. The email included a short summary of the initial pilot and didn’t ask me to keep this confidential so I will reproduce the description of the project below:

Harvard University is embarking on a collaboration with Google that could harness Google’s search technology to provide to both the Harvard community and the larger public a revolutionary new information location tool to find materials available in libraries. In the coming months, Google will collaborate with Harvard’s libraries on a pilot project to digitize a substantial number of the 15 million volumes held in the University’s extensive library system. Google will provide online access to the full text of those works that are in the public domain. In related agreements, Google will launch similar projects with Oxford, Stanford, the University of Michigan, and the New York Public Library. As of 9 am on December 14, an FAQ detailing the Harvard pilot program with Google will be available at http://hul.harvard.edu.

The Harvard pilot will provide the information and experience on which the University can base a decision to launch a large-scale digitization program. Any such decision will reflect the fact that Harvard’s library holdings are among the University’s core assets, that the magnitude of those holdings is unique among university libraries anywhere in the world, and that the stewardship of these holdings is of paramount importance. If the pilot is deemed successful, Harvard will explore a long-term program with Google through which the vast majority of the University’s library books would be digitized and included in Google’s searchable database. Google will bear the direct costs of digitization in the pilot project.

By combining the skills and library collections of Harvard University with the innovative search skills and capacity of Google, a long-term program has the potential to create an important public good. According to Harvard President Lawrence H. Summers, “Harvard has the greatest university library in the world. If this experiment is successful, we have the potential to provide the world’s greatest system for dissemination as well.”

In addition, there would be special benefits to the Harvard community. Plans call for the eventual development of a link allowing Google users at Harvard to connect directly to the online HOLLIS (Harvard Online Library Information System) catalog (http://holliscatalog.harvard.edu) for information on the location and availability at Harvard of works identified through a Google search. This would merge the search capacity of the Internet with the deep research collections at Harvard into one seamless resource-a development especially important for undergraduates who often see the library and the Internet as alternative and perhaps rival sources of information.

Eventually, Harvard users would benefit from far better access to the 5 million books located at the Harvard Depository (HD). If the University undertakes the long-term program, Harvard users would gain online access to the full text of out-of-copyright books stored at HD. For books still in copyright, Harvard users could gain the ability to search for small snippets of text and, possibly, to view tables of contents. In short, the Harvard student or faculty member would gain some of the advantages of browsing that remote storage of books at HD cannot currently provide.

According to Sidney Verba, Carl H. Pforzheimer University Professor and Director of the University Library, “The possibility of a large-scale digitization of Harvard’s library books does not in any way diminish the University’s commitment to the collection and preservation of books as physical objects. The digital copy will not be a substitute for the books themselves. We will continue actively to acquire materials in all formats and we will continue to conserve them. In fact, as part of the pilot we are developing criteria for identifying books that are too fragile for digitizing and for selecting them out of the project.

“It is clear,” Verba continued, “that the new century presents unparalleled challenges and opportunities to Harvard’s libraries. Our pilot program with Google can prove to be a vital and revealing first step in a lengthy and rewarding process that will benefit generations of scholars and others.”

When Harvard or Google make their official announcement, I’ll link to whatever I can find. I personally think this is really big news. I hope that after other major universities join this movement the Library of Congress will follow. Of course, I’m not happy about Google having a monopoly on this sort of thing but I suspect that these universities will not consent to any kind of exclusive licensing and we will see competing services emerge soon. This is a great day for scholarship and, in my opinion, for democratizing access to knowledge.

UPDATES: 21:00 – The first hit for this on Google News is a KOTV article about this here which just got posted. I’m guessing this will be big news tomorrow. In the article Harvard comes out as one of the least cooperative of the libraries involved in the project. 22:00 – The New York times has just now posted an article on this. Notice the article is dated December 14th so it is probably in tomorrow’s print edition. The article also had this to add:

Last night the Library of Congress and a group of international libraries from the United States, Canada, Egypt, China and the Netherlands announced a plan to create a publicly available digital archive of one million books on the Internet. The group said it planned to have 70,000 volumes online by next April.

It looks like Harvard is only allowing some 40,000 volumes to start and is being very protective about its collection. I don’t believe for a second Harvard President Summer’s quote in the NYT article saying that Harvard has always held its library to be a “global resource” especially when, judging only from these initial articles, it seems like they are one of the least enthusiastic participants in this new project. Michigan and Stanford are leading the way by committing millions of books in their collection. I’m really excited about this, I really hope the movement will spread quickly to Japan’s National Diet Library, and countless of libraries around the world. This is truly an exciting time!

Obsolete Kana and Other Wacky Combos

Matt had a good question: How do we type some of those obsolete kana like ゐ and ゑ? I found the answer in the Apple ことえり help file which, unlike most Apple help files, was surprisingly helpful. For the above characters you type WYI and WYE, respectively. Japanese is not the only language with this problem. I occasionally forget how I type nü using the pinyin input method for Chinese (the answer is a wonderfully intuitive nv).

UPDATE: In a comment to this posting, my mom added a great link to a page listing various special roman characters and how to input them. I guess I could also add a plug for my own Pinyin to Unicode Convertor website which you can use to create unicode pinyin with tone marks.

UPDATE: Adamu pointed out that this list doesn’t have the ヱ from Yebisu beer! The list below is all hiragana. If, while typing Japanese you type “wye” you get ゑ but if you type “wye” with the “shift” key down, you can get the desired ヱ that is familiar to beer fans in Japan.

Kana Chart

Google Print

Google has a “search-in-book” feature now which it appears to be testing. The feature is similar to Amazon.com, its A9.com search engine, and the commercial online library Questia. It seems to have a limited number of books but I found Louise Young’s Japan’s Total Empire. Just search for the book by title in the normal Google search engine and if a book turns up it you can enter the feature where you can see the book synopsis, back cover, reviews, table of contents, a few sample pages and access to a search engine which shows you all the pages which use a particular search term. You can then view each of these pages and those surrounding it. For Young’s book, search, for example for “Zhang Zuolin” to see all the pages in which she mentions my favorite bad boy warlord.

Nature: Green and Gold Roads to Open Access

I have been following the progress of the Open Access movement in academic journals as closely as my time allows. I gave a presentation to a number of professors and students at Waseda University which talked a lot about the OA movement and I could tell that others became interested when they heard about it. This movement, to provide more open access to research articles that are usually only archived in expensive online databases or not online at all. The movement is making most progress and getting most discussion amongst scientists. There is a great blog, I may have mentioned before Open Access News, and there is also a great series of articles in Nature magazine (they also have an RSS feed for the series).

One recent article in this series mentions the fact (often discussed in these articles) that open access journals are cited more often than those only accessible in subscription databases. It also adds more evidence to this from their own research. However, they also add that some of these articles are from subscription only journals but which authors have “self-archived” and put online.

One way to estimate [the access problem] is to compare citation counts for Open Access articles with pay-to-access articles. Lawrence4 found that in computer science citations were three times higher for Open Access articles than for papers only available for payment in print or online. Kurtz et al. have since reported similar estimates in astrophysics, and Odlyzko in mathematics.

We are carrying out a much larger study across all disciplines, using a 10-year sample of 14 million articles from the Institute for Scientific Information (ISI)’s database; initial results, for the field of physics, show Open Access articles being cited 2.5 to 5 times more than articles that users’ institutions must pay to access online, with this advantage peaking within about 3 years of an article’s publication.

All these articles were published in subscription-based journals, but some were made accessible because authors had ‘self-archived’ copies on the Web-see http://www.eprints.org/self-faq/. Physicists have been self-archiving in growing numbers since 1991 in a central archive called ArXiv. Computer scientists have been self-archiving on their own websites, which are then harvested by Citeseer.

They go on to discuss the “green” and “gold” (the latter meaning fully open online access) approaches to Open Access. It is a good read. The original article on Nature is online as is a more extensive article by its authors on their findings.

Chronicle: E-Learning Dissapointments

The Chronicle had a recent article on “Why the E-Learning Boom Went Bust.” I think one of their points was interesting:

What’s the reality? For the most part, faculty members use the electronics to simplify tasks, not to fundamentally change how they teach their subjects. They readily translate lecture notes into PowerPoint presentations. They use course-management tools like Blackboard and WebCT to distribute class materials, grades, and assignments. But the materials are simply scanned, and the assignments neither look nor feel different. Even when the textbook comes with an interactive CD-ROM, or when the publisher makes the same material available on a proprietary Web site, most faculty members do not assign it. Only modest breakthroughs have occurred — in the use of e-mail to communicate rapidly and directly with students and in the adoption of computerized testing materials.

Indeed, many people believe that the rapid introduction of course-management tools has actually reduced e-learning’s impact on the way most faculty members teach. Blackboard and WebCT make it almost too easy for professors to transfer their standard teaching materials to the Web. While Blackboard’s promotional materials talk about enabling faculty members to use a host of new applications, the specific promises that the software makes to potential users are less dramatic: the ability for them “to manage their own Internet-based file space on a central system and to collect, share, discover, and manage important materials from articles and research papers to presentations and multimedia files.” All that professors need to use the product are the rudimentary electronic-library skills that most have already mastered. Blackboard and WebCT allow the faculty users, when asked, “Are you involved in e-learning?” to respond, “Yes, my courses are already online!”1

If you don’t have database access to the Chronicle, you can read more on their observations by downloading their report “Thwarted Innovation

1. Why the E-Learning Boom Went Bust ,  By: Zemsky, Robert, Massy, William F., Chronicle of Higher Education, 00095982, 7/9/2004, Vol. 50, Issue 44 (On EBSCO)