Revision as of 17:56, 22 July 2007 by Gavinbaker (talk | contribs) (adding some stuff)
Jump to: navigation, search

This is the page for the (as yet unnamed) project to create search tools for students assigned public domain texts for class, and promote the public domain and open educational resources (OERs).



What the Project is

  • Databases containing public domain or open-licensed texts that might be assigned in a (college) class (not run by us)
  • API to search those databases and return results with a high signal-to-noise ratio for our purposes
    • Social searching? ("x% of users who searched for 'moby dick' found this useful...")
  • Scripts to format output for various formats and devices
    • Read online
    • Download (plain text, HTML, PDF, other formats?)
    • Print
    • Print-on-demand? (
  • End-user interfaces
    • Web site
    • Facebook application
    • Firefox extension?
    • Plug-in to online course management systems (Blackboard, etc.)?
  • Promotional campaign
    • Partners
    • Online outreach
    • Media outreach
    • Chapters & campus outreach


Everything needs a catchy name. What's our idea?

Databases to search

What archives do we want to search?

  • Must contain public domain or open-licensed texts that might be assigned in a (college) class
  • The higher the signal-to-noise ratio, the better

Project Gutenberg


  • How many books?
  • License? (iirc PG's license isn't quite PD...)

The Gutenberg license basically says you cannot alter the ebook if you redistribute for free should you use their PG trademark with a PD title. They also have copyrighted titles (which are marked as such) which you cannot redistribute without permission. For more, see here.

  • API for searching?

I contacted the site admin once about this for adding that to a book inventory system but never received a response, if I remember correctly, things could be different now. Scripting their own search forms should not be too difficult but asking would probably be nicer...

    • Should we just store a copy of every book? We can figure that out later.
  • Existing code for making books look nice?

Internet Archive


I think the concern was that not all of their links were good. additionally, a good part of their archive is from project guttenberg, so there'd be an overlap (somewhat annoying). however, if they have books project guttenberg does, but from another source, that's a good thing, and makes it more likely that they're going to have the stuff students are looking for.

a small political consideration: linda frueh at the IA was very excited about the program, and it might seem rude to design a whole program that they really like yet exclude their archive



Search API

Output formats

Read online


  • Plain text
  • HTML
  • PDF
  • Other formats?


  • PDF?
  • CSS media-type: print?


Audio recording of computer reading (vocoder)

  • Would this be useful?
    • I don't think so. (Who wants to listen to a vocoder for multiple hours?) But we could include open-licensed audio recordings of human readings in the databases we search. --Gavin 02:56, 23 July 2007 (JST)


Student PIRGs

  • (link)
  • Student PIRGs could use this tool in their arsenal in their pro-OER activities

Public Knowledge

  • [5]
  • almost certainly would offer server space

Internet Archive

  • [6]
  • might be interested in some form of collaboration

End-user interfaces

Web site

Brendan made a mockup. If it was online, we could link to it!

Facebook application

  • (link)

Firefox extension

  • (link)

Plug-in to online course management systems

  • Blackboard

Promoting it


What needs to be done? Who will do it when?

The Future

What will happen to the project in one year's time? (If the answer is "No plans (yet)" that's fine, but we still should think about this.)