|
|
Kenneth R. Abbott and Sunil K. Sarin.
Experiences with workflow management: Issues for the next generation.
In Richard Furuta and Christine Neuwirth, editors, CSCW '94,
New York, 1994. ACM.
Workflow management is a technology that is considered
strategically important by many businesses, and its
market growth shows no signs of abating. It is,
however, often viewed with skepticism by the
research community, conjuring up visions of
oppressed workers performing rigidly-defined tasks
on an assembly line. Although the potential for
abuse no doubt exists, workflow management can
instead be used to help individuals manage their
work and to provide a clear context for performing
that work. A key challenge in the realization of
this ideal is the reconciliation of workflow process
models and software with the rich variety of
activities and behaviors that comprise ``real''
work. Our experiences with the InConcert workflow
management system are used as a basis for outlining
several issues that will need to be addressed in
meeting this challenge. This is intended as an
invitation to CSCW researchers to influence this
important technology in a constructive manner by
drawing on research and experience.
|
|
|
Tarek F. Abdelzaher and Nina Bhatti.
Web content adaptation to improve server overload behavior.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
This paper presents a study of Web content
adaptation to improve server overload
performance, as well as an implementation of a' Web content adaptation
software prototype. When the request rate on a
Web server increases beyond server capacity, the server becomes
overloaded and unresponsive. The TCP listen
queue of the server's socket overflows exhibiting a drop-tail behavior. As a
result, clients experience service outages.
Since clients typically issue multiple requests over the duration of a
session with the server, and since requests are
dropped indiscriminately, all clients
connecting to the server at overload are likely
to experience connection failures, even though
there may be enough capacity on the server to deliver all responses properly
for a subset of clients. In this paper, we
propose to resolve the overload problem by adapting delivered content to load
conditions to alleviate overload. The premise
is that successful delivery of a less resource intensive content under overload
is more desirable to clients than connection
rejection or failures.
|
|
|
Serge Abiteboul, Sophie Cluet, and Tova Milo.
Querying and updating the file.
In Proceedings of the Nineteenth International Conference on
Very Large Databases, pages 73-84, Dublin, Ireland, 1993. VLDB Endowment,
Saratoga, Calif.
|
|
|
Serge Abiteboul, Sophie Cluet, and Tova Milo.
Correspondence and translation for heterogeneous data.
In Proceedings of the 6th International Conference on Database
Theory, Delphi, Greece, 1997. Springer, Berlin.
|
|
|
Marc Abrams, Constantinos Phanouriou, Alan L. Batongbacal, Stephen M. Williams,
and Jonathan E. Shuster.
Uiml: An appliance-independent xml user interface language.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
Today's Internet appliances feature user
interface technologies almost unknown a few
years ago: touch screens, styli,
handwriting and voice recognition, speech
synthesis, tiny screens, and more. This
richness creates problems. First. different
appliances use different languages: WML for
cell phones; SpeechML, JSML, and VoxML for
voice enabled devices such as phones; HTML and XUL for desktop computers, and
so on. Thus, developers must maintain multiple
source code families to deploy interfaces to one information system
on multiple appliances. Second, user interfaces
differ dramatically in complexity (e.g, PC versus cell phone
interfaces). Thus, developers must also manage
interface content. Third, developers
risk writing appliance-specific interfaces for
an appliance that might not be on the market
tomorrow. A solution is to build
interfaces with a single, universal language
free of assumptions about appliances and
interface technology. This paper
introduces such a language, the User Interface
Markup Language (UIML), an XML-compliant
language. UIML insulates the interface designer from the peculiarities
of different appliances through style sheets. A
measure of the power of UIML is that it can replace hand-coding of Java AWT
or Swing user interfaces.
|
|
|
Mark S. Ackerman.
Providing social interaction in the digital library.
In Proceedings of the First Annual Conference on the Theory and
Practice of Digital Libraries, 1994.
Format: HTML Document(12K)
.
Audience: Non-technical, digital library researchers/funders.
References: 13.
Links: 2.
Relevance: Low-medium.
Abstract: Argues that social aspects of collaboration must be included
in a Digital Library for the informal, organizational things that aren't always
available in information sources. Mentions a TCL based system called CAFE
that adds functionality of messages, bulletin boards, and talk.
|
|
|
Mark S. Ackerman and Roy T. Fielding.
Collection maintenance in the digital library.
In Proceedings of the Second Annual Conference on the Theory and
Practice of Digital Libraries, 1995.
Format: HTML Document(39K + pictures) .
Audience: Librarians, web masters.
References: 27.
Links: 2.
Relevance: Low.
Abstract: Discusses the problem of collection maintenance in the digital
domain, and argues that while some traditional practices will carry over, new
methods will have to be created, esp. for dynamic and informal resources. S
uggests that some maintenance can be done automatically by agents, and gives 2
examples: MOMSpider, which checks to make sure links are still current and
Web:Lookout which notifies user when interesting changes are made to a watched
page.
|
|
|
Michael J. Ackerman.
Accessing the visible human project.
D-Lib Magazine, October 1995.
Format: HTML Document(11K).
Audience: Medical professionals,.
References: 1.
Links: 5.
Relevance: None.
Abstract: Describes the Visible Human Project (1 mm
cross sections of two cadavers), how to obtain the
images, how large they are, what IP agreements need
to be signed.
|
|
|
R. Acuff, L. Fagan, T. Rindfleisch, B. Levitt, and P. Ford.
Lightweight, mobile e-mail for intra-clinic communication.
In Proceedings of the 1997 AMIA Annual Fall Symposium, pages
729-33, Oct 1997.
|
|
|
N. Adam, Y. Yesha, B. Awerbuch, K. Bennet, B. Blaustein, A. Brodsky, R. Chen,
O. Dogramaci, B. Grossman, R. Holowczak, J. Johnson K. Kalpakis, C. McCollum,
A.-L. Neches, B. Neches, A. Rosenthal, J. Slonim, H. Wactlar, and O. Wolfson.
Strategic directions in electronic commerce and digital libraries:
towards a digital agora.
ACM Computing Surveys, 28(4):818-35, December 1996.
The paper examines the research requirements of electronic
commerce and digital libraries in six key areas. It provides
case studies that describe three electronic commerce research
projects (USC-ISI, CommerceNet, First Virtual) and six
digital libraries projects sponsored by an NSF/ARPA/NASA
initiative. The paper focuses on the following common areas
of EC and DL research: acquiring and storing information;
finding and filtering information; securing information and
auditing access; universal access; cost management and
financial instruments; and socio-economic impact.
|
|
|
Anne Adams and Ann Blandford.
Digital libraries’ support for the user’s ‘information journey’.
In Proceedings of the Fifth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2005.
The temporal elements of users’ information requirements are a continually confounding aspect of digital library design. No sooner have users’ needs been identified and supported than they change. This paper evaluates the changing information requirements of users through their ‘information journey’ in two different domains (health and academia). In-depth analysis of findings from interviews, focus groups and observations of 150 users have identified three stages to this journey: information initiation, facilitation (or gathering) and interpretation. The study shows that, although digital libraries are supporting aspects of users’ information facilitation, there are still requirements for them to better support users’ overall information work in context. Users are poorly supported in the initiation phase, as they recognize their information needs, especially with regard to resource awareness; in this context, interactive press-alerts are discussed. Some users (especially clinicians and patients) also required support in the interpretation of information, both satisfying themselves that the information is trustworthy and understanding what it means for a particular individual.
|
|
|
Eytan Adar and Jeremy Hylton.
On-the-fly hyperlink creation for page images.
In Proceedings of the Second Annual Conference on the Theory and
Practice of Digital Libraries, 1995.
Format: HTML Document () .
Audience: Digital library researchers.
References: 9.
Links: 0.
Relevance: Low.
Abstract: Store pages as bitmaps, and retrieve a cite when user clicks
on
it, by doing OCR, then passing relevant line to library catalog, as 12 queries
of 3 words each (randomly selected from the line) and returning the best scoring
results. Somewhat robust to typos in cites, but not too slow.
|
|
|
Paul S. Adler and Terry Winograd, editors.
Usability : turning technologies into tools.
Oxford University Press, 1992.
|
|
|
Eugene Agichtein and Luis Gravano.
Snowball: Extracting relations from large plain-text collections.
In Proceedings of the Fifth ACM International Conference on
Digital Libraries, 2000.
Text documents often contain valuable structured data
that is hidden in regular English sentences. This data is best
exploited if available as a relational table that we could use for
answering precise queries or for running data mining tasks. We
explore a technique for extracting such tables from
document collections that requires only a handful of training examples
from users. these examples are used to generate extraction patterns,
that in turn result in new tuples being extracted from the document
collection. We build on this idea and present our Snowball system.
Snowball introduces novel strategies for generating patterns and
extracting tuples from plain-text documents. At each iteration of the
extraction process, Snowball evaluates the quality of these patterns
and tuples without human intervention, and keeps only the most
reliable ones for the next iteration. In this paper we also develop a
scalable evaluation methodology and metrics for our task, and present
a thorough experimental evaluation of Snowball and comparable techniques
over a collection for more than 300,000 newspaper documents.
|
|
|
Maristella Agosti, Nicola Ferro, and Nicola Orio.
Annotating illuminated manuscripts: an effective tool for research
and education.
In Proceedings of the Fifth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2005.
The aim of this paper is to report the research results of an ongoing project that deals with the exploitation of a digital archive of drawings and illustrations of historic documents for research and education purposes. According to the results on a study of user requirements, we designed tools to provide researchers with novel ways for accessing the digital manuscripts, sharing, and transferring knowledge in a collaborative environment. Annotations are proposed for making explicit the results of scientific research on the relationships between images belonging to manuscripts produced in a time span of centuries. For this purpose, a taxonomy for linking annotation is proposed, together with a conceptual schema for representing annotations and for linking them to digital objects.
|
|
|
Rakesh Agrawal, Tomasz Imielinski, and Arun Swami.
Mining association rules between sets of items in large databases.
In Proceedings of the International Conference on Management of
Data, pages 207-216. ACM Press, 1993.
|
|
|
Alfred Aho, John Hopcroft, and Jeffrey Ullman.
Data Structures and Algorithms.
Addison-Wesley, 1983.
|
|
|
T. Alanko, M. Kojo, M. Liljeberg, and K. Raatikainen.
Mowgli: improvements for internet applications using slow wireless
links.
In Waves of the Year 2000+ PIMRC '97. The 8th IEEE International
Symposium on Personal, Indoor and Mobile Radio Communications. Technical
Program, Proceedings (Cat. No.97TH8271), volume 3, pages 1038-42, 1997.
Modern cellular telephone systems extend the usability of
portable personal computers enormously. A nomadic user can be
given ubiquitous access to remote information stores and
computing services. However, the behavior of wireless links
creates severe inconveniences within the traditional data
communication paradigm. We give an overview of the problems
related to wireless mobility. We also present a new software
architecture for mastering the problems and discuss a new
paradigm for designing mobile distributed applications. The
key idea in the architecture is to place a mediator, a
distributed intelligent agent, between the mobile node and
the wireline network.
|
|
|
Reka Albert, Albert-Laszlo Barabasi, and Hawoong Jeong.
Diameter of the World Wide Web.
Nature, 401(6749), September 1999.
|
|
|
Alexa internet inc.
http://www.alexa.com.
|
|
|
R. B. Allen.
Interface issues for interactive multimedia documents.
In Advances in Digital Libraries '95, 1995.
Format: Not Yet Online.
|
|
|
Robert B. Allen.
Navigating and searching in hierarchical digital library catalogs.
In Proceedings of the First Annual Conference on the Theory and
Practice of Digital Libraries, 1994.
Format: HTML Document (21K) .
Audience: non technical, users.
References: 15.
Links: 2.
Relevance: Low.
Abstract: Describes a particular user interface based on a book shelf
metaphor. Tries to use an a priori classification (Dewey Decimal System) as an
organization tool (in addition to results of electronic searches).
|
|
|
Robert B. Allen.
Two digital library intefaces which exploit hierarchical structure.
In DAGS '95, 1995.
Format: HTML Document(33K + pictures) .
Audience: General Computer scientists, HCI .
References: 22.
Links: 1.
Relevance: Low-Medium.
Abstract: Uses metaphor of hierarchical Dewey Decimal system or
faceted (implying a DAG) ACM literature categories to aid
UI. Shows graphically where in the hierarchy hits were
found for a search.
|
|
|
Robert B. Allen.
A query interface for an event gazetteer.
In Proceedings of the Fourth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2004.
We introduce the idea of an ``event gazetteer''
that stores and presents locations in time. Each event is coded as a
schema with attributes of event type, location, actor, and beginning
and ending times. Sets of events can be collected as timelines and the
events on these timelines can be linked by annotations. The system has
been built with JSP and Oracle. Systematic metadata is essential for
effective interaction with this system. For instance, the actors may be
described by the roles in which they participate. In this paper, we
focus on the construction of queries for this complex metadata.
Ultimately, we envision a flexible, broad-based service that is a
resource for users ranging from students to genealogists interested in
events.
|
|
|
Robert B. Allen.
A multi-timeline interface for historical newspapers.
In Proceedings of the Fifth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2005.
Events may be are best understood in the context of other events. Because of the temporal ordering, we can call a set of related events a timeline. Even such timelines are best understood in the context of other timelines. To facilitate the exploration of a collection of timelines and events, a visualization tool has been developed that structures the user's browsing. In this model, each event is accompanied by a text description and links to related resources. In particular, this system can provide a browsing interface of digitized historical newspapers.
|
|
|
Robert B. Allen and Jane Acheson.
Browsing the structure of multimedia stories.
In Proceedings of the Fifth ACM International Conference on
Digital Libraries, 2000.
Stories may be analyzed as sequences of causally-related events
and reactions to those events by the characters. We employ a notation of plot
elements, similar to one developed by Lehnert, and we extend that by forming
higher level story threads. This notation requires that events and
reactions be linked and that the chains of links be terminated back to the
beginning of the story. Furthermore, we have built a browser for the plot
elements, the story threads, and associated multimedia. We apply the browser
to Corduroy, a children's short feature which was analyzed in detail. We
provide additional illustrations with analysis of Kiss of Death, a Film
Noir classic. Effectively, the browser provides a framework for
interactive summaries of the narrative.
|
|
|
Open Mobile Alliance.
Wireless application protocol.
http://www.openmobilealliance.org/tech/affiliates/wap/wapindex.html#wap20,
2001.
The WAP Web site from where the specs are available.
|
|
|
Virgilio Almeida, Azer Bestavros, Mark Crovella, and Adriana de Oliveira.
Characterizing reference locality in the www.
In Proceedings of PDIS'96: The IEEE Conference on Parallel and
Distributed Information Systems, 1996.
|
|
|
Virgilio A.F. Almeida, Wagner Meira Jr., Vicotr F. Ribeiro, and Nivio Ziviani.
Efficiency analysis of brokers in the electronic marketplace.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
In this paper we analyze the behavior of e-commerce
users based on actual logs from two large non-English e-brokers.
We start by presenting a quantitative study of the behavior of e-brokers and
discuss the influence of regional and cultural
issues on them. We then discuss a model that
quantifies the efficiency of the results
provided by brokers in the electronic
marketplace. This model is a function of
factors such as server response time and
regional factors. Our findings clearly
indicate that e-commerce is strongly tied to
local language, national customs and
regulations, currency conversion and
logistics, and Internet infrastructure. We
found that the behavior of customers of online
bookstores is strongly affected
by brand and regional factors. Music CD
shoppers show a different behavior that might
stem from the fact that music is
universal and not so language dependent.
|
|
|
Altavista incorporated.
http://www.altavista.com.
|
|
|
Amazon inc.
http://www.amazon.com.
|
|
|
Jose-Luis Ambite and Craig A. Knoblock.
Reconciling distributed information sources.
In AAAI Spring Symposium on Information Gathering, 1995.
Format: Compressed PostScript().
|
|
|
B. Amento, L. Terveen, and W. Hill.
Does authority mean quality? Predicting expert quality ratings of
web documents.
In Proceedings of the Twenty-Third Annual International ACM
SIGIR Conference on Research and Development in Information Retrieval. ACM,
2000.
evaluating different link based ranking techniques
|
|
|
Einat Amitay, Nadav Har'El, Ron Sivan, and Aya Soffer.
Web-a-where: geotagging web content.
In SIGIR '04: Proceedings of the 27th annual international
conference on Research and development in information retrieval, pages
273-280. ACM Press, 2004.
|
|
|
E. Amoroso.
Fundamentals of Computer Security Technology.
Prentice Hall, Englewood Cliffs, NJ., 1994.
|
|
|
H. Anan, X. Liu, K. Maly, M. Nelson, M. Zubair, J. C. French, E. Fox, and
P. Shivakumar.
Preservation and transition of ncstrl using an oai-based
architecture.
In Proceedings of the Second ACM/IEEE-CS Joint Conference on
Digital Libraries, 2002.
NCSTRL (Networked Computer Science Technical Reference Library) is
a federation of digital libraries providing computer science materials. The
architecture of the original NCSTRL was based largely on the Dienst software.
It was implemented and maintained by the digital library group at Cornell
University until September 2001. At that time, we had an immediate goal of
preserving the existing NCSTRL collection and a long-term goal of providing a
framework where participating organizations could continue to disseminate
technical publications. Moreover, we wanted the new NCSTRL to be based on
OAI (Open Archives Initiative) principles that provide a framework to facilitate
the discovery of content in distributed archives. In this paper, we describe our
experience in moving towards an OAI-based NCSTRL.
|
|
|
Dan Ancona, Jim Frew, Greg Jan‰e, and Dave Valentine.
Accessing the alexandria digital library from geographic information
systems.
In Proceedings of the Fourth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2004.
We describe two experimental desktop library
clients that offer improved access to geospatial data via the
Alexandria Digital Library (ADL): ArcADL, an extension to ESRI's
ArcView GIS, and vtADL, an extension to the Virtual Terrain Project's
Enviro terrain visualization package. ArcADL provides a simplified user
interface to ADL's powerful underlying distributed geospatial search
technology. Both clients use the ADL Access Framework to access library
data that is available in multiple formats and retrievable by multiple
methods. Issues common to both clients and future scenarios are also
considered.
|
|
|
Kenneth M. Anderson, Aaron Andersen, Neet Wadhwani, and Laura M. Bartolo.
Metis: Lightweight, flexible, and web-based workflow services for
digital libraries.
In Proceedings of the Third ACM/IEEE-CS Joint Conference on
Digital Libraries, 2003.
The Metis project is developing workflow technology
designed for use in digital libraries by avoiding the assumptions
made by traditional workflow systems. In particular, digital
libraries have highly distributed sets of stakeholders who
nevertheless must work together to perform shared activities.
Hence, traditional assumptions that all members of a workflow belong
to the same organization, work in the same fashion, or have access to
similar computing platforms are invalid. The Metis approach makes use
of event-based workflows to support the distributed nature of digital
library workflow and employs techniques to make the resulting
technology lightweight, flexible, and integrated with the Web.
This paper describes the conceptual framework behind the Metis approach
as well as a prototype which implements the framework. The prototype
is evaluated based on its ability to model and execute a workflow drawn
from a real-world digital library. After describing related work, the
paper concludes with a discussion of future research opportunities in
the area of digital library workflow and outlines how Metis is being
deployed to a small set of digital libraries for additional evaluation.
|
|
|
R. Anderson and M. Kuhn.
Tamper resistance-a cautionary note.
In Proceedings of the Second USENIX Workshop on Electronic
Commerce, Berkeley, CA, USA, 1996. USENIX Assoc.
An increasing number of systems, from pay-TV to electronic
purses, rely on the tamper resistance of smartcards and other
security processors. We describe a number of attacks on such
systems some old, some new and some that are simply little
known outside the chip testing community. We conclude that
trusting tamper resistance is problematic; smartcards are
broken routinely, and even a device that was described by a
government signals agency as the most secure processor
generally available' turns out to be vulnerable. Designers of
secure systems should consider the consequences with care.
|
|
|
R. Anderson, C. Manifavas, and C. Sutherland.
Netcard - a practical electronic cash system.
In Fourth Cambridge Workshop on Security Protocols, 1996.
|
|
|
R.C. Angell, G.E. Freund, and P. Willett.
Automatic spelling correction using a trigram similarity measure.
Information Processing and Management, 19(4):255-261, 1983.
|
|
|
ANSI/NISO.
Information Retrieval: Application Service Definition and
Protocol Specification, April 1995.
Available at http://lcweb.loc.gov/z3950/agency/document.html.
|
|
|
Vinod Anupam, Alain Mayer, Kobbi Nissim, Benny Pinkas, and Michael K. Reiter.
On the security of pay-per-click and other web advertising schemes.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
We present a hit inflation attack on pay-per-
click Web advertising schemes. Our attack is
virtually impossible for the
program provider to detect conclusively,
regardless of whether the provider is a third-
party `ad network` or the target of
the click itself. If practiced widely, this
attack could accelerate a move away from pay-
per-click program, and toward
programs in which referrers are paid only if
the referred user subsequently makes a
purchase (pay-per-sale) or engages in
other substantial activity at the target site
(pay-per-lead). We also briefly discuss the
lack of auditability inherent in these
schemes.
|
|
|
Kyoichi Arai, Teruo Yokoyama, and Yutaka Matsushita.
A window sytems with leafing through mode: Bookwindow.
In Proceedings of the Conference on Human Factors in Computing
Systems CHI'92, 1992.
|
|
|
Avi Arampatzis, Marc van Kreveld, Iris Reinbacher, Paul Clough, Hideo Joho,
Mark Sanderson, Christopher B. Jones, Subodh Vaid, Marc Benkert, and
Alexander Wolff.
Web-based delineation of imprecise regions.
In Proceedings of the Workshop on Geographic Information
Retrieval, 2004.
|
|
|
Arvind Arasu, Junghoo Cho, Hector Garcia-Molina, Andreas Paepcke, and Sriram
Raghavan.
Searching the web.
ACM Transactions on Internet Technology, 2001.
Submitted for publication. Available at
http://dbpubs.stanford.edu/pub/2000-37.
We offer an overview of current Web search engine
design. After introducing a generic search engine
architecture, we examine each engine component in
turn. We cover crawling, local Web page storage,
indexing, and the use of link analysis for boosting
search performance. The most common design and
implementation techniques for each of these components
are presented. We draw for this presentation from the
literature, and from our own experimental search engine
testbed. Emphasis is on introducing the fundamental
concepts, and the results of several performance
analyses we conducted to compare different designs.
|
|
|
William Y. Arms.
Key concepts in the architecture of the digital library.
D-Lib Magazine, Jul 1995.
Format: HTML Document(18K + pictures).
Audience: computer scientists, digital library
researchers.
References: 1.
Links: 3.
Relevance: Medium-low.
Abstract: Outlines 8 principles that are important to
DLs, a combination of social/economic issues (avoid
using words like ``copy'' and ``publish'') and
technical ones (basically a sales pitch for the
Kahn/Wilensky model of handles, maintenance, and
access control.)
|
|
|
William Y. Arms.
Key concepts in the architecture of the digital library.
D-Lib Magazine, July 1995.
|
|
|
R. Armstrong, D. Freitag, T. Joachims, and T. Mitchell.
Webwatcher: A learning apprentice for the world wide web.
In AAAI Spring Symposium on Information Gathering, 1995.
We describe an information seeking assistant for the world
wide web. This agent, called WebWatcher, interactively helps users locate
desired information by employing learned knowledge about which hyperlinks
are likely to lead to the target information.
|
|
|
Robert Armstrong, Dayne Freitag, Thorsten Joachims, and Tom Mitchell.
Webwatcher: A learning apprentice for the world wide web.
In AAAI Spring Symposium on Information Gathering, 1995.
Format: Compressed PostScript().
|
|
|
Kenneth Arnold.
The body in the virtual library: Rethinking scholarly communication.
In JEP.
Format: HTML Document (41K) .
Audience: Scholars, publishers (esp. university press), librarians.
References: 10.
Links: 1.
Relevance: Low-Medium.
Abstract: Discusess the future of university presses, in pretty grim
terms. Suggests that they lack the capital, staff, and quick reaction time to
survive in an electronic world. Considers the Mellon report on scholarly comm
unication (which suggests universities get copyrights on books their faculty
produce) unreasonable. Thinks that relying on commercial network providers
(esp. cable, telecom) would be disastrous. Advocates a non-profit distribution
ne
twork for scholarly publication.
|
|
|
Kenneth Arnold.
The electronic librarian is a verb/the electronic library is not a
sentence.
In JEP, 1994.
Format: HTML Document (49K) .
Audience: Librarians, policy makers.
References: 10.
Links: 1.
Relevance: low.
Abstract: A vision of the networked library. Sees the real value of
librarians as creating attention structures which anticipate the way clients
search.
|
|
|
Dennis S. Arnon.
Scrimshaw: a language for document queries and transformations.
Electronic Publishing: Origination, Dissemination and Design,
6(4):361-372, December 1993.
|
|
|
J. Ashley, M. Flickner, J. Hafner, D. Lee, W. Niblack, and D. Petkovic.
The query by image content (QBIC) system.
In Proceedings of the International Conference on Management of
Data (SIGMOD). ACM Press, 1995.
|
|
|
N. Asokan, P.A. Janson, M. Steiner, and M. Waidner.
The state of the art in electronic payment systems.
Computer, 30(9):28-35, September 1997.
The exchange of goods conducted face-to-face between two
parties dates back to before the beginning of recorded
history. Traditional means of payment have always had
security problems, but now electronic payments retain the
same drawbacks and add some risks. Unlike paper, digital
documents can be copied perfectly and arbitrarily often,
digital signatures can be produced by anybody who knows the
secret cryptographic key, and a buyer's name can be
associated with every payment, eliminating the anonymity of
cash. Without new security measures, widespread electronic
commerce is not viable. On the other hand, properly designed
electronic payment systems can actually provide better
security than traditional means of payments, in addition to
flexibility. This article provides an overview of electronic
payment systems, focusing on issues related to security.
|
|
|
Active Server Pages technology.
http://msdn.microsoft.com/workshop/server/asp/aspfeat.asp.
|
|
|
R. Atkinson, A. Demers, C. Hauser, C. Jacobi, P. Kessler, and M. Weiser.
Experiences creating a portable cedar.
SIGPLAN Not. (USA), SIGPLAN Notices, 24(7):322-8, 1989.
The authors have recently re-implemented the Cedar language
to
make it portable across many different architectures. The
strategy was, first, to use machine-dependent C code as an
intermediate language, second, to create a
language-independent layer known as the Portable Common
Runtime, and third, to write a relatively large amount of
Cedar-specific runtime code in a subset of Cedar itself. The
paper presents a brief description of the Cedar language, the
portability strategy for the compiler and runtime, the manner
of making connections to other languages and the Unix
operating system, and some performance measures of the
Portable Cedar.
|
|
|
Neal Audenaert, Richard Furuta, Eduardo Urbina, Jie Deng, Carlos Monroy, Rosy
Sáenz, and Doris Careaga.
Integrating collections at the cervantes project.
In Proceedings of the Fifth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2005.
Unlike many efforts that focus on supporting scholarly research by developing large-scale, general resources for a wide range of audiences, we at the Cervantes Project have chosen to focus more narrowly on developing resources in support of ongoing research about the life and works of a single author, Miguel de Cervantes Saavedra (1547-1616). This has lead to a group of hypertextual archives, tightly integrated around the narrative and thematic structure of Don Quixote. This project is typical of many humanities research efforts and we discuss how our experiences inform the broader challenge of developing resources to support humanities research.
|
|
|
Cyrus Azarbod and William Perrizo.
Building concept hierarchies for schema integration in hddbs using
incremental concept formation.
In B. Bhargava, T. Finin, and Y. Yesha, editors, CIKM 93.
Proceedings of the Second International Conference on Information and
Knowledge Management, pages 732-734, Washington, D.C., November 1993. ACM.
|
|
|
Sulin Ba, Aimo Hinkkanen, and Andre B. Whinston.
Digital library as a foundation for decision support systems.
In Proceedings of the First Annual Conference on the Theory and
Practice of Digital Libraries, 1994.
Format: HTML Document (43K) .
Audience: Semi-technical, business slant, funding proposal.
References: 14.
Links: 1.
Relevance: Low.
Abstract: Sees a DL as an enterprise wide collection of
*executable* documents. SGML and Mathematica suggested
as integration tools. Search for data representation
which will allow automatic combination of separate
documents to solve problems.
|
|
|
D. Bachiochi, M. Berstene, E. Chouinard, N. Conlan, M. Danchak, T. Furey,
C. Neligon, and D. Way.
Usability studies and designing navigational aids for the world wide
web.
In Proceedings of the Sixth International World-Wide Web
Conference, 1997.
|
|
|
B. R. Badrinath.
Distributed computing in mobile environments.
Computers & Graphics, 20(5):615-17, 1996.
Rapid progress in hardware has led to the availability of
portable personal computers ranging from laptops to hand-held
computers (PDAs and Internet terminals). The presence of
wireless connectivity gives these hand-held units the
capability of accessing information anywhere, at any time.
These mobile units can be considered to be part of a
worldwide distributed information system. Distributed
computing in mobile environments faces new challenges as more
and more mobile hosts become an integral part of a
distributed system. Problems in distributed computing in
mobile environments are due to: (1) mobility, (2) wireless
and (3) resource constraints at the mobile host. In this
paper, we discuss the impact of these factors and research
issues that need to be addressed in mobile distributed
systems.
|
|
|
Ricardo Baeza-Yates and Berthier Ribeiro-Neto.
Modern Information Retrieval.
Addison-Wesley-Longman, May 1999.
The chapters of the book are:
Introduction
Modeling
Retrieval Evaluation
Query Languages (with Gonzalo Navarro)
Query Operations
Text and Multimedia Languages and Properties
Text Operations (with Nivio Ziviani)
Indexing and Searching (with Gonzalo Navarro)
Parallel and Distributed IR (by Eric Brown)
User Interfaces and Visualization (by Marti Hearst)
Multimedia IR: Models and Languages
(by Elisa Bertino, Barbara Catania and Elena Ferrari)
Multimedia IR: Indexing and Searching (by Christos Faloutsos)
Searching the Web
Libraries and Bibliographic Systems (by Edie Rasmussen)
Digital Libraries (by Edward Fox and Ohm Sornil)
Appendix: Porter's Algorithm
Glossary
References (more than 800)
Index
More information can be found in:
http://www.sims.berkeley.edu/ hearst/irbook
|
|
|
David Bainbridge, Craig G. Nevill-Manning, Ian H. Witten, Lloyd A. Smith, and
Rodger J. McNab.
Towards a digital library of popular music.
In Proceedings of the Fourth ACM International Conference on
Digital Libraries, 1999.
Digital libraries of music have the potential to capture popular
imagination in ways that more scholarly libraries cannot.
we are working towards a comprehensive digital library of
musical material, including popular music. We have developed
new ways of collecting musical material, accessing it
through searching and browsing, and presenting the results
to the user. We work with different representations of music:
facsimile images of scores, the internal representation of a
music editing program, page images typeset by a music editor,
MIDI files, audio files representing sung user input, and
textual metadata such as title, composer and arranger, and
lyrics. This paper describes a comprehensive suite of tools that we
have built for this project. These tools gather musical material,
convert between many of these representations, allow
searching based on combined musical and textual criteria,
and help present the results of searching and browsing. Although
we do not yet have a single fully-blown digital music
library, we have built several exploratory prototype collections
of music, some of them very large (100,000 tunes), and
critical components of the system have been evaluated.
|
|
|
David Bainbridge, John Thompson, and Ian H. Witten.
Assembling and enriching digital library collections.
In Proceedings of the Third ACM/IEEE-CS Joint Conference on
Digital Libraries, 2003.
People who create digital libraries need to gather together the raw material,
add metadata as necessary, and design and build new collections. This paper sets out the
requirements for these tasks and describes a new tool that supports them interactively, making
it easy for users to create their own collections from electronic files of all types. The process
involves selecting documents for inclusion, coming up with a suitable metadata set, assigning
metadata to each document or group of documents, designing the form of the collection in terms
of document formats, searchable indexes, and browsing facilities, building the necessary indexes
and data structures, and putting the collection in place for others to use. All these tasks are
supported within a modern point-and-click interaction paradigm. Although the tool is specific to the
Greenstone digital library software, the underlying ideas should prove useful in more general contexts.
|
|
|
M. Baker.
Changing communication environments in mosquitonet.
In Proceedings of the IEEE Workshop on Mobile Computing Systems
and Applications, Dec 1994.
|
|
|
M. Baker, X. Zhao, S. Cheshire, and J. Stone.
Supporting mobility in mosquitonet.
In Proceedings of the 1996 USENIX Conference, Jan 1996.
|
|
|
Scott Baker and John H. Hartman.
The gecko nfs web proxy.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
The World-Wide Web provides remote access to
pages using its own naming scheme (URLs).
transfer protocol (HTTP),
and cache algorithms. Not only does using these
special-purpose mechanisms have performance
implications, but they
make it impossible for standard Unix
applications to access the Web. Gecko is a
system that provides access to the Web
via the NFS protocol. URLs are mapped to Unix
file names, providing unmodified applications
access to Web pages; pages
are transferred from the Gecko server to the
clients using NFS instead of HTTP.
significantly improving performance; and
NFS's cache consistency mechanism ensures that
all clients have the same version of a page.
Applications access pages as
they would Unix files. A client-side proxy
translates HTTP requests into file accesses,
allowing existing Web applications
to use Gecko. Experiments performed on our
prototype show that Gecko is able to provide
this additional functionality at a
performance level that exceeds that of HTTP.
|
|
|
Scott M. Baker and Bongki Moon.
Distributed cooperative web servers.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
Traditional techniques for a distributed web
server design rely on manipulation of
central resources, such as routers or
DNS services, to distribute requests
designated for a single IP address to
multiple web servers. The goal of the
distributed
cooperative Web server (DCWS) system
development is to explore application-level
techniques for distributing web
content. We achieve this by dynamically
manipulating the hyperlinks stored within
the web documents themselves. The
DCWS system effectively eliminates the
bottleneck of centralized resources, while
balancing the load among distributed
web servers. DCWS servers may be located in
different networks, or even different
continents and still balance load
effectively. DCWS system design is fully
compatible with existing HTTP protocol
semantics and existing web client
software products.
|
|
|
M. Balabanovic and Y. Shoham.
Learning information retrieval agents: Experiments with automated web
browsing.
In AAAI spring symposium on Information Gathering, 1995.
The current exponential growth of the Internet precipitates
a need for new tools to help people cope with the volume of
information. To complement recent work on creating searchable indexex of
the World-Wide Web and systems for filtering incoming e-mail and Usenet
news articles, we describe a system which helps users keep abreast of new
and interesting information. Every day it presents a selection of
interesting web pages. The user evaluates each page, and given this
feedback the system adapts and attempts to produce better pages the
following day. We prsent some early results from an AI programming class
to whom this was set as a project, and then describe our current
implementation. Over the course of 24 days the output of our system was
compared to both randomly-selected and human-selected pages. It
consistently performed better than the random pages, and was better than
the human-selected pages half of the time.
|
|
|
M. Balabanovic and Y. Shoham.
Fab: content-based collaborative recommendation.
Communications of the ACM, 40(3):66-72, March 1997.
Online readers are in need of tools to help them cope with the
mass of content that is available on the World Wide Web. In
traditional media, readers are provided assistance in making
selections. This includes both implicit assistance in the
form of editorial oversight and explicit assistance in the
form of recommendation services such as movie reviews and
restaurant guides. The electronic medium offers new
opportunities to create recommendation services, ones that
adapt over time to track users' evolving interests. Fab is
such a recommendation system for the Web, and has been
operational in several versions since December 1994. By
combining both collaborative and content-based filtering
systems, Fab may eliminate many of the weaknesses found in
each approach.
|
|
|
M. Balabanovic, Y. Shoham, and Y. Yun.
An adaptive agent for automated web browsing.
Journal of Visual Communication and Image Representation, 6(4),
December 1995.
|
|
|
Marko Balabanovic.
An adaptive web page recommendation service.
In Proceedings of the First International Conference on
Autonomous Agents p. 378-385, February 1997.
|
|
|
Marko Balabanovic.
Exploring versus exploiting when learning user models for text
recommendation.
User Modeling and User-Adapted Interaction (to appear), 8(1),
1998.
|
|
|
Marko Balabanovic.
An interface for learning multi-topic user profiles from implicit
feedback.
Technical Report SIDL-WP-1998-0089, Stanford University, 1998.
|
|
|
Marko Balabanovic.
The ``slider'' interface.
IBM interVisions, 11, February 1998.
|
|
|
Marko Balabanovic, Lonny L. Chu, and Gregory J. Wolff.
Storytelling with digital photographs.
In CHI '00: Proceedings of the SIGCHI conference on Human
factors in computing systems, pages 564-571, New York, NY, USA, 2000. ACM
Press.
|
|
|
Marko Balabanovic and Yoav Shoham.
Learning inforamtion retrieval agents: Experiments with automated web
browsing.
In Proceedings of the AAAI Spring Symposium on Information
Gathering from Heterogenous, Distributed Resources, 1995.
Format: Compressed PostScript
|
|
|
Marko Balabanovic and Yoav Shoham.
Combining content-based and collaborative recommendation.
Communications of the ACM, 40(3), March 1997.
|
|
|
Marko Balabanovic, Yoav Shoham, and Yeogirl Yun.
An adaptive agent for automated web browsing.
Journal of Visual Communication and Image Representation, 6(4),
December 1995.
you give agent profile. It looks at the Web for things of
interest and reports back. You give feedback
|
|
|
Michelle Baldonado.
Searching, browsing, and metasearching with sensemaker.
Web Techniques Magazine, May 1997.
|
|
|
Michelle Baldonado, Chen-Chuan K. Chang, Luis Gravano, and Andreas Paepcke.
Metadata for digital libraries: Architecture and design rationale.
Technical Report SIDL-WP-1997-0055; 1997-26, Stanford University,
1997.
Accessible at http://dbpubs.stanford.edu/pub/1997-26.
In a distributed, heterogeneous, proxy-based digital
library, autonomous services and collections are accessed
indirectly via proxies. To facilitate metadata
compatibility and interoperability in such a digital
library, we have designed a metadata architecture that
includes four basic component classes: attribute model
proxies, attribute model translators, metadata facilities
for search proxies, and metadata repositories. Attribute
model proxies elevate both attribute sets and the
attributes they define to first-class objects. They also
allow relationships among attributes to be captured.
Attribute model translators map attributes and attribute
values from one attribute model to another (where
possible). Metadata facilities for search proxies provide
structured descriptions both of the collections to which
the search proxies provide access and of the search
capabilities of the proxies. Finally, metadata repositories
accumulate selected metadata from local instances of the
other three component classes in order to facilitate global
metadata queries and local metadata caching. In this paper,
we outline further the roles of these component classes,
discuss our design rationale, and analyze related work.
|
|
|
Michelle Baldonado, Chen-Chuan K. Chang, Luis Gravano, and Andreas Paepcke.
Metadata for digital libraries: Architecture and design rationale.
In Proceedings of the Second ACM International Conference on
Digital Libraries, pages 47-56, 1997.
At http://dbpubs.stanford.edu/pub/1997-26.
In a distributed, heterogeneous, proxy-based digital
library, autonomous services and collections are accessed
indirectly via proxies. To facilitate metadata compatibility
and interoperability in such a digital library, we have
designed a metadata architecture that includes four basic
component classes: attribute model proxies, attribute model
translators, metadata facilities for search proxies, and
metadata repositories. Attribute model proxies elevate both
attribute sets and the attributes they define to first-class
objects. They also allow relationships among attributes to
be captured. Attribute model translators map attributes and
attribute values from one attribute model to another (where
possible). Metadata facilities for search proxies provide
structured descriptions both of the collections to which the
search proxies provide access and of the search capabilities
of the proxies. Finally, metadata repositories accumulate
selected metadata from local instances of the other three
component classes in order to facilitate global metadata
queries and local metadata caching. In this paper, we
outline further the roles of these component classes,
discuss our design rationale, and analyze related work.
|
|
|
Michelle Baldonado, Chen-Chuan K. Chang, Luis Gravano, and Andreas Paepcke.
The Stanford Digital Library metadata architecture.
International Journal of Digital Libraries, 1(2), February
1997.
See also http://dbpubs.stanford.edu/pub/1997-56.
|
|
|
Michelle Baldonado, Steve Cousins, B. Lee, and Andreas Paepcke.
Notable: An annotation system for networked handheld devices.
In Proceedings of the Conference on Human Factors in Computing
Systems CHI'99, pages 210-211, 1999.
|
|
|
Michelle Baldonado, Seth Katz, Andreas Paepcke, Chen-Chuan K. Chang, Hector
Garcia-Molina, and Terry Winograd.
An extensible constructor tool for the rapid, interactive design of
query synthesizers.
In Proceedings of the Third ACM International Conference on
Digital Libraries, 1998.
Accessible at http://dbpubs.stanford.edu/pub/1998-48.
We describe an extensible constructor tool that helps
information experts (e.g., librarians) create
specialized query synthesizers for heterogeneous
digital-library environments. A query synthesizer
provides a graphical user interface in which a
digital-library patron can specify a high-level,
fielded, multi-source query. Furthermore, a query
synthesizer interacts with a query translator and an
attribute translator to transform high-level queries
into sets of source-specific queries. We discuss how
the constructor can facilitate discovery of available
attributes (e.g., title), collation of schemas from
different sources, selection of input widgets for a
synthesizer (e.g., a text box or a drop-down list
widget to support input of controlled vocabulary), and
other design aspects. We also describe a prototype
constructor we implemented, based on the Stanford
InfoBus and metadata architecture.
|
|
|
Michelle Q Wang Baldonado and Steve B. Cousins.
Addressing heterogeneity in the networked information environment.
New Review of Information Networking, 2:83-102, 1996.
Several ongoing Stanford University Digital Library projects
address the issue of
heterogeneity in networked information environments. A networked
information
environment has the following components: users, information repositories,
information
services, and payment mechanisms. This paper describes three of the
heterogeneity-focused Stanford projects-InfoBus, REACH, and DLITE. The
InfoBus
project is at the protocol level, while the REACH and DLITE projects are
both at the
conceptual model level. The InfoBus project provides the infrastructure
necessary for
accessing heterogeneous services and utilizing heterogeneous payment
mechanisms. The
REACH project sets forth a uniform conceptual model for finding information
in
networked information repositories. The DLITE project presents a general
task-based
strategy for building user interfaces to heterogeneous networked
information services.
|
|
|
Michelle Q Wang Baldonado and Terry Winograd.
Techniques and tools for making sense out of heterogeneous search
service results.
Technical Report SIDL-WP-1995-0019; 1995-59, Stanford University,
1995.
|
|
|
Michelle Q Wang Baldonado and Terry Winograd.
A user interaction model for browsing based on category-level
operations.
Technical Report SIDL-WP-1996-0029; 1996-75, Stanford University,
1996.
We propose a user interaction model for browsing based on itera
tive category-level
operations. The motivation comes from two observations: 1) people naturally
think in terms
of categories, and 2) in browsing, the types of categories that are salient
to users change as
they browse. We define a set of category-level operations that lets users
iteratively view
and find results in terms of these changing category types. We also show
that we can
express some standard IR operations as iteratively applied sequences of a
funda mental
category-level operation (thus unifying them). Finally, we describe
SenseMaker, a
prototype interface for browsing heteroge neous sources.
|
|
|
Michelle Q Wang Baldonado and Terry Winograd.
SenseMaker: An information-exploration interface supporting the
contextual evolution of a user's interests.
In Proceedings of the Conference on Human Factors in Computing
Systems CHI'97, pages 11-18, Atlanta, Ga., March 1997. ACM Press, New York.
|
|
|
Sujata Banerjee and Vibhu O. Mittal.
On the use of linguistic ontologies for accessing and indexing
distributed digital libraries.
In Proceedings of the First Annual Conference on the Theory and
Practice of Digital Libraries, 1994.
Format: HTML Document ()
.
Audience: Non-technical, on-line searchers.
References: 16.
Links: 1.
Relevance: Low.
Abstract: Addresses problem of finding correct keywords to search for
by using WordNet. If a search doesn't turn up the hits needed, it modifies
query by using synonyms, generalizing, or replacing with a set of more specific
words. Searcher is asked to approve modified queries, which are then re-sent to
content providers.
|
|
|
Gaurav Banga, Fred Douglis, and Michael Rabinovich.
Optimistic deltas for www latency reduction.
In Proceedings of USENIX Technical Conference, pages 289-303,
1997.
|
|
|
Ziv Bar-Yossef, Alexander Berg, Steve Chien, and Jittat Fakcharoenphol Dror
Weitz.
Approximating aggregate queries about web pages via random walks.
In Proceedings of the Twenty-sixth International Conference on
Very Large Databases, 2000.
|
|
|
Ziv Bar-Yossef, Andrei Z. Broder, Ravi Kumar, and Andrew Tomkins.
Sic transit gloria telae: towards an understanding of the web's
decay.
In WWW '04: Proceedings of the 13th international conference on
World Wide Web, pages 328-337, New York, NY, USA, 2004. ACM Press.
The rapid growth of the web has been noted and tracked
extensively. Recent studies have however documented
the dual phenomenon: web pages have small half
lives, and thus the web exhibits rapid death as
well. Consequently, page creators are faced with an
increasingly burdensome task of keeping links
up-to-date, and many are falling behind. In addition
to just individual pages, collections of pages or
even entire neighborhoods of the web exhibit
significant decay, rendering them less effective as
information resources. Such neighborhoods are
identified only by frustrated searchers, seeking a
way out of these stale neighborhoods, back to more
up-to-date sections of the web; measuring the decay
of a page purely on the basis of dead links on the
page is too naive to reflect this frustration. In
this paper we formalize a strong notion of a decay
measure and present algorithms for computing it
efficiently. We explore this measure by presenting a
number of validations, and use it to identify
interesting artifacts on today's web. We then
describe a number of applications of such a measure
to search engines, web page maintainers,
ontologists, and individual users.
|
|
|
Albert-Laszlo Barabasi and Reka Albert.
Emergence of scaling in random networks.
Science, 286(5439):509-512, October 1999.
|
|
|
David Bargeron, Anoop Gupta, Jonathan Grudin, and Elizabeth Sanocki.
Annotations for streaming video on the web: System design and usage
studies.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
Streaming video on the World Wide Web is being widely
deployed, and workplace training and distance education
are key applications. The ability to annotate video on
the Web can provide significant added value in these and
other areas. Written and spoken annotations can provide
`in context' personal notes and can enable asynchronous
collaboration among groups of users. With annotations,
users are no longer limited to viewing content passively
on the Web, but are free to add and share commentary and
links, thus transforming the Web into an interactive
medium. We discuss design considerations in constructing
a collaborative video annotation system, and we
introduce our prototype, called MRAS. We present
preliminary data on the use of Web- based annotations
for personal note-taking and for sharing notes in a
distance education scenario, Users showed a strong
preference for MRAS over pen-and-paper for taking notes,
despite taking longer to do so. They also indicated that
they would make more abstract and questions with MRAS
than in a `live' situation, and that sharing added
substantial value.
|
|
|
Bruce R. Barkstrom, Melinda Finch, Michelle Ferebee, and Calvin Mackey.
Adapting digital libraries to continual evolution.
In Proceedings of the Second ACM/IEEE-CS Joint Conference on
Digital Libraries, 2002.
In this paper, we describe five investment streams (data
storage infrastructure, knowledge management, data production control,
data transport and security, and personnel skill mix) that need to be balanced
against short-term operating demands in order to maximize the probability of
long-term viability of a digital library. Because of the rapid pace of
information technology change, a digital library cannot be a static institution.
Rather, it has to become a flexible organization adapted to continuous evolution
of its infrastructure.
|
|
|
Kobus Barnard, Pinar Duygulu, David Forsyth, Nando de Freitas, David M. Blei,
and Michael I. Jordan.
Matching words and pictures.
J. Mach. Learn. Res., 3:1107-1135, 2003.
We present a new approach for modeling multi-modal data sets,
focusing on the specific case of segmented images with associated text.
Learning the joint distribution of image regions and words has many applications.
We consider in detail predicting words associated with whole images
(auto-annotation) and corresponding to particular image regions (region
naming). Auto-annotation might help organize and access large collections
of images. Region naming is a model of object recognition as a process of
translating image regions to words, much as one might translate from one
language to another. Learning the relationships between image regions and
semantic correlates (words) is an interesting example of multi-modal data
mining, particularly because it is typically hard to apply data mining
techniques to collections of images. We develop a number of models for the
joint distribution of image regions and words, including several which
explicitly learn the correspondence between regions and words. We study
multi-modal and correspondence extensions to Hofmann's hierarchical
clustering/aspect model, a translation model adapted from statistical
machine translation (Brown et al.), and a multi-modal extension to mixture
of latent Dirichlet allocation (MoM-LDA). All models are assessed using a
large collection of annotated images of real scenes. We study in depth the
difficult problem of measuring performance. For the annotation task, we
look at prediction performance on held out data. We present three alternative
measures, oriented toward different types of task. Measuring the performance
of correspondence methods is harder, because one must determine whether
a word has been placed on the right region of an image. We can use annotation
performance as a proxy measure, but accurate measurement requires hand labeled
data, and thus must occur on a smaller scale. We show results using both an
annotation proxy, and manually labeled data.
|
|
|
Kobus Barnard and David .A. Forsyth.
Learning the semantics of words and pictures.
In Proceedings of the IEEE International Conference on Computer
Vision, July 2001.
|
|
|
Rob Barrett, Paul P. Maglio, and Daniel C. Kellem.
How to personalize the web.
In Proceedings of the Conference on Human Factors in Computing
Systems CHI'97, 1997.
|
|
|
Laura M. Bartolo, Cathy S. Lowe, Adam C. Powell IV, Donald R. Sadoway, Jorges
Vieyra, and Kyle Stemen.
Use of matml with software applications for e-learning.
In Proceedings of the Fourth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2004.
This pilot project investigates facilitating the
development of the Semantic Web for e-learning through a practical
example, using Materials Property Data Markup Language (MatML) to
provide materials property data to a web-based application program.
Property data for 100 materials is marked up with MatML and used as an
input format for an application program. Students use the program to
generate graphs showing selected properties for different materials.
Selected graphs are submitted to the Materials Digital Library (MatDL)
so that successive classes may be informed by earlier work to encourage
new discoveries.
|
|
|
C. Batini, M. Lenzerini, and S. Navathe.
A comparative analysis of methodologies for database schema
integration.
ACM Computing Surveys, 18(4), 1986.
|
|
|
Patrick Baudisch and Ruth Rosenholtz.
Halo: a technique for visualizing off-screen objects.
In CHI '03: Proceedings of the SIGCHI conference on Human
factors in computing systems, pages 481-488, New York, NY, USA, 2003. ACM
Press.
|
|
|
E. Bauer, D. Koller, and Y. Singer.
Update rules for parameter estimation in Bayesian networks.
In Proceedings of the 13th Annual Conference on Uncertainty in
AI (UAI), 1997.
|
|
|
M. Bearman.
Odp-trader.
Open Distributed Processing, 2:19 - 33, 1994.
|
|
|
Herb Becker.
The role of the library of congress in the national digital library.
In Proceedings of DL'96, 1996.
Format: Not yet online.
|
|
|
Benjamin B. Bederson.
Photomesa: a zoomable image browser using quantum treemaps and
bubblemaps.
In Proceedings of the 14th annual ACM symposium on User
interface software and technology, pages 71-80. ACM Press, 2001.
|
|
|
Benjamin B. Bederson, Ben Shneiderman, and Martin Wattenberg.
Ordered and quantum treemaps: Making effective use of 2D space to
display hierarchies.
ACM Transactions on Graphics, 21(4):833-854, 2002.
|
|
|
Doug Beeferman, Adam Berger, and John D. Lafferty.
Statistical models for text segmentation.
Machine Learning, 34(1-3):177-210, 1999.
|
|
|
Alireza Behreman.
Generic electronic payment services.
In The Second USENIX Workshop on Electronic Commerce
Proceedings, 1996.
|
|
|
Alireza Behreman and Rajkumar Narayanaswamy.
Payment method negotiation service.
In The Second USENIX Workshop on Electronic Commerce
Proceedings, 1996.
|
|
|
M. Beigl and R. Rudisch.
System support for mobile computing.
Computers & Graphics, 20(5):619-625, 1996.
Today a mobile user wants to connect his portable
computer: remotely to the central database at home,
locally to the printer on the spot and globally to
the world-wide-web. To achieve this, different
connection lines are available: wireless networks
for connecting out in the fields, ISDN or analogue
telephone lines when residing in a hotel, Ethernet
access at the customer's site. But this
connectivity raises a lot of questions, about
technical, security or accounting issues. This
paper presents the architecture of an environment
aiming to support mobile users and dealing with the
given problems.
|
|
|
N.J. Belkin and W. Bruce Croft.
Information filtering and information retrieval: two sides of same
coin?
Communications of the ACM, 35(12):29-38, December 1992.
A comparison is made between information retrieval and
information filtering. The authors determine that information
filtering is a well defined process. By examining its
foundations and comparing it to the foundations of the IR
enterprise, the authors find there is very little difference
between filtering and retrieval at an abstract level. They
conclude that the two enterprises have the same goal; namely
they are both concerned with getting information to people
who need it. However, the authors emphasize that IR research
has ignored some aspects of the general problem which both IR
and information filtering address, and that these aspects are
precisely those which especially relevant to the specific
contexts of filtering.
|
|
|
Timothy C. Bell, Alistair Moffat, and Ian H. Witten.
Compressing the digital library.
In Proceedings of the First Annual Conference on the Theory and
Practice of Digital Libraries, 1994.
Format: HTML Document (32K) .
Audience: Semi-technical, general computer scientists.
References: 8.
Links: 1.
Relevance: Medium (but not mainstream DL).
Abstract: Discusses the interaction of compression and indexing.
Suggests a Huffman encoding applied to words & non-words. Inverted bitmap for
indexing, enhanced with Golomb encoding. Compressed 266 Mb Wall Street Journal
a
rticle
database by 50including creating the index. Queries were processed in less than .1 sec.
|
|
|
M. Bellare, J.A. Garay, R. Hauser, A. Herzberg, H. Krawczyk, M. Steiner,
G. Tsudik, and M. Waidner.
ikp-a family of secure electronic payment protocols.
In Proceedings of the First USENIX Workshop of Electronic
Commerce, Berkeley, CA, USA, 1995. USENIX Assoc.
This paper proposes a family of protocols-iKP (i=1,2,3)-for
secure electronic payments over the Internet. The protocols
implement credit card-based transactions between the customer
and the merchant while using the existing financial network
for clearing and authorization. The protocols can be extended
to apply to other payment models, such as debit cards and
electronic checks. They are based on public-key cryptography
and can be implemented in either software or hardware.
Individual protocols differ in key management complexity and
degree of security. It is intended that their deployment be
gradual and incremental. The iKP protocols are presented
herein with the intention to serve as a starting point for
eventual standards on secure electronic payment.
|
|
|
Jezekiel Ben-Arie, Purvin Pandit, and ShyamSundar Rajaram.
Design of a digital library for human movement.
In Proceedings of the First ACM/IEEE-CS Joint Conference on
Digital Libraries, 2001.
This paper is focused on a central aspect in the design of our
planned digital library for human movement, i.e. on the aspect of representation
and recognition of human activity from video data. The method of representation
is important since it has a major impact on the design of all the other building
blocks of our system such as the user interface/query block or the activity
recognition/storage block. In this paper we evaluate a representation method
for human movement that is based on sequences of angular poses and angular
velocites of the human skeletal joints, for storage and retrieval of human
actions in video databases. The choice of a representation method plays an
important role in the database structure, search methods, storage efficiency
etc.. For this representation, we develop a novel approach for complex human
activity recognition by employing multidimensional indexing combined with
temporal or sequential correlation. this scheme is then evaluated with respect
to its efficiency in storage and retrieval.
For the indexing we use postures of humans in videos that are decomposed into
a set of multidimensional tuples which represent the poses/velocities of human
body parts such as arms, legs and torso. Three novel methods for human activity
recognition are theoretically and experimentally compared. The methods require
only a few sparsely sampled human postures. We also achieve speed invariant
recognition of activities by eliminating the time factor and replacing it with
sequence information. The indexing approach also provides robust recognition
and an efficient storage/retrieval of all the activities in a small set of hash
tables.
|
|
|
Israel Ben-Shaul, Michael Herscovici, Michal Jacovi, Yoelle S. Maarek, Dan
Pelleg, Menachem Shtalhaim, Vladimir Soroka, and Sigalit Ur.
Adding support for dynamic and focused search with fetuccino.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
This paper proposes two enhancements to
existing search services over the Web. One
enhancement is the addition
of limited dynamic search around results
provided by regular Web search services, in
order to correct part of the
discrepancy between the actual Web and its
static image as stored in search repositories.
The second enhancement is
an experimental two-phase paradigm that allows
the user to distinguish between a domain query
and a focused query
within the dynamically identified domain. We
present Fetuccino, an extension of the
Mapuccino system that implements
these two enhancements. Fetuccino provides an
enhanced user-interface for visualization of
search results, including
advanced graph layout, display of structural
information and support for standards (such as
XML). While Fetuccino
has been implemented on top of existing search
services, its features could easily be
integrated into any search engine
for better performance. A light version of
Fetuccino is available on the Internet at
http://www.ibm.com/java/fetuccino.
|
|
|
Israel Ben-Shaul, Michael Herscovici, Michal Jacovi, Yoelle S. Maarek, Dan
Pelleg, Menachem Shtalhaim, Vladimir Soroka, and Sigalit Ur.
Adding support for dynamic and focused search with fetuccino.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
|
|
|
Tamara L. Berg, Alexander C. Berg, Jaety Edwards, Michael Maire, Ryan White,
Yee-Whye Teh, Erik Learned-Miller, and D.A. Forsyth.
Names and faces in the news.
In CVPR 2004: Conference on Computer Vision and Pattern
Recognition. IEEE Computer Society, 2004.
|
|
|
Donna Bergmark.
Collection synthesis.
In Proceedings of the Second ACM/IEEE-CS Joint Conference on
Digital Libraries, 2002.
The invention of the hyperlink and the HTTP transmission protocol
caused an amazing new structure to appear on the Internet - the World Wide Web.
With the Web, there came spiders, robots, and Web crawlers, which go from one
link
to the next checking Web health, ferreting out information and resources, and
imposing organization on the huge collection of information (and dross)
residing on the net. This paper reports on the use of one such crawler to
synthesize document collections on various topics in science, mathematics,
engineering and technology. Such collections could be part of a digital
library.
|
|
|
Howard Besser.
Mesl project description.
In Proceedings of DL'96, 1996.
Format: Not yet online.
|
|
|
Krishna Bharat and Andrei Broder.
Mirror, mirror on the web: A study of host pairs with replicated
content.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
TWO previous studies. one done at Stanford in 1997 based on data
collected by the Google
search engine, and one done at Digital in 1996 based on AltaVista data,
revealed that almost a third of the Web consists of duplicate pages. Both
studies
identified mirroring, that is, the systematic
replication of content over a pair of hosts, as
the principal cause of duplication, but did not further investigate this
phenomenon. The main aim of this paper is to
present a clearer picture of mirroring
on the Web. As input we used a set of 179
million URLs found during a Web crawl done in
the summer of 1998. We looked at all hosts with more than 100 URLs in
our input (about 238,000), and discovered that
about 10the prevalence of mirroring based on a
mirroring classification scheme that we define. There are numerous reasons for
mirroring: technical (e.g., to improve access
time), commercial (e.g., different intermediaries offering the same products),
cultural (e.g., same content in two languages),
social (e.g.. sharing of research data). and so forth. Although we have not done
a exhaustive study of the causes of replication, we discuss and provide
examples for several representative cases. Our
technique for detecting mirrored hosts from
large sets of collected URLs depends mostly on the syntactic analysis of URL
strings, and requires retrieval and content
analysis only for a small number of pages. We are able to detect both
partial and total mirroring, and handle cases
where the content is not byte-wise identical. Furthermore, our technique is
computationally very efficient and does not
assume that the initial set of URLs gathered from each host is comprehensive.
Hence, this approach has practical uses beyond our study, and can be applied in
other settings. For instance, for Web crawlers
and caching proxies, detecting mirrors can be
valuable to avoid redundant fetching. and knowledge of mirroring can be
used to compensate for broken links.
|
|
|
Krishna Bharat, Andrei Broder, Monika Henzinger, Puneet Kumar, and Suresh
Venkatasubramanian.
The connectivity server: Fast access to linkage information on the
web.
In Proceedings of the Seventh International World-Wide Web
Conference, April 1998.
|
|
|
B. Bhushan et al.
Managing heterogeneous networks-integrator-based approach.
In IFIP Transactions C (Communication Systems), 1993.
The authors discuss an object oriented approach to
network management. Their goal is to briefly explain
a real example of an integrated network management
(INM) system. One of the major requirements when
looking at information transfer between the managed
network and the management system is to mask the
heterogeneity of the underlying resources. As an
example of the unification of heterogeneity
networks, a software called the Integrator has been
designed and implemented. The Integrator is a
mechanism that provides an object oriented interface
to the user (human or network management application
programs) to offer a homogeneous view of a world
(set of heterogeneous domains) through a model
(depicting a formal information view). The
Integrator uses two agents to communicate with
underlying network elements: an SNMP agent accessing
TCP/IP parameters for an Ethernet network through a
SNMP agent, and an X.25 interface program doing the
same for X.25 parameters through proprietary
management software. The concepts of the Integrator
has been applied in the EC project PEMMON
|
|
|
Timothy W. Bickmore and Bill N. Schilit.
Digestor: Device-independent access to the world wide web.
In Proceedings of the Sixth International World-Wide Web
Conference, 1997.
|
|
|
Eric Bier, Lance Good, Kris Popat, and Alan Newberger.
A document corpus browser for in-depth reading.
In Proceedings of the Fourth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2004.
Software tools, including Web browsers, e-books,
electronic document formats, search engines, and digital libraries are
changing the way that people read, making it easier for them to find
and view documents. However, while these tools provide significant help
with short-term reading projects involving small numbers of documents,
they fall short of supporting readers engaged in longer-term reading
projects, in which a topic is to be understood in-depth by reading many
documents. Such readers need to find and manage many documents and
citations, remember what they have read, and prioritize what to read
next. In this paper, we describe three integrated software tools that
facilitate in-depth reading. A first tool extracts citation information
from documents. A second finds on-line documents from their citations.
The last is a document corpus browser that uses a zoomable user
interface to show a corpus at multiple granularities while supporting
reading tasks that take days, weeks, or longer. We describe these tools
and the design principles that motivated them.
|
|
|
Eric A. Bier and Adam Perer.
Icon abacus and ghost icons.
In Proceedings of the Fifth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2005.
We present two techniques that make document collection visualizations more informative. Icon abacus uses the horizontal position of icon groups to communicate document attributes. Ghost icons show linked documents by adding temporary icons and by highlighting or dimming existing ones.
|
|
|
William P. Birmingham.
An agent-based architecture for digital libraries.
D-Lib Magazine, July 1995.
Format: HTML Document().
|
|
|
William P. Birmingham, Karen M. Drabenstott, Carolyn O. Frost, Amy J. Warner,
and Katherine Willis.
The university of michigan digital library: This is not your father's
library.
In Proceedings of the First Annual Conference on the Theory and
Practice of Digital Libraries, 1994.
Format: HTML Document (36K)
.
Audience: slightly technical, generalist comfortable with technology,
funders.
References: 13.
Links: 1.
Relevance: Medium-High.
Abstract: Describes the UMichigan Digital Libraries proposal, including
some detail about their agent architecture. User agents, Collection-interface
agents, and mediators all play a role. Network resources are allocated on
a market-based mechanism, and proposal mentions need to protect intellectual
property & handle payment issues.
|
|
|
William P. Birmingham, Edmund H. Durfee, Tracy Mullen, and Michael P. Wellman.
The distributed agent architecture of the university of michigan
digital library (extended abstract).
In AAAI Spring Symposium on Information Gathering, 1995.
Format: Compressed PostScript().
|
|
|
Ann Peterson Bishop.
Working towards an understanding of digital library use: A report on
the user research efforts of the nsf/arpa/nasa dli projects.
D-Lib Magazine, October 1995.
Format: HTML Document().
|
|
|
Ann Peterson Bishop.
Making digital libraries go: Comparing use across genres.
In Proceedings of the Fourth ACM International Conference on
Digital Libraries, 1999.
A new federal initiative called Information Technology for the
Twenty-First Century (IT2) recognizes the need to bridge
research across domains in or&r to bring computing benefits to
society at large. One implication for digital library (DL)
research is that we should start looking at projects that span the
spectrum from basic computer science to the implementation of
working systems and consider links among findings on
information system use from a variety of arenas in life. In this
paper, I integrate findings from my research on people's
encounters with DLs in two different arenas: academia and low-income
neighborhoods. The point is to see how concepts and
conclusions related to use do, in fact, cross these arenas. The
paper also aims to help bring results from studies of local
community information practices into the realm of DLs, since
community networking represents one particular genre and
audience that has not yet received a great deal of attention from
those engaged in DL research. Beginning with a discussion of
DL use as an assemblage of infrastructure, norms, knowledge,
and practice, the paper explores a number of insights gleaned
from user studies associated with two separate research projects:
1) the recently completed University of Illinois Digital
Libraries Initiative (DLI) project; and 2) the Community
Networking Initiative (CNI) currently in progress under the
auspices of the University of Illinois, the Urban League of
Champaign County and Prairienet, the community network
serving East Central Illinois. Insights about DL use discussed
in this paper include: the way in which trivial barriers are
magnified until they effectively cut off use on a large scale; the
difficulties faced by outsiders whose information worlds are
impoverished, the primacy of comfort and relevant content in
encouraging use; and the importance of informal social
networks for providing help related to system use.
|
|
|
Barclay Blair and John Boyer.
Xfdl: Creating electronic commerce transaction records using xml.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
In the race to transform the World Wide Web
from a medium for information presentation to a
medium for information
exchange, the development of practices for
ensuring the security, auditability, and non-
repudiation of transactions that are
well established in the paper-based world has
not kept pace in the digital world. Existing
Internet technology provides
no easy way to create a valid `digital receipt'
that meets the requirements of both complex
distributed networks and the
business community. In addition, an improved
articulation of digital signatures is needed.
Extensible Forms Description
Language (XFDL), developed by UWI.Com and Tim
Bray, is an application of XML that allows
organizations to move
their paper-based forms systems to the Internet
while maintaining the necessary attributes of
paper-based transaction
records. XFDL was designed for implementation
in business-to-business electronic commerce and
intra-organizational
information transactions.
|
|
|
Catherine Blake.
Information synthesis: A new approach to explore secondary
information in scientific literature.
In Proceedings of the Fifth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2005.
Advances in both technology and publishing practices continue to increase the quantity of scientific literature that is available electronically. In this paper, we introduce the Information Synthesis process, a new approach that enables scientists to visualize, explore, and resolve contradictory findings that are inevitable when multiple empirical studies explore the same natural phenomena. Central to the Information Synthesis approach is a cyber-infrastructure that provides a scientist with both secondary information from an article and structured information resources. To demonstrate this approach, we have developed the Multi-User, Information Extraction for Information Synthesis (METIS) System. METIS is an interactive system that automates critical tasks within the Information Synthesis process. We provide two case-studies that demonstrate the utility of the Information Synthesis approach.
|
|
|
J.A. Blakeley, W.J. McKenna, and G. Graefe.
Experiences building the open oodb query optimizer.
In Proceedings of the International Conference on Management of
Data, 1993.
The authors report their experiences building the query
optimizer for TI's Open OODB system. It is probably the
first working object query optimizer to be based on a
complete extensible optimization framework including logical
algebra, execution algorithms, property enforcers, logical
transformation rules, implementation rules, and selectivity
and cost estimation. Their algebra incorporates a new
materialize operator with its corresponding logical
transformation and implementation rules that enable the
optimization of path expressions. The Open OODB query
optimizer was constructed using the Volcano Optimizer
Generator, demonstrating that this second-generation
optimizer generator enables rapid development of efficient
and effective query optimizers for non-standard data models
and systems.
|
|
|
Ann Blandford, Suzette Keith, Iain Connell, and Helen Edwards.
Analytical usability evaluation for digital libraries: a case study.
In Proceedings of the Fourth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2004.
There are two main kinds of approach to
considering usability of any system: empirical and analytical.
Empirical techniques involve testing systems with users, whereas
analytical techniques involve usability personnel assessing systems
using established theories and methods. We report here on a set of
studies in which four different techniques were applied to various
digital libraries, focusing on the strengths, limitations and scope of
each approach. Two of the techniques, Heuristic Evaluation and
Cognitive Walkthrough, were applied in text-book fashion, because there
was no obvious way to contextualize them to the Digital Libraries (DL)
domain. For the third, Claims Analysis, it was possible to develop a
set of re-usable scenarios and personas that relate the approach
specifically to DL development. The fourth technique, CASSM, relates
explicitly to the DL domain by combining empirical data with an
analytical approach. We have found that Heuristic Evaluation and
Cognitive Walkthrough only address superficial aspects of interface
design (but are good for that), whereas Claims Analysis and CASSM can
help identify deeper conceptual difficulties (but demand greater skill
of the analyst). However, none fit seamlessly within the fragmented
function-oriented design practices that typify much digital library
development, highlighting an important area for further work to support
improved usability.
|
|
|
Ann Blandford, Hanna Stelmaszewska, and Nick Bryan-Kinns.
Use of multiple digital libraries: A case study.
In Proceedings of the First ACM/IEEE-CS Joint Conference on
Digital Libraries, 2001.
The aim of the work reported here was to better understand
the usability issues raised when digital libraries are used in a natural
setting. The method used was a protocol analysis of users working on a
task of their own choosing to retrieve documents from publicly available
digital libraries. Various classes of usability difficulties were found.
Here, we focus on use in context - that is, usability concerns that arise
from the fact that libraries are accessed in particular ways, under
technically and organisationally imposed constraints, and that use of
any particular resource is discretionary. The concepts from an Interaction
Framework, which provides support for reasoning about patterns of
interaction between users and systems, are applied to understand interaction
issues.
|
|
|
R. Boisvert, S. Browne, J. Dongarra, and E. Grosse.
Digital software and data repositories for support of scientific
computing.
In Advances in Digital Libraries '95, 1995.
Format: Not Yet Online.
|
|
|
Kurt D. Bollacker, Steve Lawrence, and C. Lee Giles.
A system for automatic personalized tracking of scientific literature
on the web.
In Proceedings of the Fourth ACM International Conference on
Digital Libraries, 1999.
We introduce a system as part of the CiteSeer digital library
project for automatic tracking of scientific literature that is
relevant to a user's research interests. Unlike previous systems
that use simple keyword matching, CiteSeer is able to
track and recommend topically relevant papers even when
keyword based query profiles fail. This is made possible
through the use of a heterogenous profile to represent user
interests. These profiles include several representations, including
content based relatedness measures. The CiteSeer
tracking system is well integrated into the search and browsing
facilities'of CiteSeer, and provides the user with great
flexibility in tuning a profile to better match his or her interests.
The software for this system is available, and a sample
database is online as a public service.
|
|
|
Leslie Bondaryk.
Calculus modules online: An internet multimedia application.
In DAGS'95, 1995.
Format: HTML Document(21K + pictures)
Audience: Calculus Instructors.
References: 13.
Links: 16.
Abstract: Discusses an architecture for a system that aids in the
teaching of calculus.
|
|
|
J. Bonigk and A. Lubinski.
A basic architecture for mobile information access.
Computers & Graphics, 20(5):683-91, 1996.
As the development of pen computing' continues, more and
more
of today's computers are likely gradually to move away from
people's desktops and into their pockets. The development of
personal digital assistants (PDAs) has initiated this move.
As these devices move into people's pockets, they need the
ability to access information on the move. This article
describes a generic view of a client server mobile computing
architecture. It also sheds some light on the basic network
topologies that have been considered previously for such
systems. The scenario used is a hospital ward. Each doctor is
equipped with a PDA and each ward or a group of wards with a
server providing patient records. As a doctor visits a
patient in a ward, the patient's record is accessed from the
server onto the PDA. The doctor updates the record and sends
the update back to the server.
|
|
|
Jos‰ Borbinha, Nuno Freire, and Joƒo Neves.
Bnd: A national digital library as a jigsaw.
In Proceedings of the Fourth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2004.
This paper describes the architecture and
components of the infrastructure in construction for the National
Digital Library in Portugal. The requirements emerged from the
definition of the services to support, with a special focus on
scalability, and from the decision to give a special attention to
community building standards, open solutions, and reusable and cost
effective components. The generic bibliographic metadata format in this
project is UNIMARC, and the structural metadata is METS. The URN
identifiers are processed and resolved as simple but very effective
PURL identifiers, and the storage is provided by the new emerging
LUSTRE file system, for immediate access, and by a locally developed
GRID architecture, ARCO, for long term preservation. All these
components run on Linux servers, as also the middleware for access
based in the FEDORA framework.
|
|
|
N. Borenstein and N. Freed.
MIME (Multipurpose Internet Mail Extensions) Part One:
Mechanisms for specifying and describing the format of Internet message
bodies, September 1993.
Internet RFC 1521.
|
|
|
Nathaniel Borenstein.
Cooperative work in the andrew message system.
In Proceedings of the Conference on Computer-Supported
Cooperative Work, CSCW'88, 1988.
Describes collab-related aspects of Andrew.
|
|
|
Christine L. Borgman, Gregory Leazer, Anne Gilliland-Swetland, Kelli Millwood,
Leslie Champeny, Jason Finley, and Laura J. Smart.
How geography professors select materials for classroom lectures:
Implications for the design of digital libraries.
In Proceedings of the Fourth ACM/IEEE-CS Joint Conference on
Digital Libraries, 2004.
A goal of the Alexandria Digital Earth Prototype
(ADEPT) project is to make primary resources in geography useful for
undergraduate instruction in ways that will promote inquiry learning.
The ADEPT education and evaluation team interviewed professors about
their use of geography information as they prepare for class lectures,
as compared to their research activities. We found that professors
desired the ability to search by concept (erosion, continental drift,
etc.) as well as geographic location, and that personal research
collections were an important source of instructional materials.
Resources in geo-spatial digital libraries are typically described by
location, but are rarely described by concept or educational
application. This paper presents implications for the design of an
educational digital library from our observations of the lecture
preparation process. Findings include functionality requirements for
digital libraries and implications for the notion of digital libraries
as a shared information environment. The functional requirements
include definitions and enhancements of searching capabilities, the
ability to contribute and to share personal collections of resources,
and the capability to manipulate data and images.
|
|
|
Katy Borner, Ying Feng, and Tamara McMahon.
Collaborative visual interfaces to digital libraries.
In Proceedings of the Second ACM/IEEE-CS Joint Conference on
Digital Libraries, 2002.
This paper argues for the design of collaborative visual
interfaces
to digital libraries that support social navigation. As an illustrative example
we present work in progress on the design of a three-dimensional document space
for a scholarly community - namely faculty, staff, and students at the School of
Library and Information Science, Indiana University. We conclude with a set of
research challenges.
|
|
|
C. Mic Bowman, Peter B. Danzig, Darren R. Hardy, Udi Manber, Michael F.
Schwartz, and Duane P. Wessels.
Harvest: A scalable, customizable discovery and access system.
Technical Report CU-CS-732-94, Dept. of Computer Science, Univ. of
Colorado, Boulder, Colo., August 1994.
Accessible at http://harvest.transarc.com/.
|
|
|
C.M. Bowman, Peter B. Danzig, Darren R. Hardy, Udi Manber, and Michael F.
Schwartz.
The harvest information discovery and access system.
Computer Networks and ISDN Systems, 28(1-2):119-125, December
1995.
It is increasingly difficult to make effective use of
Internet information, given the rapid growth in data
volume, user base, and data diversity. We introduce
Harvest, a system that provides a scalable, customizable
architecture for gathering, indexing, caching,
replicating, and accessing Internet information.
|
|
|
Claus Brabrand, Anders Moller, Anders Sandholm, and Michael I. Schwartzbach.
A runtime system for interactive web services.
In Proceedings of the Eighth International World-Wide Web
Conference, 1999.
Interactive Web services are increasingly
replacing traditional static Web pages.
Producing Web services seems to
require a tremendous amount of laborious
low-level coding due to the primitive nature
of CGI programming. We present
ideas for an improved runtime system for
interactive Web services built on top of CGI
running on virtually every
combination of browser and HTTP/CGI server.
The runtime system has been implemented and
used extensively in
<bigwig>. a tool for producing interactive
Web services.
|
|
|
Onn Brandman, Junghoo Cho, Hector Garcia-Molina, and Narayanan Shivakumar.
Crawler-friendly web servers.
In Proceedings of the Workshop on Performance and Architecture
of Web Servers (PAWS), Santa Clara, California, June 2000.
Held in conjunction with ACM SIGMETRICS 2000. Available at
http://dbpubs.stanford.edu/pub/2000-25.
In this paper we study how to make web servers (e.g.,
Apache) more crawler friendly. Current web servers
offer the same interface to crawlers and regular web
surfers, even though crawlers and surfers have very
different performance requirements. We evaluate simple
and easy-to-incorporate modifications to web servers so
that there are significant bandwidth
savings. Specifically, we propose that web servers
export meta-data archives decribing their content.
|
|
|
Onn Brandman, Hector Garcia-Molina, and Andreas Paepcke.
Where have you been? a comparison of three web tracking technologies.
In Submitted for publication, 1999.
Available at http://dbpubs.stanford.edu/pub/1999-61.
Web searching and browsing can be improved if browsers and search
engines know which pages users frequently visit. 'Web
tracking' is the process of gathering that
information. The goal for Web tracking is to obtain a
database describing Web page download times and users'
page traversal patterns. The database can then be used
for data mining or for suggesting popular or relevant
pages to other users. We implemented three Web tracking
systems, and compared their performance. In the first
system, rather than connecting directly to Web sites, a
client issues URL requests to a proxy. The proxy
connects to the remote server and returns the data to
the client, keeping a log of all transactions. The
second system uses sniffers to log all HTTP traffic
on a subnet. The third system periodically collects
browser log files and sends them to a central
repository for processing. Each of the systems differs
in its advantages and pitfalls. We present a comparison
of these techniques.
|
|
|
Jack Brassil.
September - secure electronic publishing trial.
In Proceedings of DL'96, 1996.
Format: Not yet online.
|
|
|
Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker.
Web caching and zipf-like distributions: Evidence and implications.
In Proceedings of Infocom, 1999.
|
|
|
Allen Brewer, Wei Ding, Karla Hahn, and Anita Komlodi.
The role of intermediary services in emerging digital libraries.
In Proceedings of DL'96, 1996.
Format: Not yet online.
|
|
|
M.W. Bright, A.R. Hurson, and S. Pakzad.
Automated resolution of sematic heterogeneity in multidatabases.
ACM Transaction on Database Systems, 19(2):212-253, June 1994.
|
|
|
M.W. Bright, A.R. Hurson, and Simin H. Pakzad.
A taxonomy and current issues in multidatabase systems.
IEEE Computer, 25(3):51-60, March 1992.
This article presents a taxonomy of global
information-sharing systems and discusses where
multidatabase systems fit in the spectrum of
solutions. The authors use this taxonomy as a basis
for defining multidatabase systems, then discuss the
issues associated with them. In particular, the
paper focuses on two major design approaches-
global schema systems and multidatabase language systems.
|
|
|
Brightplanet.com.
http://www.brightplanet.com.
|
|
|
The Deep Web: Surfacing Hidden Value.
http://www.completeplanet.com/Tutorials/DeepWeb/.
|
|
|
S. Brin and L. Page.
The anatomy of a large-scale hypertextual web search engine.
In Proceedings of 7th World Wide Web Conference, 1998.
In this paper, we present Google, a prototype of a
large-scale search engine which makes heavy use of
the structure present in hypertext. Google is designed
to crawl and index the Web efficiently and produce
much more satisfying search results than existing
systems. The prototype with a full text and hyperlink
database of at least 24 million pages is available at
http://google.stanford.edu/
To engineer a search engine is a challenging task.
Search engines index tens to hundreds of millions of
web pages involving a comparable number of distinct
terms. They answer tens of millions of queries every
day. Despite the importance of large-scale search
engines on the web, very little academic research has
been done on them. Furthermore, due to rapid advance
in technology and web proliferation, creating a web
search engine today is very different from three years
ago. This paper provides an in-depth description of
our large-scale web search engine - the first such
detailed public description we know of to date.
Apart from the problems of scaling traditional
search techniques to data of this magnitude, there are
new technical challenges involved with using the
additional information present in hypertext to
produce better search results. This paper addresses
this question of how to build a practical large-scale
system which can exploit the additional information
present in hypertext. Also we look at the problem of
how to effectively deal with uncontrolled hypertext
collections where anyone can publish anything they
want.
|
|
|
Sergev Brin, James Davis, and Hector Garcia-Molina.
Copy detection mechanisms for digital documents.
SIGMOD, pages 398-409, 1995.
In a digital library system, documents are available in
digital form and therefore are more easily copied and
their copyrights are more easily violated. This is a
very serious problem, as it discourages owners of
valuable information from sharing it with authorized
users. There are two main philosophies for addressing
this problem: prevention and detection. The former
actually makes unauthorized use of documents difficult
or impossible while the latter makes it easier to
discover such activity. We propose a system for
registering documents and then detecting copies, either
complete copies or partial copies. We describe
algorithms for such detection, and metrics required for
evaluating detection mechanisms (covering accuracy,
efficiency, and security). We also describe a working
prototype, called COPS, describe implementation issues,
and present experimental results that suggest the proper
settings for copy detection parameters.
|
|
|
Sergey Brin.
Extracting patterns and relations from the world wide web.
In WebDB Workshop at 6th International Conference on Extending
Database Technology, EDBT'98, 1998.
Available at http://www-db.stanford.edu/ sergey/extract.ps.
Seed a search with examples of a pattern, such as
citations to books. Let the engine run over Web pages
and learn. Get back more books.
|
|
|
Sergey Brin and Lawrence Page.
The anatomy of a large-scale hypertextual web search engine.
In Proceedings of the Seventh International World-Wide Web
Conference, 1998.
Shows architecture of Google.
|
|
|
Sergey Brin and Lawrence Page.
Dynamic data mining: A new architecture for data with high
dimensionality.
Technical report, Stanford University, 1998.
Describes a new architecture for data mining. It
makes use of some of the dynamic itemset counting
technology
|
|
|
Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar
Rajagopalan, Raymie Stata, Andrew Tomkins, and Janet Wiener.
Graph structure in the web: experiments and models.
In Proceedings of the Ninth International World-Wide Web
Conference, 2 |