Request for Question Clarification by
pafalafa-ga
on
16 Feb 2004 16:30 PST
Hello Edward,
The designers of the Halibot system have three records at the Patent
Office related to the design and operation of an "e-mail answering
agent" as they call it. The records do not include source code, but
do have a great deal of information on the architecture of the Halibot
system. I've attached a chunk from one of the records, so you can get
the flavor of it (unforunately, I can't attach flow diagrams and other
visuals that accompany the text).
The patent records also identify the key people involved in the design
of the Halibot system, who it may be possible to track down.
If this info seems like it would be of interest to you, let me know
how you would like me to proceed.
And good luck with your quest.
pafalafa-ga
==========
SUMMARY OF THE INVENTION
[0009] Briefly, one embodiment of the present invention comprises an
answering agent accessible at an e-mail destination address. The
answering agent waits for e-mail messages to be delivered that have a
"To:" header and a "Subject:" header. The "To:" header is filled-in by
a user and has two halves, a topic half and a domain name half in the
form of "topic@domain-name". The answering agent logically resides at
the corresponding domain-name address on a network, e.g., the
Internet. It extracts the source addresses of e-mail messages it
receives so that it knows where to return answers and where a database
of preferences might be indexed locally. The topic refers to an area
of information that the user has a question about. The "Subject:"
header is filled-in by the user with a qualifier that helps narrow
down the breadth of the user's inquiry. A finite set of topical areas
are accessible to the user through the answering agent. A database and
the Internet itself are data-mined for current information and the
locations of information that could be used to answer users'
questions. The answering agent, in effect, converts e-mail format
queries for information into standard web-based and database-based
searches and collects the answers to the questions. The questions can
be anticipated and the answers placed in a cache, or the questions can
be researched automatically in real-time.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] FIG. 1 is a functional block diagram of an e-mail answering
agent embodiment of the present invention;
[0011] FIG. 2 is a functional block diagram of an e-mail, HTTP, and
WAP answering agent embodiment of the present invention;
[0012] FIGS. 3A-3C are flowcharts describing a composer embodiment of
the present invention as can be used in FIGS. 1 and 2;
[0013] FIG. 4 is a flowchart describing a scheduler embodiment of the
present invention as can be used in FIGS. 1 and 2;
[0014] FIGS. 5A-5E are flowcharts describing a receiver embodiment of
the present invention as can be used in FIGS. 1 and 2;
[0015] FIGS. 6A and 6B are flowcharts that represent a topic server
embodiment of the present invention as can be used in FIGS. 1 and 2;
and
[0016] FIG. 7 is a diagram representing a way to organize database
embodiments of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
[0017] FIG. 1 represents an e-mail answering agent embodiment of the
present invention, and is referred to herein by the general reference
numeral 100. The answering agent 100 comprises a system for answering
informational queries included in an incoming e-mail message 102. A
simple mail transfer protocol (SMTP) network 104 is used to deliver
these to a post-office protocol (POP) mailbox 106. From there, a
receiver 110 monitors the (POP) mailbox through use of POP3 system
108. The key information is parsed and saved in a database 112 for
processing. The receiver determines if the response should be plain
text or can be HTML, depending on the e-mail application detected. A
scheduler 114 continuously queues new requests in the database for
pre-created, scheduled queries in parallel with ad-hoc queries coming
from receiver. A composer 116 polls the queue in the database for
pending requests. The composer makes requests through an analyzer/call
router, which passes the request to a topic server 124. The topic
server returns the answer. The composer formulates the answer as an
e-mail message that is sent out on an SMTP system 118. A discrete
e-mail message 120 with a responsive answer in the message body is
sent back to the corresponding user.
[0018] Each answer can have an advertisement included by an ad server
122. In a business model embodiment of the present invention,
advertisers pay a fee to a service provider to deliver ads to the
users along with the answers to the queries. The answers themselves
are obtained from the Web or databases by a topic server 124. In one
embodiment, the Web is used as a real-time reference library of facts.
A group of data sources 126 includes HTML and other kinds of documents
on the Internet and in local databases. A webserver 128 is used to
serve HTTP queries coming in via the Web.
[0019] FIG. 2 represents a second e-mail answering agent embodiment of
the present invention, and is referred to herein by the general
reference numeral 200. The answering agent 200 can answer queries from
SMTP e-mail, HTTP Web, and wireless access protocol (WAP). An incoming
e-mail request 202 is carried by an SMTP system 204 to a POP-mailbox
206. A receiver 210 is connected to a POP3 server 208. A database 212
holds queries in a queue waiting for service. A helper 214 provides
automated help responses to the user. A scheduler 216 inserts jobs
that have been scheduled to be processed into the database queue for
work by a composer 218. An SMTP server 220 handles outgoing traffic in
the form of outgoing e-mail messages 222. An ad server 224 adds
commercial paid advertisements to the outgoing answers and responses
in a business model embodiment. A plurality of topic servers 226 are
each specialized to research particular topics from a variety of data
sources 228. A webserver 230 allows an Internet presence that can
receive and respond to HTTP requests 232. Incoming jobs can be
received from a web request 234 and also a WAP request 236. The
webserver is connected to an analyzer and topic server like the
composer, and sends answers to questions received from the Web and WAP
via an outgoing Web response 238 and a WAP response 240.
[0020] The way a topic server 226 derives information varies from
topic to topic, and is typically a six-step process. A first step
dispatches requests to one of several built-in topic "modules" each
appointed to handle one discrete topic, e.g., flight status, airfare,
area code, movies, dictionary, etc. A second step parses and validates
the query parameters. Each particular topic module knows exactly what
kind of input it needs. In the case of flight status, an airline and a
flight number are expected. In the case of travel directions, a
starting and an ending address are necessary. These parameters are
dissected using complex, flexible interpretations. For example, the
entry of a physical address has seemingly infinite variations, all of
which the system 100 and 200 must be able to interpret successfully. A
third step starts with a webpage or other given data source and the
parameters. It constructs a URL and the posted variables.
[0021] For example, once a physical address has been parsed from the
query parameters, variables such as
"addr=836+Green+Street&city=San+Francisco&s- tate=CA" may be appended
to a standard URL. A fourth step fetches/posts to the URL and reads a
resulting HTML page, e.g., over a standard HTTP connection with the
data source's web site. A step five crops the resulting "raw" HTML to
the bare essential information. Typically, there is only a very small
section of the resulting web page that is useful to the user. The rest
consists of navigation links, advertising, and general aesthetic
layout. Only the raw results are needed, so the rest is stripped off.
[0022] A sixth and last step parses the HTML and reformats the results
depending on the requested output format. Such is a complex process
that must usually be custom built for each discrete topic that uses a
web site as its data source. Not every HTML page looks the same, so
each topic module must "know" the format of its respective data
source. When the response comes back, the topic module must interpret
the HTML, like a browser, to present it to the user in a meaningful
way. For example, if the requested output format is text, and the HTML
results contains a table of information, the table tags must be parsed
so that the rows and columns of information can be logically
redisplayed in plain text. There is a generalized method for doing
this that is shared among many topic modules, although no two are
purely identical. For example, the HTML "</TD>" tag signifies the end
of a column, which aids in separating informational tokens. The
"</TR>" tag signifies the end of a row in a table, and indicates a
logical place to put a line break.
[0023] In an example of the operation of one embodiment of the present
invention, a user wants to know what time a particular airline flight
is supposed to land. The user sends an e-mail message to
"flightstatus@halibot.com" and includes the airline and flight number
in the subject field, for example,
1 From: user@somewhere.net To: flightstatus@halibot.com Subject: United 2507
[0024] When the message arrives in POP mailbox 206 on halibot.com, the
receiver 210 detects the new request. It parses out the topic based on
the address to which the message was sent and the request parameters
from the subject. It also determines the most appropriate output
format based on the e-mail application used to compose and send the
message. It then queues-up a new request in the database 212. The
composer 218 constantly polls the database 212 to detect any queued
requests. It connects to the topic server 226 and conveys the topic
name, parameters, and output format. The composer 218 takes any
results returned from the topic server 226 and sends a message back to
the user, for example,
2 From: flightstatus@halibot.com To: user@somewhere.net Subject: Re:
United 2507 Flight information last updated less than 1 minute ago.
United Airlines 2507 Departing San Francisco Intl, CA 5:38pm In Flight
329 mi SW of Chicago, IL 33000' 475 mph B744 Arriving Newark Intl, NJ
1:11am
[0025] FIG. 3A represents a composer main loop 300. A process 302
selects queued requests from the database. In each iteration of the
main loop, the composer 116 and 218 examines a "mail_queue" database
table to see if there are any requests that need to be processed. A
"fresh" request is identified when a "being_processed_by" column is
null, a "began_processing" column is null, and a "completed" column is
null. All requests that conform to these criteria are selected by an
SQL statement. Such sorts first by priority, and then by the time at
which the request was created. The highest priority is given to the
oldest requests. A typical SQL statement that can be used is in the
following table.
3 select * from mail_queue where being_processed_by is null and
began_processing is null and completed is null order by priority,
created..
...and so on.