![]() |
|
![]() | ||
|
Subject:
Text/Language processing & Categorization
Category: Computers Asked by: knowledgeseeker-ga List Price: $10.00 |
Posted:
27 Jul 2003 12:10 PDT
Expires: 26 Aug 2003 12:10 PDT Question ID: 235698 |
Does anyone know or can help me find software or algorithm that would take as an input a short piece of text and as an output give a set of categories that text would belong to. It would also help to get relevance index. What I am looking for is a piece of code/algorithm that would classify a short piece of text. Possibly also extract relevant keywords. Example 1: "What is the best French restaurant in New York City?" The output would be: Categories- Restaurant, French or Cat - Restaurant, Sub-Cat - French. Ex 2: "Give me a list of available books in Visual Basic" Result Category - Books, Sub-Cat - Visual Basic. It would be nice if the database of Categories, SubCategories that are used for classification would be expandable. I'm looking for something that would enable me programmatically select relevant Category/Sub-Category to post in here on google answers based on the question! Thank You. |
![]() | ||
|
There is no answer at this time. |
![]() | ||
|
Subject:
Re: Text/Language processing & Categorization
From: hailstorm-ga on 25 Aug 2003 23:33 PDT |
knowledgeseeker, This was a very interesting question that I would have liked to have answered. However, though simple in nature, this type of question is very difficult to provide a reliable programmable solution for, since it requires the "intelligence" to parse the important information from an English language statement. One thought I had was to use the Google API to extract directory classification information through a Google query. My results for your two sample questions, plus one more of my own creation were: What is the best French restaurant in New York City? - Top/Regional/North_America/United_States/New_York/Localities/N/New_York_City/Manhattan/Business_and_Economy/Restaurants_and_Bars/Guides_and_Directories Give me a list of available books in Visual Basic - Top/Computers/Programming/Languages/Visual_Basic/Resources What is the most popular car made in America? - Top/Arts/Literature/Genres/Cyberpunk The first two queries provide all the classification information we want. Actually, too much information, but it may be possible to whittle that down with further programming. Unfortunately, the third query is completely wrong. So if we can't rely on Google and its multi petabyte storehouse of information, it may prove difficult to find any solution that can adequately address your needs. |
Subject:
Re: Text/Language processing & Categorization
From: bio-ga on 26 Aug 2003 11:35 PDT |
Hi Knowledgeseeker, I think a Bayesian classification algorithm can be designed for this task. But you will first have to "train" it (possibly using all the past questions asked in GA). Search Google for "bayesian text classification" for more information. Bio |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |