Facto C The Generic Fact Lookup Engine

Xiaoxin Yin, Wenzhao Tan

Internet Services Research Center (ISRC), Microsoft Research

Access it at http://lepton.research.microsoft.com/facto/ 

Facto is a fact lookup engine that aims at answering user questions aiming at facts about entities. It has three distinguishing features:

       Fully automated: No human labeling is needed.

       Domain independent: Facto handles data from all over the web.

       Self-curated: Facto uses data from the different web sites to predict the trustworthiness of data.

Facto Technologies

Data collection pipeline:

Facto identifies attribute-value tables on the web, and extracts information from them.  It also extracts the main entities of web pages, which is combined with the attribute-values to form a large database. Equivalent entities and equivalent attributes are identified from the data.

Query answering pipeline:

When receiving a user query, Facto decomposes it into all possible combinations of entity name and attribute, and tries to match them in its database. It retrieves answers for each possible combination, and aggregates the data to select the best answer.

Facto Examples

{Britney Spears height}

{height of mount rainier}

{Renee O'Connor birthday}

{what is the net worth of bill gates}

{how long is yangtze river}

{lucy lawless birthdate}

 

{microsoft number of employees}

{tom cruise's mother}

{ticketmaster headquarters}

{twilight cast}

{when was pixar founded}

 

{seattle longitude}

{mass of mars}