Ricardo Baeza-Yates, vice president of Yahoo! Research for Europe, Middle East and Latin America, will present "Towards a Distributed Web Search Engine" on Monday, April 26, 2010.
The event, which is free and open to the public, will take place from 11 a.m. to 12:15 p.m. in room 0105 of Murphey Hall on the University of North Carolina at Chapel Hill campus.
In the ocean of Web data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized systems based on replicated clusters. Web data, however, is always evolving. The number of Web sites continues to grow rapidly (230 millions at the end of 2009) and there are currently more than 20 billion indexed pages. On the other hand, Internet users are above one billion and hundreds of million of queries are issued each day. In the near future, centralized systems are likely to become less effective against such a data-query load, thus suggesting the need of fully distributed search engines. Such engines need to maintain high quality answers, fast response time, high query throughput, high availability and scalability; in spite of network latency and scattered data. In this talk we present the main challenges behind the design of a distributed Web retrieval system and our research in all the components of a search engine: crawling, indexing, and query processing.
About Ricardo Baeza-Yates
Ricardo Baeza-Yates is vice president of Yahoo! Research for Europe, Middle East and Latin America, leading the labs at Barcelona, Spain and Santiago, Chile.
Until 2005 he was the director of the Center for Web Research at the Dept. of Computer Science of the Engineering School of the University of Chile; and ICREA Professor at the Dept. of Technology of University of Pompeu Fabra in Barcelona, Spain. He is co-author of the book Modern Information Retrieval, published in 1999 by Addison-Wesley (second edition will appear in 2010), as well as co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 150 other publications. He has received the Organization of American States award for young researchers in exact sciences (1993) and with two Brazilian colleagues obtained the COMPAQ prize for the best CS Brazilian research article (1997). In 2003 he was the first computer scientist to be elected to the Chilean Academy of Sciences and in 2009 he became an ACM Fellow.