Cod : TPIM314
Titular curs : Conf. dr. M. Cosulschi
Forma de invatamant : Master
Ciclul : 2 Anul : 2
Semestrul : 1, Curs : 2h, Laborator : 2h
Nr. credite : 6
Profil : Informatica
Specializare : Tehnici Avansate Pentru Prelucrarea Informatiei
Tip disciplina : obligatorie
Categoria formativa : de specialitate
Obiective:
- Insusirea principiilor web-ului semantic;
- Intelegerea principiilor de functionare ale unui motor de cautare;
- Modelarea datelor pe web;
- Insusirea paradigmei de programare MapReduce.
Continutul cursului:
- Arhitectura Web
- Modelul RDF (Resource Description Framework);
- Managementul datelor RDF. Interogarea datelor RDF cu SPARQL;
- Arhitectura aplicatiilor Web-ului semantic. Linked Data;
- Extragerea automata a datelor din paginile Web;
- Cloud Computing;
- MapReduce - dezvoltarea aplicatiilor distribuite cu MapReduce;
- Apache Hadoop - arhitectura;
- Pig, Cassandra, etc.
Forma de evaluare : examen
Bibliografie:
- S. Abiteboul, I. Manolescu, P. Rigaux, M.-C. Rousset, P. Senellart: Web Data Management, Cambridge University Press, 2011.
- C. D. Manning, P. Raghavan and H. Schütze, Introduction to Information Retrieval, Cambridge University Press, 2008.
- J. Lin and C. Dyer, Data-Intensive Text Processing with MapReduce, Morgan & Claypool Publishers, 2010.
- T. Heath and C. Bizer, Linked Data: Evolving the Web into a Global Data Space (1st edition), Synthesis Lectures on the Semantic Web: Theory and Technology, Morgan & Claypool, 2011.
- T. White, Hadoop: The Definitive Guide. Storage and Analysis at Internet Scale, 3rd Edition, O'Reilly Media / Yahoo Press, 2012.
- S. Abiteboul, R. Hull, V. Vianu, Foundations of databases, Addison-Wesley, 1995.
- S. Abiteboul, P. Buneman, D. Suciu, Data on the Web: From Relations to Semistructured Data and XML, Morgan Kaufmann, 1999.
Material didactic:
- Above the Clouds: A Berkeley View of Cloud Computing
- Resource Description Framework (RDF)
- Introducere catre RDF
- Introduction to RDF
- Linked data
- Inside search
- Datasets Available for Linked Open Data Initiatives
- PatchR Repository
- The Dark Face of Google.
- Introduction to RDF and the Semantic Web for the life sciences.
- Sindice.com: A Document-oriented Lookup Index for Open Linked Data
Pachete software:
- Lucene - este o librarie software gratuita pentru extragerea de informatii;
- SIREn: - Efficient semi-structured Information Retrieval for Lucene;
- AllegroGraph - o baza de date scalabila pentru date RDF;
- Swoogle - motor de cautare peste date semantice;
- Aplicatii - ce folosesc open data;
Prezentari:
- Data Sciences: From First Order Logic to the Web - Serge Abiteboul;
Alte cursuri:
- Dezvoltarea aplicatiilor Web
- CS 276: Information Retrieval and Web Search
- CSE 591: Semantic Web Mining
- CS561 Web Data Management (Spring 2017)
- Web and Social Information Extraction
- Web Data Management
|