Repository navigation
nutch
- Website
- Wikipedia
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
A OCR Search Engine With Tesseract Nutch Solr And PHP
Mapreduce project by Hadoop, Nutch, AWS EMR, Pig, Tez, Hive
Link ranking with Apache Giraph for Apache Nutch
A simple web crawler inside a docker container using Apache Nutch 1 and Solr.
A very simple search engine "specialised" in searching financial news.
Launch fast and easy an Apache Solr linked with Apache Nutch in separated docker containers.
Simple crawler using apache nutch and elasticsearch
Search Engine project for Information Retrieval class.
Developed a Spatial Search website that allow users to search documents from FBI Vault website. Extract the most frequently occurring location in each of documents, and load the geo-tagged data into Apache Solr to index the documents, visualize search results using the Google Maps API.
Search engine knowledge systems(搜索引擎知识体系).
Web Crawler in a Distributed manner