Site Mapper

Look at Your Website the Way Google Does

How it works

Crawl a website, rank the pages, and extract content

Tools used

  • Get Links – examines a webpage and returns links to other pages
  • Page Rank – a simple implementation of the PageRank algorithm
  • URL2Text – extracts the main content from a webpage
  • Summarizer – creates a summary by extracting key topic sentences
  • Auto-Tag – generates keywords via Latent Dirichlet Allocation
  • D3 - a JavaScript library for generating graphs and visualizations

Source Data

We examine the provided URL, traversing the links found in that page and the pages linked to – up to a certain depth. We then generate a visual map and ranked list based on the links between those pages. For each page, we also provide a summary and list of tags.

Takeaway

Algorithmia's easy-to-use microservices make it possible to quickly traverse and examine a large number of pages without the need to set up any infrastructure. By combining these services with readily available visualization tools, a developer can rapidly assemble a site crawler without ever leaving the browser environment.

Built For Developers

A simple, scalable API for machine intelligence

SAMPLE INPUT

import Algorithmia
client = Algorithmia.client('API KEY')
algo = client.algo('web/GetLinks/0.1.5')
print algo.pipe("https://algorithmia.com")

SAMPLE OUTPUT

[
  "http://developers.algorithmia.com",
  "https://algorithmia.com/terms",
  ...
]
LEARN MORE

Join the thousands of developers already building intelligent apps

Get 10,000 additional credits when you sign up using the code "demos"

SIGN UP FOR FREE

Algorithms as a Microservice

Leverage an ever-growing library of more than 2,200 algorithmic microservices via an intuitive API. We provide the tools and manage the cloud infrastructure needed to run it at scale.

Learn more

Web Services for Business Logic

Instantly deploy your backend code as an API for public or private consumption. Every algorithm runs as it's own microservice, making each composable, interoperable, and secure.

Learn more

Hosted Trained Models

Have a trained machine learning or deep learning model? Turn it into a serverless microservices in minutes. We'll show you how to get started for free, and scale with ease.

Learn more