Gegenfeld
8 min read

How Search Engines Work

How Search Engines Work

How Search Engines Work

You have a question, Google has the answer — and that's no coincidence. Search engines like Google largely determine how we access information. But how do search engines actually work?

Summary: Search engines crawl the public internet using robots and spiders. Crawled websites are added to an index. The index can be thought of as a library containing a listing of all websites. When a search query is made, the search engine serves an entry from its index. The selection follows an algorithm.

Before diving into the technical details, you should first understand what search engines are, how we use them in everyday life, and how your business can benefit from them.

Search Engine Basics

We focus here on search engines for websites on the World Wide Web, not on software-internal or intranet search engines. Let's start with the definition of a search engine.

What are search engines?

Search engines are programs that crawl content on the internet, store it in databases, and evaluate it using algorithms to enable users to search for information in a targeted way.

Search engines consist of three key components: crawlers (programs that search the public internet for content), a search index (a database containing information about websites and their content), and search algorithms (programs that match search queries to relevant content).

A user enters a search query into the search engine's input field using keywords. The search engine follows its algorithms and returns the most relevant results.

Search engines consist of three units that crawl, store, and deliver content from the public internet to users based on algorithms.

What is the goal of search engines?

Search engines aim to deliver the best possible results for a user's query so that they continue using that search engine in the future. For a user to make many searches, they must be satisfied with the organic (unpaid) results.

To ensure this, search engines rely on smart algorithms that evaluate websites and their content based on topical relevance and serve them to users. The assessment of topical relevance depends on many SEO factors and is re-evaluated for every search query.

How do search engines generate revenue?

Search engines are often operated by profit-oriented companies — such as Google, Bing, or Yahoo. These companies make money by placing advertisements in search results. The more often a user uses their search engine, the more ads they see, and more ads mean more revenue.

This logically means there are two types of search results: organic (unpaid) results, which are websites from the search engine's index, and paid results, where advertisers pay for the placement of their website.

1.1
1.1

When a user clicks on an organic result, the website owner benefits from free visitors. When a user clicks on a paid result, the advertiser pays a fee to the search engine. This fee is known as PPC (Pay-per-Click) and varies depending on the search query.

Why do you benefit from search engines?

Search engines give you the opportunity to rank your website in organic results for relevant topics and keywords. If you rank for high-demand keywords, you receive more traffic and page views at no cost. The foundation for this is appearing high up on search engine results pages (also called SERPs).

Your position in the SERPs is directly linked to the percentage of visitors you receive for a keyword. Simply put: higher ranking = more visitors.

Your goal is therefore to appear as prominently as possible in search results.

How Search Engines Index Content

You know Google, and Google probably knows you. That's thanks to the unimaginably large databases that search engines refer to as their "index," which contain trillions of individual pages. Each page is equipped with metadata and evaluated by algorithms.

The indexing process is broken down into several steps.

2.1
2.1

URLs and waiting for crawlers

An index consists of individual web pages, and each page has its own URL (not to be confused with the domain). For example, a domain is the website address (e.g. https://gegenfeld.com ), while a URL is the exact location of a resource on a website (e.g. https://gegenfeld.com/seo ).

There are three main ways search engines discover new URLs:

Submitting URLs manually — Many search engines allow you to manually submit your URLs. You log in, confirm ownership of your domain, and submit your URLs directly to the search engine.

3.1
3.1

This Process can take a few minutes

4.1
4.1

The URL then enters the crawler's queue.

Sitemaps

A sitemap is like a family tree or hierarchical mind map of your website. It contains all the URLs you want indexed.

5
5

Search engines use your sitemap as a guide to find relevant URLs — like a map of your site that leads them to the interesting places. Sitemaps can be created manually or automatically using software. Many website builders and CMS platforms like WordPress generate them automatically. You can typically find your sitemap at example.com/sitemap.xml. After creating it, link to it in your website's footer and submit it to Google and other search engines.

Backlinks

Search engines read the HTML code of pages in their index. When Google finds a link on a page, it follows that link and may add the linked page to the index. This applies to both internal links and backlinks. If Website A links to a page on Website B, the search engine follows that link. On the new page, the same process repeats: HTML is read, links are followed, and over time a kind of "spider's web" of interconnected pages is formed — and these pages make up the search engine's index.

Crawling

Websites are crawled by crawlers (also called robots, webcrawlers, or spiders), such as Googlebot.

6
6

Crawlers follow links found on websites, analyze newly discovered pages, and repeat this process. The analysis considers various factors including the rate of change, age, and PageRank of a URL.

Crawlers are computer programs that automatically search the public internet by visiting URLs and analyzing websites. Crawling is the term for this process, carried out primarily by search engines.

Since this process requires significant resources, each website has a certain "crawl budget" — meaning Google and others only allocate limited resources to crawling and indexing your site. The higher your PageRank, the more Google invests in crawling your website. You should therefore only allow pages to be indexed that offer genuine value. For small websites this is less critical, but if your site has thousands of indexable URLs, it's worth paying attention to.

Processing and rendering

To serve your website to the right users, it must first be processed. Google renders your website's code and tries to understand its content — including metadata, text, images, and links.

7
7

Processing is increasingly handled by smart algorithms and artificial intelligence. Google also uses NLP (Natural Language Processing) to better understand the meaning of written content, including dividing texts into entities using AI.

How Search Engines Rank

Now that you understand how search engines find and index pages, it's important to understand how they evaluate and rank websites.

Search engines use AI and algorithms to assess a large number of so-called ranking factors that determine when a website is shown for which search query. The exact workings of these algorithms are not publicly known — even Google doesn't know all the factors. That said, SEO professionals broadly agree on some of the most important ones, including backlinks, Core Web Vitals (CWVs), structured data, search intent, mobile-first optimization, E-A-T (Expertise, Authority, Trust), dwell time, brand awareness, internal links, UX design, and SERP snippet optimization.

Backlinks

Backlinks have been among the most important ranking factors for years. However, not every backlink carries the same value. A backlink from a suspicious website, or from a site that distributes hundreds of different backlinks, carries less weight. Backlinks can also have a negative impact, for example if spam is suspected. Google's PageRank algorithm measures how many backlinks of what quality point to which websites, based on matrix multiplications.

8
8

Search intent

At least as important as backlinks is matching the user's search intent — meaning you publish content that delivers exactly what the user is looking for.

9
9

This factor can also be described as "topical relevance." The better your page solves your visitors' problems, the more search engines will recommend it. Google measures how users interact with results and whether a page meets their expectations.

Mobile-first

The mobile-first approach means optimizing websites equally for mobile devices — smartphones, tablets, and so on — in addition to the desktop view. Google also crawls websites using a smartphone version of Googlebot to detect usability issues.

10
10

This reflects the fact that over 50% of all searches come from a smartphone or tablet on average. Mobile-first optimization includes adapted image and font sizes as well as technical factors like fast load times and Core Web Vitals compliance.

How Search Results Vary

Every user receives personalized search results from Google, adjusted based on various factors.

Location — A key factor in what content is shown is the user's location, which can be determined via GPS, IP address, Wi-Fi connection, or mobile network.

Language — Even more important than location is the language in which a search is conducted. The language is usually identified from the keywords used. If a query contains words used across multiple languages, the user's language may be inferred from their location, device language, browser language, or Google account settings.

Search history — Previous searches, activity on other websites, and Google account settings can also influence the SERPs. For a more neutral view of search results, you can open a guest window in your browser, install a Chrome plugin (such as MozBar), or use Google through a third-party service like Startpage.

Written by

G

Gegenfeld Team

Gegenfeld Team