The Internet: How Search Works

Hi, my name is John. I lead the search and
machine learning teams at Google. I think it's amazingly inspiring that people all
over the world turn to search engines to ask
trivial questions and incredibly important questions, so it's a huge responsibility to give them the best answers that we can. Hi, my name is Akshaya and I work on the Bing search team. There are many times where we'll
start looking into artificial intelligence and machine learning, but we
have to address how are the users going to use this because end of the day we
want to make an impact to society. Let's ask a simple question.
"How long does it take to travel to Mars?" Where did these results come from and
why was this listed before the other one? Okay, let's dive in and see how the
search engine turned your request into a result. The first thing you need to know
is when you do a search, the search engine isn't actually going out to the
World Wide Web to run your search in real time. That's because there's over a billion websites on the internet and hundreds more being created every
single minute, so if the search engine had to look through every single site to
find the one you wanted, it would just take forever. So to make your search faster,
search engines are constantly scanning the web in advance to record the
information that might help with your search later. That way when you search
about "travel to Mars" the search engine already has what it needs to give you an
answer in real time. Here's how it works: The internet is a web of pages connected
to each other by hyperlinks. Search engines are constantly running a program called "spider" that crawls through these web pages to collect information
about them. Each time it finds a hyperlink, it follows it until it has visited every page it can find on the entire internet. For each page of the
spider visits, it records any information it might need for a search by adding it to
a special database called a search index. Now, let's go back to that search from
earlier and see if we can figure out how the search engine came up with the results. When you ask "how long does it take to travel to Mars", the search engine looks each of those words in the search index to immediately get a list of all the pages on the internet containing those words, but just looking for these
search terms could return millions of pages so the search engine needs to be
able to determine the best matches to show you first. This is where it gets tricky because the search engine may need to guess what you're looking for. Each search engine uses its own algorithm to rank the pages based on what it thinks you want. The search engine's ranking algorithm might check if
your search term shows up in the page title, it might check if all of the words
show up next to each other, or any number of other calculations that helped it better determine which pages you'll want to see and which you won't. Google invented
the most famous algorithm for choosing the most relevant results for a search, by taking into account how many other web pages link to a given page. The idea is that if lots of websites think that a web page is interesting, then it's
probably the one you're looking for. This algorithm was called "Page Rank", not because it ranks web pages, but because it was named after its inventor, Larry
Page, who is one of the founders of Google. Because the website often makes money
when you visit it, spammers are constantly trying to find ways to game the search algorithm so that their pages are listed higher in the results. Search engines
regularly update their algorithms to prevent fake or untrustworthy sites from
reaching the top. Ultimately, it's up to you to keep an eye out for these pages that are untrustworthy by looking at the web address and making sure it's
a reliable source. Search programs are always evolving to
improve the algorithms so the return better results, faster results then their competitors. Today's search engines even use
information that you haven't explicitly provided to help you narrow down your search. So, for example, if you did a
search for dog parks, many search engines would give you results for all the dogs parks nearby even though you didn't type in your location. Modern search engines also understand
more than just the words on a page, but what they actually mean in order to find the best one that matches what you're
looking for. For example, if you search for a "fast pitcher" it will know you're looking for an athlete, but if you search for "large pitcher" it will find you options for you kitchen. To understand the words better, we use something called machine learning, a type of artificial intelligence. It enables search algorithms
to search not just individual letters or words in the page, but understand the
underlying meaning of the words. The internet is growing exponentially, but if the
teams that design search engines do our jobs right, the information you want should
always be just a few keystrokes away.

21 thoughts on “The Internet: How Search Works

  1. interesting but I thought Google took advantage of a node system much like Dijkstra's algorithm of nodes used for gps directional systems in cars, that way each node has data associated with it or a dataset associated with it which can make the process of a search smaller and faster and then by ranking nodes you get a list of data back that is ordered for each search…having a data set about a dataset about a dataset seems quite like a very sad sql database with a lot of foreign keys and normalisation, when nosql datasets like Mongo just get straight to the data and increases speed…

    In terms of machine learning how do you make a machine always associate the word pitcher to a sports player and not a jub? Without using word combinations? So for example an array of words and if word[0] and word[1] are in a sentence then it is dataset A that is selected maybe? I always think the fast and stupid ways work best; which means a neural network isn't necessary unless maybe each word is a is an input to a neuron maybe…although it is easier to make each website a node (OOP) and if similar cluster them together and monitor each cluster and have a dataset that searches are referred to and that is updated buy the search engine…that way the data is prepared and exist before a search is made…,

    oh wait google does do this and charges us the get better rankings on their databases lol gosh im such a pathetic fool!

  2. Just a little correction, at @1:37 "The Internet is NOT a web of pages connected with each other". The correct explanation should be "The World Wide Web is a web of pages connected with each other" because the Internet is a web of computers connected with each other. Using the terms Internet & Web interchangeably has created so much confusion that the people think they both are the same.

  3. We need to make some other
    Technology which will change the way we are browsing data by images or some Quicker way to get what we need.

  4. The video was effective and it works somehow by spider… And another thing i found that the girl was beautiful… When i see her its like love at first side <3 <3

  5. Awesome Information.
    Thanks for the valuable insights. I follow your steps for better results.
    Here I want to share some information about Check SEO Tool
    There is Check SEO tool available in the market that helps in increasing your website rank. On-page SEO must be the foundation to website that helps individuals in boosting the website rank check the status of the website as per SEO standards.Use Check SEO tool and maintain key factors such as content to boost your website ranking.

  6. Nice explanation on how search engine works. I have made an attempt to explain the same thing in hindi. So if you want to understand it in hindi then please do check the video:

  7. Awesome Graphic design, presentation honestly, you guys ask exactly what most people would to think to ask …… BUT your answers for the questions are far away from the truth!!

  8. @1:21 He said "….to make your search faster search engines are constantly scanning the web in advance to record informations that may help with your search later" what in God name is this means? how the fuck the Google knows what i'm going to do tomorrow if i didn't GOOGLE IT??????

  9. Hi, could someone tell me where does this video fit in the 18-19 Computer Science Principles Course? I do not see it as part of any of the lessons. Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *