Robots.txt
Kelly Murphy
Lessons
Intro - Technical SEO
01:40 2Definition and History
03:25 3Why It Matters
01:24 4Core Components
01:45 5Results + KPIs
01:26 6Goal Setting
00:39What is Technical SEO? - Quiz
8Overview + Getting Ready
00:41 9Google Search Console
04:20 10Screaming Frog
06:11 11Tools - Quiz
12Questions + Brief
01:05 13Competitive Overview
03:53 14Mini Audit
02:24 15How to Approach a Client - Quiz
16Sitemaps
04:59 17Robots.txt
02:44 18HTTPS
02:08 19Broken Pages
05:55 20Duplicate Content
02:10 21Mobile Friendliness
07:44 22Performing an Audit (PART 1) - Quiz
23Metadata + Header Tags
02:39 24Image Optimization
01:37 25Schema Markup
04:57 26Social Tags
01:57 27Performing an Audit (PART 2) - Quiz
28Summary - SEO Course
00:33 29Final Quiz
Lesson Info
Robots.txt
next is the robots dot txt file And this is very important. So the robots dot txt file can tell google not to call certain pages and it can also tell google where the site maps are contained on the site. So the only requirements of a robots dot txt file are the lines user agent and disallow. So the example that we're looking at here for IBM, you can see those right here for any robots that txt file, you know that it's valid if you can see that there's a user agent line and it says agent string here and then there is a disallowed line and you don't even have to have these disallow lines down here. You just have to have that line disallowed in general. And below the disallow line is the heart of the robots at txt file because disallow indicates which parts of the site site crawlers are instructed to not crawl by the client. So here you can see that IBM is saying to google that they don't want google to crawl anything at this backslash link to crawl anything at back slash common slash err...
or etcetera. So the first step you'll want to take is go to your client's site name dot com slash robots dot txt. So again, we're looking at this example IBM dot com slash robots. Dot txt u r l to see whether they have that file uploaded. If they don't have this file uploaded, then it can usually be auto created by most CMS platforms similar to the xml site maps and once you've got the file, you first want to check to ensure that the root domain of the site in any important pages are not listed under disallow. So we can check here and we see that IBM dot com or it would just have that single backslash that that's not under disallow. This is a pretty common mistake after a new site launch or refresh, as developers will sometimes put the U R L s there that they don't want the public to access until those pages are ready to go live. So subsequently removing the disallowed tag is pretty easy to forget, but we're in good shape here. After that. You'll want to review the rest of the content under disallow and you want to make sure that everything looks like extraneous pages. So for example, email content or advertising third parties, things like that that don't need to be crawled. So everything IBM has here under disallow. These are presumably parts of their site that are not important for the standard user to see in search results. If there are any important pages under disallow, that you think should be included in Google's index. You want to flag those to the client
Class Materials
Ratings and Reviews
Jazzy Expert
Great experience..!!!