Facebook  Twitter 

SMFHacks.com

+-

SMFHacks.com

+- User Information

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?

+- Forum Stats

Members
Total Members: 4255
Latest: andreios
New This Month: 3
New This Week: 1
New Today: 0
Stats
Total Posts: 43259
Total Topics: 7518
Most Online Today: 297
Most Online Ever: 2482
(April 09, 2011, 07:02:45 pm)
Users Online
Members: 1
Guests: 277
Total: 278

Author Topic: Archiver Mod info and robot.txt  (Read 5381 times)

0 Members and 1 Guest are viewing this topic.

Offline GameSocket

  • Jr. Member
  • **
  • Posts: 79
  • NZ Made
    • View Profile
    • GameSocket
Archiver Mod info and robot.txt
« on: June 22, 2006, 04:06:56 am »
I guess I have seen an advantage of installing this mod, as today I was crawled by "ia_archiver".
On Investigating this is what I have found.

The crawler is Alexa crawler (robot), which identifies itself as ia_archiver.
Whenever ia_archiver lands on the top level of a Web site, it looks for a file called "robots.txt". Robots.txt is a file website administrators can place at the top level of a site to direct the behavior of web crawling robots.

A crawler will always pick up a copy of the robots.txt file prior to its crawl of the site.

To exclude all robots, the robots.txt file should look like this:

User-agent: *
Disallow: /
To exclude just one directory (and its subdirectories), say, the /images/ directory, the file should look like this:

User-agent: *
Disallow: /images/

Web site administrators can allow or disallow specific robots from visiting part or all of their site. Alexa's crawler identifies itself as ia_archiver, and so to allow ia_archiver to visit (while preventing all others), your robots.txt file should look like this:

User-agent: ia_archiver
Disallow:
To prevent ia_archiver from visiting (while allowing all others), your robots.txt file should look like this:

User-agent: ia_archiver
Disallow: /

For more information regarding robots, crawling, and robots.txt visit the Web Robots Pages at http://www.robotstxt.org, an excellent source for the latest information on the Standard for Robots Exclusion.

In any event, simply by visiting your site with the Alexa Toolbar open, Alexa will learn of your site and add it to our list of sites to visit, thus ensuring your inclusion in the Alexa service and in the Alexa archive.
If you are the type of person who won't be satisfied until you get to click a button that says "Crawl My Site," then Alexa have just the form for you. 

http://pages.alexa.com/help/webmasters/index.html#crawl_site


I have not been crawled by Alexa before installing this mod.






(\__/)
(O.o )   *If You need help, best not to ask me*
(> < )


Offline SMFHacks

  • Administrator
  • Hero Member
  • *****
  • Posts: 16436
    • View Profile
Re: Archiver Mod info and robot.txt
« Reply #1 on: June 22, 2006, 06:42:26 am »
Good news. I think it is better since it allows the search engines to find the boards and threads easier without going though all the other links they find.

SMFHacks.com
Get your Forum Ranked! at https://www.forumrankings.net - find out how your forum compares with others!

Like What I do? Support me at https://www.patreon.com/vbgamer45/

 

Related Topics

  Subject / Started by Replies Last post
1 Replies
3559 Views
Last post January 29, 2009, 09:30:15 pm
by SMFHacks
1 Replies
4144 Views
Last post July 11, 2009, 02:49:38 pm
by SMFHacks
0 Replies
2860 Views
Last post March 14, 2010, 12:11:07 am
by nin79
0 Replies
1850 Views
Last post December 23, 2010, 07:02:57 am
by morokat
3 Replies
3741 Views
Last post July 10, 2011, 06:19:11 am
by cosmicx

+- Recent Topics

No thumbnails on new uploads by SMFHacks
March 27, 2024, 02:10:41 pm

Display the Contact Page for guests by SMFHacks
March 27, 2024, 10:55:43 am

is it possible to add support for odysee.com by fvlog19
March 21, 2024, 08:47:51 am

Request for admin notification by davejo
March 10, 2024, 01:31:59 am

I need help with torrent upload by Ineedsmfhelp
March 09, 2024, 10:01:13 pm

an idea for new mod (( content type with different display )) by SMFHacks
February 27, 2024, 01:36:27 pm

[Mod] RSS Feed Poster by SMFHacks
February 27, 2024, 11:57:18 am

find duplicate pictures by fvlog19
February 14, 2024, 02:22:40 pm

Error uploading video. by SMFHacks
February 08, 2024, 02:04:16 pm

Gallery icon as last added image by fvlog19
February 01, 2024, 01:04:56 pm

Powered by EzPortal