2 found after the web page can grab content.
was found to be a spider crawling, database dynamic generation, with a lot of parameters URL, sessionID, the whole page is flash, frame structure, a large number of turns, and copy the contents of a large number of spiders have the potential to intercept at the door. This is to pay attention to
1 if you don’t know the frame, you can omit this step, because you have avoided this spider trap.
, the 4 jump
1 in a part of the web page using flash to enhance the visual effect is very normal, for example, now a lot of advertising, flash icon. But this is part of a HTML page. Will not have much impact
2 framework design page, in the early days, but now websites rarely use the framework design, so there is not much to say, whether you are in or not, remember.
Hello, this is the first time I published an article on it, if not a good place to master the exhibitions please.
The use of
1 (session ID) tracking user access, user’s visit will generate a separate ID, then add in URL, this is the spider every time the spider will crawl the site as a new user, which can not be normal spider crawling, a this is the spider trap.
1 to make search engine found on the website home page, there must be good external links to the home page, found the home page, and then the spider will creep along the link.
Some sites use sessionID
2 but some website is a big flash file, which constitutes a spider trap, this spider has only one flash link to take, no other content, so as to avoid it.
1, the search engine can find web pages.