Gary Illyes of Google stated in the Search Off The Record podcast on crawling, which was briefly covered on Friday, that he is looking into methods for Google to better handle URL parameters.
Google’s John Mueller enquired of Gary Illyes, “What other types of crawling optimizations do you see happening?”
Gary suggested, “Maybe better URL parameter handling.”
URL parameters are the additional characters and parameters that appear at the end of a URL and are commonly used for tracking, but also for dynamic page building, among other things. It can lead Google to crawl in an unlimited space, like with calendar pages that continue indefinitely.
Then John asked Gary what he meant, and Gary replied, “Something like the URL Parameter handling tool that we used to have more in a protocol structure where you say, “This parameter is optional?” Gary answered, “Oh, that’s a good idea.”
The problem is that Google must crawl all of the URL variants to determine which parameters to canonicalize with the main URL. Gary commented, “We have to crawl first to know that anything is different, and we need a big sample of URLs to determine that, “Oh, these parameters are meaningless.”
Google formerly had a URL parameter tool in the Search Console, however, it was deactivated in April 2022. Google got rid of it “because it was not used,” Gary explained.
So, what improvements can Google make? Gary stated, “If someone complains that we are over-crawling them because they have one of these weird URL spaces with an infinite number of URL parameters, we could simply tell them, “Okay, use this approach to block that URL space.”
Gary noted that controls like “robots.txt” might be used for this, as well as rules like everything following this symbol should be ignored, or a mix of the two, they said.
Source- seoroundtable