Good Day alangrah-ga,
Before I answer your question, please keep in mind:
There are very few people in this world that KNOW any search engine's
algorithms and ways. Us Google researchers are not included in that
small group, even when it comes to Google itself. This means that my
answer will be based on what the SEO (search engine optimisation)
community has learned about the ways of Google and other search
engines (SEs), rather then on scientific information derived from the
algorithms or behind the scenes knowledge.
DOMAIN NAMES: WITH WWW vs. WITHOUT WWW
--------------------------------------
At present Google considers a domain name with and without the WWW
prefix as two separate websites. This fact can carry over even beyond
the www prefix and into other areas:
http://nettica.com
http://nettica.com/index.html
http://www.nettica.com/index.html
The above should all point to the same location, but inside the Google
index they are three separate entries. Looking ahead, you should
always use your full URL, and specifically request that any links to
your website include your chosen URL format (I personally recommend
http://www.nettica.com/).
To minimize any potential damage for anything already done, 301
permanent forwarding is the answer. While there are many ways to
forward a URL, the 301 method is the safest and best recognized by
Google and other search engines. To utilize this method you need to
create a .htaccess file in your root directory, or add code the an
existing .htaccess file.
Here is a sample .htaccess code for the first URL listed above:
RewriteCond %{HTTP_HOST} http://nettica.com$
RewriteRule ^(.+) http://www.nettica.com/$1 [L,R=301]
Use that formula to add forwards from the other possible URL formats.
Make sure that you check your server for an existing .htaccess file
before uploading a new one. If you have an existing .htaccess file
and upload a new file, the old settings will be deleted. In this case
you would want to download the file first, and then make modifications
to it, thus preserving all other settings.
I have heard rumours that Google is working on a system modification
that will recognize the same URL simply displayed in a different
format. Until such a system is implemented, the 301 permanent redirect
is your best option for a solid PageRank (explained further down)
without leaks, and the best way to go in respect of avoiding spammer
status.
Once you have the 301 forwarding in place, make sure that any links on
your site follow your preferred URL format. Failing to do so can
confuse the search engine, and will potentially result in a spammer
flag being set off at some point. Avoid using URL formats which point
to the forwarded alias, and link directly to the correct location.
This will keep you off the "mirroring black list".
SPIDERING PAGES BEYOND THE HOMEPAGE
-----------------------------------
When the Google spider finds your website, it will try to follow some
links on the page. The Google robot is a pretty simple being, and
doesn't like too much fancy stuff. In fact, JavaScript links, Flash
content of any kind, dynamically generated pages based on user input,
and many other types of links will not be followed by Google. It
simply cannot read such content.
Your pages are all in ASPX format, which is acceptable for the most
part. There are ASPX pages in Google, but some people are having the
same problem you are experiencing. ASPX files seem to go unnoticed in
same cases.
While there is no easy answer, here are a couple recommendations:
- Google likes the lowest possible ratio of code to content. Place
your CSS items in a separate file and load it in the header. This
will result in a cleaner code, improved load time, reduced bandwidth,
and a better content to code ratio.
- You've taken the wonderful step of creating a site map. Google
loves site maps. Still, to make sure it is used at maximum benefit and
fully utilized by the Google spider, make all the links on the site
map page links. No images, just text. Google will love you even more.
Make sure you include all the pages that exist on your website, even
those below the second layer.
Please note that sometimes Google will not follow all the links at the
time it finds the home page. Other times, it will, but the new content
takes a while before it makes it to the search database. Give it some
time before spending too much time on exploring as to why your other
pages are not listed yet.
LISTING PROCESS FOR DMOZ.ORG
----------------------------
As you might already know, DMOZ submission are reviewed by humans
before any listing is added to the online index. Some categories don't
have an editor, and it might be months before an editor is available,
and the backlog is cleared.
Couple of things you can do here: contrary to popular belief, you can
have two listings at DMOZ. Each website is permitted an industrial
listing categorized by it's content and topic, and a separate
geographical listing based on your regional location . If you are
having no luck getting listed for the correct industry, go ahead and
submit a listing by region.
Still no luck? There is one more thing you can try: the great
volunteer folks from DMOZ have a common hang out place on the web at
http://resource-zone.com/ .
The address is a link to the DMOZ.ORG open directory public forum.
Sign up for the forum, and see if someone there can help you. I've
heard many success stories from individuals who used this as their
last resort, and got better results than with any other method.
LISTING AT YAHOO
-----------------
Each incoming link to your website helps your PageRank (PR). The more
links, the higher your PR. What is PR, and how is it calculated?
PageRank Explained
"PageRank relies on the uniquely democratic nature of the web by using
its vast link structure as an indicator of an individual page's value.
In essence, Google interprets a link from page A to page B as a vote,
by page A, for page B. But, Google looks at more than the sheer volume
of votes, or links a page receives; it also analyses the page that
casts the vote. Votes cast by pages that are themselves "important"
weigh more heavily and help to make other pages "important."
Source: Google Technology at ://www.google.com/technology/
Not only does your PR improve with any additional quality back links,
but it also gives Google a better chance of finding your pages. Links
to secondary pages can result in those pages getting recognized by by
Google much quicker.
If you are paying for a premium listing at Yahoo, I would recommend
withdrawing it and putting your money towards other marketing efforts.
If the listing is costing you nothing, know that it is there, forget
about it, and move on to the next step in your marketing. :)
FINAL NOTES AND ADDITIONAL INFORMATION
--------------------------------------
Back links and search content are not updated at the same time. Back
links are updated a lot less frequently than search content databases.
From the looks of it, backlinks were last updated about a month ago,
and if I am not mistaken it was four months after the second last
update (in other words the last back link update took place after 4
months of no updates to the back links). No one seems to be able to
figure it out, but considering the heavy weight of backlinks on the
Google system, everyone is surprised as to why the backlinks are
updated so seldom.
Technicalities aside, you can gain back links by going to forums that
interest you, and starting to contribute. Most forums allow for a
signature file with a back link to your site, as long as you are a
valuable member. Make sure that the forum messages are listed on
Google before you do that, otherwise the links could be doing nothing
for your PR and back link count.
Last but not least, gateway pages are not always bad: a gateway can
hurt you if it is in place only to boost your rankings. If Google
somehow determines that the page is never intended to be seen by a
human, it might decide that the page's sole purpose is to boost
rankings. So, make sure that your gateway pages are relevant to the
average user and not in place just for the benefit of the search
engines. If you are still constructing a gateway page, but already
have a link to it, you should consider preventing Google from
spidering it. This can be achieved via the robots.txt file.
A sample robots.txt file might look something like this:
User-agent: *
Disallow: /temp/newproduct
Disallow: /temp/newservice
The above would prevent Google from crawling and indexing
http://www.nettica.com/temp/newproduct and
http://www.nettica.com/temp/newservice. Just place the robots.txt file
in the home directory, and all search engines will stay out of the
listed directories.
For more information on robots.txt...
Website: The Web Robots Pages
URL: http://www.robotstxt.org/wc/robots.html
RESOURCES
---------
All provided information is a compilation based on information
acquired through hundreds of hours of following SEO forums and
articles, testing theories through my own website, and results from
providing search engine optimisation to my clients.
Here are some of my favourite online resources for information on the
subject of SEO, Google, and SEO copy writing:
Website: Webmaster World
URL: http://www.webmasterworld.com
Website: Internet Marketing Research
URL: http://www.internet-marketing-research.net/
Website: Search Engine World
URL: http://www.searchengineworld.com/
I encourage you to ask for a clarification if you find any part of my
answer unclear, or inadequate in any way. I wish you success with your
website.
Regards,
slawek-ga |
Request for Answer Clarification by
alangrah-ga
on
17 Nov 2003 22:35 PST
Ok, a couple of clarifications:
YAHOO
-----
I paid the $299 standard fee for being listed in Yahoo's directory.
As far as I can tell, that's the least you can pay for a business
listing. My main concern here is the fact that the Yahoo URL in the
main directory points to:
http://srd.yahoo.com/S=600409:D1/CS=600409/SS=96388461/*http://www.nettica.com/
instead of:
http://www.nettica.com/
It's certainly not the end of the world, because there are direct
links in Yahoo's other global directories (Australia, for example),
but it seems to me that the URL above would make it practically
impossible for Google to count this as a backlink. Am I right about
this? Considering Google recommends getting listed in Yahoo, I
seriously would expect the main listing to make some sort of
difference.
SPIDERING PAGES BEYOND THE HOMEPAGE
-----------------------------------
This is one of the hardest problems to judge. The spider, on
occassion, will find the .ASPX files. On the last update (just a few
days ago) the pages were crawled and present in the index for
approximately 24 hours before they were summarily removed. I'd love
to know how to make those pages "stick" or why they may have been
removed. I'm amazed my sitemap survived the last round. Perhaps it
was because I had manually added the URL through Google? This is what
actually what prompted me to post to this forum.
GATEWAY PAGES
-------------
Maybe I wasn't clear on this one, or we're thinking of two different
things. If you look in the ommitted results for nettica you'll see a
bunch of additional pages (www.gramnet.com, www.hackthenet.com, etc)
which are beta sites for our Portal software. (hackthenet.com has
been a great tool for testing security, everyone wannabie on the net
has hit it). It is used to ensure our customers will have easy to use
and completely secure software. We are intending to help our pagerank
by having a backlink to our site on their homepage (which is
configurable, of course). To Google these sites may look like
duplicate content, but when customers start adding their own rich
content, our ratings should skyrocket. Am I correct in this
assumption?
-----
Thanks for you tip on DMOZ.ORG, I will give it a try. I am also well
acquainted with the robots.txt file, premanent redirections, etc. I
have used a combination of both to minimize the effects of the bad
links. I have also been working pretty hard to optimize the pages
(moving the important content to the top of the HTML, the 'templated'
content to the bottom). If you could give me a 1-10 rating on my
"google friendliness" I would appreciate it. I need to know whether
its better for me to continue trying to optimize the site, or to just
leave it for a while and let nature take its course. I know that
sometimes there's nothing you can really do but wait. But with only
10-12 chances a year, you have to make the most of each opportunity.
Thanks,
Alan
|
Clarification of Answer by
slawek-ga
on
18 Nov 2003 00:15 PST
YAHOO
-----
My website has been listed in Yahoo for close to year now, and it
never did show up as a back link on Google.
I did read before that having a paid listing in Yahoo can boost your
PR on Google. I suspect that the reason is simply a matter of the
quality control having been done by a human, very similar to DMOZ.ORG.
Since Yahoo is powered by Google, there are a lot of systems that are
interlaced in one way or another when it comes to those two search
engines.
When you pay for a listing in Yahoo, chances are that Google knows
about it, and respects your site much more. In my opinion, the $299
can be used better elsewhere, where you can purchase more links that
will amount to more than the Yahoo paid listing. No professional I
have talked to has ever paid for a Yahoo inclusion or recommends doing
so. It is a great way to kick-start things if you can't wait a long
time for results. Beyond that, it is really of little benefit, or at
the very least less benefit than spending that money on a "broader
campaign" vs. just Yahoo.
I have checked my listing, and it appears the same way as yours (the
long complex looking address, vs. a direct link). The same goes for
everyone else's listing on the same search result page where my site
appears. This seems to be normal, and has nothing to do with
forwarding, or wrong-doing of any sort. This is probably why after a
year in Yahoo, there is no back link indexed from Google to my site.
Last I saw, Google requested that only top level domain addresses be
added to the directories. Other pages should be picked up
automatically by the spider, and adding them individually can be
perceived as spamming... Based on that information, your manual
addition of the site map to the directory is an unlikely reason for it
"sticking". Your submission of the home page should be the only one.
From there, just make sure that all your links use an HREF or SRC.
Those are the only two types of links that Google can follow (
://www.google.com/bot.html#whatlinks ).
SPIDERING PAGES BEYOND THE HOMEPAGE
-----------------------------------
Being listed and then dropped is not that uncommon. I thought I
included a link about that, but I cannot see it in my answer?
://www.google.com/bot.html has a few question and answer points,
but ://www.google.com/bot.html#notinindex specifically might be of
more interest to you. See below for quote from the mentioned link:
"The documents will be indexed and entered into the search database
soon after being crawled. Occasionally, documents fetched by Googlebot
will end up not being included in the index, for a variety of reasons
(e.g. they appear to be duplicates of other pages on the web, etc.)"
I would recommend not letting Google spider all of the pages until
their content is a little more diversified and unique between the
domain names. Allow Google to index your main site, and keep it from
the other pages until you have more user contributions, or make them
more unique yourself for now. Taking into account that those pages
show up under "similar pages", and the quick disappearance of your
pages after they get index, you are risking being black listed for
mirroring the content. Google has obviously already recognized the two
additional domains as very similar...
The pages can result in a PR boost to your main site, but there are a
couple of roadblocks to this being a "golden nugget" solution:
- PR gets diluted. Regardless of the PR of the site linking to you,
there is only so much PR that can be "handed out". If the link to your
page is one of a hundred, don't expect much of a boost in PR even if
the referring page is a PR7 or PR8. The higher the PR of the referring
page, the more can be handed out, but there is always a limit on how
much PR can be passed on.
- The PR gained from a back link should be a side bonus, and not the
main reason for the referring link or page itself. Keep your eye on
value for the visitor, and much of everything else will fall in place.
I am concerned that you are thinking too much about PR. Take good care
of the visitors first, and you are taking a higher road which is much
safer, and in the long run much more successful.
Your assumption of getting a nice kick back from the other sites when
"customers start adding their own rich content" is only true if the
content does not include a heap of links to other sites, and the pages
that the link is on achieve a high PR themselves.
RATING OF GOOGLE FRIENDLINESS
-----------------------------
I have looked at your website in 3 different web browsers: Netscape,
Internet Explorer, and Opera. The only web browser under which the
page appears correct is Opera. Under Netscape and Internet Explorer
layout problems are quite visible: for Netscape there are two main
problems. The login prompt at the top is almost unusable. It is smack
in the middle of the page in a column. The user prompt, entry field,
password prompt, entry field, and the login icon are all stacked for a
total of 5 lines. Part of the upper menu is covered by the Login
button. Also the side menu is under all the body text. The menu
starts only after the body text is over, making for a very long page
with nothing beside the menu. I suspect this is just a table size
problem.
Under Internet Explorer the login prompts are fine (all lined up
nicely across the top), but the menu has the same problem as in
Netscape. Why do I mention all this? Google likes clean code.
Something is wrong with your code, and it could be hurting your
rankings.
Also, the CSS code should be in a separate file to improve the content
to code ratio. You should put your first paragraph in H1 tags, as this
increases the value of the text. Simply modify your H1 tag with CSS,
and no one but Google will know that it is a "heading 1 text". Between
these and a few other little tricks, there is some room for
improvement. Without looking at the actual ASPX code, I would say
your page is a 5.5/10 when it comes to Google friendliness. As you can
see I took you up on "being blunt". :) I realize that after putting
in so much hard work hearing this might hurt, but I am placing your
business ahead of your feelings. I suspect that is the way you wanted
it, otherwise you would not have asked the tough questions that you
did.
Should you require further information, once again please do not
hesitate to ask for a clarification.
Regards,
slawek-ga
|