Revenue Quality & Leverage

The coronavirus issue is likely to linger for some time.

Up to 70% of Germany could become infected & some countries like the UK are even considering herd immunity as a strategy:

“I’m an epidemiologist. When I heard about Britain’s ‘herd immunity’ coronavirus plan, I thought it was satire”
– William Hanage

What if their models are broken?

Many companies like WeWork or Oyo have been fast and loose chasing growth while slower growing companies have been levering up to fund share buybacks. Airlines spent 96% of free cash flow on share buybacks. The airlines seek a $50 billion bailout package.

There are knock-on effects from Boeing to TripAdvisor to Google all the way down to travel affiliate blogger, local restaurants closing, the over-levered bus company going through bankruptcy & bondholders eating a loss on the debt.

Companies are going to let a lot of skeletons out of the closet as literally anything and everything bad gets attributed to coronavirus. Layoffs, renegotiating contracts, pausing ad budgets, renegotiating debts, requesting bailouts, etc. The Philippine stock market was recently trading at 2012 levels & closed indefinitely.

Brad Geddes mentioned advertisers have been aggressively pulling PPC budgets over the past week: “If you have to leave the house to engage in the service, it just seems like it’s not converting right now.”

During the prior recession Google repriced employee options to retain talent.

More time online might mean search engines & social networks capture a greater share of overall ad spend, but if large swaths of the economy do not convert & how people live changes for an extended period of time it will take time for the new categories to create the economic engines replacing the old out-of-favor categories.

[IMPORTANT: insert affiliate ad for cruise vacations here]

As Google sees advertisers pause ad budgets Google will get more aggressive with keeping users on their site & displacing organic click flows with additional ad clicks on the remaining advertisers.

When Google or Facebook see a 5% or 10% pullback other industry players might see a 30% to 50% decline as the industry pulls back broadly, focuses more resources on the core, and the big attention merchants offset their losses by clamping down on other players.

At its peak TripAdvisor was valued at about $14 billion & it is now valued at about $2 billion.

TripAdvisor announced layoffs. As did Expedia. As did Booking.com. As did many hotels. And airlines. etc. etc. etc.

I am not suggesting people should be fearful or dominated by negative emotions. Rather one should live as though many other will be living that way.

In times of elevated uncertainty, in business it is best to not be led by emotions unless they are positive ones. Spend a bit more time playing if you can afford to & work more on things you love.

Right now we might be living through the flu pandemic of 1918 and the Great Depression of 1929 while having constant access to social media updates. And that’s awful.

Consume less but deeper. Less Twitter, less news, fewer big decisions, read more books.

It is better to be more pragmatic & logic-based in determining opportunity cost & the best strategy to use than to be led by extreme fear.

  • If you have sustainable high-margin revenue treasure it.
  • If you have low-margin revenue it might quickly turn into negative margin revenues unless something changes quickly.
  • If you have low-margin revenue which is sustainable but under-performed less stable high-margin revenues you might want to put a bit more effort into those sorts of projects as they are more likely to endure.

On a positive note, we might soon get a huge wave of innovation

“Take the Great Depression. Economist Alexander Field writes that “the years 1929–1941 were, in the aggregate, the most technologically progressive of any comparable period in U.S. economic history.” Productivity growth was twice as fast in the 1930s as it was in the decade prior. The 1920s were the era of leisure because people could afford to relax. The 1930s were the era of frantic problem solving because people had no other choice. The Great Depression brought unimaginable financial pain. It also brought us supermarkets, microwaves, sunscreen, jets, rockets, electron microscopes, magnetic recording, nylon, photocopying, teflon, helicopters, color TV, plexiglass, commercial aviation, most forms of plastic, synthetic rubber, laundromats, and countless other discoveries.”

The prior recession led to trends like Groupon. The McJobs recovery led to services like Uber & DoorDash. Food delivery has been trending south recently, though perhaps the stay-at-home economy will give it a boost.

I have been amazed at how fast affiliates moved with pushing N95 face masks online over the past couple months. Seeing how fast that stuff spun up really increases the perceived value of any sustainable high-margin businesses.

Amazon.com is hiring another 100,000 warehouse workers as people shop from home. Amazon banned new face masks and hand sanitizer listings. One guy had to donate around 18,000 cleaning products he couldn’t sell.

I could see online education becoming far more popular as people aim to retrain while stuck at home.

What sorts of new industries will current & new technologies lead to as more people spend time working from home?

Categories: 

from SEO Book http://www.seobook.com/revenue-quality
via KCG Auto Feed

Subscription Fatigue

Subscription Management

I have active subscriptions with about a half-dozen different news & finance sites along with about a half dozen software tools, but sometimes using a VPN or web proxy across different web browsers makes logging in to all of them & clearing cookies for some paywall sites a real pain.

If you don’t subscribe to any outlets then subscribing to an aggregator like Apple News+ can make a lot of sense, but it is very easy to end up with dozens of forgotten subscriptions.

Winner-take-most Market Stratification

The news business is coming to resemble other tech-enabled businesses where a winner takes most. The New York Times stock, for instance, is trading at 15 year highs & they recently announced they are raising subscription prices:

The New York Times is raising the price of its digital subscription for the first time, from $15 every four weeks to $17 — from about $195 to $221 a year.

With a Trump re-election all but assured after the Russsia, Russia, Russia garbage, the party-line impeachment (less private equity plunderer Mitt Romney) & the ridiculous Iowa primary, many NYT readers will pledge their #NeverTrumpTwice dollars with the New York Times.

If you think politics looks ridiculous today, wait until you see some of the China-related ads in a half-year as the novel coronavirus spreads around the world.

Outside of a few core winners, the news business online has been so brutal that even Warren Buffett is now a seller. As the economics get uglier news sites get more extreme with ad placements, user data sales, and pushing subscriptions. Some of these aggressive monetization efforts make otherwise respectable news outlets look like part of a very downmarket subset of the web.

Users Fight Back

Users have thus adopted to blocking ads & are also starting to ramp up blocking paywall notifications.

Each additional layer of technological complexity is another cost center publishers have to fund, often through making the user experience of their sites worse, which in turn makes their own sites less differentiated & inferior to the copies they have left across the web (via AMP, via Facebook Instant Articles, syndication in Apple News or on various portal sites like MSN or Yahoo!).

A Web Browser For Every Season

Google Chrome is spyware, so I won’t recommend installing that.

Here Google’s official guide on how to remove the spyware.

The easiest & most basic solution which works across many sites using metered paywalls is to have multiple web browsers installed on your computer. Have a couple browsers which are used exclusively for reading news articles when they won’t show up in your main browser & set those web browsers to delete cookies on close. Or open the browsers in private mode and search for the URL of the page from Google to see if that allows access.

  • If you like Firefox there are other iterations from other players like Pale Moon, Comodo IceDragon or Waterfox using their core.
  • If you like Google Chrome then Chromium is the parallel version of it without the spyware baked in. The Chromium project is also the underlying source used to build about a dozen other web browsers including: Opera, Vivaldi, Brave, Cilqz, Blisk, Comodo Dragon, SRWare Iron, Yandex Browser & many others. Even Microsoft recently switched their Edge browser to being powered by the Chromium project. The browsers based on the Chromium store allow you to install extensions from the Chrome web store.
  • Some web browsers monetize users by setting affiliate links on the home screen and/or by selling the default search engine recommendation. You can change those once and they’ll typically stick with whatever settings you use.
  • For some browsers I use for regular day to day web use I set them up to continue session on restart, and I have a session manager plugin like this one for Firefox or this one for Chromium-based browsers. For browsers which are used exclusively for reading paywall blocked articles I set them up to clear cookies on restart.

Bypassing Paywalls

There are a couple solid web browser plugins built specifically for bypassing paywalls.

Academic Journals

Unpaywall is an open database of around 25,000,000 free scholarly articles. They provide extensions for Firefox and Chromium based web browsers on their website.

News Articles

There is also one for news publications called bypass paywalls.

  • Mozilla Firefox: To install the Firefox version go here.
  • Chrome-like web browsers: To install the Chrome version of the extension in Opera or Chromium or Microsoft Edge you can download the extension here, enter developer mode inside the extensions area of your web browser & install extension. To turn developer mode on, open up the drop down menu for the browser, click on extensions to go to the extension management area, and then slide the “Developer mode” button to the right so it is blue.

Regional Blocking

If you travel internationally some websites like YouTube or Twitter or news sites will have portions of their content restricted to only showing in some geographic regions. This can be especially true for new sports content and some music.

These can be bypassed by using a VPN service like NordVPN, ExpressVPN, Witopia or IPVanish. Some VPN providers also sell pre-configured routers. If you buy a pre-configured router you can use an ethernet switch or wifi to switch back and forth between the regular router and the VPN router.

You can also buy web proxies & enter them into the Foxy Proxy web browser extension (Firefox or Chromium-compatible) with different browsers set to default to different country locations, making it easier to see what the search results show in different countries & cities quickly.

If you use a variety of web proxies you can configure some of them to work automatically in an open source rank tracking tool like Serposcope.

The Future of Journalism

I think the future of news is going to be a lot more sites like Ben Thompson’s Stratechery or Jessica Lessin’s TheInformation & far fewer broad/horizontal news organizations. A friend of mine named Terry Godier launched a conversion-oriented email newsletter named Conversion Gold which has done quite well right out of the gate, leading him to launch IndieMailer, a community for paid newsletter creators.

The model which seems to be working well for those sorts of news sites is…

  • stick to a tight topic range
  • publish regularly at a somewhat decent frequency like daily or weekly, though have a strong preference to quality & originality over quantity
  • have a single author or a small core team which does most the writing and expand editorial hiring slowly
  • offer original insights & much more depth of coverage than you would typically find in the mainstream news
  • Rely on WordPress or a low-cost CMS & billing technology partner like Substack, Memberful, or if they have a bit more technical chops they can install aMember on their own server. One of the biggest mistakes I made when I opened up a membership site about a decade back was hand rolling custom code for memberhsip management. At one point we shut down the membership site for a while in order to allow us to rip out all that custom code & replace it with aMember.
  • Accept user comments on pieces or integrate a user forum using something like Discord on a subdomain or a custom Slack channel. Highlight or feature the best comments. Update readers to new features via email.
  • Invest much more into obtaining unique data & sources to deliver new insights without spending aggressively to syndicate onto other platforms using graphical content layouts which would require significant design, maintenance & updating expenses
  • Heavily differentiate your perspective from other sources
  • maintain a low technological maintenance overhead
  • low cost monthly subscription with a solid discount for annual pre-payment
  • instead of using a metered paywall, set some content to require payment to read & periodically publish full-feature free content (perhaps weekly) to keep up awareness of the offering in the broader public to help offset churn.

Some also work across multiple formats with complimentary offerings. The Ringer has done well with podcasts & Stratechery also has the Exponent podcast.

There are a number of other successful online-only news subscription sites like TheAthletic & Bill Bishop’s Sinocism newsletter about China, but I haven’t subscribed to them yet. Many people support a wide range of projects on platforms like Patreon.

Categories: 

from SEO Book http://www.seobook.com/bypass-paywall
via KCG Auto Feed

Favicon SEO

Google recently copied their mobile result layout over to desktop search results. The three big pieces which changed as part of that update were

  • URLs: In many cases Google will now show breadcrumbs in the search results rather than showing the full URL. The layout no longer differentiates between HTTP and HTTPS. And the URLs shifted from an easily visible green color to a much easier to miss black.
  • Favicons: All listings now show a favicon next to them.
  • Ad labeling: ad labeling is in the same spot as favicons are for organic search results, but the ad labels are a black which sort of blends in to the URL line. Over time expect the black ad label to become a lighter color in a way that parallels how Google made ad background colors lighter over time.

One could expect this change to boost the CTR on ads while lowering the CTR on organic search results, at least up until users get used to seeing favicons and not thinking of them as being ads.

The Verge panned the SERP layout update. Some folks on Reddit hate this new layout as it is visually distracting, the contrast on the URLs is worse, and many people think the organic results are ads.

I suspect a lot of phishing sites will use subdomains patterned off the brand they are arbitraging coupled with bogus favicons to try to look authentic. I wouldn’t reconstruct an existing site’s structure based on the current search result layout, but if I were building a brand new site I might prefer to put it at the root instead of on www so the words were that much closer to the logo.

Google provides the following guidelines for favicons

  • Both the favicon file and the home page must be crawlable by Google (that is, they cannot be blocked to Google).
  • Your favicon should be a visual representation of your website’s brand, in order to help users quickly identify your site when they scan through search results.
  • Your favicon should be a multiple of 48px square, for example: 48x48px, 96x96px, 144x144px and so on. SVG files, of course, do not have a specific size. Any valid favicon format is supported. Google will rescale your image to 16x16px for use in search results, so make sure that it looks good at that resolution. Note: do not provide a 16x16px favicon.
  • The favicon URL should be stable (don’t change the URL frequently).
  • Google will not show any favicon that it deems inappropriate, including pornography or hate symbols (for example, swastikas). If this type of imagery is discovered within a favicon, Google will replace it with a default icon.

In addition to the above, I thought it would make sense to provide a few other tips for optimizing favicons.

  • Keep your favicons consistent across sections of your site if you are trying to offer a consistent brand perception.
  • In general, less is more. 16×16 is a tiny space, so if you try to convey a lot of information inside of it, you’ll likely end up creating a blob that almost nobody but you recognizes.
  • It can make sense to include the first letter from a site’s name or a simplified logo widget as the favicon, but it is hard to include both in a single favicon without it looking overdone & cluttered.
  • A colored favicon on a white background generally looks better than a white icon on a colored background, as having a colored background means you are eating into some of the scarce pixel space for a border.
  • Using a square shape versus a circle gives you more surface area to work with.
  • Even if your logo has italics on it, it might make sense to avoid using italics in the favicon to make the letter look cleaner.

Here are a few favicons I like & why I like them:

  • Citigroup – manages to get the word Citi in there while looking memorable & distinctive without looking overly cluttered
  • Nerdwallet – the N makes a great use of space, the colors are sharp, and it almost feels like an arrow that is pointing right
  • Inc – the bold I with a period is strong.
  • LinkedIn – very memorable using a small part of the word from their logo & good color usage.

Some of the other memorable ones that I like include: Twitter, Amazon, eBay, Paypal, Google Play & CNBC.

Here are a few favicons I dislike & why

  • Wikipedia – the W is hard to read.
  • USAA – they included both the logo widget and the 4 letters in a tiny space.
  • Yahoo! – they used inconsistent favicons across their sites & use italics on them. Some of the favicons have the whole word Yahoo in them while the others are the Y! in italics.

If you do not have a favicon Google will show a dull globe next to your listing. Real Favicon Generator is a good tool for creating favicons in various sizes.

What favicons do you really like? Which big sites do you see that are doing it wrong?

Categories: 

from SEO Book http://www.seobook.com/favicon-seo
via KCG Auto Feed

Brands vs Ads

About 7 years ago I wrote about how the search relevancy algorithms were placing heavy weighting on brand-related signals after Vince & Panda on the (half correct!) presumption that this would lead to excessive industry consolidation which in turn would force Google to turn the dials in the other direction.

My thesis was Google would need to increasingly promote some smaller niche sites to make general web search differentiated from other web channels & minimize the market power of vertical leading providers.

The reason my thesis was only half correct (and ultimately led to the absolutely wrong conclusion) is Google has the ability to provide the illusion of diversity while using sort of eye candy displacement efforts to shift an increasing share of searches from organic to paid results.

As long as any market has at least 2 competitors in it Google can create a “me too” offering that they hard code front & center and force the other 2 players (along with other players along the value chain) to bid for marketshare. If competitors are likely to complain about the thinness of the me too offering & it being built upon scraping other websites, Google can buy out a brand like Zagat or a data supplier like ITA Software to undermine criticism until the artificially promoted vertical service has enough usage that it is nearly on par with other players in the ecosystem.

Google need not win every market. They only need to ensure there are at least 2 competing bids left in the marketplace while dialing back SEO exposure. They can then run other services to redirect user flow and force the ad buy. They can insert their own bid as a sort of shill floor bid in their auction. If you bid below that amount they’ll collect the profit through serving the customer directly, if you bid above that they’ll let you buy the customer vs doing a direct booking.

Where this gets more than a bit tricky is if you are a supplier of third party goods & services where you buy in bulk to get preferential pricing for resale. If you buy 100 rooms a night from a particular hotel based on the presumption of prior market performance & certain channels effectively disappear you have to bid above market to sell some portion of the rooms because getting anything for them is better than leaving them unsold.

Dipping a bit back into history here, but after Groupon said no to Google’s acquisition offer Google promptly partnered with players 2 through n to ensure Groupon did not have a lasting competitive advantage. In the fullness of time most those companies died, LivingSocial was acquired by Groupon for nothing & Groupon is today worth less than the amount they raised in VC & IPO funding.

Most large markets will ultimately consolidate down to a couple players (e.g. Booking vs Expedia) while smaller players lack the scale needed to have the economic leverage to pay Google’s increasing rents.

This sort of consolidation was happening even when the search results were mostly organic & relevancy was driven primarily by links. As Google has folded in usage data & increased ad load on the search results it becomes harder for a generically descriptive domain name to build brand-related signals.

It is not only generically descriptive sorts of sites that have faded though. Many brand investments turned out to be money losers after the search result set was displaced by more ads (& many brand-related search result pages also carry ads above the organic results).

The ill informed might write something like this:

Since the Motorola debacle, it was Google’s largest acquisition after the $676 million purchase of ITA Software, which became Google Flights. (Uh, remember that? Does anyone use that instead of Travelocity or one of the many others? Neither do I.)

The reality is brands lose value as the organic result set is displaced. To make the margins work they might desperately outsource just about everything but marketing to a competitor / partner, which will then latter acquire them for a song.

Travelocity had roughly 3,000 people on the payroll globally as recently as a couple of years ago, but the Travelocity workforce has been whittled to around 50 employees in North America with many based in the Dallas area.

The best relevancy algorithm in the world is trumped by preferential placement of inferior results which bypasses the algorithm. If inferior results are hard coded in placements which violate net neutrality for an extended period of time, they can starve other players in the market from the vital user data & revenues needed to reinvest into growth and differentiation.

Value plays see their stocks crash as growth slows or goes in reverse. With the exception of startups frunded by Softbank, growth plays are locked out of receiving further investment rounds as their growth rate slides.

Startups like Hipmunk disappear. Even an Orbitz or Travelocity become bolt on acquisitions.

The viability of TripAdvisor as a stand alone business becomes questioned, leading them to partner with Ctrip.

TripAdvisor has one of the best link profiles of any commercially oriented website outside of perhaps Amazon.com. But ranking #1 doesn’t count for much if that #1 ranking is below the fold.

TripAdvisor shifted their business model to allow direct booking to better monetize mobile web users, but as Google has ate screen real estate and grew Google Travel into a $100 billion business other players have seen their stocks sag.

Google sits at the top of the funnel & all other parts of the value chain are compliments to be commoditized.

  • Buy premium domain names? Google’s SERPs test replacing domain names with words & make the domain name gray.
  • Improve conversion rates? Your competitor almost certainly did as well, now you both can bid more & hand over an increasing economic rent to Google.
  • Invest in brand awareness? Google shows ads for competitors on your brand terms, forcing you to buy to protect the brand equity you paid to build.

Search Metrics mentioned Hotels.com was one of the biggest losers during the recent algorithm updates: “I’m going to keep on this same theme there, and I’m not going to say overall numbers, the biggest loser, but for my loser I’m going to pick Hotels.com, because they were literally like neck and neck, like one and two with Booking, as far as how close together they were, and the last four weeks, they’ve really increased that separation. … I’m going to give a winner. The fire department that’s fighting the fires in Northern California.”

As Google ate the travel category the value of hotel-related domain names has fallen through the floor.

Most of the top selling hotel-related domain names were sold about a decade ago:

On August 8th HongKongHotels.com sold for $4,038. And the buyer may have overpaid for it!

Google consistently grows their ad revenues 20% a year in a global economy growing at under 4%.

There are only about 6 ways they can do that

  • growth of web usage (though many of those who are getting online today have a far lower disposable income than those who got on a decade or two ago did)
  • gain marketshare (very hard in search given that they effectively are the market in most markets outside of China & Russia)
  • create new inventory (new ad types on Google Maps & YouTube)
  • charge more for clicks
  • improve at targeting by better surveillance of web users (getting harder after GDPR & similar efforts from some states in the next year or two)
  • shift click streams away from organic toward paid channels (through larger ads, more interactive ad units, less appealing organic result formatting, etc.)

Wednesday both Expedia and TripAdvisor reported earnings after hours & both fell off a cliff: “Both Okerstrom and Kaufer complained that their organic, or free, links are ending up further down the page in Google search results as Google prioritizes its own travel businesses.”

Losing 20% to 25% of your market cap in a single day is an extreme move for a company worth billions of dollars.

Thursday Google hit fresh all time highs.

“Google’s old motto was ‘Don’t Be Evil’, but you can’t be this big and profitable and not be evil. Evil and all-time highs pretty much go hand in hand.” – Howard Lindzon

Booking held up much better than TripAdvisor & Expedia as they have a bigger footprint in Europe (where antitrust is a thing) and they have a higher reliance on paid search versus organic.

The broader SEO industry is to some degree frozen by fear. Roughly half of SEOs claim to have not bought *ANY* links in a half-decade.

Long after most of the industry has stopped buying links some people still run the “paid links are a potential FTC violation guideline” line as though it is insightful and/or useful.

Ask the people carrying Google’s water what they think of the official FTC guidance on poor ad labeling in search results and you will hear the beautiful sound of crickets chirping.

Where is the ad labeling in this unit?

Does small gray text in the upper right corner stating “about these results” count as legitimate ad labeling?

And then when you scroll over that gray text and click on it you get “Some of these hotel search results may be personalized based on your browsing activity and recent searches on Google, as well as travel confirmations sent to your Gmail. Hotel prices come from Google’s partners.”

Zooming out a bit further on the above ad unit to look at the entire search result page, we can now see the following:

  • 4 text ad units above the map
  • huge map which segments demand by price tier, current sales, luxury, average review, geographic location
  • organic results below the above wall of ads, and the number of organic search results has been reduced from 10 to 7

How many scrolls does one need to do to get past the above wall of ads?

If one clicks on one of the hotel prices the follow up page is … more ads.

Check out how the ad label is visually overwhelmed by a bright blue pop over.

Worth noting Google Chrome has a built-in ad blocking feature which allows them to strip all ads from displaying on third party websites if they follow Google’s best practices layout used in the search results.

You won’t see ads on websites that have poor ad experiences, like:

  • Too many ads
  • Annoying ads with flashing graphics or autoplaying audio
  • Ad walls before you can see content

When these ads are blocked, you’ll see an “Intrusive ads blocked” message. Intrusive ads will be removed from the page.

Hotels have been at the forefront of SEO for many years. They drive massive revenues & were perhaps the only vertical ever referenced in the Google rater guidelines which stated all affiliate sites should be labeled as spam even if they are helpful to users.

Google has won most of the profits in the travel market & so they’ll need to eat other markets to continue their 20% annual growth.

Some people who market themselves as SEO experts not only recognize this trend but even encourage this sort of behavior:

Zoopla, Rightmove and On The Market are all dominant players in the industry, and many of their house and apartment listings are duplicated across the different property portals. This represents a very real reason for Google to step in and create a more streamlined service that will help users make a more informed decision. … The launch of Google Jobs should not have come as a surprise to anyone, and neither should its potential foray into real estate. Google will want to diversify its revenue channels as much as possible, and any market that allows it to do so will be in its sights. It is no longer a matter of if they succeed, but when.

We are nearing many inflection points in many markets where markets that seemed somewhat disconnected by search will still end up being dominated by search. Google is investing heavily in quantum computing. Google Fiber was a nothingburger to force competing ISPs into accelerating expensive network upgrades, but beaming in internet services from satellites will allow Google to bypass local politics, local regulations & heavy network infrastructure construction costs. A startup named Kepler recently provided high-bandwidth connectivity to the Arctic. When Google launches a free ISP there will be many knock on effects.

Categories: 

from SEO Book http://www.seobook.com/brands-vs-ads
via KCG Auto Feed

Internet Wayback Machine Adds Historical TextDiff

The Wayback Machine has a cool new feature for looking at the historical changes of a web page.

The color scale shows how much a page has changed since it was last cached & you can select between any two documents to see how a page has changed over time.

You can then select between any two documents to see a side-by-side comparison of the documents.

That quickly gives you an at-a-glance view of how they’ve changed their:

  • web design
  • on-page SEO strategy
  • marketing copy & sales strategy

For sites that conduct seasonal sales & rely heavily on holiday themed ads you can also look up the new & historical ad copy used by large advertisers using tools like Moat, WhatRunsWhere & Adbeat.

Categories: 

from SEO Book http://www.seobook.com/internet-wayback-machine-adds-historical-textdiff
via KCG Auto Feed

Dofollow, Nofollow, Sponsored, UGC

A Change to Nofollow

Last month Google announced they were going to change how they treated nofollow, moving it from a directive toward a hint. As part of that they also announced the release of parallel attributes rel=”sponsored” for sponsored links & rel=”ugc” for user generated content in areas like forums & blog comments.

Why not completely ignore such links, as had been the case with nofollow? Links contain valuable information that can help us improve search, such as how the words within links describe content they point at. Looking at all the links we encounter can also help us better understand unnatural linking patterns. By shifting to a hint model, we no longer lose this important information, while still allowing site owners to indicate that some links shouldn’t be given the weight of a first-party endorsement.

In many emerging markets the mobile web is effectively the entire web. Few people create HTML links on the mobile web outside of on social networks where links are typically nofollow by default. This reduces the potential signal available to either tracking what people do directly and/or shifting how the nofollow attribute is treated.

Google shifting how nofollow is treated is a blanket admission that Penguin & other elements of “the war on links” were perhaps a bit too effective and have started to take valuable signals away from Google.

Google has suggested the shift in how nofollow is treated will not lead to any additional blog comment spam. When they announced nofollow they suggested it would lower blog comment spam. Blog comment spam remains a growth market long after the gravity of the web has shifted away from blogs onto social networks.

Changing how nofollow is treated only makes any sort of external link analysis that much harder. Those who specialize in link audits (yuck!) have historically ignored nofollow links, but now that is one more set of things to look through. And the good news for professional link auditors is that increases the effective cost they can charge clients for the service.

Some nefarious types will notice when competitors get penalized & then fire up Xrummer to help promote the penalized site, ensuring that the link auditor bankrupts the competing business even faster than Google.

Links, Engagement, or Something Else…

When Google was launched they didn’t own Chrome or Android. They were not yet pervasively spying on billions of people:

If, like most people, you thought Google stopped tracking your location once you turned off Location History in your account settings, you were wrong. According to an AP investigation published Monday, even if you disable Location History, the search giant still tracks you every time you open Google Maps, get certain automatic weather updates, or search for things in your browser.

Thus Google had to rely on external signals as their primary ranking factor:

The reason that PageRank is interesting is that there are many cases where simple citation counting does not correspond to our common sense notion of importance. For example, if a web page has a link on the Yahoo home page, it may be just one link but it is a very important one. This page should be ranked higher than many pages with more links but from obscure places. PageRank is an attempt to see how good an approximation to “importance” can be obtained just from the link structure. … The denition of PageRank above has another intuitive basis in random walks on graphs. The simplied version corresponds to the standing probability distribution of a random walk on the graph of the Web. Intuitively, this can be thought of as modeling the behavior of a “random surfer”.

Google’s reliance on links turned links into a commodity, which led to all sorts of fearmongering, manual penalties, nofollow and the Penguin update.

As Google collected more usage data those who overly focused on links often ended up scoring an own goal, creating sites which would not rank.

Google no longer invests heavily in fearmongering because it is no longer needed. Search is so complex most people can’t figure it out.

Many SEOs have reduced their link building efforts as Google dialed up weighting on user engagement metrics, though it appears the tide may now be heading in the other direction. Some sites which had decent engagement metrics but little in the way of link building slid on the update late last month.

As much as Google desires relevancy in the short term, they also prefer a system complex enough to external onlookers that reverse engineering feels impossible. If they discourage investment in SEO they increase AdWords growth while gaining greater control over algorithmic relevancy.

Google will soon collect even more usage data by routing Chrome users through their DNS service: “Google isn’t actually forcing Chrome users to only use Google’s DNS service, and so it is not centralizing the data. Google is instead configuring Chrome to use DoH connections by default if a user’s DNS service supports it.”

If traffic is routed through Google that is akin to them hosting the page in terms of being able to track many aspects of user behavior. It is akin to AMP or YouTube in terms of being able to track users and normalize relative engagement metrics.

Once Google is hosting the end-to-end user experience they can create a near infinite number of ranking signals given their advancement in computing power: “We developed a new 54-qubit processor, named “Sycamore”, that is comprised of fast, high-fidelity quantum logic gates, in order to perform the benchmark testing. Our machine performed the target computation in 200 seconds, and from measurements in our experiment we determined that it would take the world’s fastest supercomputer 10,000 years to produce a similar output.”

Relying on “one simple trick to…” sorts of approaches are frequently going to come up empty.

EMDs Kicked Once Again

I was one of the early promoters of exact match domains when the broader industry did not believe in them. I was also quick to mention when I felt the algorithms had moved in the other direction.

Google’s mobile layout, which they are now testing on desktop computers as well, replaces green domain names with gray words which are easy to miss. And the favicon icons sort of make the organic results look like ads. Any boost a domain name like CreditCards.ext might have garnered in the past due to matching the keyword has certainly gone away with this new layout that further depreciates the impact of exact-match domain names.

At one point in time CreditCards.com was viewed as a consumer destination. It is now viewed … below the fold.

If you have a memorable brand-oriented domain name the favicon can help offset the above impact somewhat, but matching keywords is becoming a much more precarious approach to sustaining rankings as the weight on brand awareness, user engagement & authority increase relative to the weight on anchor text.

Categories: 

from SEO Book http://www.seobook.com/dofollow-nofollow-sponsored-ugc
via KCG Auto Feed

New Keyword Tool

Our keyword tool is updated periodically. We recently updated it once more.

For comparison sake, the old keyword tool looked like this

Whereas the new keyword tool looks like this

The upsides of the new keyword tool are:

  • fresher data from this year
  • more granular data on ad bids vs click prices
  • lists ad clickthrough rate
  • more granular estimates of Google AdWords advertiser ad bids
  • more emphasis on commercial oriented keywords

With the new columns of [ad spend] and [traffic value] here is how we estimate those.

  • paid search ad spend: search ad clicks * CPC
  • organic search traffic value: ad impressions * 0.5 * (100% – ad CTR) * CPC

The first of those two is rather self explanatory. The second is a bit more complex. It starts with the assumption that about half of all searches do not get any clicks, then it subtracts the paid clicks from the total remaining pool of clicks & multiplies that by the cost per click.

The new data also has some drawbacks:

  • Rather than listing search counts specifically it lists relative ranges like low, very high, etc.
  • Since it tends to tilt more toward keywords with ad impressions, it may not have coverage for some longer tail informational keywords.

For any keyword where there is insufficient coverage we re-query the old keyword database for data & merge it across. You will know if data came from the new database if the first column says something like low or high & the data came from the older database if there are specific search counts in the first column

For a limited time we are still allowing access to both keyword tools, though we anticipate removing access to the old keyword tool in the future once we have collected plenty of feedback on the new keyword tool. Please feel free to leave your feedback in the below comments.

One of the cool features of the new keyword tools worth highlighting further is the difference between estimated bid prices & estimated click prices. In the following screenshot you can see how Amazon is estimated as having a much higher bid price than actual click price, largely because due to low keyword relevancy entities other than the official brand being arbitraged by Google require much higher bids to appear on competing popular trademark terms.

Historically, this difference between bid price & click price was a big source of noise on lists of the most valuable keywords.

Recently some advertisers have started complaining about the “Google shakedown” from how many brand-driven searches are simply leaving the .com part off of a web address in Chrome & then being forced to pay Google for their own pre-existing brand equity.

Categories: 

from SEO Book http://www.seobook.com/new-keyword-tool
via KCG Auto Feed

AMP’d Up for Recaptcha

Beyond search Google controls the leading distributed ad network, the leading mobile OS, the leading web browser, the leading email client, the leading web analytics platform, the leading free video hosting site.

They win a lot.

And they take winnings from one market & leverage them into manipulating adjacent markets.

Embrace. Extend. Extinguish.

AMP is an utterly unnecessary invention designed to further shift power to Google while disenfranchising publishers. From the very start it had many issues with basic things like supporting JavaScript, double counting unique users (no reason to fix broken stats if they drive adoption!), not supporting third party ad networks, not showing publisher domain names, and just generally being a useless layer of sunk cost technical overhead that provides literally no real value.

Over time they have corrected some of these catastrophic deficiencies, but if it provided real value, they wouldn’t have needed to force adoption with preferential placement in their search results. They force the bundling because AMP sucks.

Absurdity knows no bounds. Googlers suggest: “AMP isn’t another “channel” or “format” that’s somehow not the web. It’s not a SEO thing. It’s not a replacement for HTML. It’s a web component framework that can power your whole site. … We, the AMP team, want AMP to become a natural choice for modern web development of content websites, and for you to choose AMP as framework because it genuinely makes you more productive.”

Meanwhile some newspapers have about a dozen employees who work on re-formatting content for AMP.

Feeeeeel the productivity!

Some content types (particularly user generated content) can be unpredictable & circuitous. For many years forums websites would use keywords embedded in the search referral to highlight relevant parts of the page. Keyword (not provided) largely destroyed that & then it became a competitive feature for AMP: “If the Featured Snippet links to an AMP article, Google will sometimes automatically scroll users to that section and highlight the answer in orange.”

That would perhaps be a single area where AMP was more efficient than the alternative. But it is only so because Google destroyed the alternative by stripping keyword referrers from search queries.

The power dynamics of AMP are ugly:

“I see them as part of the effort to normalise the use of the AMP Carousel, which is an anti-competitive land-grab for the web by an organisation that seems to have an insatiable appetite for consuming the web, probably ultimately to it’s own detriment. … This enables Google to continue to exist after the destination site (eg the New York Times) has been navigated to. Essentially it flips the parent-child relationship to be the other way around. … As soon as a publisher blesses a piece of content by packaging it (they have to opt in to this, but see coercion below), they totally lose control of its distribution. … I’m not that smart, so it’s surely possible to figure out other ways of making a preload possible without cutting off the content creator from the people consuming their content. … The web is open and decentralised. We spend a lot of time valuing the first of these concepts, but almost none trying to defend the second. Google knows, perhaps better than anyone, how being in control of the user is the most monetisable position, and having the deepest pockets and the most powerful platform to do so, they have very successfully inserted themselves into my relationship with millions of other websites. … In AMP, the support for paywalls is based on a recommendation that the premium content be included in the source of the page regardless of the user’s authorisation state. … These policies demonstrate contempt for others’ right to freely operate their businesses.

After enough publishers adopted AMP Google was able to turn their mobile app’s homepage into an interactive news feed below the search box. And inside that news feed Google gets to distribute MOAR ads while 0% of the revenue from those ads find its way to the publishers whose content is used to make up the feed.

Appropriate appropriation. 😀

Each additional layer of technical cruft is another cost center. Things that sound appealing at first blush may not be:

The way you verify your identity to Let’s Encrypt is the same as with other certificate authorities: you don’t really. You place a file somewhere on your website, and they access that file over plain HTTP to verify that you own the website. The one attack that signed certificates are meant to prevent is a man-in-the-middle attack. But if someone is able to perform a man-in-the-middle attack against your website, then he can intercept the certificate verification, too. In other words, Let’s Encrypt certificates don’t stop the one thing they’re supposed to stop. And, as always with the certificate authorities, a thousand murderous theocracies, advertising companies, and international spy organizations are allowed to impersonate you by design.

Anything that is easy to implement & widely marketed often has costs added to it in the future as the entity moves to monetize the service.

This is a private equity firm buying up multiple hosting control panels & then adjusting prices.

This is Google Maps drastically changing their API terms.

This is Facebook charging you for likes to build an audience, giving your competitors access to those likes as an addressable audience to advertise against, and then charging you once more to boost the reach of your posts.

This is Grubhub creating shadow websites on your behalf and charging you for every transaction created by the gravity of your brand.

Shivane believes GrubHub purchased her restaurant’s web domain to prevent her from building her own online presence. She also believes the company may have had a special interest in owning her name because she processes a high volume of orders. … it appears GrubHub has set up several generic, templated pages that look like real restaurant websites but in fact link only to GrubHub. These pages also display phone numbers that GrubHub controls. The calls are forwarded to the restaurant, but the platform records each one and charges the restaurant a commission fee for every order

Settling for the easiest option drives a lack of differentiation, embeds additional risk & once the dominant player has enough marketshare they’ll change the terms on you.

Small gains in short term margins for massive increases in fragility.

“Closed platforms increase the chunk size of competition & increase the cost of market entry, so people who have good ideas, it is a lot more expensive for their productivity to be monetized. They also don’t like standardization … it looks like rent seeking behaviors on top of friction” – Gabe Newell

The other big issue is platforms that run out of growth space in their core market may break integrations with adjacent service providers as each want to grow by eating the other’s market.

Those who look at SaaS business models through the eyes of a seasoned investor will better understand how markets are likely to change:

“I’d argue that many of today’s anointed tech “disruptors” are doing little in the way of true disruption. … When investors used to get excited about a SAAS company, they typically would be describing a hosted multi-tenant subscription-billed piece of software that was replacing a ‘legacy’ on-premise perpetual license solution in the same target market (i.e. ERP, HCM, CRM, etc.). Today, the terms SAAS and Cloud essentially describe the business models of every single public software company.

Most platform companies are initially required to operate at low margins in order to buy growth of their category & own their category. Then when they are valued on that, they quickly need to jump across to adjacent markets to grow into the valuation:

Twilio has no choice but to climb up the application stack. This is a company whose ‘disruption’ is essentially great API documentation and gangbuster SEO spend built on top of a highly commoditized telephony aggregation API. They have won by marketing to DevOps engineers. With all the hype around them, you’d think Twilio invented the telephony API, when in reality what they did was turn it into a product company. Nobody had thought of doing this let alone that this could turn into a $17 billion company because simply put the economics don’t work. And to be clear they still don’t. But Twilio’s genius CEO clearly gets this. If the market is going to value robocalls, emergency sms notifications, on-call pages, and carrier fee passed through related revenue growth in the same way it does ‘subscription’ revenue from Atlassian or ServiceNow, then take advantage of it while it lasts.

Large platforms offering temporary subsidies to ensure they dominate their categories & companies like SoftBank spraying capital across the markets is causing massive shifts in valuations:

I also think if you look closely at what is celebrated today as innovation you often find models built on hidden subsidies. … I’d argue the very distributed nature of microservices architecture and API-first product companies means addressable market sizes and unit economics assumptions should be even more carefully scrutinized. … How hard would it be to create an Alibaba today if someone like SoftBank was raining money into such a greenfield space? Excess capital would lead to destruction and likely subpar returns. If capital was the solution, the 1.5 trillion that went into telcos in late ’90s wouldn’t have led to a massive bust. Would a Netflix be what it is today if a SoftBank was pouring billions into streaming content startups right as the experiment was starting? Obviously not. Scarcity of capital is another often underappreciated part of the disruption equation. Knowing resources are finite leads to more robust models. … This convergence is starting to manifest itself in performance. Disney is up 30% over the last 12 months while Netflix is basically flat. This may not feel like a bubble sign to most investors, but from my standpoint, it’s a clear evidence of the fact that we are approaching a something has got to give moment for the way certain businesses are valued.”

Circling back to Google’s AMP, it has a cousin called Recaptcha.

Recaptcha is another AMP-like trojan horse:

According to tech statistics website Built With, more than 650,000 websites are already using reCaptcha v3; overall, there are at least 4.5 million websites use reCaptcha, including 25% of the top 10,000 sites. Google is also now testing an enterprise version of reCaptcha v3, where Google creates a customized reCaptcha for enterprises that are looking for more granular data about users’ risk levels to protect their site algorithms from malicious users and bots. … According to two security researchers who’ve studied reCaptcha, one of the ways that Google determines whether you’re a malicious user or not is whether you already have a Google cookie installed on your browser. … To make this risk-score system work accurately, website administrators are supposed to embed reCaptcha v3 code on all of the pages of their website, not just on forms or log-in pages.

About a month ago when logging into Bing Ads I saw recaptcha on the login page & couldn’t believe they’d give Google control at that access point. I think they got rid of that, but lots of companies are perhaps shooting themselves in the foot through a combination of over-reliance on Google infrastructure AND sloppy implementation

Today when making a purchase on Fiverr, after converting, I got some of this action

Hmm. Maybe I will enable JavaScript and try again.

Oooops.

That is called snatching defeat from the jaws of victory.

My account is many years old. My payment type on record has been used for years. I have ordered from the particular seller about a dozen times over the years. And suddenly because my web browser had JavaScript turned off I was deemed a security risk of some sort for making an utterly ordinary transaction I have already completed about a dozen times.

On AMP JavaScript was the devil. And on desktop not JavaScript was the devil.

Pro tip: Ecommerce websites that see substandard conversion rates from using Recaptcha can boost their overall ecommerce revenue by buying more Google AdWords ads.

As more of the infrastructure stack is driven by AI software there is going to be a very real opportunity for many people to become deplatformed across the web on an utterly arbitrary basis. That tech companies like Facebook also want to create digital currencies on top of the leverage they already have only makes the proposition that much scarier.

If the tech platforms host copies of our sites, process the transactions & even create their own currencies, how will we know what level of value they are adding versus what they are extracting?

Who measures the measurer?

And when the economics turn negative, what will we do if we are hooked into an ecosystem we can’t spend additional capital to get out of when things head south?

Categories: 

from SEO Book http://www.seobook.com/amped-recaptcha
via KCG Auto Feed

The Fractured Web

Anyone can argue about the intent of a particular action & the outcome that is derived by it. But when the outcome is known, at some point the intent is inferred if the outcome is derived from a source of power & the outcome doesn’t change.

Or, put another way, if a powerful entity (government, corporation, other organization) disliked an outcome which appeared to benefit them in the short term at great lasting cost to others, they could spend resources to adjust the system.

If they don’t spend those resources (or, rather, spend them on lobbying rather than improving the ecosystem) then there is no desired change. The outcome is as desired. Change is unwanted.

News is a stock vs flow market where the flow of recent events drives most of the traffic to articles. News that is more than a couple days old is no longer news. A news site which stops publishing news stops becoming a habit & quickly loses relevancy. Algorithmically an abandoned archive of old news articles doesn’t look much different than eHow, in spite of having a much higher cost structure.

According to SEMrush’s traffic rank, ampproject.org gets more monthly visits than Yahoo.com.

Traffic Ranks.

That actually understates the prevalence of AMP because AMP is generally designed for mobile AND not all AMP-formatted content is displayed on ampproject.org.

Part of how AMP was able to get widespread adoption was because in the news vertical the organic search result set was displaced by an AMP block. If you were a news site either you were so differentiated that readers would scroll past the AMP block in the search results to look for you specifically, or you adopted AMP, or you were doomed.

Some news organizations like The Guardian have a team of about a dozen people reformatting their content to the duplicative & proprietary AMP format. That’s wasteful, but necessary “In theory, adoption of AMP is voluntary. In reality, publishers that don’t want to see their search traffic evaporate have little choice. New data from publisher analytics firm Chartbeat shows just how much leverage Google has over publishers thanks to its dominant search engine.”

It seems more than a bit backward that low margin publishers are doing duplicative work to distance themselves from their own readers while improving the profit margins of monopolies. But it is what it is. And that no doubt drew the ire of many publishers across the EU.

And now there are AMP Stories to eat up even more visual real estate.

If you spent a bunch of money to create a highly differentiated piece of content, why would you prefer that high spend flaghship content appear on a third party website rather than your own?

Google & Facebook have done such a fantastic job of eating the entire pie that some are celebrating Amazon as a prospective savior to the publishing industry. That view – IMHO – is rather suspect.

Where any of the tech monopolies dominate they cram down on partners. The New York Times acquired The Wirecutter in Q4 of 2016. In Q1 of 2017 Amazon adjusted their affiliate fee schedule.

Amazon generally treats consumers well, but they have been much harder on business partners with tough pricing negotiations, counterfeit protections, forced ad buying to have a high enough product rank to be able to rank organically, ad displacement of their organic search results below the fold (even for branded search queries), learning suppliers & cutting out the partners, private label products patterned after top sellers, in some cases running pop over ads for the private label products on product level pages where brands already spent money to drive traffic to the page, etc.

They’ve made things tougher for their partners in a way that mirrors the impact Facebook & Google have had on online publishers:

“Boyce’s experience on Amazon largely echoed what happens in the offline world: competitors entered the market, pushing down prices and making it harder to make a profit. So Boyce adapted. He stopped selling basketball hoops and developed his own line of foosball tables, air hockey tables, bocce ball sets and exercise equipment. The best way to make a decent profit on Amazon was to sell something no one else had and create your own brand. … Amazon also started selling bocce ball sets that cost $15 less than Boyce’s. He says his products are higher quality, but Amazon gives prominent page space to its generic version and wins the cost-conscious shopper.”

Google claims they have no idea how content publishers are with the trade off between themselves & the search engine, but every quarter Alphabet publish the share of ad spend occurring on owned & operated sites versus the share spent across the broader publisher network. And in almost every quarter for over a decade straight that ratio has grown worse for publishers.

The aggregate numbers for news publishers are worse than shown above as Google is ramping up ads in video games quite hard. They’ve partnered with Unity & promptly took away the ability to block ads from appearing in video games using googleadsenseformobileapps.com exclusion (hello flat thumb misclicks, my name is budget & I am gone!)

They will also track video game player behavior & alter game play to maximize revenues based on machine learning tied to surveillance of the user’s account: “We’re bringing a new approach to monetization that combines ads and in-app purchases in one automated solution. Available today, new smart segmentation features in Google AdMob use machine learning to segment your players based on their likelihood to spend on in-app purchases. Ad units with smart segmentation will show ads only to users who are predicted not to spend on in-app purchases. Players who are predicted to spend will see no ads, and can simply continue playing.”

And how does the growth of ampproject.org square against the following wisdom?

Literally only yesterday did Google begin supporting instant loading of self-hosted AMP pages.

China has a different set of tech leaders than the United States. Baidu, Alibaba, Tencent (BAT) instead of Facebook, Amazon, Apple, Netflix, Google (FANG). China tech companies may have won their domestic markets in part based on superior technology or better knowledge of the local culture, though those same companies have largely went nowhere fast in most foreign markets. A big part of winning was governmental assistance in putting a foot on the scales.

Part of the US-China trade war is about who controls the virtual “seas” upon which value flows:

it can easily be argued that the last 60 years were above all the era of the container-ship (with container-ships getting ever bigger). But will the coming decades still be the age of the container-ship? Possibly not, for the simple reason that things that have value increasingly no longer travel by ship, but instead by fiberoptic cables! … you could almost argue that ZTE and Huawei have been the “East India Company” of the current imperial cycle. Unsurprisingly, it is these very companies, charged with laying out the “new roads” along which “tomorrow’s value” will flow, that find themselves at the center of the US backlash. … if the symbol of British domination was the steamship, and the symbol of American strength was the Boeing 747, it seems increasingly clear that the question of the future will be whether tomorrow’s telecom switches and routers are produced by Huawei or Cisco. … US attempts to take down Huawei and ZTE can be seen as the existing empire’s attempt to prevent the ascent of a new imperial power. With this in mind, I could go a step further and suggest that perhaps the Huawei crisis is this century’s version of Suez crisis. No wonder markets have been falling ever since the arrest of the Huawei CFO. In time, the Suez Crisis was brought to a halt by US threats to destroy the value of sterling. Could we now witness the same for the US dollar?

China maintains Huawei is an employee-owned company. But that proposition is suspect. Broadly stealing technology is vital to the growth of the Chinese economy & they have no incentive to stop unless their leading companies pay a direct cost. Meanwhile, China is investigating Ericsson over licensing technology.

India has taken notice of the success of Chinese tech companies & thus began to promote “national champion” company policies. That, in turn, has also meant some of the Chinese-styled laws requiring localized data, antitrust inquiries, foreign ownership restrictions, requirements for platforms to not sell their own goods, promoting limits on data encryption, etc.

The secretary of India’s Telecommunications Department, Aruna Sundararajan, last week told a gathering of Indian startups in a closed-door meeting in the tech hub of Bangalore that the government will introduce a “national champion” policy “very soon” to encourage the rise of Indian companies, according to a person familiar with the matter. She said Indian policy makers had noted the success of China’s internet giants, Alibaba Group Holding Ltd. and Tencent Holdings Ltd. … Tensions began rising last year, when New Delhi decided to create a clearer set of rules for e-commerce and convened a group of local players to solicit suggestions. Amazon and Flipkart, even though they make up more than half the market, weren’t invited, according to people familiar with the matter.

Amazon vowed to invest $5 billion in India & they have done some remarkable work on logistics there. Walmart acquired Flipkart for $16 billion.

Other emerging markets also have many local ecommerce leaders like Jumia, MercadoLibre, OLX, Gumtree, Takealot, Konga, Kilimall, BidOrBuy, Tokopedia, Bukalapak, Shoppee, Lazada. If you live in the US you may have never heard of *any* of those companies. And if you live in an emerging market you may have never interacted with Amazon or eBay.

It makes sense that ecommerce leadership would be more localized since it requires moving things in the physical economy, dealing with local currencies, managing inventory, shipping goods, etc. whereas information flows are just bits floating on a fiber optic cable.

If the Internet is primarily seen as a communications platform it is easy for people in some emerging markets to think Facebook is the Internet. Free communication with friends and family members is a compelling offer & as the cost of data drops web usage increases.

At the same time, the web is incredibly deflationary. Every free form of entertainment which consumes time is time that is not spent consuming something else.

Add the technological disruption to the wealth polarization that happened in the wake of the great recession, then combine that with algorithms that promote extremist views & it is clearly causing increasing conflict.

If you are a parent and you think you child has no shot at a brighter future than your own life it is easy to be full of rage.

Empathy can radicalize otherwise normal people by giving them a more polarized view of the world:

Starting around 2000, the line starts to slide. More students say it’s not their problem to help people in trouble, not their job to see the world from someone else’s perspective. By 2009, on all the standard measures, Konrath found, young people on average measure 40 percent less empathetic than my own generation … The new rule for empathy seems to be: reserve it, not for your “enemies,” but for the people you believe are hurt, or you have decided need it the most. Empathy, but just for your own team. And empathizing with the other team? That’s practically a taboo.

A complete lack of empathy could allow a psychopath (hi Chris!) to commit extreme crimes while feeling no guilt, shame or remorse. Extreme empathy can have the same sort of outcome:

“Sometimes we commit atrocities not out of a failure of empathy but rather as a direct consequence of successful, even overly successful, empathy. … They emphasized that students would learn both sides, and the atrocities committed by one side or the other were always put into context. Students learned this curriculum, but follow-up studies showed that this new generation was more polarized than the one before. … [Empathy] can be good when it leads to good action, but it can have downsides. For example, if you want the victims to say ‘thank you.’ You may even want to keep the people you help in that position of inferior victim because it can sustain your feeling of being a hero.” – Fritz Breithaupt

News feeds will be read. Villages will be razed. Lynch mobs will become commonplace.

Many people will end up murdered by algorithmically generated empathy.

As technology increases absentee ownership & financial leverage, a society led by morally agnostic algorithms is not going to become more egalitarian.

When politicians throw fuel on the fire it only gets worse:

It’s particularly odd that the government is demanding “accountability and responsibility” from a phone app when some ruling party politicians are busy spreading divisive fake news. How can the government ask WhatsApp to control mobs when those convicted of lynching Muslims have been greeted, garlanded and fed sweets by some of the most progressive and cosmopolitan members of Modi’s council of ministers?

Mark Zuckerburg won’t get caught downstream from platform blowback as he spends $20 million a year on his security.

The web is a mirror. Engagement-based algorithms reinforcing our perceptions & identities.

And every important story has at least 2 sides!

Some may “learn” vaccines don’t work. Others may learn the vaccines their own children took did not work, as it failed to protect them from the antivax content spread by Facebook & Google, absorbed by people spreading measles & Medieval diseases.

Passion drives engagement, which drives algorithmic distribution: “There’s an asymmetry of passion at work. Which is to say, there’s very little counter-content to surface because it simply doesn’t occur to regular people (or, in this case, actual medical experts) that there’s a need to produce counter-content.”

As the costs of “free” become harder to hide, social media companies which currently sell emerging markets as their next big growth area will end up having embedded regulatory compliance costs which will end up exceeding any sort of prospective revenue they could hope to generate.

The Pinterest S1 shows almost all their growth is in emerging markets, yet almost all their revenue is inside the United States.

As governments around the world see the real-world cost of the foreign tech companies & view some of them as piggy banks, eventually the likes of Facebook or Google will pull out of a variety of markets they no longer feel worth serving. It will be like Google did in mainland China with search after discovering pervasive hacking of activist Gmail accounts.

Lower friction & lower cost information markets will face more junk fees, hurdles & even some legitimate regulations. Information markets will start to behave more like physical goods markets.

The tech companies presume they will be able to use satellites, drones & balloons to beam in Internet while avoiding messy local issues tied to real world infrastructure, but when a local wealthy player is betting against them they’ll probably end up losing those markets: “One of the biggest cheerleaders for the new rules was Reliance Jio, a fast-growing mobile phone company controlled by Mukesh Ambani, India’s richest industrialist. Mr. Ambani, an ally of Mr. Modi, has made no secret of his plans to turn Reliance Jio into an all-purpose information service that offers streaming video and music, messaging, money transfer, online shopping, and home broadband services.”

Publishers do not have “their mojo back” because the tech companies have been so good to them, but rather because the tech companies have been so aggressive that they’ve earned so much blowback which will in turn lead publishers to opting out of future deals, which will eventually lead more people back to the trusted brands of yesterday.

Publishers feeling guilty about taking advertorial money from the tech companies to spread their propaganda will offset its publication with opinion pieces pointing in the other direction: “This is a lobbying campaign in which buying the good opinion of news brands is clearly important. If it was about reaching a target audience, there are plenty of metrics to suggest his words would reach further – at no cost – on Facebook. Similarly, Google is upping its presence in a less obvious manner via assorted media initiatives on both sides of the Atlantic. Its more direct approach to funding journalism seems to have the desired effect of making all media organisations (and indeed many academic institutions) touched by its money slightly less questioning and critical of its motives.”

When Facebook goes down direct visits to leading news brand sites go up.

When Google penalizes a no-name me-too site almost nobody realizes it is missing. But if a big publisher opts out of the ecosystem people will notice.

The reliance on the tech platforms is largely a mirage. If enough key players were to opt out at the same time people would quickly reorient their information consumption habits.

If the platforms can change their focus overnight then why can’t publishers band together & choose to dump them?

In Europe there is GDPR, which aimed to protect user privacy, but ultimately acted as a tax on innovation by local startups while being a subsidy to the big online ad networks. They also have Article 11 & Article 13, which passed in spite of Google’s best efforts on the scaremongering anti-SERP tests, lobbying & propaganda fronts: “Google has sparked criticism by encouraging news publishers participating in its Digital News Initiative to lobby against proposed changes to EU copyright law at a time when the beleaguered sector is increasingly turning to the search giant for help.”

Remember the Eric Schmidt comment about how brands are how you sort out (the non-YouTube portion of) the cesspool? As it turns out, he was allegedly wrong as Google claims they have been fighting for the little guy the whole time:

Article 11 could change that principle and require online services to strike commercial deals with publishers to show hyperlinks and short snippets of news. This means that search engines, news aggregators, apps, and platforms would have to put commercial licences in place, and make decisions about which content to include on the basis of those licensing agreements and which to leave out. Effectively, companies like Google will be put in the position of picking winners and losers. … Why are large influential companies constraining how new and small publishers operate? … The proposed rules will undoubtedly hurt diversity of voices, with large publishers setting business models for the whole industry. This will not benefit all equally. … We believe the information we show should be based on quality, not on payment.

Facebook claims there is a local news problem: “Facebook Inc. has been looking to boost its local-news offerings since a 2017 survey showed most of its users were clamoring for more. It has run into a problem: There simply isn’t enough local news in vast swaths of the country. … more than one in five newspapers have closed in the past decade and a half, leaving half the counties in the nation with just one newspaper, and 200 counties with no newspaper at all.”

Google is so for the little guy that for their local news experiments they’ve partnered with a private equity backed newspaper roll up firm & another newspaper chain which did overpriced acquisitions & is trying to act like a PE firm (trying to not get eaten by the PE firm).

Does the above stock chart look in any way healthy?

Does it give off the scent of a firm that understood the impact of digital & rode it to new heights?

If you want good market-based outcomes, why not partner with journalists directly versus operating through PE chop shops?

If Patch is profitable & Google were a neutral ranking system based on quality, couldn’t Google partner with journalists directly?

Throwing a few dollars at a PE firm in some nebulous partnership sure beats the sort of regulations coming out of the EU. And the EU’s regulations (and prior link tax attempts) are in addition to the three multi billion Euro fines the European Union has levied against Alphabet for shopping search, Android & AdSense.

Google was also fined in Russia over Android bundling. The fine was tiny, but after consumers gained a search engine choice screen (much like Google pushed for in Europe on Microsoft years ago) Yandex’s share of mobile search grew quickly.

The UK recently published a white paper on online harms. In some ways it is a regulation just like the tech companies might offer to participants in their ecosystems:

Companies will have to fulfil their new legal duties or face the consequences and “will still need to be compliant with the overarching duty of care even where a specific code does not exist, for example assessing and responding to the risk associated with emerging harms or technology”.

If web publishers should monitor inbound links to look for anything suspicious then the big platforms sure as hell have the resources & profit margins to monitor behavior on their own websites.

Australia passed the Sharing of Abhorrent Violent Material bill which requires platforms to expeditiously remove violent videos & notify the Australian police about them.

There are other layers of fracturing going on in the web as well.

Programmatic advertising shifted revenue from publishers to adtech companies & the largest ad sellers. Ad blockers further lower the ad revenues of many publishers. If you routinely use an ad blocker, try surfing the web for a while without one & you will notice layover welcome AdSense ads on sites as you browse the web – the very type of ad they were allegedly against when promoting AMP.

Tracking protection in browsers & ad blocking features built directly into browsers leave publishers more uncertain. And who even knows who visited an AMP page hosted on a third party server, particularly when things like GDPR are mixed in? Those who lack first party data may end up having to make large acquisitions to stay relevant.

Voice search & personal assistants are now ad channels.

App stores are removing VPNs in China, removing Tiktok in India, and keeping female tracking apps in Saudi Arabia. App stores are centralized chokepoints for governments. Every centralized service is at risk of censorship. Web browsers from key state-connected players can also censor messages spread by developers on platforms like GitHub.

Microsoft’s newest Edge web browser is based on Chromium, the source of Google Chrome. While Mozilla Firefox gets most of their revenue from a search deal with Google, Google has still went out of its way to use its services to both promote Chrome with pop overs AND break in competing web browsers:

“All of this is stuff you’re allowed to do to compete, of course. But we were still a search partner, so we’d say ‘hey what gives?’ And every time, they’d say, ‘oops. That was accidental. We’ll fix it in the next push in 2 weeks.’ Over and over. Oops. Another accident. We’ll fix it soon. We want the same things. We’re on the same team. There were dozens of oopses. Hundreds maybe?” – former Firefox VP Jonathan Nightingale

As phone sales fall & app downloads stall a hardware company like Apple is pushing hard into services while quietly raking in utterly fantastic ad revenues from search & ads in their app store.

Part of the reason people are downloading fewer apps is so many apps require registration as soon as they are opened, or only let a user engage with them for seconds before pushing aggressive upsells. And then many apps which were formerly one-off purchases are becoming subscription plays. As traffic acquisition costs have jumped, many apps must engage in sleight of hand behaviors (free but not really, we are collecting data totally unrelated to the purpose of our app & oops we sold your data, etc.) in order to get the numbers to back out. This in turn causes app stores to slow down app reviews.

Apple acquired the news subscription service Texture & turned it into Apple News Plus. Not only is Apple keeping half the subscription revenues, but soon the service will only work for people using Apple devices, leaving nearly 100,000 other subscribers out in the cold: “if you’re part of the 30% who used Texture to get your favorite magazines digitally on Android or Windows devices, you will soon be out of luck. Only Apple iOS devices will be able to access the 300 magazines available from publishers. At the time of the sale in March 2018 to Apple, Texture had about 240,000 subscribers.”

Apple is also going to spend over a half-billion Dollars exclusively licensing independently developed games:

Several people involved in the project’s development say Apple is spending several million dollars each on most of the more than 100 games that have been selected to launch on Arcade, with its total budget likely to exceed $500m. The games service is expected to launch later this year. … Apple is offering developers an extra incentive if they agree for their game to only be available on Arcade, withholding their release on Google’s Play app store for Android smartphones or other subscription gaming bundles such as Microsoft’s Xbox game pass.

Verizon wants to launch a video game streaming service. It will probably be almost as successful as their Go90 OTT service was. Microsoft is pushing to make Xbox games work on Android devices. Amazon is developing a game streaming service to compliment Twitch.

The hosts on Twitch, some of whom sign up exclusively with the platform in order to gain access to its moneymaking tools, are rewarded for their ability to make a connection with viewers as much as they are for their gaming prowess. Viewers who pay $4.99 a month for a basic subscription — the money is split evenly between the streamers and Twitch — are looking for immediacy and intimacy. While some hosts at YouTube Gaming offer a similar experience, they have struggled to build audiences as large, and as dedicated, as those on Twitch. … While YouTube has made millionaires out of the creators of popular videos through its advertising program, Twitch’s hosts make money primarily from subscribers and one-off donations or tips. YouTube Gaming has made it possible for viewers to support hosts this way, but paying audiences haven’t materialized at the scale they have on Twitch.

Google, having a bit of Twitch envy, is also launching a video game streaming service which will be deeply integrated into YouTube: “With Stadia, YouTube watchers can press “Play now” at the end of a video, and be brought into the game within 5 seconds. The service provides “instant access” via button or link, just like any other piece of content on the web.”

Google will also launch their own game studio making exclusive games for their platform.

When consoles don’t use discs or cartridges so they can sell a subscription access to their software library it is hard to be a game retailer! GameStop’s stock has been performing like an ICO. And these sorts of announcements from the tech companies have been hitting stock prices for companies like Nintendo & Sony: “There is no doubt this service makes life even more difficult for established platforms,” Amir Anvarzadeh, a market strategist at Asymmetric Advisors Pte, said in a note to clients. “Google will help further fragment the gaming market which is already coming under pressure by big games which have adopted the mobile gaming business model of giving the titles away for free in hope of generating in-game content sales.”

The big tech companies which promoted everything in adjacent markets being free are now erecting paywalls for themselves, balkanizing the web by paying for exclusives to drive their bundled subscriptions.

How many paid movie streaming services will the web have by the end of next year? 20? 50? Does anybody know?

Disney alone with operate Disney+, ESPN+ as well as Hulu.

And then the tech companies are not only licensing exclusives to drive their subscription-based services, but we’re going to see more exclusionary policies like YouTube not working on Amazon Echo, Netflix dumping support for Apple’s Airplay, or Amazon refusing to sell devices like Chromecast or Apple TV.

The good news in a fractured web is a broader publishing industry that contains many micro markets will have many opportunities embedded in it. A Facebook pivot away from games toward news, or a pivot away from news toward video won’t kill third party publishers who have a more diverse traffic profile and more direct revenues. And a regional law blocking porn or gambling websites might lead to an increase in demand for VPNs or free to play points-based games with paid upgrades. Even the rise of metered paywalls will lead to people using more web browsers & more VPNs. Each fracture (good or bad) will create more market edges & ultimately more opportunities. Chinese enforcement of their gambling laws created a real estate boom in Manila.

So long as there are 4 or 5 game stores, 4 or 5 movie streaming sites, etc. … they have to compete on merit or use money to try to buy exclusives. Either way is better than the old monopoly strategy of take it or leave it ultimatums.

The publisher wins because there is a competitive bid. There won’t be an arbitrary 30% tax on everything. So long as there is competition from the open web there will be means to bypass the junk fees & the most successful companies that do so might create their own stores with a lower rate: “Mr. Schachter estimates that Apple and Google could see a hit of about 14% to pretax earnings if they reduced their own app commissions to match Epic’s take.”

As the big media companies & big tech companies race to create subscription products they’ll spend many billions on exclusives. And they will be training consumers that there’s nothing wrong with paying for content. This will eventually lead to hundreds of thousands or even millions of successful niche publications which have incentives better aligned than all the issues the ad supported web has faced.

Categories: 

from SEO Book http://www.seobook.com/fractured
via KCG Auto Feed

Keyword Not Provided, But it Just Clicks

When SEO Was Easy

When I got started on the web over 15 years ago I created an overly broad & shallow website that had little chance of making money because it was utterly undifferentiated and crappy. In spite of my best (worst?) efforts while being a complete newbie, sometimes I would go to the mailbox and see a check for a couple hundred or a couple thousand dollars come in. My old roommate & I went to Coachella & when the trip was over I returned to a bunch of mail to catch up on & realized I had made way more while not working than what I spent on that trip.

What was the secret to a total newbie making decent income by accident?

Horrible spelling.

Back then search engines were not as sophisticated with their spelling correction features & I was one of 3 or 4 people in the search index that misspelled the name of an online casino the same way many searchers did.

The high minded excuse for why I did not scale that would be claiming I knew it was a temporary trick that was somehow beneath me. The more accurate reason would be thinking in part it was a lucky fluke rather than thinking in systems. If I were clever at the time I would have created the misspeller’s guide to online gambling, though I think I was just so excited to make anything from the web that I perhaps lacked the ambition & foresight to scale things back then.

In the decade that followed I had a number of other lucky breaks like that. One time one of the original internet bubble companies that managed to stay around put up a sitewide footer link targeting the concept that one of my sites made decent money from. This was just before the great recession, before Panda existed. The concept they targeted had 3 or 4 ways to describe it. 2 of them were very profitable & if they targeted either of the most profitable versions with that page the targeting would have sort of carried over to both. They would have outranked me if they targeted the correct version, but they didn’t so their mistargeting was a huge win for me.

Search Gets Complex

Search today is much more complex. In the years since those easy-n-cheesy wins, Google has rolled out many updates which aim to feature sought after destination sites while diminishing the sites which rely one “one simple trick” to rank.

Arguably the quality of the search results has improved significantly as search has become more powerful, more feature rich & has layered in more relevancy signals.

Many quality small web publishers have went away due to some combination of increased competition, algorithmic shifts & uncertainty, and reduced monetization as more ad spend was redirected toward Google & Facebook. But the impact as felt by any given publisher is not the impact as felt by the ecosystem as a whole. Many terrible websites have also went away, while some formerly obscure though higher-quality sites rose to prominence.

There was the Vince update in 2009, which boosted the rankings of many branded websites.

Then in 2011 there was Panda as an extension of Vince, which tanked the rankings of many sites that published hundreds of thousands or millions of thin content pages while boosting the rankings of trusted branded destinations.

Then there was Penguin, which was a penalty that hit many websites which had heavily manipulated or otherwise aggressive appearing link profiles. Google felt there was a lot of noise in the link graph, which was their justification for the Penguin.

There were updates which lowered the rankings of many exact match domains. And then increased ad load in the search results along with the other above ranking shifts further lowered the ability to rank keyword-driven domain names. If your domain is generically descriptive then there is a limit to how differentiated & memorable you can make it if you are targeting the core market the keywords are aligned with.

There is a reason eBay is more popular than auction.com, Google is more popular than search.com, Yahoo is more popular than portal.com & Amazon is more popular than a store.com or a shop.com. When that winner take most impact of many online markets is coupled with the move away from using classic relevancy signals the economics shift to where is makes a lot more sense to carry the heavy overhead of establishing a strong brand.

Branded and navigational search queries could be used in the relevancy algorithm stack to confirm the quality of a site & verify (or dispute) the veracity of other signals.

Historically relevant algo shortcuts become less appealing as they become less relevant to the current ecosystem & even less aligned with the future trends of the market. Add in negative incentives for pushing on a string (penalties on top of wasting the capital outlay) and a more holistic approach certainly makes sense.

Modeling Web Users & Modeling Language

PageRank was an attempt to model the random surfer.

When Google is pervasively monitoring most users across the web they can shift to directly measuring their behaviors instead of using indirect signals.

Years ago Bill Slawski wrote about the long click in which he opened by quoting Steven Levy’s In the Plex: How Google Thinks, Works, and Shapes our Lives

“On the most basic level, Google could see how satisfied users were. To paraphrase Tolstoy, happy users were all the same. The best sign of their happiness was the “Long Click” — This occurred when someone went to a search result, ideally the top one, and did not return. That meant Google has successfully fulfilled the query.”

Of course, there’s a patent for that. In Modifying search result ranking based on implicit user feedback they state:

user reactions to particular search results or search result lists may be gauged, so that results on which users often click will receive a higher ranking. The general assumption under such an approach is that searching users are often the best judges of relevance, so that if they select a particular search result, it is likely to be relevant, or at least more relevant than the presented alternatives.

If you are a known brand you are more likely to get clicked on than a random unknown entity in the same market.

And if you are something people are specifically seeking out, they are likely to stay on your website for an extended period of time.

One aspect of the subject matter described in this specification can be embodied in a computer-implemented method that includes determining a measure of relevance for a document result within a context of a search query for which the document result is returned, the determining being based on a first number in relation to a second number, the first number corresponding to longer views of the document result, and the second number corresponding to at least shorter views of the document result; and outputting the measure of relevance to a ranking engine for ranking of search results, including the document result, for a new search corresponding to the search query. The first number can include a number of the longer views of the document result, the second number can include a total number of views of the document result, and the determining can include dividing the number of longer views by the total number of views.

Attempts to manipulate such data may not work.

safeguards against spammers (users who generate fraudulent clicks in an attempt to boost certain search results) can be taken to help ensure that the user selection data is meaningful, even when very little data is available for a given (rare) query. These safeguards can include employing a user model that describes how a user should behave over time, and if a user doesn’t conform to this model, their click data can be disregarded. The safeguards can be designed to accomplish two main objectives: (1) ensure democracy in the votes (e.g., one single vote per cookie and/or IP for a given query-URL pair), and (2) entirely remove the information coming from cookies or IP addresses that do not look natural in their browsing behavior (e.g., abnormal distribution of click positions, click durations, clicks_per_minute/hour/day, etc.). Suspicious clicks can be removed, and the click signals for queries that appear to be spmed need not be used (e.g., queries for which the clicks feature a distribution of user agents, cookie ages, etc. that do not look normal).

And just like Google can make a matrix of documents & queries, they could also choose to put more weight on search accounts associated with topical expert users based on their historical click patterns.

Moreover, the weighting can be adjusted based on the determined type of the user both in terms of how click duration is translated into good clicks versus not-so-good clicks, and in terms of how much weight to give to the good clicks from a particular user group versus another user group. Some user’s implicit feedback may be more valuable than other users due to the details of a user’s review process. For example, a user that almost always clicks on the highest ranked result can have his good clicks assigned lower weights than a user who more often clicks results lower in the ranking first (since the second user is likely more discriminating in his assessment of what constitutes a good result). In addition, a user can be classified based on his or her query stream. Users that issue many queries on (or related to) a given topic T (e.g., queries related to law) can be presumed to have a high degree of expertise with respect to the given topic T, and their click data can be weighted accordingly for other queries by them on (or related to) the given topic T.

Google was using click data to drive their search rankings as far back as 2009. David Naylor was perhaps the first person who publicly spotted this. Google was ranking Australian websites for [tennis court hire] in the UK & Ireland, in part because that is where most of the click signal came from. That phrase was most widely searched for in Australia. In the years since Google has done a better job of geographically isolating clicks to prevent things like the problem David Naylor noticed, where almost all search results in one geographic region came from a different country.

Whenever SEOs mention using click data to search engineers, the search engineers quickly respond about how they might consider any signal but clicks would be a noisy signal. But if a signal has noise an engineer would work around the noise by finding ways to filter the noise out or combine multiple signals. To this day Google states they are still working to filter noise from the link graph: “We continued to protect the value of authoritative and relevant links as an important ranking signal for Search.”

The site with millions of inbound links, few intentional visits & those who do visit quickly click the back button (due to a heavy ad load, poor user experience, low quality content, shallow content, outdated content, or some other bait-n-switch approach)…that’s an outlier. Preventing those sorts of sites from ranking well would be another way of protecting the value of authoritative & relevant links.

Best Practices Vary Across Time & By Market + Category

Along the way, concurrent with the above sorts of updates, Google also improved their spelling auto-correct features, auto-completed search queries for many years through a featured called Google Instant (though they later undid forced query auto-completion while retaining automated search suggestions), and then they rolled out a few other algorithms that further allowed them to model language & user behavior.

Today it would be much harder to get paid above median wages explicitly for sucking at basic spelling or scaling some other individual shortcut to the moon, like pouring millions of low quality articles into a (formerly!) trusted domain.

Nearly a decade after Panda, eHow’s rankings still haven’t recovered.

Back when I got started with SEO the phrase Indian SEO company was associated with cut-rate work where people were buying exclusively based on price. Sort of like a “I got a $500 budget for link building, but can not under any circumstance invest more than $5 in any individual link.” Part of how my wife met me was she hired a hack SEO from San Diego who outsourced all the work to India and marked the price up about 100-fold while claiming it was all done in the United States. He created reciprocal links pages that got her site penalized & it didn’t rank until after she took her reciprocal links page down.

With that sort of behavior widespread (hack US firm teaching people working in an emerging market poor practices), it likely meant many SEO “best practices” which were learned in an emerging market (particularly where the web was also underdeveloped) would be more inclined to being spammy. Considering how far ahead many Western markets were on the early Internet & how India has so many languages & how most web usage in India is based on mobile devices where it is hard for users to create links, it only makes sense that Google would want to place more weight on end user data in such a market.

If you set your computer location to India Bing’s search box lists 9 different languages to choose from.

The above is not to state anything derogatory about any emerging market, but rather that various signals are stronger in some markets than others. And competition is stronger in some markets than others.

Search engines can only rank what exists.

“In a lot of Eastern European – but not just Eastern European markets – I think it is an issue for the majority of the [bream? muffled] countries, for the Arabic-speaking world, there just isn’t enough content as compared to the percentage of the Internet population that those regions represent. I don’t have up to date data, I know that a couple years ago we looked at Arabic for example and then the disparity was enormous. so if I’m not mistaken the Arabic speaking population of the world is maybe 5 to 6%, maybe more, correct me if I am wrong. But very definitely the amount of Arabic content in our index is several orders below that. So that means we do not have enough Arabic content to give to our Arabic users even if we wanted to. And you can exploit that amazingly easily and if you create a bit of content in Arabic, whatever it looks like we’re gonna go you know we don’t have anything else to serve this and it ends up being horrible. and people will say you know this works. I keyword stuffed the hell out of this page, bought some links, and there it is number one. There is nothing else to show, so yeah you’re number one. the moment somebody actually goes out and creates high quality content that’s there for the long haul, you’ll be out and that there will be one.” – Andrey Lipattsev – Search Quality Senior Strategist at Google Ireland, on Mar 23, 2016


Impacting the Economics of Publishing

Now search engines can certainly influence the economics of various types of media. At one point some otherwise credible media outlets were pitching the Demand Media IPO narrative that Demand Media was the publisher of the future & what other media outlets will look like. Years later, after heavily squeezing on the partner network & promoting programmatic advertising that reduces CPMs by the day Google is funding partnerships with multiple news publishers like McClatchy & Gatehouse to try to revive the news dead zones even Facebook is struggling with.

“Facebook Inc. has been looking to boost its local-news offerings since a 2017 survey showed most of its users were clamoring for more. It has run into a problem: There simply isn’t enough local news in vast swaths of the country. … more than one in five newspapers have closed in the past decade and a half, leaving half the counties in the nation with just one newspaper, and 200 counties with no newspaper at all.”

As mainstream newspapers continue laying off journalists, Facebook’s news efforts are likely to continue failing unless they include direct economic incentives, as Google’s programmatic ad push broke the banner ad:

“Thanks to the convoluted machinery of Internet advertising, the advertising world went from being about content publishers and advertising context—The Times unilaterally declaring, via its ‘rate card’, that ads in the Times Style section cost $30 per thousand impressions—to the users themselves and the data that targets them—Zappo’s saying it wants to show this specific shoe ad to this specific user (or type of user), regardless of publisher context. Flipping the script from a historically publisher-controlled mediascape to an advertiser (and advertiser intermediary) controlled one was really Google’s doing. Facebook merely rode the now-cresting wave, borrowing outside media’s content via its own users’ sharing, while undermining media’s ability to monetize via Facebook’s own user-data-centric advertising machinery. Conventional media lost both distribution and monetization at once, a mortal blow.”

Google is offering news publishers audience development & business development tools.

Heavy Investment in Emerging Markets Quickly Evolves the Markets

As the web grows rapidly in India, they’ll have a thousand flowers bloom. In 5 years the competition in India & other emerging markets will be much tougher as those markets continue to grow rapidly. Media is much cheaper to produce in India than it is in the United States. Labor costs are lower & they never had the economic albatross that is the ACA adversely impact their economy. At some point the level of investment & increased competition will mean early techniques stop having as much efficacy. Chinese companies are aggressively investing in India.

“If you break India into a pyramid, the top 100 million (urban) consumers who think and behave more like Americans are well-served,” says Amit Jangir, who leads India investments at 01VC, a Chinese venture capital firm based in Shanghai. The early stage venture firm has invested in micro-lending firms FlashCash and SmartCoin based in India. The new target is the next 200 million to 600 million consumers, who do not have a go-to entertainment, payment or ecommerce platform yet— and there is gonna be a unicorn in each of these verticals, says Jangir, adding that it will be not be as easy for a player to win this market considering the diversity and low ticket sizes.

RankBrain

RankBrain appears to be based on using user clickpaths on head keywords to help bleed rankings across into related searches which are searched less frequently. A Googler didn’t state this specifically, but it is how they would be able to use models of searcher behavior to refine search results for keywords which are rarely searched for.

In a recent interview in Scientific American a Google engineer stated: “By design, search engines have learned to associate short queries with the targets of those searches by tracking pages that are visited as a result of the query, making the results returned both faster and more accurate than they otherwise would have been.”

Now a person might go out and try to search for something a bunch of times or pay other people to search for a topic and click a specific listing, but some of the related Google patents on using click data (which keep getting updated) mentioned how they can discount or turn off the signal if there is an unnatural spike of traffic on a specific keyword, or if there is an unnatural spike of traffic heading to a particular website or web page.

And, since Google is tracking the behavior of end users on their own website, anomalous behavior is easier to track than it is tracking something across the broader web where signals are more indirect. Google can take advantage of their wide distribution of Chrome & Android where users are regularly logged into Google & pervasively tracked to place more weight on users where they had credit card data, a long account history with regular normal search behavior, heavy Gmail users, etc.

Plus there is a huge gap between the cost of traffic & the ability to monetize it. You might have to pay someone a dime or a quarter to search for something & there is no guarantee it will work on a sustainable basis even if you paid hundreds or thousands of people to do it. Any of those experimental searchers will have no lasting value unless they influence rank, but even if they do influence rankings it might only last temporarily. If you bought a bunch of traffic into something genuine Google searchers didn’t like then even if it started to rank better temporarily the rankings would quickly fall back if the real end user searchers disliked the site relative to other sites which already rank.

This is part of the reason why so many SEO blogs mention brand, brand, brand. If people are specifically looking for you in volume & Google can see that thousands or millions of people specifically want to access your site then that can impact how you rank elsewhere.

Even looking at something inside the search results for a while (dwell time) or quickly skipping over it to have a deeper scroll depth can be a ranking signal. Some Google patents mention how they can use mouse pointer location on desktop or scroll data from the viewport on mobile devices as a quality signal.

Neural Matching

Last year Danny Sullivan mentioned how Google rolled out neural matching to better understand the intent behind a search query.

The above Tweets capture what the neural matching technology intends to do. Google also stated:

we’ve now reached the point where neural networks can help us take a major leap forward from understanding words to understanding concepts. Neural embeddings, an approach developed in the field of neural networks, allow us to transform words to fuzzier representations of the underlying concepts, and then match the concepts in the query with the concepts in the document. We call this technique neural matching.

To help people understand the difference between neural matching & RankBrain, Google told SEL: “RankBrain helps Google better relate pages to concepts. Neural matching helps Google better relate words to searches.”

There are a couple research papers on neural matching.

The first one was titled A Deep Relevance Matching Model for Ad-hoc Retrieval. It mentioned using Word2vec & here are a few quotes from the research paper

  • “Successful relevance matching requires proper handling of the exact matching signals, query term importance, and diverse matching requirements.”
  • “the interaction-focused model, which first builds local level interactions (i.e., local matching signals) between two pieces of text, and then uses deep neural networks to learn hierarchical interaction patterns for matching.”
  • “according to the diverse matching requirement, relevance matching is not position related since it could happen in any position in a long document.”
  • “Most NLP tasks concern semantic matching, i.e., identifying the semantic meaning and infer”ring the semantic relations between two pieces of text, while the ad-hoc retrieval task is mainly about relevance matching, i.e., identifying whether a document is relevant to a given query.”
  • “Since the ad-hoc retrieval task is fundamentally a ranking problem, we employ a pairwise ranking loss such as hinge loss to train our deep relevance matching model.”

The paper mentions how semantic matching falls down when compared against relevancy matching because:

  • semantic matching relies on similarity matching signals (some words or phrases with the same meaning might be semantically distant), compositional meanings (matching sentences more than meaning) & a global matching requirement (comparing things in their entirety instead of looking at the best matching part of a longer document); whereas,
  • relevance matching can put significant weight on exact matching signals (weighting an exact match higher than a near match), adjust weighting on query term importance (one word might or phrase in a search query might have a far higher discrimination value & might deserve far more weight than the next) & leverage diverse matching requirements (allowing relevancy matching to happen in any part of a longer document)

Here are a couple images from the above research paper

And then the second research paper is

Deep Relevancy Ranking Using Enhanced Dcoument-Query Interactions
“interaction-based models are less efficient, since one cannot index a document representation independently of the query. This is less important, though, when relevancy ranking methods rerank the top documents returned by a conventional IR engine, which is the scenario we consider here.”

That same sort of re-ranking concept is being better understood across the industry. There are ranking signals that earn some base level ranking, and then results get re-ranked based on other factors like how well a result matches the user intent.

Here are a couple images from the above research paper.

For those who hate the idea of reading research papers or patent applications, Martinibuster also wrote about the technology here. About the only part of his post I would debate is this one:

“Does this mean publishers should use more synonyms? Adding synonyms has always seemed to me to be a variation of keyword spamming. I have always considered it a naive suggestion. The purpose of Google understanding synonyms is simply to understand the context and meaning of a page. Communicating clearly and consistently is, in my opinion, more important than spamming a page with keywords and synonyms.”

I think one should always consider user experience over other factors, however a person could still use variations throughout the copy & pick up a bit more traffic without coming across as spammy. Danny Sullivan mentioned the super synonym concept was impacting 30% of search queries, so there are still a lot which may only be available to those who use a specific phrase on their page.

Martinibuster also wrote another blog post tying more research papers & patents to the above. You could probably spend a month reading all the related patents & research papers.

The above sort of language modeling & end user click feedback compliment links-based ranking signals in a way that makes it much harder to luck one’s way into any form of success by being a terrible speller or just bombing away at link manipulation without much concern toward any other aspect of the user experience or market you operate in.

Pre-penalized Shortcuts

Google was even issued a patent for predicting site quality based upon the N-grams used on the site & comparing those against the N-grams used on other established site where quality has already been scored via other methods: “The phrase model can be used to predict a site quality score for a new site; in particular, this can be done in the absence of other information. The goal is to predict a score that is comparable to the baseline site quality scores of the previously-scored sites.”

Have you considered using a PLR package to generate the shell of your site’s content? Good luck with that as some sites trying that shortcut might be pre-penalized from birth.

Navigating the Maze

When I started in SEO one of my friends had a dad who is vastly smarter than I am. He advised me that Google engineers were smarter, had more capital, had more exposure, had more data, etc etc etc … and thus SEO was ultimately going to be a malinvestment.

Back then he was at least partially wrong because influencing search was so easy.

But in the current market, 16 years later, we are near the infection point where he would finally be right.

At some point the shortcuts stop working & it makes sense to try a different approach.

The flip side of all the above changes is as the algorithms have become more complex they have went from being a headwind to people ignorant about SEO to being a tailwind to those who do not focus excessively on SEO in isolation.

If one is a dominant voice in a particular market, if they break industry news, if they have key exclusives, if they spot & name the industry trends, if their site becomes a must read & is what amounts to a habit … then they perhaps become viewed as an entity. Entity-related signals help them & those signals that are working against the people who might have lucked into a bit of success become a tailwind rather than a headwind.

If your work defines your industry, then any efforts to model entities, user behavior or the language of your industry are going to boost your work on a relative basis.

This requires sites to publish frequently enough to be a habit, or publish highly differentiated content which is strong enough that it is worth the wait.

Those which publish frequently without being particularly differentiated are almost guaranteed to eventually walk into a penalty of some sort. And each additional person who reads marginal, undifferentiated content (particularly if it has an ad-heavy layout) is one additional visitor that site is closer to eventually getting whacked. Success becomes self regulating. Any short-term success becomes self defeating if one has a highly opportunistic short-term focus.

Those who write content that only they could write are more likely to have sustained success.

from SEO Book http://www.seobook.com/keyword-not-provided-it-just-clicks
via KCG Auto Feed