These are my links for January 17th through January 20th:
- PG&E Electrical System Outage Map – This map shows the current outages in our 70,000-square-mile service area. To see more details about an outage, including the cause and estimated time of restoration, click on the color-coded icon associated with that outage.
- Twitter.com vs The Twitter Ecosystem – Fred Wilson comments on some data from John Borthwick indicating Twitter ecosystem use = 3-5x Twitter.com directly.
"John's chart estimates that Twitter.com is about 20mm uvs a month in the US (comScore has it at 60mm uvs worldwide) and the Twitter ecosystem at about 60mm uvs in the US.
That says that across all web services, not just AVC, the Twitter ecosystem is about 3x Twitter.com. And on this blog, whose audience is certainly power users, that ratio is 5x."
- Chris Walshaw :: Research :: Partition Archive – Welcome to the University of Greenwich Graph Partitioning Archive. The archive consists of the best partitions found to date for a range of graphs and its aim is to provide a benchmark, against which partitioning algorithms can be tested, and a resource for experimentation.
The partition archive has been in operation since the year 2000 and includes results from most of the major graph partitioning software packages. Researchers developing experimental partitioning algorithms regularly submit new partitions for possible inclusion.
Most of the test graphs arise from typical partitioning applications, although the archive also includes results computed for a graph-colouring test suite [Wal04] contained in a separate annex.
The archive was originally set up as part of a research project into very high quality partitions and authors wishing to refer to the partitioning archive should cite the paper [SWC04].
- Twitter’s Crawl « The Product Guy – "A list of incidents that affected the Page Load Time of the Twitter product, distinguishing between total downtime, and partial downtime and information inaccessibility, based upon the public posts on Twitters blog.
I did my best to not double count any problems, but it was difficult since many of the problems occur so frequently, and it is often difficult to distinguish, from these status blog posts alone, between a persisting problem being experienced or fixed, from that of a new emergence of a similar or same problem. Furthermore, I also excluded the impact on Page Load Time arising from scheduled maintenance/downtime – periods of time over which the user expectation would be most aligned with the product’s promise of Page Load Time. "
- Soundboard.com – Soundboard.com is the web's largest catalog of free sounds and soundboards – in over 20 categories, for mobile or PC. 252,858 free sounds on 17,171 soundboards from movies to sports, sound effects, television, celebrities, history and travel. Or build, customize, embed and manage your own
These are my links for April 28th from 05:35 to 14:24:
- Official Google Blog: Adding search power to public data – Interesting. Wonder if the underlying public data sets will eventually become available on Google App Engine as well, sort of like the public data sets available for use with Amazon EC2 applications.
- MySQL And Search At Craigslist – Jeremy Zawodny's slides on MySQL, Sphinx, and free text search implementation at Craigslist, from last week's MySQL conference.
- Skew, The Frontend Engineer’s Misery @ Irrational Exuberance – For mashups and the like, the distinction between a FE engineer and web dev is rather small in terms of technical skills; they are both using the same skillset, they are both interacting with APIs, and so on. However, there are important distinctions between the two: 1. web developers tend to move in small groups or as individuals, whereas fe engineers work in larger groups, 2. web developers tend to design a product on top of an existing backend service (api, etc), while fe engineers are usually working in parallel with the backend being developed.
- Study: Twitter Audience Does Not Have A Return Policy – Over 60 percent of people who sign up to use the popular (and tremendously discussed) micro-blogging platform do not return to using it the following month, according to new data released by Nielsen Online. In other words, Twitter currently has just a 40 percent retention rate, up from just 30 percent in previous months–indicating an “I don’t get it factor” among new users that is reminiscent of the similarly-over hyped Second Life from a few years ago.
- Hey Americans, Appreciate Your Freedom Of Speech : NPR – Firoozeh Dumas on the underappreciated freedoms of speech and expression we have in the US vs journalists and bloggers in Iran.
These are my links for February 21st from 13:59 to 21:55:
- Non Sequitur — Gocomics.com – "Hi. My name is Bob, and I'm a Twitter addict…"
- A Tutorial on Support Vector Machines for Pattern Recognition – Christopher J.C. Burges (PDF) – Appeared in: Data Mining and Knowledge Discovery 2, 121-167, 1998. The tutorial starts with an overview of the concepts of VC dimension and structural risk
minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable
data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss
when SVM solutions are unique and when they are global. We describe how support vector training can
be practically implemented, and discuss in detail the kernel mapping technique which is used to construct
SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large
(even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian
radial basis function kernels. While very high VC dimension would normally bode ill for generalization
performance, there are several arguments which support the observed high accuracy of SVMs,
which we review.
- Data Mining Research – dataminingblog.com: Data Miners on Twitter – A list of data mining people on twitter.
- YouTube – The Crisis of Credit Visualized – Part 1 – Nice animated video attempting to present a simplified explanation of the credit crisis and the relationship between home mortgage lending, bank leverage, and risk.
- “10 Obstacles to Cloud Computing” by UC Berkeley & How GoGrid Hurdles Them | GoGrid Blog – Another commentary on the recent UCB cloud computing overview paper
This evening I’m rolling out a long overdue update to the blogging platform. It’s been a little complicated, because I ‘ve been running a heavily customized WordPress 1.5.2 for a long time, and there have been a lot of changes since then to WordPress, various plugins, and the underlying database (the current release is 2.7.1).
The new version is based on Atahualpa, which has many customizable options. The Recent Posts, Tag Cloud, Recent Links, Twitter status, and permalinks are all working as before. The new template doesn’t have a place for the randomly selected banner thumbnail images from my Flickr account, but does incorporate a larger random image at the top, which currently selects from a few photos I picked out of my snapshot collection. I may figure out some other way of sharing some photos here. I’ve also added a random quote widget. You have to provide your own collection of quotes, so there aren’t many in there yet.
It might be a little slower than the old platform for a while until I get the caching set up, all those customizable options use a lot of database queries.
Let me know what you think, and if you are have any suggestions or are having problems viewing things. I’ve mostly been looking at this with Firefox 3, so people with other browsers may have a different experience.
I haven’t been posting here in a while, but think I will try picking up the keyboard here a little more frequently. I added a twitter box on the sidebar a while back, as I have been experimenting with that more, along with friendfeed, facebook, etc. I like the brevity and immediacy of twitter, but not everything fits in 140 characters. You can find me on twitter and friendfeed as “hjl”, also on Facebook.
Hello, dear readers. I had lunch with some friends the other day and they mentioned that I hadn’t posted in a while. Sorry I haven’t been paying much attention to this site lately, other than knocking back comment and link spam. I recently saw that Google Reader is starting to report subscription statistics, which prompted me to take a look. It’s been a while since I looked over the server logs, and I was surprised at the number of RSS subscriptions that have accumulated (i.e. it’s more than I can account for by friends, family, and random acquaintances). I didn’t know you were out there, but now that you’re decloaked and I can see you, I wanted to say hello.
I ended up taking a break from posting for a few weeks (since the beginning of the year). Not by coincidence, I’ve also ramped up my running since the beginning of the year, prepping for this year’s Big Sur Marathon, while holding other obligations roughly constant.
Anyway, I think I’ll try some different approaches to posting here and see how it works out.
I’m amazed by the volume of discussion about Amanda Congdon, Andrew Baron, and the history and future (or not) of Rocketboom. I’m looking forward to seeing what either or both of them do going forward, like everyone else, but have nothing to add to the discussion other than best wishes.
However…the flap is also having the side effect of showing that just about everyone I “know” online has been watching Rocketboom. Check out the Technorati search for pointers to Amanda’s departure post and see how many names you recognize. Who knew?
Update 07-06-2006 16:20 PDT: Rocketboom the comic
Update 07-12-2006 21:42 PDT: Rocketboom is back on the air with new host Joanne Colan, here’s her debut.
My quick notes on trying out Google Reader:
- The AJAX user interface is whizzy and fun, and is similar to an e-mail reader.
- Importing feeds is really slow.
- Keyboard navigation shortcuts are great.
- Searching through your own feeds or for new feeds is convenient using Google
- I hate having a single item displayed at a time.
- “Blog This” action is handy, if you use Blogger. They could easily make this go to other blogging services later.
- This could be a good “starter” service for introducing someone to feed readers, but
- No apparent subscription export mechanism
- Doesn’t deal well with organizing a large number of feeds.
I started importing the OPML subscription file from Bloglines into Google Reader on Friday evening. I have around 500 subscriptions in that list, and I’m not sure how long it ended up taking to import. It was more than 15 minutes, which was when I headed off to bed, and completed sometime before this afternoon.
I love having keyboard navigation shortcuts. The AJAX-based user interface is zippy and “fun”. Unfortunately, Google Reader displays articles one at a time, a little like reading e-mail. I’m in the habit of scanning sections of the subscription lists to see which sections I want to look at, then scanning and scrolling through lists of articles in Bloglines. Even though this requires mousing and clicking, it’s a lot faster than flashing one article at a time in Google Reader.
I don’t think the current feed organization system works on Google Reader, at least for me. My current (bad) feed groupings from Bloglines show up on Google Reader as “Labels” for groups of feeds, which is nice. It’s hard to just read a set of feeds, though. Postings show up in chronological order, or by relevance. This is totally unusable for a large set of feeds, especially when several of them are high-traffic, low-priority (e.g. Metafilter, del.icio.us, USGS earthquakes). If I could get the “relevance” tuned by context (based on label or tag?) it might be useful.
When you add a new feed, it starts out empty, and appears to add articles only as they are posted. It would be nice to have them start out with whatever Google has cached already. I’m sure I’m not the first subscriber to most of the feeds on my list.
On the positive side, this seems like a good starting point for someone who’s new to feed readers and wants a web-based solution. It looks nice, people have heard of Google, and the default behaviors probably play better with a modest number of feeds. Up to this point, I’ve been steering people at Bloglines in the past, and more recently pointing them at Rojo.
I wish the Bloglines user interface could be revised to make it quicker to get around. I really like keyboard navigation. I can also see some potential in the Google Reader’s listing by “relevance” rather than date listing, and improved search and blogging integration. I’m frequently popping up another window to run searches while reading in Bloglines.
Google Reader doesn’t seem like it’s quite what I’m looking for just now, but I’ll keep an eye on it.
I think I want something to manage even more feeds than I have now, but where I’m reading a few regularly, a few articles from a pool of feeds based on “relevance”, and articles from the “neighborhood” of my feeds when they hit some “relevance” criteria. I’d also like to search my pool of identified / tagged feeds, along with some “neighborhood” of feeds and other links. I think a lot of this is about establishing context, intent, and some sort of “authoritativeness”, to augment the usual search keyword matching.
Looks like Google Blog Search took out the redirects that were breaking the referrer headers.
Now the search keywords are visible again. Here’s a typical log entry:
xxx.xxx.xxx.xxx – - [15/Sep/2005:15:58:13 -0700]
HTTP/1.1″ 200 26981 “http://blogsearch.google.com/blogsearch?hl=en&q=odeo&btnG=Search+Blogs&scoring=d”
“Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.10) Gecko/20050716
Blogger Buzz says the redirect was in place during development to help keep the project under wraps.
Feature request to Google Blog Search team: please add search query info to the referrer string.
Lots of coverage this morning from people trying out Google Blog Search. (Search Engine Watch, Anil Dash, lots more)
I’m seeing some traffic from Google Blog Search overnight, but it looks like they don’t send the search query in the referrer. Here’s a sample log entry:
xxx.xxx.xxx.xxx – - [14/Sep/2005:00:51:09 -0700] “GET /weblog/archives/2005/09/14/google-blog-search-launches/ HTTP/1.1″ 200 22964 “http://www.google.com/url?sa=D&q=http://www.hojohnlee.com/weblog/archives/2005/09/14/google-blog-search-launches/” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4″
So there’s no way to know the original search query. I have a pretty good idea how the overnight traffic looking for the Google post got here, but there are also people landing on fairly obscure pages here and I’m always curious how they found them. I’m sure the SEO crowd will be all over this shortly.
There have been a number of comments that Google Blog Search is sort of boring, but I’m finding that there’s good novelty value in having really fast search result pages. Haven’t used it enough to get a sense of how good the coverage is, or how fast it updates, but it will be a welcome alternative to Technorati and the others.
Update 09-14-2005 14:01 PDT: These guys think Google forgot to remove some redirect headers.
Update 09-14-2005 23:25 PDT: Over at Blogger Buzz, Google says they left the redirect in by accident, will be taking them out shortly:
“After clicking on a result in Blog Search, I’m being passed through a redirect. Why?”
Sadly, this wasn’t part of an overly clever click-harvesting scheme. We had the redirects in place during testing to prevent referrer-leaking and simply didn’t remove them prior to launch. But they should be gone in the next 24 hours … which will have the advantage of improving click-through time.
Google’s entry into blog search launched this evening, go try it out or read their help page.
This will be interesting competition for the existing blog search companies. It definitely responds fast at the moment, let’s see how it holds up when the next flash news crowd turns up…
via Niall Kennedy and Kevin Burton
We’re using the WordPress WP-ContactForm plugin by Ryan Duff and Firas Durri on some of our sites. During the past few weeks, there has been an increasing volume of attempted spam e-mail through the contact form. The latest update (1.3) has additional validation on the form input to prevent the injection of MIME enclosures, additional mail header fields, etc.
Here’s a recent discussion thread on the WordPress support forum. Firas says:
For those curious, the spamming/attaching is done via injecting extra headers alongwith the ‘From’ field. It’s not done using the actual html interface, but via other agents posting to the script.
The update announcement is here; the latest version is available on the plugin project page.
If you’re running an earlier version of the WordPress Contact Form plugin, this update should block the latest round of spam agents attempting to abuse the older version.
The past couple of days I’ve received a few hundred comment spams from “Kelly Ronald”, “John Reed”, “Nicholas Truman”, “Peter Back”, and “Alexander Kolt”, from IP addresses in Mexico, Taiwan, France, Australia, and California, among others. Most of them are tagged by the stopword list, but it’s a reminder that I should revisit the antispam implementation while I’m reworking the site. For now, I’m making good use of the bulk comment edit feature in WordPress.
Jeff Clavier appears to have gotten the same treatment:
If you are like me, you got blasted by âfriendlyâ comments from Alexander Kolt, Nicolas Trumen, John Reed, Peter Back, and Kelly Ronald â all praising your blog, your posts and yourself.
This new generation of comment spam is more clever than previous but for one thing – the fact that spammers are picking old posts that are not commented upon anymore. Otherwise they use legit blogs/blog posts and in a few cases, it is not even clear which web site they are âpimpingâ
Jeff also turned up a security blog with additional info:
We have experienced a âmassive attackâ of SPAM on our blogging system from various hosts all pointing to two websites:
http://www.cosmicbuddha.com/blog/archives/ 001169.html (I have broken the URL intentionally)
http://anthony.ianniciello.net/blog/archives/ 000079.html (I have again broken the URL intentionally)
The comments contained very brief sentences and links to the above web sites.
From what it looks like it was an act of an attack against automatic blacklisting and un-moderated comments, probably not conducted by authorsâ of the above blogs.
The author of at least one of the sites linked to in this spam run doesn’t seem to be responsible, he’s got a comment on the post linked above, and one of his posts has effectively been taken over by the discussion about how he ended up as one of the two target links in the posted spam comments.
This batch of spam seems a bit random. The typical spam postings I see here try to link to spamblogs and commercial sites. None of the linked sites in this set appear to benefit from the spam. So perhaps this is a test run for something in development. Wonderful thought.
Separately, I’ve also seen a number of attempts to send spam e-mail through a hard coded PHP mail form. Bill Lazar mentions seeing some similar traffic on his site:
In the last few days, though, somebody or someone’s script has found the form and is filling it out repeatedly. I guess the idea is that a useful percentage of web forms will trigger an automated response that’s of interest to the programmer though just what isn’t clear to me. The script fills in the form fields with the same data, an email address of a four or five character random group of letters (such as xtpku) at this domain.
The bad formmail posts are originating from 22.214.171.124 and 126.96.36.199, among others. I don’t think it’s actually succeeding in getting mail sent anywhere, but it’s clogging up the administrative mailbox with failure messages.
Update 09-14-2005 16:20 PDT: Updating to WP-Contact Form 1.3 seems to help. Still seeing attempted spam from new IP addresses, including 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, and 22.214.171.124. Hopefully they’ll figure out that it’s not working and move on.
The main sessions on Thursday and Friday were in the larger hall downstairs at the Palace Hotel. This event was pitched as a “Business” blogging event, and the audience seemed to be predominantly PR, marketing, and advertising folks. The general mind set was something like “what exactly is this blog stuff and what do I need to do about it?” In a show of hands, a significant fraction (more than half?) of the attendees were not blogging, either for their business or personally, but more than half were occasionally reading blogs.
A lot of business (and human) behavior can be attributed to a combination of fear and greed. In this case, some of the “fear” would be:
Losing control or being blindsided by negative PR. The Kryptonite bike lock hack was frequently cited in discussions.
Legal exposure if my employees are blogging, or PR exposure if negative comments or hate speech left by comments.
On the “greed” front:
- Blogging is new, and could become a competitive advantage (or disadvantage, if the competition is doing it) for existing products and services. Ford vs GM was cited several times, also Clip and Seal.
- Opportunities to recruit new customers, influence consumers through more authentic word-of-mouth vs mass advertising.
Assuming that this crowd is representative of the interest and awareness of businesses, there’s a long way to go in educating companies about the changing opportunity, risks, and characteristics of blogs and syndicated web publishing. There’s also an usability / explainability issue for the software and services vendors. I’m not fond of Microsoft’s “Web Feeds” push, but it’s representative of the sort of changes that will be needed to get out of technology-focused discussions and into conversation about potential business value among the mainstream, vs early-adopter market.
Wordpress demo and announcement of wordpress.com (hosted WordPress, like TypePad)
Movable Type 3.2 demo and release
The wireless service on Thursday was extremely unstable, probably due to the large number of users. On Friday, the Anchorfree team turned off the RADIUS authentication which seemed to improve the availability of the connection.
Lastly, Microsoft came up with some nice Ogio messenger bags. One of my old bags just bit the dust a couple of weeks ago, and I’d just started looking for one, so I think I’ll give this one a try for a while.
See also: BBS05 – Wednesday
The blog outsourcing topic has rolled along while I’ve been spending the day at the Blog Business Summit, listening to discussions on commercializing blogs. There’s now a post about it (Outsourcing bloggers in China) at CNET, which turned up a few other skeptics, and it’s looking like the Blogoriented guys are probably a hoax.
Despite that, I also think it’s inevitable that we’ll see at least a couple of real projects along these lines within a year, not aimed at simulating teenaged girls, but rather at building blog networks, filled and buzzed by creating inexpensive original content and editing search feeds that target specific niches.
David Sifry at Technorati has a good summary on the growing problems of spam blogs and fake blogs, and all the search engines are likely to make progress against what are essentially the next generation of link farms. Unfortunately, as discussed in this afternoon’s sessions on web advertising and affiliate models, if you can get traffic, there’s potential for a lot of money to be made by simple manipulations of the system, at least until the search engines improve. Content picked up by the blog search engines gets indexed immediately, leaving a way around some of the the sandboxing and other mechanisms used by Google and others, and makes profitable links visible immediately.
It’s cheap and apparently effective to implement spam and fake blogs. I’ve noticed the volume of junk e-mail is decreasing, while the number of spam blogs in search results seems to be increasing. It’s going to take cooperation among multiple parties to fix this, but everyone recognizes this as a problem, so it’s going to get better. (Here’s Mark Cuban’s take.)
I think that a follow on issue is that genuinely “original” content, in the “first author” sense, rather than in the “new idea” sense, can be probably be reliably cranked out through a well defined process. Think of something like an Indian call center or coding shop crossed with a daily news bureau, supervised by an editor who picked topics with some guidance from Wordtracker, Google and others. You’d get low cost, original writing, around an editorially consistent, topically relevant set of themes, and perhaps even with some interesting domain expertise, all tuned to be informative and keyworded to be search engine friendly.
Many of the same processes used at Wipro, Infosys, and other software and BPO outsourcers could be adapted to this application. Why cheat the search engine rankings when you can just reduce the cost of production and actually receive ranking benefit when the search engines get better at filtering for contextually better results and get rid of the “really fake” blogs? The Weblogs Inc blog network model seems to be working so far – Jason Calcanis says they’ve just hit a $1M annual ad revenue rate. Reducing the content production costs can’t hurt. I’m sure they could apply some of these ideas, if they haven’t already, and if they don’t, some other new blog network will certainly try.
This approach to farming out the process-oriented writing tasks should apply equally to a number of periodicals, such as magazines and newspapers. The difference between the news content in many newspapers is already often just the local editor’s preferences on the AP or Reuters newsfeeds and what fit in between the committed ad inches.
I don’t think this sort of blog or content outsourcing would be “bad” or “evil” in the sense of creating lower quality content, at least in some topic domains, since a pool of skilled professionals already exists offshore, and is growing rapidly. If you got a good editor in place, it might even improve the overall quality of online content. It’s not misrepresentation, unless you tried to pass off your authors as being something they’re not. But I wouldn’t even bother with attempting the nuances of local US culture with a staff of offshore bloggers, despite the availability of cultural indoctrination programs they run call center trainees through. That would work about as well having US bloggers cover cricket or Bollywood gossip or Korean K-pop singers for their respective local audiences.
This seems to leave American pop culture as a secure niche for a while. Unfortunately, I’m incredibly bad at celebrity gossip. Although, now that I think about it, I did meet Cher once at her house in Malibu…
Putting on my evil genius hat, here’s a hypothetical approach for building an astroturfing blog empire, filled with posts from simulated teenaged (18-35) girls. Start by extracting common phrases, topics, and contexts from some LiveJournal and MySpace blogs. Next, build some auto-blogging agents resembling Weisenbaum’s Eliza program crossed with some modern chatterbots. Finally, set it loose on LiveJournal, Xanga, and MySpace and have it start forming its own blogrings and online cliques, responding to filtered inputs from comments, selected feeds, and topical news, biased for the current hot keywords and with statistically plausible content and linkage…any Emacs Lisp and SQL hackers want to take this on?
See also: Outsource your Blog, Reasons I Still Read Newspapers
Update 08-19-2005 12:32 – some discussion at My Heart’s in Accra
Update 08-27-2005 00:10 – See also Goofy algorithm generates web page about “Prostitute Phobia” (at BoingBoing), which comments on this site, which is one of a collection of automatically generated pages.
The Blog Business Summit is actually on Thursday and Friday, but this afternoon there was an introductory session on blogging for business, led by Dave Taylor.
I’m not in the core target audience for this session, since I’m already involved in various blogging projects, but thought it would be interesting to talk with people and to hear their questions, concerns, and goals with respect to blogging.
It’s also useful to hear someone else try to explain blogs, RSS, web services, et al. I regularly find myself searching for a common starting context when talking about these topics with people who aren’t already somewhat involved in internet and web culture, especially if they’re from non-technology businesses. It’s remarkable that the tools have become as widespread as they are, given the impenetrable names.
I made good use of the free wireless service provided by AnchorFree. They’re running a captive portal that requires registration, so you’ll need to sign up for an account, but it’s nice to have. My notebook picked up three access points, all at high signal strength, probably installed in the room somewhere. Logged the location in Plazes.
Wirleess performance was okay to sluggish, I’m sure it’s a bit overloaded; something like half the people in the room had notebook computers. My session got dropped a few times, which reset my SSH sessions and required logging in on AnchorFree again using the browser. Lots of continuous partial attention going on in that room. Plus a few fully distracted people trying to get their wireless connections going. Perhaps they should hire those blog outsourcing guys.
This post is tagged (bbs05). Dave mentioned in his talk that he doesn’t like them, and thinks they’ll go away as search engines improve. I partially agree. User tags don’t scale well and in their present incarnation are highly vunerable to spam, but within relatively small communities, they can be an effective supplement to normal search engines. (Example – I could tag a collection of poetry as “haiku”, or “cinquain”, making it visible where the raw text might otherwise be difficult to locate through search.)
The coffee largely ran out after the break, hopefully they’ll have a larger supply tomorrow.
I had been speculating on something like this after reading an article last month about outsourcing personal website maintenance to India.
via Marginal Revolution, Content to Go
As I write this entry my partner Jeff is in the air on the way to our office in Shanghai. What Jeff and I are doing is simple but as far as I know we are the first. We are outsourcing blogs to China.
Our general business model is a two tiered effort to hire Chinese citizens to write blogs en masse for us at a valued wage. The first tier is to create original blogs. These blogs will pop up in various areas of the net and appear to the unknowing reader to be written by your standard American. Our short term goal for these original blogs is to generate a steady stream of revenue through traditional blog advertising like google adwords. We estimate that our current blogforce of 25 can support around 500 unrelated blogs. Hopefully a few of those will be hits. The long term goal is to generate a large untraceable astroturfing mechanism for launching of various products. When a vendor needs to promote a new product to the internet demographic we will be able to create a believable buzz across hundreds of âreputableâ blogs and countless message boards. We can offer a legitimacy to advertisers that doesenât exist anywhere else.
The second tier of our plan is a blog vacation service where our employees fill in for established bloggers who need to take a break from regular posting. As all bloggers know, an unupdated blog is quickly forgotten. For a nominal fee we can provide seamless integration of filler.
I’m not entirely sure that the project is real, they claim to have raised $5 million US and the domain was just registered 3 days ago, but this caught my eye because I think there are some real possibilities for something like this.
Personally, I don’t have a problem with commercial blogging or professional blogging. However…their plan calls for deliberate misrepresentation of commercial interests as personal ones, on a large scale. This could be blog spam taken to the next level.
If they’re really heading off to put together an offshored blog content network, I think it could be done without heading straight for the “astroturf” market, which might give it a slower start but longer legs.
In my quick take on this idea, I’d probably choose India or Phillipines over China for basic English language skills, since the target audience is in the US, and have content editors with actual domain knowledge working with lower cost writers. This might not work for simulating teen LiveJournal sites, but should fit pretty well for topical blogs of most sorts. Hmm. That sounds like the direction the newspaper and magazine business is already heading…
Update 08-19-2005 – Followed up with more comments, plus ideas on how to build the evil astroturfing network in a new post.
I’m doing a little experimenting with AdSense. So far most of my pages come up with ads for “Start your blog now” or “Sexy Girls & Sexy Guys”. It’s interesting to see which posts trigger a keyword match. I have observed a few posts that have switched from generic blog ads to a topical ad after a followup visit from the Mediapartners-Google crawler. You’d think that a post on the Blackdog Linux Server, the Yahoo-Alibaba deal, or visiting the Mona Lisa at the Louvre would trip a keyword or two.
The banners are only on the single post templates at the moment, so you’ll need to click on a post to see them. There’s also a set of vertical text ads at the bottom of the sidebar. I can tell I’m probably going to end up starting on a round of site revisions by the time I’m done with this, although I’m just interested in getting a better handle on the advertising and affiliate space at the moment.
Update: 08-15-2005 23:58 – At least this post has gotten tagged with Adsense ads. It will be interesting to see which pages actually trigger clickthroughs, vs which pages get reasonable keyword tags from Adsense.
Later this week I’ll be at the Blog Business Summit in San Francisco. A discounted registration for WordPress users is available.
There’s also a WordPress update released, 1.5.2, with bug and security fixes since 126.96.36.199. It’s not a platform for everyone, but I’ve been very pleased with the high level of support, technical flexibility, and the active developer and user communities that have evolved around WordPress in the past couple of years.
I enjoy the option of changing whatever I like in the system, but also enjoy not needing to do so most of the time.
Update 2005-08-14 17:48 – A bigger discount is available for Blogger users! The WordPress discount is $400, the Blogger discount is $500. Hmm.