Coming soon – one click from SpiralFrog to iPod?

Today SpiralFrog announced a free subscription-based music service. Subscribers will be able to download music to their music playing devices, but will need to listen to advertising presented on the SpiralFrog site periodically, to keep the music authorized. It sounds like the downloaded music would be WMA files, using Microsoft Windows Media DRM.

A couple of days ago, Engadget pointed at FairUse4WM, a Windows Media DRM 10 and 11 removal utility with a user friendly interface.

This FT article says that iPod+iTunes has the largest market share for legally authorized music at 80%. At the same time, it notes the growing number of non-iPod MP3 and other music players coming to market. I suspect it won’t be long before there’s a one-click utility to remove the Windows Media DRM, transcode the WMA file to MP3, and import them into iTunes so subscribers can listen on their iPod or whatever device. It probably won’t be from SpiralFrog, though.

The upcoming Zune music / video players from Microsoft are likely to have similar issues, whatever their online media network turns out to be.

I think it’s great that the music publishers are trying different business models, in this case advertising. On the other hand, I find I use services like Pandora for casual listening and finding new music, then buy the actual CDs of music I want rather than purchasing from iTunes, just so I have a clean, portable DRM-free audio file that can be shipped around the house and across whatever device happens to be convenient. I’d rather just buy clean, portable bits, without needing the physical CD. Where’s the service for that? (Other than allofmp3.com).

More on SpiralFrog from BoingBoing, TechCrunch

Update Tuesday 08-29-2006 21:16 PDT – I see that Microsoft is working on patching WinDRM to block FairUse4WM. (Good luck with that.) And on the iPod front, it looks like jHymn has been getting updates so it can work with iTunes 6 to remove the FairPlay DRM, making those files portable to non-iPod devices.

US housing and the stock market


In the past ten years (1996-2006), the NAHB homebuilders index tends to lead the performance of the S&P 500 by 12 months. The index goes from is based on a survey of homebuilding companies views on current sales, the outlook for the next 6 months, and the current level of prospective buyer traffic. This month was the 7th monthly drop in a row, and is a 15-year low.

In the period from 1985 to 1996, there is no correlation between the housing and stock market, so this could optimistically be viewed as a temporary coincidence. On the other hand, asset class correlations have been going up for a while. Draw your own conclusions, but real estate prices, home builders, and mortgage lenders are clearly having a difficult time recently.

John Mauldin has pulled together some observations on the housing market in his newsletter this week, leading with the graph above from David Rosenberg at Merrill Lynch. (Free e-mail registration required, but worthwhile reading.)

Here in the Bay Area, real estate prices are chronically high, ranging from “insanely high” to just “overpriced”. Here’s the pessimistic view. At least you can live in your overpriced house if you have enough cash to support it. I know of at least one dot-com zillionaire who lucked out by overpaying for his house in cash before the crash. The house went from something like $6M to $4M, but his stock went from something like $100M to $2M.

More on the housing market from Barry Ritholtz.

Venn diagram humor


Indexed” features sketches on 3×5 index cards, heavy on graphs and Venn diagrams. If you’re reading this, there’s probably something in there for you. (via Korean Jurist)

See also Gaping Void (blogging / tech cartoons drawn on business cards) and xkcd (math / grad student humor).

Back to school 2006

Back to School 2006
Today is the first day of school in Palo Alto. It feels like we just started summer vacation, but it’s fun seeing everyone after the break. I’m always surprised by how much the kids grow in just a few weeks.

Amazon aStore – custom storefronts for Amazon affiliates

Amidst the speculation about the Amazon Unbox video download service, Amazon has quietly launched aStores, a service providing custom online storefronts for Amazon affiliates. (You may not be able to view the link unless you’re an Amazon affiliate.)

aStore by Amazon is a new Associates product that gives you the power to create a professional online store, in minutes and without the need for programming skills, that can be embedded within or linked to from your website.

Here’s a link to their demo store.

You get to pick up to nine “featured items” to put on the home page of the store, choose product categories, and add reviews and editorial content. The shopping cart and fulfillment are handled by Amazon, with standard referral fees going back to the affiliate. There’s a browser based interface for building a store on the Amazon Affiliates site. The resulting store can be hosted by Amazon or on your own site.

This sort of functionality has been available for a while for those will and able to customize their site using Amazon’s web services API, but the aStores program will make custom stores broadly accessible to all of the Amazon affiliates base (just in time for the holiday shopping season). I suspect we’ll see an explosion of niche shopping sites in short order, it looks pretty easy to set one up.

Dell recalls notebook batteries – who’s next?

Dell is recalling several models of notebook batteries, due to several incidents of spontaneous combustion. The batteries in question were manufactured by Sony, which also supplies batteries to other notebook vendors. Lithium-ion batteries are widely used today, so I’m expecting to see additional recalls from other notebook vendors, or at least a raft of press releases verifying that they do not have a problem. Dell has already set up their own web site for battery recall information.

I haven’t heard of any episodes other than various spontaneously combusting Dell notebooks and exploding Powerbooks in recent weeks, but I’m keeping an eye out for news about my Thinkpad’s battery.

The battery issue is compounded by the recent changes to airline security screening. It would be unfortunate if this got all lithium-ion batteries banned from the cabin. On the other hand I don’t see any way to create a completely accident-/terrorist-proof high density energy storage device, which is going to make some people unhappy now that they’ve noticed the issue.

An open source Internet Imaging Protocol implementation

Data Compression Blog notes an updated release of the open source IIPImage Server.

There are some fun demos of interactive browsing through very high resolution satellite and Hubble telescope imagery of Earth, the Orion Nebula, and M101.

I haven’t kept up with IIP since wrapping up work on it many years ago (back at HP), so it’s interesting to see it still has a bit of life. The original implementation was done around 1996, before XML-RPC, SOAP, REST, and all of today’s Web 2.0 underpinnings. It’s amazing to look back and see how much internet software infrastructure has become widely deployed in the past ten years. I suspect we’d consider an implementation approach similar to Google or Yahoo Maps if the effort were started today, but I’m glad to see people find IIP useful.

Consequences of new air travel restrictions – removable drives, portable user profiles?

I’m quite pleased that the British authorities managed to foil the attempt to blow up multiple airliners last week. On the other hand, I’m probably not alone in wondering how long-haul business air travel is going to work out.

If a ban on all liquids, gels, and personal electronics stands, a lot of air carriers will need to start competing on in-flight service again. In recent years, I normally bring my own water, food, work, entertainment, and a change of clothes for air travel to China and India. On a trip to India, it’s about 30 hours in transit, which is a lot of time to watch the 6 movies that United usually rotates each month, along with putting in a full day or so of work. I usually fly United since their Asian routes are all based here, but I wouldn’t want to rely on them for food, water, and entertainment. Might be time to book on Singapore Airlines, which flies with a huge video- and audio-on-demand library and Nintendo video games, never seems to run out of food or water, and consistently provides attentive cabin service.

Given the growing number of data theft cases, I’m also hesitant to put my Thinkpad in a checked bag which I’m not allowed to lock (per TSA). Some people are suggesting that airlines rent computers onboard, but this isn’t going to help much until either

  • You can remove your data and applications and carry it with you
  • You can connect to your data and applications online from the cabin

Putting the risk of using someone else’s hardware aside for a moment (sort of like an internet cafe in the sky), you might need a convenient, security-screenable media to carry the bulk of your personal data with you. Perhaps flash memory in another year or two. I know of people who carry portable environments on USB flash memory keys, but you have to be fairly motivated to deal with it at the moment. If notebook computers get pushed into checked luggage, I’m certain we’ll see at least one more high profile data leak, in which someone happened to steal the wrong notebook that had data it wasn’t supposed to have on it.

The other direction would be to use web services for applications, files, and storage. Some people already work that way, but it usually fails badly if you don’t have a reliable and relatively fast network connection. A permutation of this might be to have the airlines become a sort of internet service provider, and cache copies of your data onto the airplane’s local network server for in-flight use, which get pushed back to the primary server when you land.

I’m glad I don’t have any overseas travel scheduled for a while.

Update Sunday 08-13-2006 22:18 PDT: more on the prospects for air travel from Michael Parekh, Jeff Jarvis, and Fred Wilson.

Three years in 3 minutes – a self portrait movie


This popped up on YouTube this afternoon – filmmaker Ahree Lee shot an image of her face every day for three years starting in November 2001, then concatenated them into a fascinating short movie called “Me”. The project is set to music by Nathan Melsted, which give it a hypnotic, X-file-ish feeling.

A slightly longer and sharper version is posted at AtomFilms. There’s also a short related project built on lots of different faces, called “Everyone”.

Heathrow closed, terror plot disrupted

Overnight, British authorities arrested 21 suspected terrorists planning to blow up several airliners on Continental, United, and American by mixing liquid explosives carried onboard in hand luggage.

At the moment, all liquids are banned from hand luggage, except for baby formula and medicine.

All in all, it sounds like great work by the UK authorities, although this quote leaves me wondering a little (since they’ve only arrested 21 so far):

“A senior U.S. counterterrorism official said authorities believe dozens of people — possibly as many as 50 — were involved in the plot.”

More from Counterterrorism Blog here, here, and here

More on the America Online search query data

The search query data that America Online posted over the weekend has been removed from their site following a blizzard of posts regarding the privacy issues. AOL officially regards this as “a screw up”, according to spokesperson Andrew Weinstein, who responded in comments on several sites:

All –

This was a screw up, and we’re angry and upset about it. It was an innocent enough attempt to reach out to the academic community with new research tools, but it was obviously not appropriately vetted, and if it had been, it would have been stopped in an instant.

Although there was no personally-identifiable data linked to these accounts, we’re absolutely not defending this. It was a mistake, and we apologize. We’ve launched an internal investigation into what happened, and we are taking steps to ensure that this type of thing never happens again.

I pulled down a copy of the data last night before the link went down, but didn’t get around to actually looking it over until this evening. In a casual glance at random sections of the data, I see a surprising (to me) number of people typing in complete URLs, a range of sex-related queries, (some of which I don’t actually understand), shopping-related queries, celebrity-related queries, and a lot of what looks like homework projects by high school or college students.

In the meantime, many other people have found interesting / problematic entries among the data, including probable social security numbers, driver’s license numbers, addresses, and other personal information. Here’s a list of queries about how to kill your wife from Paradigm Shift.

More samples culled from the data here, here, and here.

#479 Looks like a student at Prairie State University who like playing EA Sports Baseball 2006, is a White Sox fan, and was planning going to Ozzfest. When nothing else is going on, he likes to watch Nip/Tuck.

#507 likes to bargain on eBay, is into ghost hunting, currently drives a 2001 Dodge, but plans on getting a Mercedes. He also lives in the Detroit area.

#1021 is unemployed and living in New Jersey. But that didn’t get him down because with his new found time, he’s going to finally get to see the Sixers.

#1521 like the free porn.

Based on my own eclectic search patterns, I’d be reluctant to infer specific intent based only on a series of search queries, but it’s still interesting, puzzling, and sometimes troubling to see the clusters of queries that appear in the data.

Up to this point, in order to have a good data set of user query behavior, you’d probably need to work for one of the large search engines such as Google or Yahoo (or perhaps a spyware or online marketing company). I still think sharing the data was well-intentioned in spirit (albeit a massive business screwup).

Sav, commenting over at TechCrunch (#67) observes:

The funny part here is that the researchers, accustomed to looking at data like this every day, didn’t realize that you could identify people by their search queries. (Why would you want to do that? We’ve got everyone’s screenname. We’ll just hide those for the public data.) The greatest discoveries in research always happen by accident…

A broader issue in the privacy context is that all this information and more is already routinely collected by search engines, search toolbars, assorted desktop widget/pointer/spyware downloads, online shopping sites, etc. I don’t think most people have internalized how much personal information and behavioral data is already out there in private data warehouses. Most of the time you have to pay something to get at it, though.

I expect to see more interesting nuggets mined out of the query data, and some vigorous policy discussion regarding the collection and sharing of personal attention gestures such as search queries and link clickthroughs in the coming days.

See also: AOL Research publishes 20 million search queries

Update Tuesday 08-08-2006 05:58 PDT – The first online interface for exploring the AOL search query data is up at www.aolsearchdatabase.com (via TechCrunch).

Update Tuesday 08-08-2006 14:18 PDT – Here’s another online interface at dontdelete.com (via Infectious Greed)

Update Wednesday 08-09-2006 19:14 PDT – A profile of user 4417749, Thelma Arnold, a 62-year-old widow who lives in Lilburn, GA, along with a discussion of the AOL query database in the New York Times.

AOL Research publishes 20 million search queries

More raw data for search engineers and SEOs, and fodder for online privacy debates – AOL Research has released a collection of roughly 20 million search queries which include all searches done by a randomly selected set of around 500,000 users from March through May 2006.

This should be a great data set to work with if you’re doing research on search engines, but seems problematic from a privacy perspective. The data is anonymized, so AOL user names are replaced with a numerical user ID:

The data set includes {UserID, Query, QueryTime, ClickedRank, DestinationDomainUrl}.

I suspect it may be possible to reverse engineer some of the query clusters to identify specific users or other personal data. If nothing else, I occasionally observe people accidentally typing in user names or passwords into search boxes, so there are likely to be some of those in the mix. “Anonymous” in the comments over at Greg Linden’s blog thinks there will be a lot of those. The destination URLs have apparently been clipped as well, so you won’t be able to see the exact page that resulted in a click-through.

Haven’t taken a look at the actual data yet, but I’m glad I’m not an AOL user.

Adam D’Angelo says:

This is the same data that the DOJ wanted from Google back in March. This ruling allowed Google to keep all query logs secret. Now any government can just go download the data from AOL.

On the search application side, this is a rare look at actual user search behavior, which would be difficult to obtain without access to a high traffic search engine or possibly through a paid service.

Plentyoffish sees an opportunity for PPC and Adsense spammers:

Google/ AOL have just given some of the worlds biggest spammers a breakdown of high traffic terms its just a matter of weeks now until google gets mega spammed with made for adsense sites and other kind of spam sites targetting keywords contained in this list.

I think it’s great that AOL is trying to open up more and engage with the research community, and it looks like there are some other interesting data collections on the AOL Research site — but I suspect they’re about to take a lot of heat on the privacy front, judging from the mix of initial reactions on Techmeme. Hope it doesn’t scare them away and they find a way to publish useful research data without causing a privacy disaster.

More on the privacy angle from SiliconBeat, Zoli Erdos

See also: Coming soon to DVD – 1,146,580,664 common five-word sequences

Update – Sunday 08-06-2006 20:31 PDT – AOL Research appears to have taken down the announcement and the log data in the past few hours in response to a growing number of blog posts, mostly critical, and mostly focused on privacy. Markus at Plentyoffish has also used the data to generate a list of ringtone search keywords which users clicked through to a ringtone site as an example of how this data can be used by SEO and spam marketers. Looks like the privacy issues are going to get the most airtime right now, but I think the keyword clickthrough data is going to have the most immediate effect.

Update Monday 08-07-2006 08:02 PDT: Some mirrors of the AOL data

The investor sentiment cycle, tech and web 2.0


Silicon Valley is built on optimism and entrepreneurship, but lately, most tech companies can do no good in the eyes of public market investors, who are presently in a mood to sell on no news, bad news, or even good news.

At the same time, private market sentiment toward investing in Web 2.0, online services, and consumer media and publishing appears to be positive.

Barry Ritholtz posted this investor sentiment graph, on which I’ve marked roughly where I think we are for “new web” startups versus public tech companies.

A lot of the problem with public tech company share prices is due to uncertainty about future prospects – a slowing economy, growing competition, increasing costs, and a general cloud of unknowable liability from options accounting issues. The actual businesses are often doing OK or even great, but investor sentiment has shifted, bringing down the share price. See BRCM, RACK, or NVDA for a few examples of what happens when you report a decent quarter without boosting forward guidance. Fewer people are willing to pay a 30 multiple for growth that may never come, or that may never have existed in the first place.

In contrast, there is still a lot of investor love for Web 2.0 startups and other “new” online services. Part of this reflects supply and demand — there are a lot of investable funds around, and it’s hard for a fund to invest a lot of money in small chunks.

There’s still a lot of excitement about the future of Web 2.0 et al, but it’s been feeling overdone for the past few months (“Digg is worth $60M“), without necessarily being over. On the other hand, I still get the impression that people here in Silicon Valley are still somewhere between denial (“It will go back up”) and fear (“What if it doesn’t”) with respect to the medium term prospects for tech stock share prices.

This investor sentiment graph and other graphs of economic cycles can also be found on the forecast page at Now and Futures.

Anyone think we’ve already hit a peak for Web 2.0 investments or a bottom for tech stocks? (I don’t.)

Coming soon to DVD – 1,146,580,664 common five-word sequences

Google Research is publishing a huge n-gram dataset distilled from trillions of words perused by Google’s vast search spidering effort:

We processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times. There are 13,653,070 unique words, after discarding words that appear less than 200 times.

This looks like just the thing for developing some interesting predictive text applications, or just random data mining. The 6-DVD set will be distributed by the Linguistic Data Consortium, which collects and distributes interesting speech and text databases and training sets. Some other items in their collection include transcribed speech from 3000 speakers, a mapping between Chinese and English place, organization, and corporate names, and a transcription of colloquial Levantine Arabic speech.

Update Sunday 08-06-2006 16:41 PDT: See also AOL Research publishes 20 million search queries

West Coaster at Santa Monica Pier

Get ready Whee!

The West Coaster is a small roller coaster going around and above Pacific Park, a small amusement park at the Santa Monica Pier. The ride isn’t too scary, provides great views of the beach, and goes around twice each time you board. The mid-week lines are short, so we went around and around until we lost count.

Apple says “OK, now you can worry about those options”


A couple of weeks ago, Apple Computer announced a better-than-expected quarter, and also mentioned that they didn’t expect any “material adjustments” to result from their stock options accounting investigation. Since then, the share price has been rocketing from the low 50s up to touch 70 today.

After the close today, they mentioned that they will restate earnings back as far as 2002 and will delay filing their 10-Q. Oops.

Reuters News, August 3, 2006

Apple Computer Inc. (NASDAQ:AAPL – News) said on Thursday it would likely need to restate earnings and will delay filing its quarterly report because of additional irregularities it found in its accounting of stock options and its shares fell 6.6 percent.

Apple also said in a SEC regulatory filing Thursday that all financial communications issued since September 29, 2002, should not be relied upon. The irregularities are related to the issuance of stock option grants made between 1997 and 2001.

For reference, here’s the Apple 3Q06 press release, July 19, 2006:

As previously announced, an internal investigation discovered irregularities related to the issuance of certain stock option grants made between 1997 and 2001. A special committee of Apple’s outside directors has hired independent counsel to perform an investigation and the Company has informed the SEC. At this time, based upon the irregularities identified to date, management does not anticipate any material adjustment to the financial results included in this earnings release. However, if additional irregularities are identified by the independent investigation, a material adjustment to the financial information could be required.

At one point I used to believe in buy-and-hold, longer term investing based on fundamentals. Unfortunately, it can be difficult to discern what the fundamentals actually are, and how the stock markets will react to them.

In the current market conditions, I’m doing much better as a short term trader than when I try to hold longer term positions.

Update Friday 08-04-2006 14:33PDT – Interesting anecdotes on Apple financial culture from Applepeels.

How much is two plus two?

It can be difficult to have a lot of confidence in financial statements published by publicly traded companies. Companies built on intellectual property or financial instruments are especially prone to varying interpretation of their finances, but accounting issues seem to plague companies from every industry these days.

This tale was posted by Wavesmash in the comments at Bill Cara’s site today:

Re: how can we ever trust any financial statements at all? An old joke an accountant once told me…

A business man was interviewing applicants for the position of divisional manager. He devised a simple test to select the most suitable person for the job. He asked each applicant the question, “What is two and two?”

The first interviewee was a journalist. His answer was “Twenty-two.”

The second was a social worker. She said, “I don’t know the answer but I’m glad we had time to discuss this important question.”

The third applicant was an engineer. He pulled out a slide rule and showed the answer to be between 3.999 and 4.001.

The next person was a lawyer. He stated that in the case of Jenkins v Commr of Stamp Duties (Qld), two and two was proven to be four.

The last applicant was an accountant. The business man asked him, “How much is two and two?”

The accountant got up from his chair, went over to the door and closed it then came back and sat down. He leaned across the desk and said in a low voice, “How much do you want it to be?”

He got the job.

Back from the mobile office

At the mobile office Ocean kayaking at Malibu

Spent most of the past weekend on the beach in Malibu. Emily and I tried a little surfing, ocean kayaking, and also got a good look at some dolphins while we were paddling around.

I brought the Thinkpad, but left the charger at home, the idea being to limit my computer use while on vacation. We decided to stay a couple extra days, so I was effectively offline after running on batteries for 5 hours or so. Next time I’ll bring the charger anyway.

If you’ve been having trouble getting at this site while I’ve been away, Dreamhost posted a narrative of their recent adventures in data hosting, some of which have been power-related, and some not.