|
|
site admin | May 21st, 2009 | Comments are closed
These are my links for May 21st from 06:07 to 22:34:
site admin | May 12th, 2009 | Comments are closed
These are my links for May 8th through May 12th:
site admin | May 5th, 2009 | Comments are closed
These are my links for May 4th through May 5th:
- Influential Nodes in a Diffusion Model for Social Networks (icalp05-inf.pdf) – Kempe, Kleinberg, Tardos. Algorithm for greedy approximation of most influential nodes in social network (63% of optimal) under various conditions.
- Maximizing the Spread of Influence through a Social Network (kdd03-inf.pdf) – Kempe, Kleinberg, Tardos. Maximizing propagation by selecting most influential nodes is NP-hard, but a greedy approximation can work well (63% of optimal) under various conditions.
- Notification Strategies for Social Networks – Discussion on approaches to maximizing use of a limited number of notifications within social networks e.g. Facebook
- James Smith • loopj.com » Blog Archive » jQuery Plugin: Tokenizing Autocomplete Text Entry – Looks handy – "This is a jQuery plugin to allow users to select multiple items from a predefined list, using autocompletion as they type to find each item. You may have seen a similar type of text entry when filling in the recipients field sending messages on facebook."
- Google Code FAQ – Using cURL to interact with Google data services – Step by step tutorial on using curl with Google data APIs.
- Behind The Business Plan Of Pirates Inc. : NPR – It takes around $250K to fund a Somali pirate operation. About 20 percent goes to pay off officials who look the other way. About 50 percent is for expenses and payroll. The leader of an attack makes $10,000 to $20,000 (the average Somali family lives on $500 a year). The initial investor — who put in $250,000 of seed capital — gets 30 percent, sometimes up to $500,000.
site admin | May 2nd, 2009 | Comments are closed
These are my links for April 30th through May 2nd:
- FusionCharts Free – Animated Flash Charts and Graphs for ASP, PHP, ASP.NET, JSP, RoR and other web applications – Flash charting component that can be used to render data-driven & animated charts for your web applications and presentations. It is a cross-browser and cross-platform solution that can be used with PHP, Python, Ruby on Rails, ASP, ASP.NET, JSP, ColdFusion, simple HTML pages or even PowerPoint Presentations to deliver interactive and powerful flash charts. You do NOT need to know anything about Flash to use FusionCharts. All you need to know is the language you're programming in.
- Raphaël—JavaScript Library – Raphaël is a small JavaScript library that should simplify your work with vector graphics on the web. If you want to create your own specific chart or image crop and rotate widget, for example, you can achieve it simply and easily with this library. Raphaël uses the SVG W3C Recommendation and VML as a base for creating graphics. This means every graphical object you create is also a DOM object, so you can attach JavaScript event handlers or modify them later. Raphaël’s goal is to provide an adapter that will make drawing vector art compatible cross-browser and easy.
- A Really Gentle Introduction to Data Mining | Regular Geek – List of data mining blogs and related resources.
- BlackBerry SSH Tutorial: Connect to Unix Server using MidpSSH for Mobile Devices – Notes on using MidpSSH on Blackberry for remote access to servers. Seems to work, although big network lag on my BlackBerry Bold / AT&T.
- Country Reports on Terrorism 2008 – U.S. law requires the Secretary of State to provide Congress, by April 30 of each year, a full and complete report on terrorism with regard to those countries and groups meeting criteria set forth in the legislation. This annual report is entitled Country Reports on Terrorism. Beginning with the report for 2004, it replaced the previously published Patterns of Global Terrorism.
- DIY: How To Find Authoritative Twitter Users Plus 100 To Get You Started | Ignite Social Media – Some comments on recommendation metrics for Twitter, trying to use "favorites" mark as an indicator.
- SIGUSR2 > The Power That is GNU Emacs – "If you've never been convinced before that Emacs is the text editor in which dreams are made from, or that inside Emacs there are unicorns manipulating your text, don't expect me to convince you."
site admin | April 29th, 2009 | Comments are closed
These are my links for April 28th through April 29th:
- With YQL Execute, the Internet becomes your database (Yahoo! Developer Network Blog) – Use Yahoo to query and assemble data from around the internet, manipulate resulting XML recordsets with server side Javascript.
- Glimmer: a jQuery Interactive Design Tool – Articles – MIX Online – "Makes jQuery accessible through a visual tool. The objective for Glimmer is pretty simple: to enable the power of jQuery through an interactive design surface. If jQuery is the "write less, do more” JavaScript library, then Glimmer is the “write none, do more” jQuery design tool. Glimmer has three core audiences: power users, designers and developers."
- Inside Facebook Reports: Why Hasn’t Facebook Grown More in China? – A look at Chinese consumer internet and social media usage, QQ, 51, Xiaonei, Kaixin, and some reasons why there are only around 300,000 Facebook users in China today.
- Facebook maps the swine flu hysteria | The Web Services Report – CNET News – Visualizing interest in swine flu by mapping percentages of mentions on Facebook wall pages, using data from Lexicon.
- Develop Twitter API application in django and deploy on Google App Engine — The Uswaretech Blog – Django Web Development – Walkthrough of a sample Twitter application on Google App Engine, using Django/Python.
site admin | April 27th, 2009 | Comments are closed
These are my links for April 24th through April 27th:
site admin | April 23rd, 2009 | Comments are closed
These are my links for April 20th through April 23rd:
- What I’ve Learned from Hacker News – Paul Graham on social dynamics and managing Hacker News, user submitted comments and ranking (voting up/down) , editorial intervention and moderators, project goals.
- SEOmoz | Reddit, Stumbleupon, Del.icio.us and Hacker News Algorithms Exposed! – Looking at variations on algorithms for ranking items on social news aggregators
- NGINX + PHP-FPM + APC = Awesome – Walkthrough on setting up cached PHP web server on nginx with apc.
- Particletree » PHP Quick Profiler – Lightweight tool for profiling PHP code.
- MySQL’s Full-Text Formulas – Database Journal –
- http://www.acapela-group.com/text-to-speech-interactive-demo.html – Online text-to-speech demo, with various male and female speakers, plus a few translations.
- Dealing with Duplicate Person Data – Proud to Use Perl – Classifying likely duplicate entries in name/address contact data using Levenshtein distance and tables of nickname synonym and assigned distance weights.
- Web Security Horror Stories: The Director’s Cut at <head> – Presentation slides from a talk by Simon Willison on cross site scripting, SQL injection, referer forgery, and clickjacking attacks on web applications.
site admin | April 19th, 2009 | Comments are closed
These are my links for April 18th through April 19th:
- Why Programmers Suck at CSS Design – Stefano’s Linotype – A practical approach to CSS for non-designers (programmers).
- The Art & Science of Seductive Interactions – Presentation slides on improving application user experience by making them more game like (points, levels, scarcity), social interaction, and other ideas.
- Stephen Marsland – Python code from "Machine Learning: An Algorithmic Perspective", assorted clustering and estimation algorithms.
- Firediff – In Case of Stairs – Firediff implements a change monitor that records all of the changes made by firebug and the application itself to CSS and the DOM. This
provides insight into the functionality of the application as well as provide a record of the changes that were required to debug and tweak the page’s display.
- Crowdsourcing the semantic web | lexanderA – "Currently, all attempts at providing semantic metadata require server-side changes which means that we need to rely on page authors to implement them. This, of course, is a major obstacle. But what if we could change that? What if we could bypass page authors and have the crowd add semantic metadata to existing pages?"
- Just How Important is the Valley? Let’s Look at some Data. – Tony Wright dot com – Is the silicon valley entrepreneurship model specific to SV? List of acquisitions in 2007 and 2008.
site admin | April 15th, 2009 | Comments are closed
These are my links for April 13th through April 15th:
site admin | April 12th, 2009 | Comments are closed
These are my links for April 12th from 17:02 to 19:13:
site admin | April 9th, 2009 | Comments are closed
These are my links for April 9th from 08:07 to 17:53:
- IP address geolocation SQL database – IP address geolocation with MySQL by Marc-Andre Caron. He's done all the necessary legwork to solve this problem, putting together a free, monthly-updated MySQL dataset that will allow you to derive country, region, city, zip, latitude, and longitude from an IP address.
- Del.icio.us Finally Gets Some Respect from Yahoo – Probably Too Late – ReadWriteWeb –
- In the Event That You Have Accidentally Swallowed the Higgs Boson by Michael Rottman – The Morning News – "7. Do you feel protons decaying? Grand Unification may be occurring near your vital organs. "
- FT.com / Companies / UK companies – Dotcom veterans in Twitter ‘brains trust’ – "Mr Read has brought together a “brains trust” of advisers to Twitter Partners, including Brent Hoberman and Martha Lane Fox, founders of Lastminute.com; Saul Klein, a partner at Index Ventures, the London venture capitalists; and Toby Coppel, the former European vice-president at Yahoo."
- byteonic.com » What you cannot do using Java in Google App Engine – List of some restrictions on Java code running on GAE
site admin | February 25th, 2009 | Comments are closed
These are my links for February 24th through February 25th:
- The C10K problem – On techniques for scaling to large number of network clients (e.g. >10000).
- Yodel Anecdotal » Blog Archive » Hello, (twitter) world – List of official Yahoo twitter handles for various activities including research, geo, search, and yui.
- New AWS Public Data Sets – Economics, DBpedia, Freebase, and Wikipedia – AWS adds Freebase, DBPedia, Wikipedia extract, and US Transportation data sets.
- eigenclass – Related document discovery, without algebra – Another approach to simple related document discovery, based on tags, should work ok for small data sets.
- SVD Recommendation System in Ruby – igvita.com – A 50 line SVD recommendation / collaborative filtering system for a Rails app. with the help of some simple linear algebra.
The automatic nightly link posts from del.icio.us stopped working properly sometime last year. The links would get posted, but had extra “\n” inserted at every line break. Here’s an example. An unexpected side effect of having “ugly” link posts is that I mostly stopped posting links to del.icio.us for a while.
As part of the recent blog platform update, I’ve switched from the del.icio.us “experimental” nightly blog posting to Postalicious, which seems to be working nicely, you can see the new link post style (and the old ones too, unless I get around to cleaning them up) here.

Here is a visualization of my del.icio.us tags, by Kunal Anand, who’s been collecting del.icio.us tags and turning them into interesting pictures. Here’s the short explanation he sent along:
1. Each dot represents a tag (aka a node)
2. Each line represents an intersection between tags
3. The center of the visualization (denoted by a colored gradient), represents the heavy set of intersections
It appears that I have a fairly consistent set of regularly used tags, and a fairly even distribution of less used tags that intersect with the most common ones.
For comparison, see visualizations of tags from Brad Feld, Tom Coates, Pete Freitag
Del.icio.us is testing out private bookmarks now.
I’ve been playing with a private instance of Scuttle ever since del.icio.us was purchased by Yahoo a few months back, but have continued using del.icio.us for posting public links anyway.
My del.icio.us links are automatically posted here (except when one end or the other is out of service for some reason), don’t know if that would include the private ones or not. Also don’t know exactly where the private bookmarks might be visible, aside from in one’s own account. I’ll have to give it a try.

Tagnautica is a fun and interesting Flash user interface for exploring and navigating among tags, in this case on Flickr. After keying in an initial tag, related tags are displayed in a circle, with a sample image from each tag category displayed in a representative size.
When you move the cursor over a tag bubble, it temporarily becomes larger so you can get a look at it. The other bubbles keep resizing as well, giving the interface a very fluid appearance. When you find something you like, you can click on the Tagnautica bubble to view the tag page over at Flickr.
I always enjoy these sorts of user interfaces for semi-random exploration. I’ve noticed that I don’t really use any of the cool visualization tools when I actually want to find something, though. Not sure if that’s because they don’t represent a useful set of questions as implemented yet, or simply because my brain doesn’t work that way.
I find I experience these interfaces more as pleasant interactive art than as useful data navigation tools. One of these days I’m sure something is going to click, though.
Ho John Lee | December 11th, 2005 | 2 comments
Last Friday’s announcement that Yahoo is buying del.icio.us has probably got more than a few people thinking about the future of the service and whether they want to keep using it. In any case, as with all of the interesting and useful web services out there, it’s good to take time now and then to back up your personal data, in case something goes sideways and the service becomes unavailable or unusable for whatever reason.
I’m personally planning on continuing to use del.icio.us, although there are a number of interesting tagged bookmarking alternatives out there, including running your own.
The first step is to get your personal bookmark data, which can be obtained through the del.icio.us API. You can retrieve all your saved bookmarks at del.icio.us/api/posts/all, which will return an XML file that can be saved to your local system and used as a backup or to import your bookmarks into another web application elsewhere.
The next step is to decide what you want to do with the data. Some alternative tagged bookmarking solutions include:
The following services are based on open source projects, so you can (or in some cases have to) run your own bookmarking system.
Yahoo already runs MyWeb2.0, which presumably will begin to merge with del.icio.us at some point. It has a lot of interesting features, but hasn’t had enough to get me to switch over up to this point. I’ve been wanting private bookmarks and tags on del.icio.us for a while, although I think I’ll be moving those off my desktop onto a roll-your-own server solution.
Any more suggestions? Reply in the comments and I’ll pull them up to the main post.
Here’s an extensive list of free bookmark managers at lights.com (via David Beisel)
Ho John Lee | December 9th, 2005 | 6 comments
Yahoo continues down the path of more tagging and more collaborative content. Having already purchased Flickr, this morning they’re acquiring del.icio.us (terms undislosed):
From Joshua Schachter at the del.icio.us blog:
We’re proud to announce that del.icio.us has joined the Yahoo! family. Together we’ll continue to improve how people discover, remember and share on the Internet, with a big emphasis on the power of community. We’re excited to be working with the Yahoo! Search team – they definitely get social systems and their potential to change the web. (We’re also excited to be joining our fraternal twin Flickr!)
From Jeremy Zawodny at Yahoo Search Blog:
And just like we’ve done with Flickr, we plan to give del.icio.us the resources, support, and room it needs to continue growing the service and community. Finally, don’t be surprised if you see My Web and del.icio.us borrow a few ideas from each other in the future.
From Lisa McMillan, an enthusiastic user of all 3 services (comment on the del.icio.us blog):
Yahoo that’s delicious! I live here. I live in flickr. I live at yahoo. This is insane. You deserve this success dude. Just please g-d don’t let me lose my bookmarks I’m practically my own search engine. LOL
Tagged bookmarking sites such as del.icio.us can provide a rich source of input data for developing contextual and topical search. The early adopters that have used del.icio.us up to this point are unlikely to bookmark spam or very uninteresting pages, and the aggregate set of bookmarks and tags is likely to expose clustering of links and related tags which can be used to refine search results by improving estimates of user intent. Individuals are becoming their own search engine in a very personal, narrow way, which could be coupled to general purpose search engines such as Yahoo or Google.
I think Google needs to identify resources it can use to incorporate more user feedback into search results. Looking over the users’ shoulders via AdSense is interesting but inadequate on its own because there are a lot of sites that will never be AdSense publishers. Explicit input capturing the user’s intent, whether through tagging, voting, posting, publishing, is a strong indication of relevance and interest by that user. I think the basic Google philosophy of letting the algorithm do everything is much more scalable, but it looks like time to capture more human input into the algorithms.
In a recent post, I pointed out some work at Yahoo on computing conditional search ranking based on user intent. The range of topics on del.icio.us tends to be predictably biased, but for the areas that it covers well, I’d be looking for some opportunities to improve search results based on what humans thought was interesting. As far as I know, Google doesn’t have any assets in this space. Maybe Blogger or Orkut, but those are very noisy inputs.
This seems like a great move by Yahoo on multiple fronts, and I am very interested to see how this plays out.
See also:
Update 12-12-2005 12:30 PST: No hard numbers, but something like $10-15MM with earnouts looks plausible. More posts, analysis, and reader comments: Om Malik, John Batelle, Paul Kedrosky.

I’ve added a local tag cosmos, which shows a tag cloud for posts on this site. Unfortunately, I’m also using tags and bookmarks scattered across del.icio.us, Flickr, Technorati, and other services, which aren’t integrated into the cloud, but this provides a different view of what’s been posted here since I’ve started tagging things.
I’m still evolving my personal use of tags. You can see that I’ve started tagging some posts with “web2.0“, although I’ve been reluctant to turn it into a site category. I don’t like the label, but I recognize that it’s the most popular tag for a lot of “new” stuff at the moment. So exposing the tag makes it more findable.
I’ve been debating reducing the number of post categories in favor of using frequently occuring tags for site navigation, so that recurring topics automatically make themselves more visible. It can be difficult to find things here, partly because I’m posting about a lot of different topics and partly because the categories don’t always organize the posts very well.
Tagging on this site is currently implemented using Jerome’s Keywords plugin for WordPress to apply tags to posts and for generating the tag cloud.
Ho John Lee | October 9th, 2005 | 1 comment
My quick notes on trying out Google Reader:
Summary:
- The AJAX user interface is whizzy and fun, and is similar to an e-mail reader.
- Importing feeds is really slow.
- Keyboard navigation shortcuts are great.
- Searching through your own feeds or for new feeds is convenient using Google
- I hate having a single item displayed at a time.
- “Blog This” action is handy, if you use Blogger. They could easily make this go to other blogging services later.
- This could be a good “starter” service for introducing someone to feed readers, but
- No apparent subscription export mechanism
- Doesn’t deal well with organizing a large number of feeds.
More notes:
I started importing the OPML subscription file from Bloglines into Google Reader on Friday evening. I have around 500 subscriptions in that list, and I’m not sure how long it ended up taking to import. It was more than 15 minutes, which was when I headed off to bed, and completed sometime before this afternoon.
I love having keyboard navigation shortcuts. The AJAX-based user interface is zippy and “fun”. Unfortunately, Google Reader displays articles one at a time, a little like reading e-mail. I’m in the habit of scanning sections of the subscription lists to see which sections I want to look at, then scanning and scrolling through lists of articles in Bloglines. Even though this requires mousing and clicking, it’s a lot faster than flashing one article at a time in Google Reader.
I don’t think the current feed organization system works on Google Reader, at least for me. My current (bad) feed groupings from Bloglines show up on Google Reader as “Labels” for groups of feeds, which is nice. It’s hard to just read a set of feeds, though. Postings show up in chronological order, or by relevance. This is totally unusable for a large set of feeds, especially when several of them are high-traffic, low-priority (e.g. Metafilter, del.icio.us, USGS earthquakes). If I could get the “relevance” tuned by context (based on label or tag?) it might be useful.
When you add a new feed, it starts out empty, and appears to add articles only as they are posted. It would be nice to have them start out with whatever Google has cached already. I’m sure I’m not the first subscriber to most of the feeds on my list.
On the positive side, this seems like a good starting point for someone who’s new to feed readers and wants a web-based solution. It looks nice, people have heard of Google, and the default behaviors probably play better with a modest number of feeds. Up to this point, I’ve been steering people at Bloglines in the past, and more recently pointing them at Rojo.
I wish the Bloglines user interface could be revised to make it quicker to get around. I really like keyboard navigation. I can also see some potential in the Google Reader’s listing by “relevance” rather than date listing, and improved search and blogging integration. I’m frequently popping up another window to run searches while reading in Bloglines.
Google Reader doesn’t seem like it’s quite what I’m looking for just now, but I’ll keep an eye on it.
Wishful thinking:
I think I want something to manage even more feeds than I have now, but where I’m reading a few regularly, a few articles from a pool of feeds based on “relevance”, and articles from the “neighborhood” of my feeds when they hit some “relevance” criteria. I’d also like to search my pool of identified / tagged feeds, along with some “neighborhood” of feeds and other links. I think a lot of this is about establishing context, intent, and some sort of “authoritativeness”, to augment the usual search keyword matching.
|
|