|
|
Ho John Lee | January 23rd, 2010 | Comments are closed
These are my links for January 20th through January 23rd:
- Data.gov – Featured Datasets: Open Government Directive Agency – Datasets required under the Open Government Directive through the end of the day, January 22, 2010. Freedom of Information Act request logs, Treasury TARP and derivative activity logs, crime, income, agriculture datasets.
- All Your Twitter Bot Needs Is Love – The bot’s name? Jason Thorton. He’s been humming along for months now, sending out over 1250 tweets to some 174 followers. His tweets, while not particularly creative, manage to be both believable and timely. And he’s powered by a single word: Love.
Thorton is the creation of developer Ryan Merket, who built him as a side project in around three hours. Merket has just posted the code that powers him, and has also divulged how he made Thorton seem somewhat realistic: the bot looks for tweets with the word “love” in them and tweets them as its own.
- Building a Twitter Bot – "Meet Jason Thorton. To people who know Jason, he is a successful entrepreneur in San Francisco who tweets 4-5 times a day. But Jason has a secret, he’s not really a human, he’s the product of my simple algorithm in PHP
Jason tweets A LOT about the word “love” – that’s because Jason actually steals tweets from the public timeline that contain the word “love” and posts them as his own
Jason also @replies to people who use the word “love” in their tweets, and asks them random questions or says something arbitrary
It took me about 3 hours to code Jason, imagine what a real engineer could do with real AI algorithms? Now realize that it’s already a reality. Sites like Twitter are full of side projects, company initiatives, spambots and AI robots. When the free flow of information becomes open, the amount of disinformation increases. Theres a real need for someone to vet the people we ‘meet’ on social sites – will be interesting to see how this market grows in the next year
- Website monitoring status – Public API Status – Health monitor for 26 APIs from popular Web services, including Google Search, Google Maps, Bing, Facebook, Twitter, SalesForce, YouTube, Amazon, eBay and others
- PG&E Electrical System Outage Map – This map shows the current outages in our 70,000-square-mile service area. To see more details about an outage, including the cause and estimated time of restoration, click on the color-coded icon associated with that outage.
Ho John Lee | January 20th, 2010 | Comments are closed
These are my links for January 17th through January 20th:
- PG&E Electrical System Outage Map – This map shows the current outages in our 70,000-square-mile service area. To see more details about an outage, including the cause and estimated time of restoration, click on the color-coded icon associated with that outage.
- Twitter.com vs The Twitter Ecosystem – Fred Wilson comments on some data from John Borthwick indicating Twitter ecosystem use = 3-5x Twitter.com directly.
"John's chart estimates that Twitter.com is about 20mm uvs a month in the US (comScore has it at 60mm uvs worldwide) and the Twitter ecosystem at about 60mm uvs in the US.
That says that across all web services, not just AVC, the Twitter ecosystem is about 3x Twitter.com. And on this blog, whose audience is certainly power users, that ratio is 5x."
- Chris Walshaw :: Research :: Partition Archive – Welcome to the University of Greenwich Graph Partitioning Archive. The archive consists of the best partitions found to date for a range of graphs and its aim is to provide a benchmark, against which partitioning algorithms can be tested, and a resource for experimentation.
The partition archive has been in operation since the year 2000 and includes results from most of the major graph partitioning software packages. Researchers developing experimental partitioning algorithms regularly submit new partitions for possible inclusion.
Most of the test graphs arise from typical partitioning applications, although the archive also includes results computed for a graph-colouring test suite [Wal04] contained in a separate annex.
The archive was originally set up as part of a research project into very high quality partitions and authors wishing to refer to the partitioning archive should cite the paper [SWC04].
- Twitter’s Crawl « The Product Guy – "A list of incidents that affected the Page Load Time of the Twitter product, distinguishing between total downtime, and partial downtime and information inaccessibility, based upon the public posts on Twitters blog.
http://status.twitter.com/archive
I did my best to not double count any problems, but it was difficult since many of the problems occur so frequently, and it is often difficult to distinguish, from these status blog posts alone, between a persisting problem being experienced or fixed, from that of a new emergence of a similar or same problem. Furthermore, I also excluded the impact on Page Load Time arising from scheduled maintenance/downtime – periods of time over which the user expectation would be most aligned with the product’s promise of Page Load Time. "
- Soundboard.com – Soundboard.com is the web's largest catalog of free sounds and soundboards – in over 20 categories, for mobile or PC. 252,858 free sounds on 17,171 soundboards from movies to sports, sound effects, television, celebrities, history and travel. Or build, customize, embed and manage your own
site admin | May 31st, 2009 | Comments are closed
These are my links for May 30th through May 31st:
- Scaling Twitter: Making Twitter 10000 Percent Faster | High Scalability – Collection of links to presentations and interviews regarding Twitter's architecture, implementation plans, and performance issues, from spring 2009.
- The Last Psychiatrist: The Difference Between An Amateur, A Scientist, And A Genius – An amateur is full of wonder and speculation, tinkering towards the truth but suffering from a lack of knowledge and idleness; he's not even sure if someone else has already made these discoveries. "Is this a worthwhile pursuit?"
A scientist performs experiments to confirm or disprove a hypothesis, and in that way he grinds out the truth.
A genius has three abilities, which are actually the union of amateur and scientist: 1. to know the state of the art, what is known and what is not known. 2. To be able to think "out of the box". 3. To be disciplined enough to concentrate on the tedium of a formal investigation of his wondrous speculations.
- PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing – Research paper on sort of "super healing brush" for manipulating digital images, allows splicing together different sections of the image and automatically selecting similar textures to make the seam transitions work better.
- Light Blue Touchpaper » Blog Archive » Attack of the Zombie Photos – Social networking and sharing sites have challenges implementing and managing access control policies at large scale, and content delivery networks add another wrinkle.
- Map of all Google data center locations | Royal Pingdom – Where in the world is your search being served from? An attempt to assemble a list of known Google data centers worldwide.
site admin | March 8th, 2009 | Comments are closed
These are my links for March 6th through March 8th:
- Wolfram Blog : Wolfram|Alpha Is Coming! –
- Wolfram Alpha is Coming — and It Could be as Important as Google | Twine –
- Wolfram Alpha — it’s like plugging into an electronic brain » VentureBeat –
- If browsers were women – Sharenator.org – "[Chrome] Extremely skinny, but very cool and friendly. However, when it comes to the bedroom, she is very inexperienced and has little to offer. [IE] For most, she's the first woman they tried. She's really easy but can get you infected." etc etc
- Rough Type: Nicholas Carr’s Blog: The coming of the megacomputer – Nick Carr commentary on Rick Rashid's statement that 20% of servers were going to major cloud data centers. Also some interesting discussion in comments.
- FT.com | Tech Blog | How many computers does the world need? – According to Microsoft research chief Rick Rashid, around 20 per cent of all the servers sold around the world each year are now being bought by a small handful of internet companies – he named Microsoft, Google, Yahoo and Amazon.
- The New Hot Cuisine: Korean – WSJ.com – Korean food is slowly making its way into mainstream awareness, both high end (French Laundry, Le Bernardin) and everyday (CPK, Kogi BBQ).
- WriteOnIt – Fake pictures – Build fake magazine covers, newspapers, and photos.
site admin | March 1st, 2009 | Comments are closed
These are my links for February 28th through March 1st:
- Community Data – Swivel – User contributed datasets, for visualization and graphs with Swivel
- Obamameter – Map visualization of economic stimulus outlays. "Keep tabs on the the US economy, the global economy and the stimulus through our dashboard for the economy."
- recovery.gov.pdf – Slide presentation on data sources and construction of initial Recover.gov site in Jan 2009, from talk at Transparency Camp.
- Virtual Hoff : DoxPara Research – Slides from Dan Kaminsky's talk at CloudCamp Seattle on network and application security issues in cloud and virtualized computing environments.
- Can You Buy a Silicon Valley? Maybe. – from Paul Graham – "If you could get startups to stick to your town for a million apiece, then for a billion dollars you could bring in a thousand startups. That probably wouldn't push you past Silicon Valley itself, but it might get you second place. For the price of a football stadium, any town that was decent to live in could make itself one of the biggest startup hubs in the world."
- Berkshire Hathaway 2008 shareholders letter (PDF) – Warren Buffet reviews the state of the financial markets, his worst year ever, and the outlook for 2009.
- White House 2: Where YOU set the nation’s priorities – Not the actual White House, but an interesting experiment in collaborative input for setting government agenda.
- Python for Lisp Programmers – Peter Norvig examines Python. "(Although it wasn't my intent, Python programers have told me this page has helped them learn Lisp.) Basically, Python can be seen as a dialect of Lisp with "traditional" syntax (what Lisp people call "infix" or "m-lisp" syntax). One message on comp.lang.python said "I never understood why LISP was a good idea until I started playing with python." Python supports all of Lisp's essential features except macros, and you don't miss macros all that much because it does have eval, and operator overloading, and regular expression parsing, so you can create custom languages that way. "
site admin | February 26th, 2009 | Comments are closed
These are my links for February 25th through February 26th:
site admin | February 25th, 2009 | Comments are closed
These are my links for February 24th through February 25th:
- The C10K problem – On techniques for scaling to large number of network clients (e.g. >10000).
- Yodel Anecdotal » Blog Archive » Hello, (twitter) world – List of official Yahoo twitter handles for various activities including research, geo, search, and yui.
- New AWS Public Data Sets – Economics, DBpedia, Freebase, and Wikipedia – AWS adds Freebase, DBPedia, Wikipedia extract, and US Transportation data sets.
- eigenclass – Related document discovery, without algebra – Another approach to simple related document discovery, based on tags, should work ok for small data sets.
- SVD Recommendation System in Ruby – igvita.com – A 50 line SVD recommendation / collaborative filtering system for a Rails app. with the help of some simple linear algebra.
site admin | February 24th, 2009 | Comments are closed
These are my links for February 23rd through February 24th:
site admin | February 21st, 2009 | Comments are closed
These are my links for February 20th through February 21st:
- xkcd – A Webcomic – Online Communities – A map of online communities (circa 2007?)
- State of OpenSocial – weekend Apps Feb 20 2009 – Google Docs – Kevin Marks overview of OpenSocial as of February 2009.
- Massive Scrape of Twitter’s Friend Graph « blog.infochimps.org – Sample dataset for research on social graphs. "The infochimps have gathered a massive scrape of the Twitter friend graph. Right now it weighs in at about 2.7M users, 10M tweets, 58M edges."
- getting theinfo: data sets (theinfo) – Another list of publicly accessible data collections online
- Some Datasets Available on the Web » Data Wrangling Blog – List of many research datasets and resources related to data analysis available online, last updated February 2009.
- ICWSM 2009 – International AAAI Conference on Weblogs and Social Media – May 17 – 20, 2009, San Jose, California. This interdisciplinary conference brings together researchers and industry leaders interested in creating and analyzing social media. Past conferences have included technical papers from areas such as computer science, linguistics, psychology, statistics, sociology, multimedia and semantic web technologies.
site admin | February 17th, 2009 | Comments are closed
These are my links for February 16th through February 17th:
- Top 100 Network Security Tools – Many many security testing and hacking tools.
- FRONTLINE: inside the meltdown: watch the full program – "On Thursday, Sept. 18, 2008, the astonished leadership of the U.S. Congress was told in a private session by the chairman of the Federal Reserve that the American economy was in grave danger of a complete meltdown within a matter of days. "There was literally a pause in that room where the oxygen left," says Sen. Christopher Dodd"
- The Dark Matter of a Startup – "Every successful startup that I have seen has someone within their ranks that just kinda “does stuff.” No one really knows specifically what they do, but its vital to the success of the startup."
- Why I Hate Frameworks – "A hammer?" he asks. "Nobody really buys hammers anymore. They're kind of old fashioned…we started selling schematic diagrams for hammer factories, enabling our clients to build their own hammer factories, custom engineered to manufacture only the kinds of hammers that they would actually need."
- Mining The Thought Stream – Lots of comments around what is Twitter good for and how will it make money, revolving around real/near-time search, analytics, marketing, etc.
- Understanding Web Operations Culture – the Graph & Data Obsession … – Comparison of traffic at Flickr, Google, Twitter, last.fm during the Obama inauguration. "One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic in aggregate."
|
|