|
|
Ho John Lee | February 11th, 2010 | Comments are closed
These are my links for February 4th through February 11th:
- Schneier on Security: Interview with a Nigerian Internet Scammer – "We had something called the recovery approach. A few months after the original scam, we would approach the victim again, this time pretending to be from the FBI, or the Nigerian Authorities. The email would tell the victim that we had caught a scammer and had found all of the details of the original scam, and that the money could be recovered. Of course there would be fees involved as well. Victims would often pay up again to try and get their money back."
- xkcd – Frequency of Strip Versions of Various Games – n = Google hits for "strip <game name>" / Google hits for "<game name>"
- PeteSearch: How to split up the US – Visualization of social network clusters in the US. "information by location, with connections drawn between places that share friends. For example, a lot of people in LA have friends in San Francisco, so there's a line between them.
Looking at the network of US cities, it's been remarkable to see how groups of them form clusters, with strong connections locally but few contacts outside the cluster. For example Columbus, OH and Charleston WV are nearby as the crow flies, but share few connections, with Columbus clearly part of the North, and Charleston tied to the South."
- Redis: Lightweight key/value Store That Goes the Extra Mile | Linux Magazine – Sort of like memcache. "Calling redis a key/value store doesn’t quite due it justice. It’s better thought of as a “data structures” server that supports several native data types and operations on them. That’s pretty much how creator Salvatore Sanfilippo (known as antirez) describes it in the documentation. Let’s dig in and see how it works."
- Op-Ed Contributor – Microsoft’s Creative Destruction – NYTimes.com – Unlike other companies, Microsoft never developed a true system for innovation. Some of my former colleagues argue that it actually developed a system to thwart innovation. Despite having one of the largest and best corporate laboratories in the world, and the luxury of not one but three chief technology officers, the company routinely manages to frustrate the efforts of its visionary thinkers.
Ho John Lee | January 20th, 2010 | Comments are closed
These are my links for January 17th through January 20th:
- PG&E Electrical System Outage Map – This map shows the current outages in our 70,000-square-mile service area. To see more details about an outage, including the cause and estimated time of restoration, click on the color-coded icon associated with that outage.
- Twitter.com vs The Twitter Ecosystem – Fred Wilson comments on some data from John Borthwick indicating Twitter ecosystem use = 3-5x Twitter.com directly.
"John's chart estimates that Twitter.com is about 20mm uvs a month in the US (comScore has it at 60mm uvs worldwide) and the Twitter ecosystem at about 60mm uvs in the US.
That says that across all web services, not just AVC, the Twitter ecosystem is about 3x Twitter.com. And on this blog, whose audience is certainly power users, that ratio is 5x."
- Chris Walshaw :: Research :: Partition Archive – Welcome to the University of Greenwich Graph Partitioning Archive. The archive consists of the best partitions found to date for a range of graphs and its aim is to provide a benchmark, against which partitioning algorithms can be tested, and a resource for experimentation.
The partition archive has been in operation since the year 2000 and includes results from most of the major graph partitioning software packages. Researchers developing experimental partitioning algorithms regularly submit new partitions for possible inclusion.
Most of the test graphs arise from typical partitioning applications, although the archive also includes results computed for a graph-colouring test suite [Wal04] contained in a separate annex.
The archive was originally set up as part of a research project into very high quality partitions and authors wishing to refer to the partitioning archive should cite the paper [SWC04].
- Twitter’s Crawl « The Product Guy – "A list of incidents that affected the Page Load Time of the Twitter product, distinguishing between total downtime, and partial downtime and information inaccessibility, based upon the public posts on Twitters blog.
http://status.twitter.com/archive
I did my best to not double count any problems, but it was difficult since many of the problems occur so frequently, and it is often difficult to distinguish, from these status blog posts alone, between a persisting problem being experienced or fixed, from that of a new emergence of a similar or same problem. Furthermore, I also excluded the impact on Page Load Time arising from scheduled maintenance/downtime – periods of time over which the user expectation would be most aligned with the product’s promise of Page Load Time. "
- Soundboard.com – Soundboard.com is the web's largest catalog of free sounds and soundboards – in over 20 categories, for mobile or PC. 252,858 free sounds on 17,171 soundboards from movies to sports, sound effects, television, celebrities, history and travel. Or build, customize, embed and manage your own
Ho John Lee | January 18th, 2010 | Comments are closed
These are my links for December 31st through January 17th:
- Khan Academy – The Khan Academy is a not-for-profit organization with the mission of providing a high quality education to anyone, anywhere.
We have 1000+ videos on YouTube covering everything from basic arithmetic and algebra to differential equations, physics, chemistry, biology and finance which have been recorded by Salman Khan.
- StarCraft AI Competition | Expressive Intelligence Studio – AI bot warfare competition using a hacked API to run StarCraft, will be held at AIIDE2010 in October 2010.
The competition will use StarCraft Brood War 1.16.1. Bots for StarCraft can be developed using the Broodwar API, which provides hooks into StarCraft and enables the development of custom AI for StarCraft. A C++ interface enables developers to query the current state of the game and issue orders to units. An introduction to the Broodwar API is available here. Instructions for building a bot that communicates with a remote process are available here. There is also a Forum. We encourage submission of bots that make use of advanced AI techniques. Some ideas are:
* Planning
* Data Mining
* Machine Learning
* Case-Based Reasoning
- Measuring Measures: Learning About Statistical Learning – A "quick start guide" for statistical and machine learning systems, good collection of references.
- Berkowitz et al : The use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems (2006) – Berkowitz, Steven D., Woodward, Lloyd H., & Woodward, Caitlin. (2006). Use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems. Originally intended for publication in updating the 1988 volume, eds., Wellman and Berkowitz, Social Structures: A Network Approach (Cambridge University Press). Steve died in November, 2003. See Barry Wellman’s “Steve Berkowitz: A Network Pioneer has passed away,” in Connections 25(2), 2003. It has not been possible to add the updating of references or of the quality of graphics that might have been possible if Berkowitz were alive. An early version of the article appeared in the Proceedings of the Session on Combating Terrorist Networks: Current Research in Social Network Analysis for the New War Fighting Environment. 8th International Command and Control Research and Technology Symposium. National Defense University, Washington, D.C June 17-19, 2003
- SSH Tunneling through web filters | s-anand.net – Step by step tutorial on using Putty and an EC2 instance to set up a private web proxy on demand.
- PyDroid GUI automation toolkit – GitHub – What is Pydroid?
Pydroid is a simple toolkit for automating and scripting repetitive tasks, especially those involving a GUI, with Python. It includes functions for controlling the mouse and keyboard, finding colors and bitmaps on-screen, as well as displaying cross-platform alerts.
Why use Pydroid?
* Testing a GUI application for bugs and edge cases
o You might think your app is stable, but what happens if you press that button 5000 times?
* Automating games
o Writing a script to beat that crappy flash game can be so much more gratifying than spending hours playing it yourself.
* Freaking out friends and family
o Well maybe this isn't really a practical use, but…
- Time Series Data Library – More data sets – "This is a collection of about 800 time series drawn from many different fields.Agriculture Chemistry Crime Demography Ecology Finance Health Hydrology Industry Labour Market Macro-Economics Meteorology Micro-Economics Miscellaneous Physics Production Sales Simulated series Sport Transport & Tourism Tree-rings Utilities"
- How informative is Twitter? » SemanticHacker Blog – "We undertook a small study to characterize the different types of messages that can be found on Twitter. We downloaded a sample of tweets over a two-week period using the Twitter streaming API. This resulted in a corpus of 8.9 million messages (”tweets”) posted by 2.6 million unique users. About 2.7 million of these tweets, or 31%, were replies to a tweet posted by another user, while half a million (6%) were retweets. Almost 2 million (22%) of the messages contained a URL."
- Gremlin – a Turing-complete, graph-based programming language – GitHub – Gremlin is a Turing-complete, graph-based programming language developed in Java 1.6+ for key/value-pair multi-relational graphs known as property graphs. Gremlin makes extensive use of the XPath 1.0 language to support complex graph traversals. This language has applications in the areas of graph query, analysis, and manipulation. Connectors exist for the following data management systems:
* TinkerGraph in-memory graph
* Neo4j graph database
* Sesame 2.0 compliant RDF stores
* MongoDB document database
The documentation for Gremlin can be found at this location. Finally, please visit TinkerPop for other software products.
- The C Programming Language: 4.10 – by Kernighan & Ritchie & Lovecraft – void Rlyeh
(int mene[], int wgah, int nagl) {
int Ia, fhtagn;
if (wgah>=nagl) return;
swap (mene,wgah,(wgah+nagl)/2);
fhtagn = wgah;
for (Ia=wgah+1; Ia<=nagl; Ia++)
if (mene[Ia]<mene[wgah])
swap (mene,++fhtagn,Ia);
swap (mene,wgah,fhtagn);
Rlyeh (mene,wgah,fhtagn-1);
Rlyeh (mene,fhtagn+1,nagl);
} // PH'NGLUI MGLW'NAFH CTHULHU!
- How to convert email addresses into name, age, ethnicity, sexual orientation – This is so Meta – "Save your email list as a CSV file (just comma separate those email addresses). Upload this file to your facebook account as if you wanted to add them as friends. Voila, facebook will give you all the profiles of all those users (in my test, about 80% of my email lists have facebook profiles). Now, click through each profile, and because of the new default facebook settings, which makes all information public, about 95% of the user info is available for you to harvest."
- Microsoft Security Development Lifecycle (SDL): Tools Repository – A collection of previously internal-only security tools from Microsoft, including anti-xss, fuzz test, fxcop, threat modeling, binscope, now available for free download.
- Analytics X Prize – Home – Forecast the murder rate in Philadelphia – The Analytics X Prize is an ongoing contest to apply analytics, modeling, and statistics to solve the social problems that affect our cities. It combines the fields of statistics, mathematics, and social science to understand the root causes of dysfunction in our neighborhoods. Understanding these relationships and discovering the most highly correlated variables allows us to deploy our limited resources more effectively and target the variables that will have the greatest positive impact on improvement.
- PeteSearch: How to find user information from an email address – FindByEmail code released as open-source. You pass it an email address, and it queries 11 different public APIs to discover what information those services have on the user with that email address.
- Measuring Measures: Beyond PageRank: Learning with Content and Networks – Conclusion: learning based on content and network data is the current state of the art There is a great paper and talk about personalization in Google News they use content for this purpose, and then user click streams to provide personalization, i.e. recommend specific articles within each topical cluster. The issue is content filtering is typically (as we say in research) "way harder." Suppose you have a social graph, a bunch of documents, and you know that some users in the social graph like some documents, and you want to recommend other documents that you think they will like. Using approaches based on Networks, you might consider clustering users based on co-visitaion (they have co-liked some of the documents). This scales great, and it internationalizes great. If you start extracting features from the documents themselves, then what you build for English may not work as well for the Chinese market. In addition, there is far more data in the text than there is in the social graph
- mikemaccana’s python-docx at master – GitHub – MIT-licensed Python library to read/write Microsoft Word docx format files. "The docx module reads and writes Microsoft Office Word 2007 docx files. These are referred to as 'WordML', 'Office Open XML' and 'Open XML' by Microsoft. They can be opened in Microsoft Office 2007, Microsoft Mac Office 2008, OpenOffice.org 2.2, and Apple iWork 08. The module was created when I was looking for a Python support for MS Word .doc files, but could only find various hacks involving COM automation, calling .net or Java, or automating OpenOffice or MS Office."
site admin | May 31st, 2009 | Comments are closed
These are my links for May 30th through May 31st:
- Scaling Twitter: Making Twitter 10000 Percent Faster | High Scalability – Collection of links to presentations and interviews regarding Twitter's architecture, implementation plans, and performance issues, from spring 2009.
- The Last Psychiatrist: The Difference Between An Amateur, A Scientist, And A Genius – An amateur is full of wonder and speculation, tinkering towards the truth but suffering from a lack of knowledge and idleness; he's not even sure if someone else has already made these discoveries. "Is this a worthwhile pursuit?"
A scientist performs experiments to confirm or disprove a hypothesis, and in that way he grinds out the truth.
A genius has three abilities, which are actually the union of amateur and scientist: 1. to know the state of the art, what is known and what is not known. 2. To be able to think "out of the box". 3. To be disciplined enough to concentrate on the tedium of a formal investigation of his wondrous speculations.
- PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing – Research paper on sort of "super healing brush" for manipulating digital images, allows splicing together different sections of the image and automatically selecting similar textures to make the seam transitions work better.
- Light Blue Touchpaper » Blog Archive » Attack of the Zombie Photos – Social networking and sharing sites have challenges implementing and managing access control policies at large scale, and content delivery networks add another wrinkle.
- Map of all Google data center locations | Royal Pingdom – Where in the world is your search being served from? An attempt to assemble a list of known Google data centers worldwide.
site admin | May 28th, 2009 | Comments are closed
These are my links for May 24th through May 27th:
- Formulas and game mechanics – WoWWiki – Your guide to the World of Warcraft – Formulas and game mechanics rules and guidelines for developing role playing games
- Manchester United’s Park Has the Endurance to Persevere – NYTimes.com – Korean soccer player Park Ji-Sung – On Wednesday night in Rome, Park is expected to become the first Asian player to participate in the European Champions League final when Manchester United faces Barcelona.
- mloss.org – Machine Learning Open Source Software – Big collection of open source packages for machine learning, data mining, statistical analysis
- The Datacenter as Computer – Luiz André Barroso and Urs Hölzle 2009 (PDF) – 120 pages on large scale computing lessons from Google. "These new large datacenters are quite different from traditional hosting facilities of earlier times and cannot be viewed simply as a collection of co-located servers. Large portions of the hardware and software resources in these facilities must work in concert to efficiently deliver good levels of Internet service performance, something that can only be achieved by a holistic approach to their design and deployment. In other words, we must treat the datacenter itself as one massive warehouse-scale computer (WSC). We describe the architecture of WSCs, the main factors influencing their design, operation, and cost structure, and the characteristics of their software base."
- Geeking with Greg: The datacenter is the new mainframe – Pointer to a paper by Googlers Luiz Andre Barroso and Urs Holzle on the evolution of warehouse scale computing and the management and use of computing resources in a contemporary datacenter.
site admin | May 15th, 2009 | Comments are closed
These are my links for May 14th through May 15th:
- Congratulations, Google staff: $210k in profit per head in 2008 | Royal Pingdom – Google had $209,624 in profit per employee in 2008, which beats all the other large tech companies we looked at, including big hitters like Microsoft ($194K), Apple ($151K), Intel ($64K) and IBM ($30K).
- Statistical Data Mining Tutorials – A nice collection of presentations reviewing topics in data mining and machine learning. e.g. "HillClimbing, Simulated Annealing and Genetic Algorithms. Some very useful algorithms, to be used only in case of emergency." These include classification algorithms such as decision trees, neural nets, Bayesian classifiers, Support Vector Machines and cased-based (aka non-parametric) learning. They include regression algorithms such as multivariate polynomial regression, MARS, Locally Weighted Regression, GMDH and neural nets. And they include other data mining operations such as clustering (mixture models, k-means and hierarchical), Bayesian networks and Reinforcement Learning.
- Dare Obasanjo aka Carnage4Life – Why Twitter’s Engineers Hate the @replies feature – Looking at the infrastructure overhead required for Twitter's attempted change to @reply behavior.
- Scratch Helps Kids Get With the Program – Gadgetwise Blog – NYTimes.com – On my candidate list for 7th grade introductory programming and analysis. "Scratch, an M.I.T.-developed computer-programming language for children, is the focus of worldwide show-and-tell sessions this Saturday. "
- jLinq – Javascript Query Language – For manipulating data sets in Javascript, sort of like jQuery
site admin | May 5th, 2009 | Comments are closed
These are my links for May 4th through May 5th:
- Influential Nodes in a Diffusion Model for Social Networks (icalp05-inf.pdf) – Kempe, Kleinberg, Tardos. Algorithm for greedy approximation of most influential nodes in social network (63% of optimal) under various conditions.
- Maximizing the Spread of Influence through a Social Network (kdd03-inf.pdf) – Kempe, Kleinberg, Tardos. Maximizing propagation by selecting most influential nodes is NP-hard, but a greedy approximation can work well (63% of optimal) under various conditions.
- Notification Strategies for Social Networks – Discussion on approaches to maximizing use of a limited number of notifications within social networks e.g. Facebook
- James Smith • loopj.com » Blog Archive » jQuery Plugin: Tokenizing Autocomplete Text Entry – Looks handy – "This is a jQuery plugin to allow users to select multiple items from a predefined list, using autocompletion as they type to find each item. You may have seen a similar type of text entry when filling in the recipients field sending messages on facebook."
- Google Code FAQ – Using cURL to interact with Google data services – Step by step tutorial on using curl with Google data APIs.
- Behind The Business Plan Of Pirates Inc. : NPR – It takes around $250K to fund a Somali pirate operation. About 20 percent goes to pay off officials who look the other way. About 50 percent is for expenses and payroll. The leader of an attack makes $10,000 to $20,000 (the average Somali family lives on $500 a year). The initial investor — who put in $250,000 of seed capital — gets 30 percent, sometimes up to $500,000.
site admin | April 27th, 2009 | Comments are closed
These are my links for April 24th through April 27th:
site admin | April 12th, 2009 | Comments are closed
These are my links for April 12th from 17:02 to 19:13:
site admin | April 10th, 2009 | Comments are closed
These are my links for April 9th through April 10th:
site admin | April 9th, 2009 | Comments are closed
These are my links for April 9th from 08:07 to 17:53:
- IP address geolocation SQL database – IP address geolocation with MySQL by Marc-Andre Caron. He's done all the necessary legwork to solve this problem, putting together a free, monthly-updated MySQL dataset that will allow you to derive country, region, city, zip, latitude, and longitude from an IP address.
- Del.icio.us Finally Gets Some Respect from Yahoo – Probably Too Late – ReadWriteWeb –
- In the Event That You Have Accidentally Swallowed the Higgs Boson by Michael Rottman – The Morning News – "7. Do you feel protons decaying? Grand Unification may be occurring near your vital organs. "
- FT.com / Companies / UK companies – Dotcom veterans in Twitter ‘brains trust’ – "Mr Read has brought together a “brains trust” of advisers to Twitter Partners, including Brent Hoberman and Martha Lane Fox, founders of Lastminute.com; Saul Klein, a partner at Index Ventures, the London venture capitalists; and Toby Coppel, the former European vice-president at Yahoo."
- byteonic.com » What you cannot do using Java in Google App Engine – List of some restrictions on Java code running on GAE
site admin | April 9th, 2009 | Comments are closed
These are my links for April 7th through April 9th:
site admin | March 8th, 2009 | Comments are closed
These are my links for March 6th through March 8th:
- Wolfram Blog : Wolfram|Alpha Is Coming! –
- Wolfram Alpha is Coming — and It Could be as Important as Google | Twine –
- Wolfram Alpha — it’s like plugging into an electronic brain » VentureBeat –
- If browsers were women – Sharenator.org – "[Chrome] Extremely skinny, but very cool and friendly. However, when it comes to the bedroom, she is very inexperienced and has little to offer. [IE] For most, she's the first woman they tried. She's really easy but can get you infected." etc etc
- Rough Type: Nicholas Carr’s Blog: The coming of the megacomputer – Nick Carr commentary on Rick Rashid's statement that 20% of servers were going to major cloud data centers. Also some interesting discussion in comments.
- FT.com | Tech Blog | How many computers does the world need? – According to Microsoft research chief Rick Rashid, around 20 per cent of all the servers sold around the world each year are now being bought by a small handful of internet companies – he named Microsoft, Google, Yahoo and Amazon.
- The New Hot Cuisine: Korean – WSJ.com – Korean food is slowly making its way into mainstream awareness, both high end (French Laundry, Le Bernardin) and everyday (CPK, Kogi BBQ).
- WriteOnIt – Fake pictures – Build fake magazine covers, newspapers, and photos.
site admin | March 6th, 2009 | Comments are closed
These are my links for March 4th through March 6th:
- Welcome to VIPERdb – Scripps – VIPERdb is a database for icosahedral virus capsid structures . The emphasis of the resource is on providing data from structural and computational analyses on these systems, as well as high quality renderings for visual exploration.
- Virus images at VIPERdb – If you have ever wanted to make beautiful images of viruses, in colors of your choice, then go to VIPERdb, the virus particle explorer.
- Reverse HTTP – IETF draft-lentczner-rhttp-00.txt – Formal description of the reverse HTTP proposal for initiating connections through firewalls then reversing server and client roles.
- Reverse HTTP – Second Life Wiki – Experimental protocol which takes advantage of the HTTP/1.1 Upgrade: header to turn one HTTP socket around. When a client makes a request to a server with the Upgrade: PTTH/0.9 header, the server may respond with an Upgrade: PTTH/1.0 header, after which point the server starts using the socket as a client, and the client starts using the socket as a server.
- WTFs/m – The only valid measurement of code quality, WTFs/min
site admin | February 26th, 2009 | Comments are closed
These are my links for February 26th from 10:39 to 20:05:
site admin | February 25th, 2009 | Comments are closed
These are my links for February 24th through February 25th:
- The C10K problem – On techniques for scaling to large number of network clients (e.g. >10000).
- Yodel Anecdotal » Blog Archive » Hello, (twitter) world – List of official Yahoo twitter handles for various activities including research, geo, search, and yui.
- New AWS Public Data Sets – Economics, DBpedia, Freebase, and Wikipedia – AWS adds Freebase, DBPedia, Wikipedia extract, and US Transportation data sets.
- eigenclass – Related document discovery, without algebra – Another approach to simple related document discovery, based on tags, should work ok for small data sets.
- SVD Recommendation System in Ruby – igvita.com – A 50 line SVD recommendation / collaborative filtering system for a Rails app. with the help of some simple linear algebra.
site admin | February 21st, 2009 | Comments are closed
These are my links for February 21st from 13:59 to 21:55:
- Non Sequitur — Gocomics.com – "Hi. My name is Bob, and I'm a Twitter addict…"
- A Tutorial on Support Vector Machines for Pattern Recognition – Christopher J.C. Burges (PDF) – Appeared in: Data Mining and Knowledge Discovery 2, 121-167, 1998. The tutorial starts with an overview of the concepts of VC dimension and structural risk
minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable
data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss
when SVM solutions are unique and when they are global. We describe how support vector training can
be practically implemented, and discuss in detail the kernel mapping technique which is used to construct
SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large
(even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian
radial basis function kernels. While very high VC dimension would normally bode ill for generalization
performance, there are several arguments which support the observed high accuracy of SVMs,
which we review.
- Data Mining Research – dataminingblog.com: Data Miners on Twitter – A list of data mining people on twitter.
- YouTube – The Crisis of Credit Visualized – Part 1 – Nice animated video attempting to present a simplified explanation of the credit crisis and the relationship between home mortgage lending, bank leverage, and risk.
- “10 Obstacles to Cloud Computing” by UC Berkeley & How GoGrid Hurdles Them | GoGrid Blog – Another commentary on the recent UCB cloud computing overview paper
site admin | February 21st, 2009 | Comments are closed
These are my links for February 20th through February 21st:
- xkcd – A Webcomic – Online Communities – A map of online communities (circa 2007?)
- State of OpenSocial – weekend Apps Feb 20 2009 – Google Docs – Kevin Marks overview of OpenSocial as of February 2009.
- Massive Scrape of Twitter’s Friend Graph « blog.infochimps.org – Sample dataset for research on social graphs. "The infochimps have gathered a massive scrape of the Twitter friend graph. Right now it weighs in at about 2.7M users, 10M tweets, 58M edges."
- getting theinfo: data sets (theinfo) – Another list of publicly accessible data collections online
- Some Datasets Available on the Web » Data Wrangling Blog – List of many research datasets and resources related to data analysis available online, last updated February 2009.
- ICWSM 2009 – International AAAI Conference on Weblogs and Social Media – May 17 – 20, 2009, San Jose, California. This interdisciplinary conference brings together researchers and industry leaders interested in creating and analyzing social media. Past conferences have included technical papers from areas such as computer science, linguistics, psychology, statistics, sociology, multimedia and semantic web technologies.
site admin | February 19th, 2009 | Comments are closed
These are my links for February 18th through February 19th:
- Single Google Query uses 1000 Machines in 0.2 seconds – Google Fellow Jeff Dean says from 1999-2009, while both search queries and processing power have gone up by a factor of 1000, latency has gone down from around 1000ms to 200ms. Crawler updates now take minutes compared to months in 1999. 1000 machines handle a single query, all in memory.
- Government 2.0: Tweeting the Talk, Walking the Walk « Adriel Hampton – List of twitter users in various government organizations.
- The Absurdly Artificial Divide Between Pure and Applied Research – Olivia Judson – NYTimes.com – I used to explain myself as an "applied research" guy, small "r", not big "R" pure research. Love theory and analysis but want to see it get used for something eventually.
- Amazon Web Services Developer Community : Load data into S3 via hard drives? – Amazon asks for feedback regarding the FedEx option for bulk data transfer. "We have heard a number of requests about sending hard drives to AWS to load into S3. If such a service would benefit your business, we’d like to learn more about your use case."
- Local Media in a Postmodern World, Part XCI, Advertising Loses Its Balance – On the shifts in supply and demand, buyers and sellers in advertising markets as media moves from 1-to-many to niche-oriented, many-to-many and sellers take control of their own online media and advertising campaigns
site admin | February 16th, 2009 | Comments are closed
These are my links for February 15th through February 16th:
- Berkeley cloud report gets mixed reviews | The Wisdom of Clouds – CNET News – James Urqhardt commentary on UCB paper, "The paper begins by setting a definition of Cloud Computing that will be considered controversial by many, as it is firmly in the "there is no cloud computing inside enterprise data centers" camp."
- Above the Clouds: Above the Clouds Released – UC Berkeley RAD Lab starts a new blog and publishes their take on the state of cloud computing.
- Forget Dunbar’s Number, Our Future Is in Scoble’s Number « I’m Not Actually a Geek – A look at changing interaction styles enabled by growing use of online social networks and applications. "If Dunbar’s Number is defined at 150 connections, perhaps we can term the looser connection of thousands as Scoble’s Number. "
- What really happened at Ma.gnolia and lessons learned – Video podcast with Larry Halff describing how Ma.gnolia was implemented (Ruby on Rails), its ongoing operation leading up to the failure of the (1/2 TB) MySQL database a few weeks ago.
- Infrastructure for Modern Web Sites « random($foo) – An overview of packages, services, and approaches for building web systems, circa January 2009. With assorted comments.
- Online Mind Mapping – MindMeister – Web-based, embeddable mind mapping software, sort of like MindJet, wiki-style collaborative editing.
- Jean-Lou Dupont’s WEBlog: Cloud Computing Mind Map – A mind map of companies and projects in the cloud computing space.
|
|