|
|
site admin | January 16th, 2010 | Comments are closed
These are my links for June 13th through January 16th:
- StarCraft AI Competition | Expressive Intelligence Studio – AI bot warfare competition using a hacked API to run StarCraft, will be held at AIIDE2010 in October 2010.
The competition will use StarCraft Brood War 1.16.1. Bots for StarCraft can be developed using the Broodwar API, which provides hooks into StarCraft and enables the development of custom AI for StarCraft. A C++ interface enables developers to query the current state of the game and issue orders to units. An introduction to the Broodwar API is available here. Instructions for building a bot that communicates with a remote process are available here. There is also a Forum. We encourage submission of bots that make use of advanced AI techniques. Some ideas are:
* Planning
* Data Mining
* Machine Learning
* Case-Based Reasoning
- Measuring Measures: Learning About Statistical Learning – A "quick start guide" for statistical and machine learning systems, good collection of references.
- Berkowitz et al : The use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems (2006) – Berkowitz, Steven D., Woodward, Lloyd H., & Woodward, Caitlin. (2006). Use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems. Originally intended for publication in updating the 1988 volume, eds., Wellman and Berkowitz, Social Structures: A Network Approach (Cambridge University Press). Steve died in November, 2003. See Barry Wellman’s “Steve Berkowitz: A Network Pioneer has passed away,” in Connections 25(2), 2003. It has not been possible to add the updating of references or of the quality of graphics that might have been possible if Berkowitz were alive. An early version of the article appeared in the Proceedings of the Session on Combating Terrorist Networks: Current Research in Social Network Analysis for the New War Fighting Environment. 8th International Command and Control Research and Technology Symposium. National Defense University, Washington, D.C June 17-19, 2003
- SSH Tunneling through web filters | s-anand.net – Step by step tutorial on using Putty and an EC2 instance to set up a private web proxy on demand.
- PyDroid GUI automation toolkit – GitHub – What is Pydroid?
Pydroid is a simple toolkit for automating and scripting repetitive tasks, especially those involving a GUI, with Python. It includes functions for controlling the mouse and keyboard, finding colors and bitmaps on-screen, as well as displaying cross-platform alerts.
Why use Pydroid?
* Testing a GUI application for bugs and edge cases
o You might think your app is stable, but what happens if you press that button 5000 times?
* Automating games
o Writing a script to beat that crappy flash game can be so much more gratifying than spending hours playing it yourself.
* Freaking out friends and family
o Well maybe this isn't really a practical use, but…
- Time Series Data Library – More data sets – "This is a collection of about 800 time series drawn from many different fields.Agriculture Chemistry Crime Demography Ecology Finance Health Hydrology Industry Labour Market Macro-Economics Meteorology Micro-Economics Miscellaneous Physics Production Sales Simulated series Sport Transport & Tourism Tree-rings Utilities"
- How informative is Twitter? » SemanticHacker Blog – "We undertook a small study to characterize the different types of messages that can be found on Twitter. We downloaded a sample of tweets over a two-week period using the Twitter streaming API. This resulted in a corpus of 8.9 million messages (”tweets”) posted by 2.6 million unique users. About 2.7 million of these tweets, or 31%, were replies to a tweet posted by another user, while half a million (6%) were retweets. Almost 2 million (22%) of the messages contained a URL."
- Gremlin – a Turing-complete, graph-based programming language – GitHub – Gremlin is a Turing-complete, graph-based programming language developed in Java 1.6+ for key/value-pair multi-relational graphs known as property graphs. Gremlin makes extensive use of the XPath 1.0 language to support complex graph traversals. This language has applications in the areas of graph query, analysis, and manipulation. Connectors exist for the following data management systems:
* TinkerGraph in-memory graph
* Neo4j graph database
* Sesame 2.0 compliant RDF stores
* MongoDB document database
The documentation for Gremlin can be found at this location. Finally, please visit TinkerPop for other software products.
- The C Programming Language: 4.10 – by Kernighan & Ritchie & Lovecraft – void Rlyeh
(int mene[], int wgah, int nagl) {
int Ia, fhtagn;
if (wgah>=nagl) return;
swap (mene,wgah,(wgah+nagl)/2);
fhtagn = wgah;
for (Ia=wgah+1; Ia<=nagl; Ia++)
if (mene[Ia]<mene[wgah])
swap (mene,++fhtagn,Ia);
swap (mene,wgah,fhtagn);
Rlyeh (mene,wgah,fhtagn-1);
Rlyeh (mene,fhtagn+1,nagl);
} // PH'NGLUI MGLW'NAFH CTHULHU!
- How to convert email addresses into name, age, ethnicity, sexual orientation – This is so Meta – "Save your email list as a CSV file (just comma separate those email addresses). Upload this file to your facebook account as if you wanted to add them as friends. Voila, facebook will give you all the profiles of all those users (in my test, about 80% of my email lists have facebook profiles). Now, click through each profile, and because of the new default facebook settings, which makes all information public, about 95% of the user info is available for you to harvest."
- Microsoft Security Development Lifecycle (SDL): Tools Repository – A collection of previously internal-only security tools from Microsoft, including anti-xss, fuzz test, fxcop, threat modeling, binscope, now available for free download.
- Analytics X Prize – Home – Forecast the murder rate in Philadelphia – The Analytics X Prize is an ongoing contest to apply analytics, modeling, and statistics to solve the social problems that affect our cities. It combines the fields of statistics, mathematics, and social science to understand the root causes of dysfunction in our neighborhoods. Understanding these relationships and discovering the most highly correlated variables allows us to deploy our limited resources more effectively and target the variables that will have the greatest positive impact on improvement.
- PeteSearch: How to find user information from an email address – FindByEmail code released as open-source. You pass it an email address, and it queries 11 different public APIs to discover what information those services have on the user with that email address.
- Measuring Measures: Beyond PageRank: Learning with Content and Networks – Conclusion: learning based on content and network data is the current state of the art There is a great paper and talk about personalization in Google News they use content for this purpose, and then user click streams to provide personalization, i.e. recommend specific articles within each topical cluster. The issue is content filtering is typically (as we say in research) "way harder." Suppose you have a social graph, a bunch of documents, and you know that some users in the social graph like some documents, and you want to recommend other documents that you think they will like. Using approaches based on Networks, you might consider clustering users based on co-visitaion (they have co-liked some of the documents). This scales great, and it internationalizes great. If you start extracting features from the documents themselves, then what you build for English may not work as well for the Chinese market. In addition, there is far more data in the text than there is in the social graph
- mikemaccana’s python-docx at master – GitHub – MIT-licensed Python library to read/write Microsoft Word docx format files. "The docx module reads and writes Microsoft Office Word 2007 docx files. These are referred to as 'WordML', 'Office Open XML' and 'Open XML' by Microsoft. They can be opened in Microsoft Office 2007, Microsoft Mac Office 2008, OpenOffice.org 2.2, and Apple iWork 08. The module was created when I was looking for a Python support for MS Word .doc files, but could only find various hacks involving COM automation, calling .net or Java, or automating OpenOffice or MS Office."
- Handy one-liners for SED – Sed expressions are powerful, but somewhat obscure and easy to screw up. A handy cheat sheet for common tasks.
site admin | June 10th, 2009 | Comments are closed
These are my links for June 9th through June 10th:
- Announcing the Yahoo! Distribution of Hadoop (Hadoop and Distributed Computing at Yahoo!) – Yahoo releases its internal version of Hadoop, a source-only distribution of Apache Hadoop tested and used in production at Yahoo.
- Google Fusion Tables FAQ – Sort of like extra-large Google Docs spreadsheets, up to 100MB per table, 250MB per user. One interesting wrinkle is that it doesn't actually delete your dataset when you "delete" it, so the data is still available for derived tables that other users have built.
- Filesystem Performance from a Database Perspective – Presentation on performance benchmarks on linux filesystems (ext2, ext3, reiserfs, xfs, etc)
- What Assumptions Make: Filesystem I/O from a database perspective – Slide presentation comparing linux file system performance across various formats (ext2, ext3, etc), RAID configurations, readahead buffer sizes
- MySQL – Common Queries Tree – A collection of common queries implemented in MySQL
site admin | June 4th, 2009 | Comments are closed
These are my links for June 3rd through June 4th:
site admin | May 2nd, 2009 | Comments are closed
These are my links for April 30th through May 2nd:
- FusionCharts Free – Animated Flash Charts and Graphs for ASP, PHP, ASP.NET, JSP, RoR and other web applications – Flash charting component that can be used to render data-driven & animated charts for your web applications and presentations. It is a cross-browser and cross-platform solution that can be used with PHP, Python, Ruby on Rails, ASP, ASP.NET, JSP, ColdFusion, simple HTML pages or even PowerPoint Presentations to deliver interactive and powerful flash charts. You do NOT need to know anything about Flash to use FusionCharts. All you need to know is the language you're programming in.
- Raphaël—JavaScript Library – Raphaël is a small JavaScript library that should simplify your work with vector graphics on the web. If you want to create your own specific chart or image crop and rotate widget, for example, you can achieve it simply and easily with this library. Raphaël uses the SVG W3C Recommendation and VML as a base for creating graphics. This means every graphical object you create is also a DOM object, so you can attach JavaScript event handlers or modify them later. Raphaël’s goal is to provide an adapter that will make drawing vector art compatible cross-browser and easy.
- A Really Gentle Introduction to Data Mining | Regular Geek – List of data mining blogs and related resources.
- BlackBerry SSH Tutorial: Connect to Unix Server using MidpSSH for Mobile Devices – Notes on using MidpSSH on Blackberry for remote access to servers. Seems to work, although big network lag on my BlackBerry Bold / AT&T.
- Country Reports on Terrorism 2008 – U.S. law requires the Secretary of State to provide Congress, by April 30 of each year, a full and complete report on terrorism with regard to those countries and groups meeting criteria set forth in the legislation. This annual report is entitled Country Reports on Terrorism. Beginning with the report for 2004, it replaced the previously published Patterns of Global Terrorism.
- DIY: How To Find Authoritative Twitter Users Plus 100 To Get You Started | Ignite Social Media – Some comments on recommendation metrics for Twitter, trying to use "favorites" mark as an indicator.
- SIGUSR2 > The Power That is GNU Emacs – "If you've never been convinced before that Emacs is the text editor in which dreams are made from, or that inside Emacs there are unicorns manipulating your text, don't expect me to convince you."
site admin | February 28th, 2009 | Comments are closed
These are my links for February 27th through February 28th:
site admin | February 17th, 2009 | Comments are closed
These are my links for February 16th through February 17th:
- Top 100 Network Security Tools – Many many security testing and hacking tools.
- FRONTLINE: inside the meltdown: watch the full program – "On Thursday, Sept. 18, 2008, the astonished leadership of the U.S. Congress was told in a private session by the chairman of the Federal Reserve that the American economy was in grave danger of a complete meltdown within a matter of days. "There was literally a pause in that room where the oxygen left," says Sen. Christopher Dodd"
- The Dark Matter of a Startup – "Every successful startup that I have seen has someone within their ranks that just kinda “does stuff.” No one really knows specifically what they do, but its vital to the success of the startup."
- Why I Hate Frameworks – "A hammer?" he asks. "Nobody really buys hammers anymore. They're kind of old fashioned…we started selling schematic diagrams for hammer factories, enabling our clients to build their own hammer factories, custom engineered to manufacture only the kinds of hammers that they would actually need."
- Mining The Thought Stream – Lots of comments around what is Twitter good for and how will it make money, revolving around real/near-time search, analytics, marketing, etc.
- Understanding Web Operations Culture – the Graph & Data Obsession … – Comparison of traffic at Flickr, Google, Twitter, last.fm during the Obama inauguration. "One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic in aggregate."
I stopped by the BlackDog booth at Linux World today, initially drawn in by the spectacle of Tux the Linux Penguin riding on BlackDog’s mechanical bull. Not something you see every day.
The whole scene at the BlackDog booth had sort of a early dot-com boom circa 1996 feel to it. Here’s a company I’ve never heard of, with a relatively huge booth and lots of happy staffers recruiting riders for the mechanical bull, but almost no one bothering to mention what they were actually selling, other than large posters announcing “The World’s First Linux Server that will take You for a walk”. It took me a bit of effort to find a person who could explain what they were selling.
The BlackDog server turns out to be an interesting hybrid of a putting Linux on a USB flash device and putting an embedded Linux into a USB device. The actual device is around the size and weight of a pack of cards, and runs Debian Linux on a 400MHz PowerPC, drawing power from the USB interface. The announced ship date is September 1, 2005, at $199 for 256MB or $239 for a 512MB model. Both models include a fingerprint scanner, 64MB RAM, and a SD/MMC expansion slot.
Unlike the SoulPad, BlackDog is intended for use with a Windows or Linux system that’s already running. In their booth demo, when the device is plugged in, it launches an X server on the WinXP host system, which is then used as the display for applications residing on the BlackDog server.
A few considerations come to mind:
- Since there’s no network interface, this can’t be used as a Linux server in the typical sense, i.e. plugged into the network on its own. It could probably be connected to a powered USB hub for power and a network connection, but this doesn’t appear to be its design target.
- Fast startup time – in their booth demo, the environment hosted on the device came up a few seconds after plugging in the USB cable. I’m guessing that the WinXP autoplay was previously configured to run the X server from the USB flash file system on hotplug detection. In any case, it’s a lot faster than cold-booting Linux or Windows.
- Since it relies on the host computer for human interface (display, keyboard, and mouse), it’s not quite as secure as it might look. One issue I worry about in public internet cafes and other shared-computer environments is the growing presence of keyloggers. Spyware-infested public computers are fairly common in my unscientific poll (i.e. places I’ve stopped, mostly in Asia). So my working assumption is that anything I type on a public computer is visible. That would still apply to applications hosted on the BlackDog server, since it can’t do anything about securing the human interfaces on host system. This is one of the reasons I’m mostly considering bootable Linux environments for use in unsecured environments.
It seems like a neat gadget. I’ll have to think a bit more about what it’s actually good for.
The BlackDog team is apparently looking for ideas as well. They’re starting a developer contest in September when the product is shipped, with a $50,000 prize. A lot of money for a product that appears to be in the “interesting-linux-hacker-widget” category. I suspect the total product revenue for some products in this space are less than $50K.
Turns out they’re part of Realm Systems, which received $8.5 million in funding last January, which is why they can afford a huge booth that doesn’t tell anyone what they’re doing, and offer $50K for a developer contest.
Hint to Realm’s marcom team: The mechanical bull was a lot of fun, but it would be good to mention what you’re selling once in a while…
See also: SoulPad, Rabbit Ethernet, SSV Embedded, PicoTux, Engadget, discussion at Slashdot
Ho John Lee | December 30th, 2004 | 2 comments
This looks interesting, if it works.
http://www.nvu.com/
From the web site:
Nvu (pronounced N-view, for a “new view”) is an Open Source project started by Linspire, Inc. Linspire is committed exclusively to bringing Desktop Linux to the masses, and realized that an easy-to-use web authoring system was needed for Linux to continue its expansion to the Desktop. Linspire contributes significant capital, expertise, servers, bandwidth, marketing, and other resources to guarantee the continuation and success of the Nvu project.
|
|