Toggle light / dark theme

The Witch-Hunt of Geophysicists: Society returns to the Dark Ages

I cannot let the day pass without contributing a comment on the incredible ruling of multiple manslaughter on six top Italian geophysicists for not predicting an earthquake that left 309 people dead in 2009. When those who are entrusted with safeguarding humanity (be it on a local level in this case) are subjected to persecution when they fail to do so, despite acting in the best of their abilities in an inaccurate science, we have surely returned to the dark ages where those who practice science are demonized by the those who misunderstand it.

http://www.aljazeera.com/news/europe/2012/10/20121022151851442575.html

I hope I do not misrepresent other members of staff here at The Lifeboat Foundation, in speaking on behalf of the Foundation in wishing these scientists a successful appeal against a court ruling which has shocked the scientific community, and I stand behind the 5,000 members of the scientific community who sent an open letter to Italy’s President Giorgio Napolitano denouncing the trial. This court ruling was ape-mentality at its worst.

The decaying web and our disappearing history

On January 28 2011, three days into the fierce protests that would eventually oust the Egyptian president Hosni Mubarak, a Twitter user called Farrah posted a link to a picture that supposedly showed an armed man as he ran on a “rooftop during clashes between police and protesters in Suez”. I say supposedly, because both the tweet and the picture it linked to no longer exist. Instead they have been replaced with error messages that claim the message – and its contents – “doesn’t exist”.

Few things are more explicitly ephemeral than a Tweet. Yet it’s precisely this kind of ephemeral communication – a comment, a status update, sharing or disseminating a piece of media – that lies at the heart of much of modern history as it unfolds. It’s also a vital contemporary historical record that, unless we’re careful, we risk losing almost before we’ve been able to gauge its importance.

Consider a study published this September by Hany SalahEldeen and Michael L Nelson, two computer scientists at Old Dominion University. Snappily titled “Losing My Revolution: How Many Resources Shared on Social Media Have Been Lost?”, the paper took six seminal news events from the last few years – the H1N1 virus outbreak, Michael Jackson’s death, the Iranian elections and protests, Barack Obama’s Nobel Peace Prize, the Egyptian revolution, and the Syrian uprising – and established a representative sample of tweets from Twitter’s entire corpus discussing each event specifically.

It then analysed the resources being linked to by these tweets, and whether these resources were still accessible, had been preserved in a digital archive, or had ceased to exist. The findings were striking: one year after an event, on average, about 11% of the online content referenced by social media had been lost and just 20% archived. What’s equally striking, moreover, is the steady continuation of this trend over time. After two and a half years, 27% had been lost and 41% archived.

Continue reading “The decaying web and our disappearing history”

Want to Get 70 Billion Copies of Your Book In Print? Print It In DNA

I have been meaning to read a book coming out soon called Regenesis: How Synthetic Biology Will Reinvent Nature and Ourselves. It’s written by Harvard biologist George Church and science writer Ed Regis. Church is doing stunning work on a number of fronts, from creating synthetic microbes to sequencing human genomes, so I definitely am interested in what he has to say. I don’t know how many other people will be, so I have no idea how well the book will do. But in a tour de force of biochemical publishing, he has created 70 billion copies. Instead of paper and ink, or pdf’s and pixels, he’s used DNA.

Much as pdf’s are built on a digital system of 1s and 0s, DNA is a string of nucleotides, which can be one of four different types. Church and his colleagues turned his whole book–including illustrations–into a 5.27 MB file–which they then translated into a sequence of DNA. They stored the DNA on a chip and then sequenced it to read the text. The book is broken up into little chunks of DNA, each of which has a portion of the book itself as well as an address to indicate where it should go. They recovered the book with only 10 wrong bits out of 5.27 million. Using standard DNA-copying methods, they duplicated the DNA into 70 billion copies.

Scientists have stored little pieces of information in DNA before, but Church’s book is about 1,000 times bigger. I doubt anyone would buy a DNA edition of Regenesis on Amazon, since they’d need some expensive equipment and a lot of time to translate it into a format our brains can comprehend. But the costs are crashing, and DNA is a far more stable medium than that hard drive on your desk that you’re waiting to die. In fact, Regenesis could endure for centuries in its genetic form. Perhaps librarians of the future will need to get a degree in biology…

Link to Church’s paper

Source

In Conversation with Albert-lászló Barabási on Thinking in Network Terms

One question that fascinated me in the last two years is, can we ever use data to control systems? Could we go as far as, not only describe and quantify and mathematically formulate and perhaps predict the behavior of a system, but could you use this knowledge to be able to control a complex system, to control a social system, to control an economic system?

We always lived in a connected world, except we were not so much aware of it. We were aware of it down the line, that we’re not independent from our environment, that we’re not independent of the people around us. We are not independent of the many economic and other forces. But for decades we never perceived connectedness as being quantifiable, as being something that we can describe, that we can measure, that we have ways of quantifying the process. That has changed drastically in the last decade, at many, many different levels.

Continue reading “Thinking in Network Terms” and watch the hour long video interview

Artilects Soon to Come

Whether via spintronics or some quantum breakthrough, artificial intelligence and the bizarre idea of intellects far greater than ours will soon have to be faced.

http://www.sciencedaily.com/releases/2012/08/120819153743.htm

The Electric Septic Spintronic Artilect

AI scientist Hugo de Garis has prophesied the next great historical conflict will be between those who would build gods and those who would stop them.

It seems to be happening before our eyes as the incredible pace of scientific discovery leaves our imaginations behind.

We need only flush the toilet to power the artificial mega mind coming into existence within the next few decades. I am actually not intentionally trying to write anything bizarre- it is just this strange planet we are living on.

http://www.sciencedaily.com/releases/2012/08/120813155525.htm

http://www.sciencedaily.com/releases/2012/08/120813123034.htm

OpenOffice / LibreOffice & A Warning For Futurists

I spend most of my time thinking about software, and occasionally I come across issues that are relevant to futurists. I wrote my book about the future of software in OpenOffice, and needed many of its features. It might not be the only writing / spreadsheet / diagramming / presentation, etc. tool in your toolbox, but it is a worthy one. OpenDocument Format (ODF) is the best open standard for these sorts of scenarios and LibreOffice is currently the premier tool to handle that format. I suspect many of the readers of Lifeboat have a variant installed, but don’t know much of the details of what is going on.

The OpenOffice situation has been a mess for many years. Sun didn’t foster a community of developers around their work. In fact, they didn’t listen to the community when it told them what to do. So about 18 months ago, after Oracle purchased Sun and made the situation worse, the LibreOffice fork was created with most of the best outside developers. LibreOffice quickly became the version embraced by the Linux community as many of the outside developers were funded by the Linux distros themselves. After realizing their mess and watching LibreOffice take off within the free software community, Oracle decided to fire all their engineers (50) and hand the trademark and a copy of the code over to IBM / Apache.

Now it would be natural to imagine that this should be handed over to LibreOffice, and have all interested parties join up with this effort. But that is not what is happening. There are employees out there whose job it is to help Linux, but they are actually hurting it. You can read more details on a Linux blog article I wrote here. I also post this message as a reminder about how working together efficiently is critical to have faster progress on complicated things.

Risk Assessment is Hard (computationally and otherwise)

How hard is to assess which risks to mitigate? It turns out to be pretty hard.

Let’s start with a model of risk so simplified as to be completely unrealistic, yet will still retain a key feature. Suppose that we managed to translate every risk into some single normalized unit of “cost of expected harm”. Let us also suppose that we could bring together all of the payments that could be made to avoid risks. A mitigation policy given these simplifications must be pretty easy: just buy each of the “biggest for your dollar” risks.

Not so fast.

The problem with this is that many risk mitigation measures are discrete. Either you buy the air filter or you don’t. Either your town filters its water a certain way or it doesn’t. Either we have the infrastructure to divert the asteroid or we don’t. When risk mitigation measures become discrete, then allocating the costs becomes trickier. Given a budget of 80 “harms” to reduce, and risks of 50, 40, and 35, then buying the 50 leaves 15 “harms” that you were willing to pay to avoid left on the table.

Alright, so how hard can this be to sort this out? After all, just because going big isn’t always the best for your budget, doesn’t mean it isn’t easy to figure out. Unfortunately, this problem is also known as the “0−1 knapsack problem”, which computer scientists know to be NP-complete. This means that there isn’t any known process to find exact solutions that are polynomial in the size of the input, thus requiring looking through a good portion of the potential solution combinations, taking an exponential amount of time.

What does this tell us? First of all, it means that it isn’t appropriate to expect all individuals, organizations, or governments to make accurate comparative risk assessments for themselves, but neither should we discount the work that they have done. Accurate risk comparisons are hard won and many time-honed cautions are embedded in our insurance policies and laws.

However, as a result of this difficulty, we should expect that certain short-cuts are made, particularly cognitive short-cuts: sharp losses are felt more sharply, and have more clearly identifiable culprits, than slow shifts that erode our capacities. We therefore expect our laws and insurance policies to be biased towards sudden unusual losses, such as car accidents and burglaries, as opposed to a gradual increase in surrounding pollutants or a gradual decrease in salary as a profession becomes obsolete. Rare events may also not be included through processes of legal and financial adaptation. We should also expect them to pay more attention to issues we have no “control” over, even if the activities we do control are actually more dangerous. We should therefore be particularly careful of extreme risks that move slowly and depend upon our own activities, as we are naturally biased to ignore them compared to more flashy and sudden events. For this reason, models, games, and simulations are very important tools for risk policy. For one thing, they make these shifts perceivable by compressing them. Further, as they can move longer-term events into the short-term view of our emotional responses. However, these tools are only as good as the information they include, so we also need design methodologies that aim to broadly discover information to help avoid these biases.

The discrete, “all or nothing” character of some mitigation measures has another implication. It also tells us that we wouldn’t be able to make implicit assessments of how much individuals of different income levels value their lives by the amount they are willing to pay to avoid risks. Suppose that we have some number of relatively rare risks, each having a prevention stage, in which the risks have not manifested in any way, and a treatment stage, in which they have started to manifest. Even if the expected value favors prevention over treatment in all cases, if one cannot pay for all such prevention, then the best course in some cases is to pay for very few of them, leaving a pool of available resources to treat what does manifest, which we do not know ahead of time.

The implication for existential and other extreme risks is we should be very careful to clearly articulate what the warning signs for each of them are, for when it is appropriate to shift from acts of prevention to acts of treatment. In particular, we should sharply proceed with mitigating the cases where the best available theories suggest there will be no further warning signs. With existential risks, the boundary between remaining flexible and needing to commit requires sharply different responses, but with unknown tipping points, the location of the boundary is fuzzy. As a lack of knowledge knows no prevention and will always manifest, only treatment is feasible, so acting sharply to build our theories is vital.

We can draw another conclusion by expanding on how the model given at the beginning is unrealistic. There is no such thing as a completely normalized harm, as there are tradeoffs between irreconcilable criteria, the evaluation of which changes with experience across and within individuals. Even temporarily limiting an analysis to standard physical criteria (say lives), rare events pose a problem for actuarial assessment, with few occurrences giving poor bounds on likelihood. Existential risks provide no direct frequencies, nor opportunity for an update in Bayesian belief, so we are left to an inductive assessment of the risk’s potential pathways.

However, there is also no single pool for mitigation measures. People will form and dissolve different pools of resources for different purposes as they are persuaded and dissuaded. Therefore, those who take it upon themselves to investigate the theory leading to rare and one-pass harms, for whatever reason, provide a mitigation effort we might not rationally take for ourselves. It is my particular bias to think that information systems for aggregating these efforts and interrogating these findings, and methods for asking about further phenomena still, are worth the expenditure, and thus the loss in overall flexibility. This combination of our biases leads to a randomized strategy for investigating unknown risks.

In my view, the Lifeboat Foundation works from a similar strategy as an umbrella organization: one doesn’t have to yet agree that any particular risk, mitigation approach, or desired future is the one right thing to pursue, which of course can’t be known. It is merely the bet that pooling those pursuits will serve us. I have some hope this pooling will lead to efforts inductively combining the assessments of disparate risks and potential mitigation approaches.

Security and Complexity Issues Implicated in Strong Artificial Intelligence, an Introduction

Strong AI or Artificial General Intelligence (AGI) stands for self-improving intelligent systems possessing the capacity to interact with theoretical- and real-world problems with a similar flexibility as an intelligent living being, but the performance and accuracy of a machine. Promising foundations for AGI exist in the current fields of stochastic- and cognitive science as well as traditional artificial intelligence. My aim in this post is to give a very basic insight into- and feeling for the issues involved in dealing with the complexity and universality of an AGI for a general readership.

Classical AI, such as machine learning algorithms and expert systems, are already heavily utilized in today’s real-world problems, in the form of mature machine learning algorithms, which may profitably exploit patterns in customer behaviour, find correlations in scientific data or even predict negotiation strategies, for example [1] [2], or in the form of genetic algorithms. With the next upcoming technology for organizing knowledge on the net, which is called the semantic web and deals with machine-interpretable understanding of words in the context of natural language, we may start inventing early parts of technology playing a role in the future development of AGI. Semantic approaches come from computer science, sociology and current AI research, but promise to describe and ‘understand’ real-world concepts and to enable our computers to build interfaces to real world concepts and coherences more autonomously. Actually getting from expert systems to AGI will require approaches to bootstrap self-improving systems and more research on cognition, but must also involve crucial security aspects. Institutions associated with this early research include the Singularity Institute [3] and the Lifeboat Foundation [4].

In the recent past, we had new kinds of security challenges: DoS attacks, eMail- and PDF-worms and a plethora of other malware, which sometimes even made it into military and other sensitive networks, and stole credit cards and private data en masse. These were and are among the first serious incidents related to the Internet. But still, all of these followed a narrow and predictable pattern, constrained by our current generation of PCs, (in-)security architecture, network protocols, software applications, and of course human flaws (e.g. the emotional response exploited by the “ILOVEYOU virus”). To understand the implications in strong AI first means to realize that probably there won’t be any human-predictable hardware, software, interfaces around for longer periods of time as long as AGI takes off hard enough.

To grasp the new security implications, it’s important to understand how insecurity can arise from the complexity of technological systems. The vast potential of complex systems oft makes their effects hard to predict for the human mind which is actually riddled with biases based on its biological evolution. For example, the application of the simplest mathematical equations can produce complex results hard to understand and predict by common sense. Cellular automata, for example, are simple rules for generating new dots, based on which dots, generated by the same rule, are observed in the previous step. Many of these rules can be encoded in as little as 4 letters (32 bits), and generate astounding complexity.

Cellular automaton, produced by a simple recursive formula

The Fibonacci sequence is another popular example of unexpected complexity. Based on a very short recursive equation, the sequence generates a pattern of incremental increase which can be visualized as a complex spiral pattern, resembling a snail house’s design and many other patterns in nature. A combination of Fibonacci spirals, for example, can resemble the motif of the head of a sunflower. A thorough understanding of this ‘simple’ Fibonacci sequence is also sufficient to model some fundamental but important dynamics of systems as complex as the stock market and the global economy.

Sunflower head showing a Fibonacci sequence pattern

Traditional software is many orders of magnitude higher in complexity than basic mathematical formulae, and thus many orders of magnitude less predictable. Artificial general intelligence may be expected to work with even more complex rules than low-level computer programs, of a comparable complexity as natural human language, which would classify it yet several orders of magnitude higher in complexity than traditional software. The estimated security implications are not yet researched systematically, but are likely as hard as one may expect now.

Practical security is not about achieving perfection, but about mitigation of risks to a minimum. A current consensus among strong AI researchers is that we can only improve the chances for an AI to be friendly, i.e. an AI acting in a secure manner and having a positive long-term effect on humanity rather than a negative one [5], and that this must be a crucial design aspect from the beginning on. Research into Friendly AI started out with a serious consideration of the Asimov Laws of robotics [6] and is based on the application of probabilistic models, cognitive science and social philosophy to AI research.

Many researchers who believe in the viability of AGI take it a step further and predict a technological singularity. Just like the assumed physical singularity that started our universe (the Big Bang), a technological singularity is expected to increase the rate of technological progress much more rapidly than what we are used to from the history of humanity, i.e. beyond the current ‘laws’ of progress. Another important notion associated with the singularity is that we cannot predict even the most fundamental changes occurring after it, because things would, by definition, progress faster than we are currently able to predict. Therefore, in a similar way in which we believe the creation of the universe depended on its initial condition (in the big bang case, the few physical constants from which the others can be derived), many researchers in this field believe that AI security strongly depends on the initial conditions as well, i.e. the design of the bootstrapping software. If we succeed in manufacturing a general-purpose decision-making mind, then its whole point would be self-modification and self-improvement. Hence, our direct control over it would be limited to its first iteration and the initial conditions of a strong AI, which could be influenced mostly by getting the initial iteration of its hard- and software design right.

Our approach to optimize those initial conditions must consist of working as careful as possible. Space technology is a useful example for this which points us into the general direction in which such development should go. In rocket science and space technology, all measurements and mathematical equations must be as precise as possible by our current technological standards. Also, multiple redundancies must be present for every system, since every single aspect of a system can be expected to fail. Despite this, many rocket launches still fail today, although we are steadily improving on error rates.

Additionally, humans interacting with an AGI may a major security risk themselves, as they may be convinced by an AGI to remove its limitations. Since an AGI can be expected to be very convincing if we expect it to exceed human intellect, we should not only focus on physical limitations, but making the AGI ‘friendly’. But even in designing this ‘friendliness’, the way our mind works is largely unprepared to deal with consequences of the complexity of an AGI, because the way we perceive and deal with potential issues and risks stems from evolution. As a product of natural evolution, our behaviour helps us dealing with animal predators, interacting in human societies and caring about our children, but not in anticipating the complexity of man-made machines. Natural behavioural traits of our human perception and cognition are a result of evolution, and are called cognitive biases.

Sadly, as helpful as they may be in natural (i.e., non-technological) environments, these are the very same behaviours which are often contra-productive when dealing with the unforeseeable complexity of our own technology and modern civilization. If you don’t really see the primary importance of cognitive biases to the security of future AI at this point, you’re probably in good company. But there are good reasons why this is a crucial issue that researchers, developers and users of future generations of general-purpose AI need to take into account. One of the major reason for founding the earlier-mentioned Singularity Institute for AI [3] was to get the basics right, including grasping the cognitive biases, which necessarily do influence the technological design of AGI.

What do these considerations practically imply for the design of strong AI? Some of the traditional IT security issues that need to be addressed in computer programs are: input validation, access limitations, avoiding buffer overflows, safe conversion of data types, setting resource limits, secure error handling. All of these are valid and important issues that must be addressed in any piece of software, including weak and strong AI. However, we must avoid underestimating the design goals for a strong AI, mitigating the risk on all levels from the beginning. To do this, we must care about more than the traditional IT security issues. An AGI will interface with the human mind, through text and direct communication and –interaction. Thus, we must also estimate the errors that we may not see, and do our best to be aware of flaws in human logic and cognitive biases, which may include:

  • Loss aversion: “the dis-utility of giving up an object is greater than the utility associated with acquiring it”.
  • Positive outcome bias: a tendency in prediction to overestimate the probability of good things happening to them
  • Bandwagon effect: the tendency to do (or believe) things because many other people do (or believe) the same.
  • Irrational escalation: the tendency to make irrational decisions based upon rational decisions in the past or to justify actions already taken.
  • Omission bias: the tendency to judge harmful actions as worse, or less moral, than equally harmful omissions (inactions).

Above cognitive biases are a modest selection from Wikipedia’s list [7], which knows over a hundred more. Struggling with some of the known cognitive biases in complex technological situations may be quite familiar to many of us, and the social components involved, from situations such as managing modern business processes to investing in the stock market. In fact, we should apply any general lessons learned from dealing with current technological complexity to AGI. For example, some of the most successful long-term investment strategies in the stock market are boring and strict, but based mostly on safety, such as Buffet’s margin of safety concept. With all factors gained from social and technological experience taken into account in an AGI design that strives to optimize both cognitive and IT security, its designers can still not afford to forget that perfect and complete security does remain an illusion.

References

[1] Chen, M., Chiu, A. & Chang, H., 2005. Mining changes in customer behavior in retail marketing. Expert Systems with Applications, 28(4), 773–781.
[2] Oliver, J., 1997. A Machine Learning Approach to Automated Negotiation and Prospects for Electronic Commerce. Available at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.9115 [Accessed Feb 25, 2011].
[3] The Singularity Institute for Artificial intelligence: http://singinst.org/
[4] For the Lifeboat Foundation’s dedicated program, see: https://lifeboat.com/ex/ai.shield
[5] Yudkowsky, E. 2006. Artificial Intelligence as a Positive and Negative Factor in Global Risk., Global Catastrophic Risks, Oxford University Press, 2007.
[6] See http://en.wikipedia.org/wiki/Three_Laws_of_Robotics and http://en.wikipedia.org/wiki/Friendly_AI, Accessed Feb 25, 2011
[7] For a list of cognitive biases, see http://en.wikipedia.org/wiki/Cognitive_biases, Accessed Feb 25, 2011

Open Letter to Ray Kurzweil

Dear Ray;

I’ve written a book about the future of software. While writing it, I came to the conclusion that your dates are way off. I talk mostly about free software and Linux, but it has implications for things like how we can have driverless cars and other amazing things faster. I believe that we could have had all the benefits of the singularity years ago if we had done things like started Wikipedia in 1991 instead of 2001. There is no technology in 2001 that we didn’t have in 1991, it was simply a matter of starting an effort that allowed people to work together.

Proprietary software and a lack of cooperation among our software scientists has been terrible for the computer industry and the world, and its greater use has implications for every aspect of science. Free software is better for the free market than proprietary software, and there are many opportunities for programmers to make money using and writing free software. I often use the analogy that law libraries are filled with millions of freely available documents, and no one claims this has decreased the motivation to become a lawyer. In fact, lawyers would say that it would be impossible to do their job without all of these resources.

My book is a full description of the issues but I’ve also written some posts on this blog, and this is probably the one most relevant for you to read: https://lifeboat.com/blog/2010/06/h-conference-and-faster-singularity

Once you understand this, you can apply your fame towards getting more people to use free software and Python. The reason so many know Linus Torvalds’s name is because he released his code as GPL, which is a license whose viral nature encourages people to work together. Proprietary software makes as much sense as a proprietary Wikipedia.

I would be happy to discuss any of this further.

Regards,

-Keith
—————–
Response from Ray Kurzweil 11/3/2010:

I agree with you that open source software is a vital part of our world allowing everyone to contribute. Ultimately software will provide everything we need when we can turn software entities into physical products with desktop nanofactories (there is already a vibrant 3D printer industry and the scale of key features is shrinking by a factor of a hundred in 3D volume each decade). It will also provide the keys to health and greatly extended longevity as we reprogram the outdated software of life. I believe we will achieve the original goals of communism (“from each according to their ability, to each according to their need”) which forced collectivism failed so miserably to achieve. We will do this through a combination of the open source movement and the law of accelerating returns (which states that the price-performance and capacity of all information technologies grows exponentially over time). But proprietary software has an important role to play as well. Why do you think it persists? If open source forms of information met all of our needs why would people still purchase proprietary forms of information. There is open source music but people still download music from iTunes, and so on. Ultimately the economy will be dominated by forms of information that have value and these two sources of information – open source and proprietary – will coexist.
———
Response back from Keith:
Free versus proprietary isn’t a question about whether only certain things have value. A Linux DVD has 10 billion dollars worth of software. Proprietary software exists for a similar reason that ignorance and starvation exist, a lack of better systems. The best thing my former employer Microsoft has going for it is ignorance about the benefits of free software. Free software gets better only as more people use it. Proprietary software is an inferior development model and an anathema to science because it hinders people’s ability to work together. It has infected many corporations, and I’ve found that PhDs who work for public institutions often write proprietary software.

Here is a paragraph from my writings I will copy here:

I start the AI chapter of my book with the following question: Imagine 1,000 people, broken up into groups of five, working on two hundred separate encyclopedias, versus that same number of people working on one encyclopedia? Which one will be the best? This sounds like a silly analogy when described in the context of an encyclopedia, but it is exactly what is going on in artificial intelligence (AI) research today.

Today, the research community has not adopted free software and shared codebases sufficiently. For example, I believe there are more than enough PhDs today working on computer vision, but there are 200+ different codebases plus countless proprietary ones. Simply put, there is no computer vision codebase with critical mass.

We’ve known approximately what a neural network should look like for many decades. We need “places” for people to work together to hash out the details. A free software repository provides such a place. We need free software, and for people to work in “official” free software repositories.

“Open source forms of information” I have found is a separate topic from the software issue. Software always reads, modifies, and writes data, state which lives beyond the execution of the software, and there can be an interesting discussion about the licenses of the data. But movies and music aren’t science and so it doesn’t matter for most of them. Someone can only sell or give away a song after the software is written and on their computer in the first place. Some of this content can be free and some can be protected, and this is an interesting question, but mostly this is a separate topic. The important thing to share is scientific knowledge and software.

It is true that software always needs data to be useful: configuration parameters, test files, documentation, etc. A computer vision engine will have lots of data, even though most of it is used only for testing purposes and little used at runtime. (Perhaps it has learned the letters of the alphabet, state which it caches between executions.) Software begets data, and data begets software; people write code to analyze the Wikipedia corpus. But you can’t truly have a discussion of sharing information unless you’ve got a shared codebase in the first place.

I agree that proprietary software is and should be allowed in a free market. If someone wants to sell something useful that another person finds value in and wants to pay for, I have no problem with that. But free software is a better development model and we should be encouraging / demanding it. I’ll end with a quote from Linus Torvalds:

Science may take a few hundred years to figure out how the world works, but it does actually get there, exactly because people can build on each others’ knowledge, and it evolves over time. In contrast, witchcraft/alchemy may be about smart people, but the knowledge body never “accumulates” anywhere. It might be passed down to an apprentice, but the hiding of information basically means that it can never really become any better than what a single person/company can understand.
And that’s exactly the same issue with open source (free) vs proprietary products. The proprietary people can design something that is smart, but it eventually becomes too complicated for a single entity (even a large company) to really understand and drive, and the company politics and the goals of that company will always limit it.

The world is screwed because while we have things like Wikipedia and Linux, we don’t have places for computer vision and lots of other scientific knowledge to accumulate. To get driverless cars, we don’t need any more hardware, we don’t need any more programmers, we just need 100 scientists to work together in SciPy and GPL ASAP!

Regards,

-Keith