|When:||Tuesday, 28 April 2009, 16:20–17:50|
Suppose we are given a list of numbers and we wish to determine whether it is sorted. That problem obviously requires reading the entire list. But Ergun, Kannan, Kumar, Rubinfeld and Viswanathan showed in 1998 that if we know in advance that our list is either sorted or far from sorted, we can perform the test by examining only a small portion of the list. Testing if a list is sorted is thus an example of a task that can be performed with a small number of samples into the input. As data of all types gets easier to obtain and cheaper to store, datasets are becoming increasingly large, and we are faced with a need to perform computational tasks on massive datasets. Can we still compute something useful about a dataset when reading all of it is prohibitively expensive? In other words, what can we solve with a sublinear number of samples from the input? This is the main question addressed by the newly emerging area, called Sublinear Algorithms. In this talk, we will give a few examples of specific problems that can be solved with a sublinear number of samples, starting with the sorting example and moving on to simple properties of images, edit distance between strings, and approximating compressibility of a string. We will also explain a framework called “property testing” that was defined by Rubinfeld and Sudan in 1996 to formalize the kind of approximation that sublinear algorithms are good at. We will present a partial classification of what can and cannot be solved in that model when an algorithm can query the input only in a sublinear number of places.