Big data is getting bigger. It has moved from limited use by retailers such as amazon.com or wallmart and others analysing customer behaviour to just about every sector.
Scientists have always crunched large amounts of data. The genomes of species and repositories of chemical structures and properties are two sources of big data for scientists. Billions of pairs of bases in the human genome are analysed to understand human evolution or to pinpoint genes. Chemical repositories are used in screening for drug activity. As N Szlezák et al. argue in the 10 February 2014 issue of ‘Nature,’ “the advent of “big data” and the availability of advanced analytical methods and technologies used to interpret such big data are leading to smarter and more effective discovery, development, and commercialization of innovative bio-pharmaceutical drugs.
An interesting tangential use comes from Hampton Creek, a private company focused on providing healthier food for everyone by exploiting the diversity of the plant world. Take their product ‘Just Mayo’. The Mayonnaise produced by Hellmann for example, is called real mayo, because as they claim it contains three simple ingredients, oil, vinegar and eggs. ‘Just Mayo’ by Hampton Creek eliminates the need for eggs by substituting a pea protein and other additives from plant sources. How did they find the right pea protein and other ingredients to get the binding power of the egg and resulting smoothness of mayonnaise? According to a recent article in Techcrunch, Hampton Creek whittled down 4000 different plants to just 13 species with the ideal traits needed for better consistency, taste and lower cost for their current products. Take the case of the pea protein. There are apparently over 2,000 varietals of Canadian yellow peas alone. Hampton Creek has done a lot of searching to find the right Canadian yellow pea with the right molecular weight and other characteristic to make a perfect mayonnaise. .
At the moment, finding just the right plant product that yields a quality product is like looking for a needle in a haystack. If Canadian peas have 2000+ plus varietals, imagine the size of the full plant protein database, and now we are talking about big data. According to Techcrunch, Hampton Creek hired a top data scientist from Google to build the world’s largest plant database. The database will be used to sort out which plants have the type of traits that can be used to create food products that are healthier, cheaper and have a lower impact on animals and the environment –i.e. Big data for feeding the world, more sustainably.