Quantcast
Channel: Blog - Data-Pop Alliance
Viewing all 137 articles
Browse latest View live

Taming the data storm to shape a human-centered data revolution: contextualizing and announcing digitising Europe’s stakeholders’ dialogues

$
0
0

While the Library of Alexandria was once believed to contain the sum of human knowledge around the third century B.C.E, it is now near impossible to quantify and increasingly difficult to store the volume of data generated by humans as a result of the advent of digital technologies and societies. More pieces of data travel per second through the Internet today than were stored on the entire Internet twenty years ago. A lot of them are innocuous and uninformative—photos of food and videos of cats. But the world is without a doubt facing a socio-technological phenomenon of historical magnitude and significance that fuels on and produces data as its main resource—initially dubbed the “Petabyte Age,” and the “Industrial Revolution of Data”, and today, known simply as the “Data Revolution”. While this “data revolution” consists mainly of two main movements— Open Data and Big Data—it is the latter that is understandably creating our greatest hopes and fears. With a long history of socio-technological phenomena fostering both public good and problems, the world is facing an age-old question: how can it be leveraged to spur positive social change?

vodafonedebate

Alex Pentland (left) and Andrew Keen (right) debating on the ethics of Big Data as part of the ‘digitising europe initiative’, November 12, 2015 in Berlin. Photo courtesy of Vodafone Institute.

Technological advances may often appear ‘neutral’—in the sense that their effect depends on how they are put to use: whether to hang a frame or crush a skull, power Christmas light bulbs or turn cities to dust. But it is never a zero-sum game: “Change the instruments and you will change the entire social theory around them”, once wrote Bruno Latour, Professor at Sciences Po. Wendell Wallach’s latest book, A Dangerous Master: How to Keep Technology from Slipping Beyond Our Control, echoes and emphasizes Christian Lous Lange’s cautionary words: “Technology is a good servant but a dangerous master.” Wallach warns against a possible “tech storm”—the plethora of harms that may result from emerging technologies. He points to the critical importance of inflection points and windows of opportunity to alter and shape the course of a technology’s use for good or bad. The pace and scope of change introduced by data is a storm, but we shouldn’t be taken by surprise and reproduce past mistakes.

There are reasons to believe that ‘data’ as an emerging ecosystem of new resources, tools, rules and players may be at such an inflection point, with a window of opportunity to collectively design its future for the better. After a first few years of highly polarized exchanges, more balanced voices have gained in strength to articulate its risks and requirements. “Is data a danger to the developing world?” recently Kate Crawford, Principal Researcher at Microsoft Research, noting how data risked “increasing power and wealth asymmetries between those in charge of the data and those who are subjected to actions based on it.” There are ample examples of this risk having materialized. To date, advanced data analytics has benefited first and foremost the two kinds of bureaucracies that hold and control most powers and data: governments and corporations. But it need not be the case in the future.

As Data-Pop Alliance Academic Director Alex Pentland wrote a few weeks ago, “(t)oday there is, for the first time, a real choice”, adding that “(t)he essential factor that led toward democratisation was broad ownership of the resource, which made it difficult to exert autocratic control. The same seems true for data: to avoid autocratic control, citizens must have effective, direct control of data about themselves”. This will both require and spur massive socio-political changes and investments, but it is not out of reach. The technology is there: “It has become technologically possible to set a standard for data stewardship that does not allow even the powerful to subvert data rights except by threat of physical force”, he writes. Civil societies and communities around the globe—but also some private companies and government agencies—seem to have come to the realization that in the future, people should and will have greater control over how their data are used, both by third parties acting as these ‘data stewards’ and by themselves.

New codes of conduct, ethical principles, and legal frameworks will be created. A condition and outcome will be the emergence of ‘data literate’ citizenries—or rather, literate citizenries in the age of data—individuals and groups with “the desire and ability to constructively engage in society through and about data”—which may include (but not be limited to) writing lines of codes. Above all, this will require creating deliberative spaces where different perspectives can be openly discussed to find common grounds and avoid unnecessary clashes, backlashes, and setbacks on the way to making data ‘work’ for people.

Creating these constructive, interactive and informed spaces is the primary goal of digitising europe 2015—a European-wide discussion on the digital economy of Europe launched by the Vodafone Institute for Society and Communications in partnership with Data-Pop Alliance. As part of this initiative, multi-stakeholders’ dialogues will be organized over the next 6 months in major European cities—which started November 12th in Berlin, and later will take place in Brussels, Madrid, and Dublin—to debate what getting the data revolution “right” means and entails. We are hopeful that these events will generate excitement, ideas and connections to contribute to crafting a human-centered data revolution.

A full video from the Nov. 12 ‘digitising europe’ event in Berlin, where our Academic Director Alex ‘Sandy’ Pentland and Andrew Keen debate on the ethics of Big Data, can be found here.

Main sources and references:

Bhargava, R., Deahl, E., Letouzé, E., Noonan, A., Sangokoya, D., & Shoup, N. (2015). “Beyond Data Literacy: Reinventing Community Engagement and Empowerment in the Age of Data.” Data-Pop Alliance.

Crawford, Kate (2015). “Is Data a Danger to the Developing World?” World Economic Forum.

Libert, B, (2013). The Big Data Revolution: Intelligence on Demand. New Word City.

McAfee, A., & Brynjolfsson, E., (2012) “Big Data: The Management Revolution.” Harvard Business Review.

McQuinn, Jason (2013). “A Brief Interview Noam CHomsky on Anarchy, Civilization, and Technology.” C.A.L. Press.

Pentland, Alex (2015). “Who Should we Trust to Manage our Data?” World Economic Forum.

Stuart, E., Samman, E., Avis, W., & Berliner, T., (2015). “The Data Revolution: Finding the Missing Millions.” Overseas Development Institute.


10 Key Take-Aways from our COP21 Side Event "Climate Change Resilience in the Age of Data"

$
0
0

On Friday, December 4th, at the Agence française de développement (AFD) headquarters in Paris, Data-Pop Alliance (DPA), AFD, and the UK’s Department for International Development (DfID) co-hosted a COP21 side event on “Climate Change Resilience in the Age of Data”. Over the course of the day, 100 researchers and practitioners shared perspectives and worked together to explore the opportunities, challenges, and requirements for leveraging new data sources and analytical approaches to contribute to the COP21 goals by building greater societal resilience, chief of which through greater empowerment of at-risk communities.

While the main outcome of the COP21 conference itself, attended by close to 150 government representatives and 40,000 participants, was the “Adoption of the Paris Agreement” (which can be found here) that includes a commitment to “hold the increase in the global average temperature to below 2°C above pre-industrial levels”, our side event yielded the following 10 main take-aways:

panel1

Expert panelists discuss “The science of data analytics for climate resilience” during the #COP21Data event. From L to R: Marc Levy (Deputy Director of the Earth Institute’s CIESIN at Columbia University) Samuel Rufat (Associate Professor at the Université Cergy-Pontoise), Rebecca Moore (Director of Google Earth Engine and Outreach), Alain Retière (Co-Founder and CEO of EverImpact), and Erik Wetter (Co-Founder and Chairman of Flowminder.org).

10. We don’t fully understand the impact.

As noted in a report published by UNISDR last month, the frequency of climate related disasters stemming form floods, droughts, cyclones, etc., has roughly doubled over the past 2 decades; they currently account for 90% of all disasters worldwide. However, while we can better quantify some of these effects, we do not really understand their full impact, especially long-term, and their exact links to climate change. “The dose-response relationship – we currently don’t have that,” said Marc Levy, Deputy Director of CEISIN at Columbia University’s Earth Institute. More so, the evidence varies with the type of disaster – it is relatively easy to measure the impact of floods in terms of geographical spread, human casualty or population displacement, but that of droughts is harder to grasp. Often times when people measure change “they leave out drought because they cannot measure it… there are different levels of exposure to climate extremes,” Levy explained. This has serious implications, including political. In a New Yorker article last week, Elizabeth Kolbert highlights the impact of the droughts in Syria:

“On the list of horrors that led to Syria’s civil war, it’s hard to know how high to place the drought or its devastating consequences – the spike in food prices, the internal displacement, the further crowding of already overcrowded cities. Certainly, it ranks below the repressive brutality of Bashar al-Assad’s regime. Still, as Secretary of State John Kerry put it recently, in a speech about climate change and national security, it’s probably “not a coincidence” that the war was preceded by four years of failed rains.”

The words of French President Francois Hollande’s at the beginning of COP21 were, “Ce qui est en cause, c’est la paix” (“What is at stake is peace”).

Climate change impact does present itself in multifaceted and sometimes unanticipated ways, but we “risk falling into a siloization of climate concerns” such as focusing on floods and/or earthquakes”, says Kenneth Chomitz, a Fellow at Forest Trends. Here, the measured impact could be (integrated) data from destroyed crop yields, and/or a farmer’s income, yet “from a farmer’s perspective, there are lots of shocks coming in.”

9. Coordination and partnerships are crucial.

When posed the question, “What are the biggest economic and political barriers to realizing data use for climate action?” the response was overwhelmingly centered on lack of coordination and of (both strength and number of) partnerships. Mariéme Jamme, Co-Founder of Africa Gathering who sits on Data-Pop Alliance’s Board of Advisers elaborated: “One of the barriers is coordination. When people come to Senegal to talk about data, they talk to the governments and not [at the] grassroots level… The collaboration I am talking about is very specific. If we can open up a little bit and be more comfortable, at least give them a chance, we can show our skills.” Better coordination and dialogue could go a long way towards narrowing the decision-gap and the disconnect between policy-makers, project implementers, and affected communities. One channel would involve streamlining data collection and management for climate change resilience. Alongside multi-sector coordination – across governmental bodies, civil society, and the private sector – there is a strong need for partnerships among stakeholders of the same sector, especially those with parallel agendas. Craig Hanson, Director of the People and Ecosystems Program at World Resources Institute, lists “Partner” as one of the top six actions people can take in data use for climate action, and in the event’s closing session, Levy stated, “There is an underinvestment in grassroots partnerships.” Without a doubt, this is one place for growth.

IMG_3728

Data-Pop Alliance Director and Co-Founder Emmanuel Letouzé discusses the use of Big Data and the 3C’s at our COP21 side event, “Climate Change Resilience in the Age of Data” on December 4, 2015 in Paris.

8. We need an ecosystem’s approach to Big Data conceptualized through its 3 C’s.

Big Data was first conceptualized in the late 2000’s as exhibiting three main characteristics – volume, velocity, and variety (and sometimes a fourth – veracity). Today, we must expanding the conceptualization of Big Data, moving from one of content to one of context, and consider Big Data as an ecosystem made up of three Cs:

• Digital bread crumbs: pieces of data that are the digital translation of human actions and interactions captured by digital devices, the majority of which are passively emitted by users of digital devices and services.

• Big Data capacities or analytics: the set of tools and methods, hardware and software, and know-how and skills necessary to process and analyze these new kinds of data. The tools and methods include visualization techniques, statistical machine learning, algorithms, and more.

• Big Data communities: the actors involved in the Big Data ecosystem, from the generators of data to their analysts and end users – potentially the whole population.

As mentioned by Emmanuel Letouzé, Director and Co-Founder of Data-Pop Alliance, in his opening remarks, “it is only by conceptualizing Big Data as a vibrant ecosystem with actors rather than as an inert raw material to be exploited that we can shape its future for the better. Building greater resilience in complex human ecosystems through Big Data requires considering Big Data as an ecosystem of its own and figuring out how both can support and learn from each other”.

panel2

A lively discussion on “Communities and citizens as sensors and responders in humanitarian contexts”. From L to R: Mariéme Jamme (Co-Founder of Africa Gathering and Data-Pop Alliance Advisory Board Member), Julie Cissé (Coordinator of GIPS/WAR), Simone Sala (Associate Director of the Dr. Steve Chan Center for Sensemaking and Data-Pop Alliance Research Affiliate), Patrick Vinck (Assistant Professor for Harvard University and Co-Founder and Co-Director of Data-Pop Alliance), John Crowley (Program Manager of Data for Climate Action at UN Global Pulse), and Silke Roth (Associate Professor of Sociology at the University of Southampton).

7. Involve and invest in local communities and data ecosystems.

Throughout the event, panelists and participants alike stressed the importance of incorporating both top-down and bottom-up processes, with several calls to direct greater attention and investment to the local level. Rebecca Moore, Director of Google Earth Engine and Earth Outreach, spoke to the importance of “data on the ground,” in addition to the sky (i.e. satellite data). “Ground-based evidence can be strongest, it can predict malaria outbreaks… weeks in advance.” She has found that the “pull model from the ground works better than the push from us.”

Nicolas de Cordes, Vice President of Marketing Anticipation for Orange, echoed this sentiment: “Fantastic use cases, so what? What’s the next step after models and pilots?” We need to reduce the cost of creating and making these algorithms (e.g. pilots) in order to grow local ecosystems. “A country that doesn’t grow a healthy data ecosystem will not be able to achieve the SDGs.”

This last one sums it up: “There is a lot of movement to get it working in the political space. Funds flow down to enable local action. They water the plant. When they stop watering, it turns brown, and then they kill it because it’s dead and then they don’t know why it died.” John Crowley, Program Manager of Data for Climate Action at the UN Global Pulse continues, “Top-down… there are long delays.”

6. Real-time data do not imply real-time insights.

In fact, it can possibly only imply real time mistakes,” reckoned Erik Wetter, Co-Founder and Chairman of Flowminder Foundation. Even with mobile data sets, which were ranked first among data sources cited in the World Bank’s recent “Survey on the use of big data for the SDGs”, the information is only about anonymous subscribers. With the growing salience of real-time analytics and nowcasting, we still need to be able to integrate multiple data sets in order to derive actionable insights. Real-time data is just that – real-time data; sometimes it is not enough.

IMG_3798 copy

One of the groups during our afternoon breakout technical sessions seeking to answer the question, “What does an actionable climate data tool and visualization look like and add?” using Google Earth Engine.

5. Institutions, legislations, and politics matter a great deal.

Data and technology won’t solve the climate crisis by themselves. Amy Luers, Assistant Director of Climate Resilience and Information in The White House Office of Science and Technology Policy, contended, “There can’t just be a technological solution, there has to be an institutional solution.” Likewise, Patrick Vinck, Assistant Professor at Harvard University and Co-Founder and Co-Director of Data-Pop Alliance, said that innovation is happening faster than capacity building, faster than legislation, faster than community involvement. “An institutional overhaul is necessary.” And Cyrille Bellier, Head of Research and Development Unit at AFD, reminded us that “development, and especially institutional change, is a long process. We have to take this into account… when collaborating with the tech industry which operates within way shorter horizons.” What COP21 has delivered, more than just a climate action proposal, is diplomacy on its better day. Let’s see what happens next.

4. The crisis of responsibility must be fixed.

As noted in an article published on Sunday, Jeffrey Sachs, Columbia University Professor and Director of the Earth Institute, calls out those countries potentially responsible:

“At the core of the climate crisis is a challenge of collective action. Every nation emits carbon dioxide and other greenhouse gases that dangerously warm the planet. Yet decarbonization is costly, at least in the short term. Each country therefore tries to push the other countries to take on a disproportionate share of the burden.”

davidchav

David Chavalarias, Director of Complex Systems Institute of Paris, showcases visualizations by Tweetoscope that compares data on Twitter keywords and research papers on climate change.

The question of responsibility and enforceability remains central. There are currently no legally binding global climate agreements. The other major treaties that came before, the Copenhagen Accord of 2009 and the Kyoto Protocol of 1997, along with many others but to a lesser degree of environmental (or even political) impact, were non-binding. Developed countries didn’t step up to the plate and the U.S. failed to ratify the latter vis-à-vis the Byrd-Hagel Resolution. At COP21, the EU and others are calling for a legally binding agreement, which will prove extremely challenging. At least 55 countries need to sign the agreement, representing at least 55% of all emissions, in order to ratify the treaty and make it legally binding come April 22, 2016 in New York City. However, even if a legally binding agreement prevails, and if all 175 countries uphold their submitted commitments, we will still exceed the 2°C goal. At another side event for the Equator Prize 2015 winners, the International Executive Director of Greenpeace, Kumi Naidoo, said that 2°C (3.6°F) was good for developed countries, but for developing and [and/including] island states, “We demand ‘1.5 to stay alive’.”

Who should be held responsible? Alain Retière, Co-Founder and CTO of EverImpact, asserted that cities have to be held responsible. This could be done, he believes, by providing a service to cities to help them monitor emissions by themselves. “Somebody has to be held responsible for regulating these emissions. What climate action is about in my view is having cities and territories responsible for their adaptation to climate change and for their mitigation of emission.” He continued, “At the end of the day, most of the climate problems are generated in cities. Most of the victims of climate change are more in rural areas.”

panel 3

Our third panel debated issues surrounding “The economics and politics of data for resilience”. L to R: Claire-Marie Foulquier Gagnes (Policy Officer at Etalab in the French Prime Minister’s Office), Amy Luers (Assistant Director of Climate Resilience and Information at The White House of Science and Technology Policy), Mathilde Bouyé (Advisor for UN Negotiations at the World Resources Institute), Mariéme Jamme (Co-Founder of Africa Gathering and DPA Advisory Board Member), and Nicolas de Cordes (Vice President of Marketing Anticipation at Orange Group).

3. (Some) Data are a public good.

Specifically climate data, which would benefit and/or adversely affect everyone. Clean oceans, for example, are a public good, and are a quintessential tragedy of the commons case. It is crucial to open up more climate data, to make the data accessible, to share it, and to integrate it with other data sources on both national and local levels. Likewise, climate data, should be non-excludable as well.

“Resilience is a process in which you can activate or use a resource. Social cohesion is a key component here.” – Patrick Vinck

“You should have a right to know what stands in the way of your development.” – Marc Levy

“Data is critical soft infrastructure.” – Anonymous audience member

“Data as a public good – needs to be catalytic but targeted. There are real issues in funding this public good.” – Kenneth Chomitz

“I truly believe we need to open up. In France, we have opened up climate data… with datasets being open it will not only serve France and French citizens but others abroad. You have a responsibility to try and open more of that data.” – Claire-Marie Foulquier-Gazagnes, Policy Officer at Etalab of the French Prime Minister’s Office

The White House published a blog in May entitled “Two Years of Transformative Open Data for Public Good.” Chief Data Scientist DJ Patil almost had it right. Although correct about data being “a vital natural resource,” he was wrong on its framing. The hopes are that it will be “…Data as a Public Good” moving forward. We need to open up.

2. Data can empower—under certain conditions.

Sort of. Data leads to more knowledge, more ownership, and more decision-making capabilities, to name a few. But more data does not always mean less bias; it can also mean more noise. More data also raises the questions of digital privacy, security, politics, and institutional frameworks. For one, “Is the data safe to release?” de Cordes asked the other panelists. When we speak about opening up more data, especially with regards to climate data, we need to ask the questions: What can be done with the data? Who will be empowered with this data and how will this data be used for or against (or otherwise) others? What are the projected outcomes, positive, negative, and otherwise, of accessing this data? How are power struggles emerging – across public and private sectors, economic classes, political institutions, and geographies – in this new information age, the age of data? Data is powerful – let’s not forget this – and it should be recorded, stored, released, and used as such.

When speaking about “democratizing data,” understanding data, and having agency to engage in social processes because of the data, “data literacy will be most important skill of the 21st century.” – Emmanuel Letouzé

“Being able to relate to data is being able to relate to power and ask the important questions,” says Johannes Jütting of PARIS21. A video we launched in October entitled “Entering the Age of Data: A Focus on Data Inclusion” addresses more of these politically-charged questions.

“…kinds of platforms have enormous opportunities in terms of mobilizing… democratizing information. Maps have been associated with powers and can be bought.” It’s an issue over the “availability verse blockage of information.” – Silke Roth, Associate Professor of Sociology at the University of Southampton

However, the power debate is not always as we would predict.I think in general, the information… of science communications around politically-charged issues is certainly flawed. Amy Luers asserts that there is a role for data analytics terms of helping to advance action on climate, however she notes that she does not believe it will work for the “persuasion of beliefs on climate change.” One area for the role of data could be microtargeting for marketing… buying other forms of energy for other reasons. Then the aggregate effect could be significant.

elequipo

L to R: Simone Sala, Gabriel Pestre, Thomas Roca, Beth Tellman, Bessie Schwarz, Emmanuel Letouzé, David Sangokoya, Natalie Shoup, Lauren Barrett, Patrick Vinck, and Mariéme Jamme.

1. It’s a matter of quality over quantity.

“More data is not always necessary. Better data, yes.” says Vinck. The climate change risks are real and the severity of the impact is deepening. The amount of data produced can seem overwhelming, but moving forward, it will be critical to focus on the quality of the data rather than the quantity. Moore talks about the decisions that governments and citizens will increasingly have to make, even ones as great as where cities are being built. “Those decisions are being made often on very poor data,” she says. Many of those decisions rely on the history, maps, and data sets that are old, lower in resolution, and not globally consistent. We need better data to be able to make better decisions.

 

Read on for more information on the Climate Change Resilience in the Age of Data event and on our research, capacity building, and engagement work in the thematic area of Climate Change and Resilience. Search #COP21Data on Twitter for highlights. Thank you to Thomas Roca of AFD for his inputs on this post.

Big Data and Migration, Part 1

$
0
0

This is the first in a series of blog posts on our ongoing work exploring the applications and implications of Big Data and migration.

People have always been on the move—whether to seek refuge from war and persecution, find jobs and create businesses, flee in the aftermath of natural disasters, or get a degree somewhere else. From country to city, from country to country, and across continents, migration is a fundamental part of human life and social development. But too often migration has been seen by policymakers and the public as problematic: a problem for destination countries if migrants are unskilled, poor, too numerous, or potentially linked to global terrorism; and a problem for source countries if it leads to the so-called ‘brain drain.’

Data from household surveys and census information, however, paint a rather different picture: migration has yield trillions of dollars in impact on the global economy from “the money migrants send home, the taxes they pay, the funds they invest, the trade they stimulate, and the knowledge and technology transfer they stimulate.” In addition, migration plays a key role for almost 1 billion people—roughly 214 million international migrants and 740 million internal migrants worldwide—in reducing poverty, escaping conflict and increasing prosperity.

The availability and increased study of migration data—data on forced (conflict- and disaster-induced), internal, international and labor migrations—largely arose from the efforts of the World Bank on the impact of remittances and the International Organization for Migration (IOM). In the past ten years, conferences such as the annual Migration and Development conference and increased empirical analyses on the intersection between migration and development (M4D) have brought attention to migration as a major development issue. Recent trends in the global political economy and technology landscapes point to a need and an opportunity to ground migration debates, policies and decisions on firmer empirical grounds, so as to improve outcomes.

First, though notably excluded during the formation of the MDGs, migration is now firmly part of the post-2015 UN Development Agenda under goal 10 of the SDGs: to “facilitate orderly, safe, regular and responsible migration and mobility of people, including through the implementation of planned and well-managed migration policies.” In addition, the SDGs also promote eradicating human trafficking, the protection of labor rights, safe work environments for migrant workers (particularly women), reducing the cost of migrant remittances, and reducing the number of people affected by disasters. The plight of migrants coming to Europe has also led to calls for a renewed focus on migration as a key policy topic in a world where political, economic and climatological processes are increasingly integrated and contentious.

Second, the ‘data revolution’ is here and should be harnessed for migration. With the advent advances in data on remittances, human trafficking and stocks on migrants and their attributes, the 2009 Commission on International Migration Data for Development already highlighted the lack of “detailed, comparable, disaggregated data on migrant stocks and flows” as the main obstacle preventing the “formulation of evidence based policies to maximise the benefits of migration for economic development around the world.” Seven years later, there has been an explosion and sea change in the sources and kinds of data on all facets of human movements that can be collected, analyzed, and visualized.

Originally framed as the “3 V’s” (volume, velocity and variety) in the early 2000s, Big Data has emerged as an ecosystem of “3 C’s”: digital “crumbs” (digital translations of human actions and interactions captured by digital devices); powerful capacities to collect, aggregate and analyze data; and communities involved in generating, governing and using data, including data generators, end users, policy-makers, experts, privacy advocates and civic hacker communities.

The introduction of these new data sources, tools and methods will not replace ongoing efforts to make better use of existing migration data; rather, they can both supplement existing statistical data on mobility and introduce new knowledge on human movement at levels of temporal and geographical granularities that could not previously be achieved.

The table below summarizes relevant data generated from 3 main sources, falling under exhaust, web-based, and sensing data: mobile phones and automatic data collection systems; email and social media data; and sensors. Of course, to reiterate a key point, a large share of these data is emitted passively, raising major privacy considerations. Collection occurs in both the public and private spheres, as well as in the more complex intersection of the two—i.e. public Tweets or Foursquare check-ins—raises the question of whether any or all uses of the data require explicit consent for the user.

Migration Big data sources

Big data Sources Systems

Devices

Public Private
Mobile-based generated data GPS Fleet tracking
Bus AVL
Taxis (GPS)
Fleet tracking
Uber, Lyft, taxis (Apps)
Ridesharing Services
Floating vehicle data – Dedicated probing In-vehicle navigation systems (Tom Tom, Inrix, Garmin)
Cellphone CDR Mobile phone towers or GPS
Geo-located Social media data

Smartphones
Tweets (Twitter API)
Check-ins (Foursquare)

Smartphones
Tweets
Micro-blog
Check-ins
Photos

Automatic Data Collection Systems Fare Collection

Public transport ticketing
Smart Cards
Contactless payments

Public transport ticketing

Contactless payments

Road Electronic Tolling
Passenger Counts (APC) In-vehicle (buses)
Stations (subway, BRT)
Websites and social media services Cookies,
log-ins
Log in data
Twitter data
Facebook data
Physical Sensors Network Sensors CCTV Cameras
Surveillance cameras
Traffic cameras
Automatic number plate recognition
Bluetooth detection
Loop detectors
Environment sensors (temperature, noise, air quality)
Parking Meters
Rail track sensors
In-vehicle sensors Bus Monitoring Sensors
Driving performance
Fuel consumption
Engine temperature

Using Big Data to detect and predict natural hazards better and faster: lessons learned with hurricanes, earthquakes, floods

$
0
0

Data–Pop Alliance has been conducting ongoing research on Big Data, climate change and environmental resilience. With funding from the UK’s Department for International Development (DfID), we published a synthesis report evaluating the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data and its potential applications and implications for climate change and disaster resilience. This report will feed into the World Humanitarian Summit to be organized in Istanbul in May 2016.

This is the first in a series of companion pieces that offer insights from the synthesis report. The authors of the series attempt to go “beyond the buzz” to lay out what we actually know about Big Data’s existing utility for disaster science and for building practical resilience.


Every day, geological, biological, hydrological, and climatic factors produce natural hazards, which in some cases result in natural disasters that can have a devastating impact on ecosystems and human societies. Hazards can be geophysical (e.g. earthquakes, cyclonic storms), biological (e.g. infestation), or generated by a combination of different factors (e.g. floods, wildfires, etc).

Big Data technologies can play a role in:

  • monitoring hazards
  • determining the exposure of human societies to disaster risk
  • tracking impacts of disasters and monitoring recovery efforts
  • mitigating vulnerabilities; and
  • strengthening resilience of communities.

Particularly interesting is the role of Big Data for detecting earthquakes, floods, hurricanes, as well as forecasting future occurrence of such hazards.

Earthquakes

Even if science is clear about the impossibility of predicting earthquakes, the detection of such events increasingly leverages data from sensors and digital social data.

Earthquake epicenters (1963-1998)
Source: NASA, Digital Tectonic Activity Map (DTAM) project

The combined application of accelerometers in mobile phones and computers with cloud computing can help faster detection of their occurrence. A group of scientists (Cochran, Lawrence, Christensen, and Jakka) employed this approach to develop the Quake-Catcher Network (QCN), a seismic network leveraging distributed/volunteer computing to gain critical insights on an earthquake by bridging traditional seismic stations with innovative data sources. In 2009 the group demonstrated that it is possible to detect small earthquakes through a global network of computers connected via the Internet, highlighting the existing capacity to develop rapid earthquake early warning systems at relatively low cost thanks to distributed data collected from the Internet.

In 2014 a group of scientists (Musaev, Wang, and Pu) developed LITMUS, a model to detect landslides following earthquakes by integrating multiple data sources. By integrating social sensors (Twitter, Instagram, and YouTube) and physical sensors (USGS seismometers and TRMM satellite), the model scored better than traditional techniques employed by USGS for real-time hazard mapping.

Digital social data from relevant organizations are being integrated to detect when crises happen. For example, the USGS monitors Tweets mentioning earthquakes worldwide with magnitudes of 5.5 and above as a means of detecting them and issuing alerts more broadly through their Twitter Earthquake Dispatch (@USGSted).

Floods

Big Data also allow the early detection of floods. By combining information related to flooding from Twitter and satellite observations, a group of scientists (De Groeve, Kugler, and Brakenridge) built a real-time map of location, timing, and impact of floods. The map, constantly updated, can be accessed online.

Global flood maps from 1985 to 2007
Source: Dartmouth Flood Observatory Global Active Archive of Large Flood Events

Social media enables qualitative situational analysis before, during, and after disasters. Floodtags (a social media analytics platform) was employed to extract information from Twitter, enabling the filtering, visualization, and mapping of social media content based on location and keywords. Satellite data came from the Global Flood Detection System (GFDS), which provides a service for rapid identification of inundated areas through daily passive microwave satellite observations. The approach was tested in two case studies, respectively in the Philippines and in Pakistan, proving to be particularly appropriate for monitoring large floods in densely populated areas.

Twitter pattern linked to 2014 floods in the Philippines
Source: Jongman et al., 2015

In the Netherlands, where the vast majority of the population lives in flood-prone areas, the government has started experimenting with how machine learning may help strengthen preparedness to future floods. In Australia, the New South Wales State Emergency Service developed an early warning system able to perform predictive analysis of floods in the region based on the integration of the Bureau of Meteorology’s external data and additional datasets (e.g. data from flood plain, historical data information from various databases).

Storms

Big Data has also proven to be helpful in monitoring and assessing the impacts of storms, whether they be hurricanes, typhoons or cyclones (such distinction depending only on the location in which the storm happens). Indeed, Big Data technologies allow an unprecedented capacity to crunch data from distributed datasets that will help gain innovative insights on the weather system. The Government of South Korea, for example, upgraded the simulation capacity of its meteorological office by 1,000% – providing it with the most capable storage system of the country. Not only weather data, but also social data as well as data from mobile telecommunication operators can be useful for mapping and analyzing meteorological hazards. In Bangladesh, the Mobile Data, Environmental Extremes and Population (MDEEP) project investigated how data from the national telecommunication operator Grameenphone could have provided insights on the effectiveness of early warning systems during the occurrence of cyclone Mahasen in 2013.

Big Data-powered visualization tools seem particularly promising for helping real-time sensemaking of the weather system as well as for raising awareness of natural hazards among citizens. Earth, an open source animated map integrating data from NOAA’s global forecast system and OSCAR’s ocean currents, among other sources, is a clear example of such potentialities.

Cyclone Pam in the proximity of Vanuatu (March 13, 2015) as portrayed in earth
Explore the visualization on the platform

Works cited

Cochran, E. S., Lawrence, J. F., Christensen, C., & Jakka, R. S. (2009). The Quake-Catcher Network: Citizen Science Expanding Seismic Horizons. Seismological Research Letters, 80(1), 26–30. http://doi.org/10.1785/gssrl.80.1.26

De Groeve, T., Kugler, Z., & Brakenridge, G. R. (2007). Near Real Time Flood Alerting for the Global Disaster Alert and Coordination System. In Proceedings of the 4th International ISCRAM Conference. Delft, the Netherlands.

Musaev, A., Wang, D., & Pu, C. (2015). LITMUS: A Multi-Service Composition System for Landslide Detection. IEEE Transactions on Services Computing, 8(5), 715–726. http://doi.org/10.1109/TSC.2014.2376558

The Real Promise of Big Data: Seeing Like a State and Like a Citizen

$
0
0

Data–Pop Alliance has been conducting ongoing research on Big Data, climate change and environmental resilience. With funding from the UK’s Department for International Development (DfID), we published a synthesis report evaluating the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data and its potential applications and implications for climate change and disaster resilience. This report will feed into the World Humanitarian Summit to be organized in Istanbul in May 2016.

This is the second in a series of companion pieces that offer insights from the synthesis report. The authors of the series attempt to go “beyond the buzz” to lay out what we actually know about Big Data’s existing utility for disaster science and for building practical resilience.


As a social scientist studying how flows of information and knowledge affect societies’ ability to adapt to environmental constraints, I see two things that could go wrong with the current craze for Big Data in the context of development, resilience and adaptation to climate change.

First, it could fail to improve the accountability and responsiveness of decision-makers (governments, corporations, organizations that wield resources and power) to citizens. In fact, Big Data could weaken relationships of accountability by making decision-makers more responsive to the subset of citizens that possess the human and financial means to engage with information technologies. Second, Big Data could grossly misrepresent people’s experiences and aspirations. It could turn out to be yet another example of misguided technocratic hubris, of the kind denounced by political scientist James Scott in “Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed.” In this book, Scott argues that plans based on scientific models and data often ignore local knowledge about complex interdependencies on the ground, and backfire as a result. For example, agronomists for a long time tried to transpose scientific agricultural practices developed for temperate climates to tropical countries. In doing so, they displaced the varied polycultures of farmers, which, he argues, were in fact much better adapted to the infertile soils in these regions.

Various observers have already noted these dangers. Some have pointed out that Big Data could exacerbate power disparities. With titles such as “Seeing Like a Database,” others underscore the uncanny parallel between Scott’s critique of bureaucracies’ past embraces of data and the current rush towards big data solutions. But there is no point arguing about whether citizens, organizations and decision-makers should embrace big data – due to the digitization of society, these data flows are here to stay and grow. Instead I would argue that we have the ability to shape the new data creed so that it meets our shared need for making better individual and collective decisions. Since the use of data is always political, the pre-condition is that data becomes available and legible in ways that improve political processes from local to global scales. This seems possible because we now live in a world where vastly more people are educated, more governments are democratic, and over 90% of people across the world have access to information technologies. Furthermore, humanity has been warned already several times of the dangers of data-fueled technocratic decisions. Minds are thus poised to think about more inclusive ways of generating, diffusing, and understanding data. The report published by DfID in December 2015 provides several important examples of this.

An important aspect of Big Data is that it adds new channels of information to both citizens and governments. These channels carry information about members of society, the situations that are unfolding – such as disasters, inflation, social tensions, etc. – and how members of government react to these situations and citizens’ needs. This has important implications. 

First, decision-makers are better able to “see” their population. Enthusiasm about Big Data in development circles arises in large part from the perception that it can provide more comprehensive and more fine-grained data about populations than what is produced by statistical offices. For example, information about phone call patterns can provide good estimates of population density and income at a much finer spatial and temporal resolution than censuses do. These data source may also better account for people living in informal settlements. Since there can be no responsiveness if people are not even counted, the potential increase in state capacity thanks to Big Data is an important part of improving accountability.

Seasonal changes in population distribution in Portugal and France, based on Call Detail Record (CDR) data.
Source: Deville et al. (2014).

Second, governments may become more legible to citizens. This is the critical second part of the accountability equation, which is much less often discussed. An example of how Big Data can help here is the Billion Prices Project at MIT, which monitors prices posted online to detect inflation trends and food insecurity. The project has shown that tapping into this passively emitted data (data arising spontaneously as a side-product of commercial activities) can reveal a very different situation than that depicted by official statistics or journalists. The impact of such information on societal resilience can be tremendous. In a 2000 study, Timothy Besley and Robin Burgess of the London School of Economics showed that over a period of 30 years, Indian state governments had been much more prone to distributing food to people suffering from food shortages in states with high levels of newspaper circulation and literacy, i.e. in areas where people were aware of the crisis and of what the government was doing about it. No quantitative research has yet studied the effect of digital technologies and Big Data flows on accountability in the context of disaster management, but this is something worth doing to learn how Big Data can contribute to resilience.

We see from this discussion that for Big Data to contribute to better political processes, it is fundamental that data (on the environment, communities, the economy, and governments) be open. Hence Big Data and the movement towards open data must go hand in hand. Important initiatives are underway, such as the Open Data for Resilience Initiative (OpenDRI). OpenDRI uses an open-source application, called Geonode, to help local actors process their existing data, engage communities in mapping data about their changing exposure to natural hazards and in participatory risk analysis. It is also fundamental that citizens have the means to be ingenious with that data. This means democratizing the access to analytical tools. Software such as InaSAFE can help create an interface to analyze the risk from disasters. Some engineers are working on an application to allow non-engineers to develop disaster-management applications themselves, from a simple visual user interface. Although little discussed thus far, it would also seem opportune to create partnerships with local universities, which could curate relevant datasets and engage in online and offline data analysis workshops to empower students in these universities to extract relevant knowledge.

The third consequence of new and more diverse information flows is that our perceptions are bound to become more robust, as they become less dependent on a narrow set of academic and administrative surveys generated according to predefined ideas (usually held by Western elites). In other words, Big Data flows can challenge people’s thinking in more and perhaps different ways than do the results of traditional measurement and sampling. Relatedly, more people can contribute to shaping the information flows. For example, citizens can identify new sources of relevant Big Data (as in the Billion Prices project). Or people can actively generate data, as those who film encounters of citizens and police, documenting cases of abuse. Big Data and its flow through information technologies–unless hijacked and locked-up by bureaucracies and corporations–are more inclusive than data structures that are primarily generated and controlled by decision-makers.

Comparison of OpenStreetMap coverage of Kathmandu, before and after the 2015 Nepal earthquake and the efforts of the Humanitarian OpenStreetMap Team.
Source: Humanitarian OpenStreetMap Team website, accessed August 2015.

As a result, we may avoid the worse pitfalls highlighted by James Scott in “Seeing Like a State,” namely mischaracterizing or simply ignoring ecological and social realities (including know-how and local knowledge) with the false confidence lent by science and data. In fact, the most sophisticated and most exciting Big Data projects in development and resilience work blend machine-generated data (from satellites, cell phones, internet activity and data-analytic algorithms) with local knowledge actively generated and communicated by citizens. For example, OpenStreetMap, a globally distributed organization with 1.5 million registered users and local groups in over 80 countries, is working to create a common open digital map of the world. Users edit a single database of maps remotely, digitizing the presence of roads, buildings, and so on, based on satellite images and their local knowledge. The Humanitarian OpenStreetMap Team (HOT) expands on this idea to create digital maps that are important for disaster preparedness or relief, again using local knowledge to locate vulnerable or damaged infrastructure. These organizations contribute to the formation of knowledge networks (rather than information products).

There is a risk that Big Data gets co-opted to reproduce an elitist and technocratic approach to solving public problems. We have learned from past experience that such an approach tends to weaken civil society and create more problems than it solves. Yet, there is also the promise – and, I would like to believe, the adequate conditions – for Big Data to give civil society new tools to engage in a more pluralist, robust, and resilient approach to development and environmental management.

Works cited

Besley, T., & Burgess, R. (2000). Land Reform, Poverty Reduction, and Growth: Evidence from India. The Quarterly Journal of Economics, 115(2), 389–430. http://doi.org/10.1162/003355300554809

Data-Pop Alliance. (2015). Big Data for Resilience: Realising the Benefits for Developing Countries (Synthesis report.).

Deville, P., Linard, C., Martin, S., Gilbert, M., Stevens, F. R., Gaughan, A. E., … Tatem, A. J. (2014). Dynamic population mapping using mobile phone data. Proceedings of the National Academy of Sciences of the United States of America, 111(45), 15888–15893. http://doi.org/10.1073/pnas.1408439111

Scott, J. C. (1999). Seeing like a State: How Certain Schemes to Improve the Human Condition Have Failed. New Haven: Yale University Press.

Urban Identity and the Fragility-Resilience Double Helix

$
0
0

This is the first in our series of blog posts on our ongoing work on urban fragility and resilience


“In the 1800s, 1 in 30 people lived in cities; today it is 1 in 2; and tomorrow virtually everyone is going to be there,” Robert Muggah reported in the Ted Radio Hour episode Building Better Cities.1

Over half of the world’s population (54%) lives in urban areas and that number is projected to grow to more than two-thirds (66%) by the year 2050.2 The rate of urban population growth is not only continuing, but its speed is dramatically quickening: urbanization is producing between 65 and 70 million new urban occupants a year.3 For the first time in our history, more people live in cities than outside of them, and the UN estimates that more than 90% of future population growth will occur in cities.4 As the pace of urbanization quickens in our increasingly global and interconnected world, emergencies and disasters will concentrate in cities, rather than nation states. In the face of these shocks and stress, government, institutional, and community capacities to effectively respond, provide services and security, and govern will be tested. These capabilities will be pushed to the brink predominantly in cities not in North America or Western Europe, but rather in Africa, Asia, and Latin America—areas that have the world’s fastest growing cities and are most prone to conflict, disasters and natural hazards.5

Figure 1: Map of urbanization growth from 2014 vs. 20506

Although urbanization is often linked to opportunity and innovation—ushering in new possibilities for health, education, wellbeing, and the exchange of goods and services—urbanization also produces new risks to security and stability. Areas that rapidly urbanize are often prone to high rates of violence, drug trafficking, crime, and unemployment, all of which undermine their governance, economic productivity, and livability. As the challenges due to risk factors to these urban areas increase, there is a real possibility that these cities become disrupted and fragile.

While all cities possess distinctive strengths and weaknesses, some are more successful and resilient than others. But how, why, and when do cities succumb to fragility or endure in resilience? Surprisingly, there is limited information available—i.e. structured (administrative) and unstructured (ambient) data—about the underlying security and stability of cities before, during, and following exposure to stress. This is a critical knowledge and capacity gap that urgently needs attention, as Sustainable Development Goal 11 of the post-2015 UN Development Agenda emphasizes the importance of constructing stable and resilient cities.7 As urbanization develops unevenly and inequitably and at a fast-moving and dynamic speed, understanding resilience and fragility is vital. However, the concepts of fragility and resilience are difficult to define because of the many parameters respective to the communities that construct each unique city and their personal challenges.

“Managing urban areas has become one of the most important development challenges of the 21st century. Our success or failure in building sustainable cities will be a major factor in the success of the post-2015 UN development agenda”

John Wilmoth, Director of UN DESA’s Population Division

In recognizing the complexity of urban environments, we may be better equipped to identify fragility by bringing together many factors to inform modeling. Some work has begun in attempting to better define what it means for a city to be resilient and fragile. Robert Muggah, Ronak Patel, and Joseph de Boer have recently initiated work on mapping a conceptual framework for resilience and fragility in “City fragility and resilience: a preliminary conceptual framework,” a background paper for the UNU/World Bank project on fragile and resilient cities. Fragile cities, as described by Muggah, Patel, and de Boer, are “those that are susceptible to a rapid deterioration and erosion to core functions owing to an overwhelming accumulation of internal and external risks.”8 These risks can take on many forms, including economic, environmental, political, and social. Some empirical evidence has pointed to a few factors that shape urban fragility9:

  • Rapid and unregulated urbanization
  • Income and social inequality
  • Concentrated poverty
  • Youth unemployment
  • Policing and justice deficits
  • Real and perceived insecurity
  • Natural hazard exposure

Resilient cities on the other hand, are cities that have the ability to preserve or improve core functions even in the face of stress and shocks. This is done by reducing exposure, limiting vulnerabilities, and enhancing adaptations to stress through institutional, individual, organizational or community action. This ability to preserve core functions is usually determined by a collection of relations between the state and the society. Factors related to shaping city resilience are usually identified as the reverse of those linked to fragility, yet this presents only a basic understanding of the complex and interactive nature of resilience factors. These factors include10:

  • Income and social equity
  • Effective policing and justice mechanisms
  • Microeconomic security/social protection
  • Provision of basic services
  • Social cohesion
  • Social networks/social support
  • Strong community-government cooperation

A common misconception surrounding urban fragility and resilience lies in perception that fragility and resilience are mutually exclusive. Additionally, a common assumption is that the relationship between fragility and resilience is inversely related (i.e. reduction of fragility is that same as the promotion of resilience, or vice versa)11; however, this is not necessarily always the case. Urban fragility and resilience can exist acutely and chronically, both at the same time (e.g. San Francisco with pockets of fragility and violence in several neighborhoods and Mogadishu with pockets of resilience among communities of entrepreneurs). Although typically understood as separate and opposing concepts, fragility and resilience are actually interconnected and overlapping phenomena. Additionally, the presence of fragility does not necessarily imply the absence of resilience.12

Muggah, Patel, and de Boer illustrate the connection between resilience and fragility within a city using the fragility-resilience double helix (see Figure 2).13 This relationship helps to define the specific capacities of a city to retain core functions in the face of stress and hazards by delineating the resilient and fragile factors the city holds. The unique identity of each city, or the urban identity, is created by factor variation, just like genetic variation accounts for human individuality. The urban identity defines the city’s ability to respond to internal or external stressors. Nonetheless, it is not just about the type of factors the double helix possesses, but also how those factors interact. The coexistence of both fragility and resilience brings about a new understanding about why, where, and how these varying concepts produce cities that succumb to fragility or endure in resilience. The interactions of these factors pose unique vulnerabilities and strengths to stress in the same way that our genetic make-up creates immunities or susceptibility to diseases in the human body.

Figure 2: The Fragility-Resilience Double Helix14

For example, Celiac disease, a disease caused by the genetic variant of the HLA-DQB1 gene, produces an abnormal vulnerability to gluten in the human body. Without acknowledgment of the disease and a strict adherence to a gluten-free diet, the body becomes irritated and causes a wide variety of symptoms in reaction to gluten. Similarly, Medellin, Colombia in the 1990s held fragility and resilience factors that caused it to be extremely vulnerable to chronic crime, making it one of the most murderous cities in the world. However, over the past few decades, Medellin has incorporated new resilience factors by building cable cars and a metro to connect poor areas to middle class ones; installing quality infrastructure in the poorest areas; and establishing social programs that target some of the most vulnerbale populations – youth – with risk reduction programs. Despite the fragility factors of drug trafficking still present in the city in addition to rapid urbanization of the city, these added resilience factors have shifted Medellin to a new urban identity: much less susceptible to acute outbreaks of conflict and better equipped to handle chronic crime. Since the 1990s, Medellin’s murder rate has dropped by 80%.15 Although DNA and genetic variation may be fixed at conception, urban variation and its fragile and resilient factors may wax and wane over time. While the relationship between fragility and resilience can be described as a double helix, the qualities themselves can still be learned.

Due to the fact that not all cities evolve equally, identifying these factor variants and their relationships within urban helixes could prove extremely useful for the future of urbanization and sustainable development in preventing citywide disturbances and failures. Mapping the extent of fragility and resilience within a city will require an in-depth analysis of the factor variants of a city and the systems and protocols necessary to handle threats. Because of the complexity of cities themselves, identifying connections and interactions between factors will require the amassment of robust data. As the digital divide continues to shrink due in part to urbanization, the use of information and communication technologies as a source for measuring and determining fragility and resilience factors that make up city identity will become valuable.16

The emergence of Big Data is opening avenues for research on city resilience and fragility. The data “crumbs” about human behaviors and beliefs generated and collected by technological devices, the “capacities” of powerful and analytical computations and tools, and the “community” of actors working within the field will all contribute to the empirical understanding of the factors and interactions between fragility and resilience. Big Data could potentially reveal problems and solutions surrounding current and future urban centers. With the knowledge amassed through Big Data’s three C’s, it is possible that we may build better cities able to adapt and maintain equilibrium in the face of political, economic, social, or environmental stresses.

Works cited

1. NPR/TED Staff (2016) “How Are Some Fragile Cities Tacking Their Worst Problem?” TED Radio Hour episode Building Better Cities.
2. United Nations Department of Economic Social Affairs (UN DESA) (2014) “World Urbanization Prospects”
3. World Economic Forum (2015) “Strengthening Resilience in Fragile Cities.” Global Agenda on Fragility, Violence, & Conflict.
4. de Boer, J. (2015) “Resilience and the Fragile City.” Stability Journal 4 (1).; United Nations Department of Economic Social Affairs (UN DESA) (2014) supra
5. United Nations Office for the Coordination of Humanitarian Affairs (OCHA)(2012). “World Humanitarian Data and Trends.”
6. United Nations Department of Economic Social Affairs (UN DESA) (2014) “World Urbanization Prospects” 2014.
7. United Nations (2016) “Sustainable Development Goals: Goal 11.”
8. Muggah, R., Patel, R. & de Boer, J. (2016) “City fragility and resilience: a preliminary conceptual framework.” Background paper for the UNU/World Bank project on fragile and resilient cities. Tokyo: UNU and World Bank. (to be published)
9. Id.
10. Id.
11. See: Kaplan, S. (2009) “Enhancing Resilience in Fragile States.” European Report on Development; Birkmann, J. (2006) “Measuring Vulnerability to Promote Disaster-Resilient Societies: Conceptual Frameworks and Definitions.” Measuring Vulnerability to Natural Hazards: Towards Disaster Resilient Societies; Pendall, R., Foster, K.A., & Cowell, M. (2009) “Resilience and Regions: Building Understanding of the Metaphor.” Cambridge J Regions Econ 3(1): 71-84.
12. Muggah, R., Patel, R. & de Boer, J. (2016) supra
13. Id.
14. Figure 2 by Carson Martinez, Data-Pop Alliance Research Assistant; derived from Muggah, R., Patel, R., & de Boer, J. (2015) “City Fragility and Resilience: A Preliminary Conceptual Framework.” (Unpublished)
15. Muggah, R. (2015) “How Can We make Latin American Cities Safer?” World Economic Forum.
16. Muggah, R. (2015) “Manifesto for a Fragile City.” Journal of International Affairs (68) 2: 19-36.

Announcement: Flowminder joins Data-Pop Alliance as fourth core member

$
0
0
 
 


Monday, February 22, 2016

Data-Pop Alliance is a global coalition on Big Data and development created by the Harvard Humanitarian Initiative (HHI), MIT Media Lab, and Overseas Development Institute (ODI) that brings together researchers, experts, practitioners, and activists to promote a people-centered Big Data revolution through collaborative research, capacity building, and community engagement.

Flowminder Foundation is a non-profit organization registered in Sweden, founded by academic researchers who pioneered the application of using anonymised mobile network data for public health applications, such as infectious disease in 2008 (malaria, cholera, dengue) and disaster response (Haiti 2010, Nepal 2015). Flowminder is closely integrated with WorldPop, a leading open data repository for geospatial demographic data used by major development agencies and governments worldwide.

“Flowminder has an unparalleled track record of innovation and application of new digital data sources to support development and humanitarian organizations in the field. We are thrilled to welcome them as a core member alongside HHI, MIT Media Lab and ODI as we proceed to promote a people-centered Data Revolution,” says Emmanuel Letouzé, Director of Data-Pop Alliance. “With Flowminder as a core member we hope to continue fostering collaborative and connected approaches missing in the space.”

“Data-Pop Alliance has in a short time become the leading global initiative and think-tank on Big Data and development issues. As we have a shared vision and complementary focus areas we are excited to join forces and together be more active in contributing to the policy dialogue, as well as designing and implementing joint initiatives in support of the Sustainable Development Goals,” says Linus Bengtsson, Executive Director of Flowminder Foundation.

Leveraging the crowd: how people can contribute to disaster management

$
0
0

Data–Pop Alliance has been conducting ongoing research on Big Data, climate change and environmental resilience. With funding from the UK’s Department for International Development (DfID), we published a synthesis report evaluating the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data and its potential applications and implications for climate change and disaster resilience. This report will feed into the World Humanitarian Summit to be organized in Istanbul in May 2016.

This is the third in a series of companion pieces that offer insights from the synthesis report. The authors of the series attempt to go “beyond the buzz” to lay out what we actually know about Big Data’s existing utility for disaster science and for building practical resilience.


When human communities are affected by disasters, collaboration is essential to mitigating their negative impacts. There are many interesting examples of collaborative bottom-up insights building on the “wisdom of crowds” to help optimize response to disasters through humanitarian assistance. This is mostly happening through digital tools, which are playing an increasingly important role in this field. Mobile devices and open source software (e.g. Open Data Kit and Frontline SMS), enable non-expert users living in remote and/or underserved communities – especially in developing and emerging regions – to collect and share data. Additional tools allow the seamless integration of distributed data, real-time information sharing and collaborative analysis of the information produced thanks to data collected by the crowd.

In particular, spatial information is central to disaster response. This includes ‘traditional’ geographic information and aerial imagery produced by governmental organizations as well as Volunteered Geographic Information (VGI), happening as per Goodchild’s definition through “the widespread engagement of large numbers of private citizens, often with little in the way of formal qualifications, in the creation of geographic information.” Such information is often joined with other unstructured data and information released (not necessarily voluntarily) on the social web during disasters to enable the real-time mapping of impacts as well as response efforts.

The most renowned collaborative platform is Ushahidi (meaning “witness” in Swahili), an application developed by Kenyan journalist Ory Okollah and technologist Erik Hershman to allow crowdsourced reporting of violent incidents in Kenya after the post-election crisis in 2008. An online map aggregated citizens and journalists’ reports, and more than 45,000 people used the website for contributing and/or reading reports during the first month of activity. Communities employed Ushahidi for a variety of purposes – including election reporting in Mexico, as well as to report malaria surveillance with micro-monetary incentives in India. Clearly, the platform was leveraged to track impact of various disasters, as well as for matchmaking between requests and offers for help, as in the case of Nepal earthquake in 2015.

Incidents related to 2008 elections in Kenya visualized on Ushahidi platform
Source: The Guardian

Thanks to tools like Ushahidi, the humanitarian sector is increasingly relying on the support of online volunteers who collaborate in disaster management operations. Such volunteers have been nicknamed digital humanitarians by Patrick Meier, former Director of Social Innovation at the Qatar Foundation’s Computing Research Institute. Meier headed the first crowdsourced online crisis mapping initiative to support post-earthquake relief efforts in Haiti in 2010. With the help of more than 100 graduate and undergraduate students, he used the Ushahidi platform to post geographic information retrieved from social and mainstream media. This effort catalysed other initiatives. For example, Haitian expatriates worldwide volunteered online to translate messages from Creole to English. The OpenStreetMap community developed a detailed map of Haiti’s transportation network within a few days. Digicel, a Haitian telecommunication company, set up a toll free SMS service for people to report urgent needs and their location with the help of the Thomson-Reuters Foundation, InSTEDD, and the United States Department of State. The working group collected, integrated, and verified whenever possible the geographic information to provide a free source of information to organizations and communities providing relief to the population in Haiti after the earthquake.

Map developed by OpenStreetMap community. For further information see Tim Waters’ presentation.
Source: Tim Waters, OpenStreetMap Foundation

Online tools also enable new forms of cooperation to provide support to communities affected by disasters. Volunteers can provide an outstanding contribution to disaster response that is complementary to that of standard disaster management organizations. Volunteers can help undertake tasks that require large computational power and/or a dedicated workforce – such as MicroMappers, a platform developed by the Qatar Computing Research Institute (QCRI) in partnership with the Stand-By Task Force. Volunteers from all over the world get summoned any time a disaster strikes, employing the platform to analyze Tweets and pictures to track evidence of damages as well as requests coming directly from affected areas.

Damage recognition on MicroMappers
Source: MicroMappers website, retrieved February 2016.

Digital tools involving the crowd can also have a positive impact on making disaster management more participatory, making non-governmental organizations, as well as communities from affected areas, able to participate directly to a sector which has traditionally been a top-down activity; communities were only the recipient of aid and had limited space for playing a role in disaster management operations.


Big Data and Disability, Part 1

$
0
0

This is the first in a series of blog posts on our ongoing work exploring the applications and implications of Big Data and disability. This serves as an initial scoping of how Big Data can contribute to various research areas related to disability. This will form the basis of a White Paper (forthcoming) exploring the ability of the Big Data ecosystem to monitor and understand the state of research about persons with disabilities and their environments.


A large number of persons with disabilities are faced with barriers to actively participating in society: they are denied their rights to “be included in the general school system, to be employed, to live independently in the community, to move freely, to vote, to participate in sport and cultural activities, to enjoy social protection, to access justice, to choose medical treatment and to enter freely into legal commitments such as buying and selling property,” according to the UN Human Rights Office of the High Commissioner. These people are commonly labeled “invisible” and continually are sidelined in communities.

In the past ten years, there has been a revolutionary and global shift in the approach by UN member states to ensure that persons with disabilities receive and enjoy the same rights, equality, and dignity as everyone else. The United Nations Convention on the Rights of Persons with Disabilities of 2008 called for the promotion, protection, and enjoyments of all human rights and freedoms by all persons with disabilities, and respect for their inherent dignity, yet persons with disabilities are continually denied their rights to education, employment, healthcare, and accessible sanitation facilities. In order to effectively protect and promote this marginalized group, a knowledge base of data concerning the situations of persons with disabilities must be built; yet, currently, there is a lack of data and information on the intersection of persons with disabilities and their environments.

The emergence of ‘Big Data’ has shown real value and potential as a new data source for understanding the situation of persons with disabilities. We conceptualize Big Data not just as large datasets, but as a new socio-technological phenomenon resulting from the emergence and development of an ecosystem, characterized by the union of 3 Cs: Big Data ‘Crumbs’ about human behaviours and beliefs; “Capacities” of digital devices, ever more powerful computing power and analytics tools to collect, aggregate, and analyze data; and a vibrant “Community” of actors involved in generating, governing, and using data.

Our work on Big Data and Disability takes root in the following questions::

  1. What sorts of data exist about persons with disabilities and their environments?
  2. In what ways can that data be used (and how has it been used so far) to understand the situation of persons with disabilities?
  3. What are some key areas in which data is lacking, and what kind of interventions might close those gaps?
  4. What are the advantages of using Big Data to understand disability?

The Categories of (Big) Data on Disability

The table below summarizes different categories and types of data, provides examples, and discusses possible opportunities for using (big) data on disability. In general, digital content (category 2) is the category of data that lends itself most readily to studying disability, because information about disability status or accessibility can be more directly/explicitly linked to each record. In the case of exhaust data (category 1), data records are linked to a user profile that does not typically include information on disability status – for instance, financial transactions are associated with an account number, which says nothing about whether the account holder has a disability or not. Opportunities in this area are typically linked to accessibility, but the fact that a service is accessible or has accessible options doesn’t necessarily reveal anything about the actual users. The exception is exhaust data from services that are specifically intended for use by Persons with Disabilities (PwD), such as GPS data from access-a-ride vehicles, or Call Detail Records from TTY/TDD services. Similarly, sensing data (category 3) has few use cases because it is rare to find data in this category that reveals something about disability/accessibility status. Thus, sensing data cannot usually be disaggregated by disability status without combining it with data from other sources/categories.

Table 1. Taxonomy and Examples of (Big) Data Sources Used Across Disability Research
Types Examples Opportunities
Category 1: Exhaust data
Mobile-based Call Details Records (CDRs)
GPS (Fleet tracking, Bus AVL)
Financial transactions Electronic ID
E-licenses (e.g. insurance)
Transportation cards (including airplane fidelity cards)
Credit/debit cards
Online marketplaces (Airbnb, TaskRabbit, UberRush)
Using transaction data to compare cost, availability, and use of services that offer accessible options (for example, accessible Airbnb listings).
Transportation GPS (Fleet tracking, Bus AVL)
EZ passes
E-hailing and rideshare services (Uber, etc)
Using fleet tracking data from access-a-ride services to study how PwD move around a city and evaluate coverage of the transit network.
Online traces Cookies
IP addresses
Using browser tracking to study how PwD interact with digital content (for example, with e-books).
Category 2: Digital Content
Social media Tweets (Twitter API)
Check-ins (Foursquare)
Facebook content
YouTube videos
Using social media data to represent PwD as a network of interactions (for example, using certain Twitter hashtags).
Crowd-sourced/online content Mapping (Open Street Map, Google Maps, Yelp)
Monitoring/Reporting (uReport)
Using crowd-sourcing to map the locations of accessible businesses and public places.
Category 3: Sensing data
Physical Smart meters
Speed/weight trackers
USGS seismometers
Remote Satellite imagery (NASA TRMM, LandSat)
Unmanned Aerial Vehicles (UAVs)

Four Functions of (Big) Data on Disability

Below we propose a possible taxonomy for discussing the functions (big) data in relation to studying problems and proposing solutions to issues stemming from disability. What these different functions actually mean, in context, will depend on what sort of data we are working with.

To illustrate these functions in context, here we use the example of data on location of businesses that are accessible to people with disabilities affecting mobility.

  1. Descriptive: describing and representing the collected information — for example, representing the locations of accessible businesses in a city, or using apps or social media to collect information on the measure that businesses have taken to ensure accessibility.
  2. Predictive: making inferences based on collected information (such as forecasting) — for example, showing trends in the growth of number of accessible businesses in certain parts of the city.
  3. Prescriptive (or diagnostic): going beyond description and inference to establish and make recommendations on the basis of causal relations — for example, showing that the addition of a school or hospital increases the number of accessible businesses in a neighborhood (if such is the case).
  4. Discursive (or engagement): spurring and shaping dialogue within and between communities and with key stakeholders through communication of data — for example, using data on accessible business locations in public discourse about the needs and resources of persons with disabilities, and what steps society can take to achieve goals that have been set forth.

Seven Areas of Research on (Big) Data and Disability

Given the broad range of definitions and types of disability, as well as the numerous functions that traditional data and Big Data can have (as described above), many interesting research questions emerge at the intersection of (Big) Data and Disability. The uses of such data — and potential research questions — range from descriptive applications to discursive applications, depending on the topic area. Indeed, data can be used in a variety of ways to identify and study issues and propose actions and solutions to some of the challenges faced by persons with disabilties. This document pulls together some articles and thoughts on those opportunities and challenges, and is meant as a starting point for identifying promising uses of data and potentially useful datasets. In looking at the types of data that exist and the areas in which data can be used regarding disability, several key themes emerge:

  1. Voting & Representation: Beyond the obvious descriptive uses of data (how are PwD distributed geographically within the electorate, are there trends in how and where the vote, etc.), we can identify some key uses that are more discursive (does anything in the data, or lack of data, point to under-representation or disenfranchisement of PwD among voters).
  2. Employment: Descriptive uses of data include mapping the availability and location of employment opportunities or workplaces that are accessible/open/useful to PwD; or monitoring compliance with legislation or policies on hiring practices. Predictive and prescriptive uses, which include studying the causes and consequences of trends in employment of PwD, can help evaluate existing policies and shape new ones. This area is particularly interesting for comparisons across countries.
  3. Community & Social Media: Descriptive uses of data can help study PwD as a network, in relation to their peers, thought leaders, political representatives, etc. Social media can also be used as a tool for people who are receiving medical treatment to be in contact with physicians and other patients, in order to receive proper support and follow-up care, etc.
  4. Accessibility: Descriptive uses of data include crowd-sourcing and mapping locations of public places, businesses, lodging, and transportation that are accessible to PwD. On the discursive side, data on compliance with accessibility standards can be used as a tool to promote more inclusive economies, cities, and societies.
  5. National and International Programs: Use of data includes evaluating where data does and doesn’t exists, comparing countries based on their implementation of national and global targets, and setting new global targets or creating metrics to track their implementation.
  6. Education: There are a variety of descriptive uses of data, in particular relating to the proportion of children with disabilities who are included in the education system (either in specially designed programs or integrated into other programs), to observing what opportunities exist for them to receive education, and identifying gaps and issues in the education system. Data can also be used for creating education material, such as using innovative approaches to digitize books or studying the effectiveness of various teaching tools and methods.
  7. Awareness & Advocacy: The topics covered above each present ways to raise awareness about certain causes, in particular through their discursive functions. More generally, tools (such as standardized metrics, and visualization techniques) can be used to collect and communicate data across all the fields described above.
Table 2. Examples of the Functions of (Big) Data in Various Areas of Research
Descriptive Predictive Prescriptive Discursive
Voting & Representation Using location data to map accessible polling stations. Using polling data to compare voting patterns among PwD versus other voters. Using location and polling data to demonstrate areas where changes are needed or impact is most effective. Using data to show that PwD may be underrepresented because of inaccessible polling stations.
Employment Using location data to map workplaces that are accessible to PwD. Using job data to find trends in employment of PwD. Using data to study the impact of potential legislation changes on rates of employment of PwD. Using data to show which countries meet international targets on employment opportunity for PwD.
Community & Social Media Using social media data to represent PwD as a network of interactions. Using data on hashtags and interactions to study how ideas and movements regarding disability are spread on social media. Using social networks as a tool to keep people receiving treatment/follow-up in contact with physicians and other patients to improve healing. Using social media as a tool to communicate data about disability and promote awareness and advocacy.
Accessibility Using crowd-sourcing to map the locations of accessible businesses and public places. Using data on participation in sharing economies to identify and measure losses cause by businesses and marketplaces that aren’t accessible. Using data to demonstrate the impact of increased accessibility on participation and inclusion, cost, economic gain, etc. Using data on compliance with accessibility standards as a tool to promote more inclusive economies, cities, and societies.
National & International Programs Using data to evaluate where data does and doesn’t exists on meeting national and international targets on disability. Using existing data to define and build new metrics for measuring implementation of targets on disability across countries. Using data to study the impact countries’ programs and legislation for meeting international targets on disability. Using data to show which countries are making progress towards meeting their targets.
Education Using data on inclusion of students with disabilities in the education system to map participation and progress. Using digitization and processing techniques like optical character recognition and captchas to created educational material. Using analytics data from educational platforms to evaluate and promote effective learning techniques. Using data to raise awareness and increase understanding of disability by the general public.
Awareness & Advocacy Using visualization tools to communicate data about challenges and opportunities. Using analytical tools to identify and communicate trends in (big) data on disability. Using data to show the impact of various programs, actions, legislation, etc. in the area of (big) data and disability. Using data as a tool to communicate and solve problems and build on existing opportunities in the area of (big) data and disability.

Unevenly Distributed Resilience

$
0
0

Data–Pop Alliance has been conducting ongoing research on Big Data, climate change and environmental resilience. With funding from the UK’s Department for International Development (DfID), we published a synthesis report evaluating the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data and its potential applications and implications for climate change and disaster resilience. This report will feed into the World Humanitarian Summit to be organized in Istanbul in May 2016.

This is the fourth in a series of companion pieces that offer insights from the synthesis report. The authors of the series attempt to go “beyond the buzz” to lay out what we actually know about Big Data’s existing utility for disaster science and for building practical resilience.


In the emerging field of data-driven resilience, one of the greatest challenges is simply getting everyone on the same page

Recent years have made it clear that data science’s potential to bring rapid, large-scale changes to the global economy is only increasing. The spread of cheap, fast computing that accompanied and followed the first tech bubble of the 1990s has since paved the way for data analysis and data-driven services to remold sectors of the economy ranging from urban transportation to light manufacturing to agriculture, not to mention defining entirely new technological frontiers in industries such as social media, search, and logistics. While most of these advances continue to be led by the private sector, an increasingly large group of specialists, researchers, and first responders have begun to apply many of the same data techniques to more broadly humanitarian concerns, trying to lever recent advances in data science to increase the resilience of communities around the world to natural disasters.

As an Assistant Professor of Economics at the University of San Francisco who studies the costs of natural disasters, this comes as good news, and not a moment too soon. While last December’s Paris Agreement may finally signal the beginning of coordinated global action to address climate change, every indicator suggests that even under optimistic scenarios natural hazard risk for much of the world will get worse this century before it gets better. Burgeoning populations with shrinking resource bases will only become more vulnerable to shocks like droughts and cyclones in coming years, a concern highlighted by the now-overwhelming evidence that climate variability can drive conflict. Increasing rates of urbanization in the developing world have shifted populations away from the risk of weather-driven fluctuations in agricultural livelihoods only to increase populations’ exposure to urban climate risks such as deadly levels of air pollution and the risk of coastal flooding. Increasing climate variability and rising seas will in many cases only worsen underlying hazard risks. Estimates of these costs from applied economics and related fields increasingly suggests that they are massive, emphasizing the need to develop new technologies, institutions, and approaches for insulating humanity from the ravages of the natural world.

It is clear that data science can be a powerful tool in increasing society’s resilience to these threats, and a recent overview of the field conducted by myself and an interdisciplinary team of researchers suggests enormous potential for such technology to make real differences in peoples’ lives. Researchers have successfully demonstrated techniques that would allow first responders to use cell phone data records (CDRs) to observe population movements in real time to assess evacuation efforts, target relief, and better understand response dynamics. Mobile data platforms and social networks such as Twitter are increasingly used to provide citizens with a pathway to engage with governments, first-responders, and the scientific community, and have already made serious improvements in response quality. Advances in data techniques in natural and social science research meanwhile have brought extraordinary advances in risk warning and monitoring systems, allowing both recovery and prevention efforts to be directed where they’ll do the most good.

Post-earthquake population movement in Nepal
Source: Flowminder, 2015

At first glance many of the proposed solutions in data-led resilience sound too good to be true, and one would be wise to be skeptical of audacious claims. There is a huge difference between first prototypes and workable products, and any new approach for using data to increase resilience will only be valuable if it’s actually used. This means that big data approaches to increasing resilience need to directly address the various complex problems that pop up when one is relying on novel technologies that place high demands on human and organizational capital. New techniques are being developed that require unusually high degrees of technical proficiency, making multiple areas of expertise necessary to implement, or even interpret, data analysis. Mundane but nontrivial hurdles related to practically implementing any data driven project – where such large data will be stored and analyzed, how the data will be collected and transferred between stakeholders, how quickly analyses can be run – can easily sink an otherwise worthwhile project. Concerns about legal and ethical ramifications of using data technologies can make stakeholders, ranging from analysts and researchers to government officials to the public, withdraw from supporting projects or allowing data to be used. Differing regulatory frameworks only further compound the problem, exposing policy makers to difficult-to-navigate reputational and political risks when data projects are being discussed.

At the end of the day, the complex problems related to supporting, standardizing, and indeed increasing the “resilience” of these data-driven approaches themselves are the sort of large-scale coordination problems common when new technologies develop; one need only look back to the first widespread roll-out of internet technologies to find similar sets of problems. What is perhaps daunting is the unusually diffuse and public nature of disaster risk management. In the private sector, coordination problems can typically yield to the profit incentive; many of today’s largest tech firms, such as Oracle, Microsoft, and Google can all in one way or another be viewed as having the seeds of their success in identifying opportunities to monetize standardization and coordination of information technologies so as to ease wide-spread and flexible use across many realms of society. Disaster risk, however, looks much more like a reinsurance market than a consumer product market, with infrequent large events dominating how firms make decisions. In this context it is unsurprising that coordination mechanisms are lacking; firms would need to be far-sighted, not driven by short-term profit, and willing to navigate the complex legal, governmental, and public relations aspects of disaster risk management and response in order to play their standard role.

While the disorganized and chaotic nature of progress in data-driven resilience can be disheartening, the good news is that there are huge social returns to be made by investing in coordination. Funding agencies, startup firms, and enthusiastic college graduates can all easily make the mistake of thinking that inventing exciting new technologies is the only way to help improve resilience, but one of the primary findings of our survey is that there are enormous opportunities to improve resilience simply by increasing coordination among actors. Centralizing and disseminating best practices, through conferences, workshops, working paper websites, newsletters, and the like can make it vastly easier for disconnected agents to quickly internalize solutions to common problems. Data philanthropy outreach, ranging from platform and infrastructure support to analysis and code development, can have enormous impacts, especially in settings with low prior technological capacity. Establishing common, open frameworks such as standard APIs for cell phone data could make coordination problems between different firms and research groups vastly less difficult to resolve. Indeed, some of the most exciting progress currently being made in data-driven resilience – from citizen-volunteer-led efforts to rapidly improve OpenStreetMap after disasters to the Humanitarian Data Exchange’s public clearinghouse of data sources – stems exactly from such efforts to coordinate actors.

The Humanitarian Data Exchange’s page for the Nepal Earthquake
Source: Humanitarian Data Exchange, accessed August 2015

In sum, the field of data-driven resilience brings to mind author William Gibson’s classic observation that the future is here, it’s just not evenly distributed. Rectifying this – by focusing on distributing existing data technologies, ensuring that they work in context when needed, and helping establish best practices so that different actors need not reinvent the wheel – may not be as sexy as creating new analytic techniques or prototype technologies, but ultimately serve as important a role in ensuring that data techniques actually increase resilience as intended. In this light, the Paris Agreement’s major achievement may ultimately be in simply establishing a common intent to coordinate among the world’s governments. Perhaps now that this coordinating intent has been signaled, the many private sector firms, government agencies, and humanitarian actors that constitute the resilience space will be able to focus on working together to build the disaster response architecture that will defend humanity from the elements in the coming century.

Works cited

Data-Pop Alliance. (2015). Big Data for Resilience: Realising the Benefits for Developing Countries (Synthesis report.).

The Color of Surveillance: Government Monitoring of the African American Community, Georgetown Law Center on Privacy & Technology, 8 April 2016

$
0
0

This is the first in our Spotlights on Inequality series that offer an in-depth discussion of the intersection between inequality, data, and group privacy.


On 8 April 2016, the Center on Privacy and Technology (CPT) at Georgetown Law (@GeorgetownCPT) hosted the conference The Color of Surveillance, which brought together academics, activists and intelligence officials to present and discuss different aspects about the history and present-day practices of government monitoring of the African-American community.

The introductory comments by Dean William M. Treanor (Georgetown Law), Prof. Paul Butler (Georgetown Law) and Prof. Alvaro Bedoya highlighted the timeliness and urgency of the topic. Dean Treanor pointed out that the events in Ferguson and the subsequent #BlackLivesMatter movement had rekindled the discussion on criminal justice, civil rights and surveillance in the United States. Prof. Butler, on the other hand, explained that, as a former prosecutor, he understood the need of the police to make use of the latest technologies but that, as an African-American man, he was simultaneously part of a community whose members often continue to be treated as enemies of the state, despite of the fact that they are usually immensely patriotic. Prof. Butler pointed out that the promise and peril of the present moment was encapsulated by the idea of “one in one million”: that there was one African-American president, but for the foreseeable future also around one million African-Americans in prisons; more than there were slaves in 1815. Prof. Butler cautioned that, in discussions surrounding surveillance, it was important to pay attention to the racial gaze and what it sees and does not see; because, ultimately, what you see also depends on where you look. Finally, Prof. Bedoya emphasized that the pattern of surveillance of African-Americans in the United States in the name of national security continues up until today and that it was important to combine the debate on national security surveillance with that on policing and surveillance of African-American communities. After all, the pervasive policing of African-Americans is also enabled by new surveillance technologies. At the same time, one should not only talk about African-Americans as objects of surveillance but also how new technologies could be used to level the playing field. Finally, Prof. Bedoya concluded that the right to privacy remains as important as ever because it is also a lack of privacy that contributes to the inequalities we are experiencing as a society today.

The rest of this blogpost will emphasize some of the highlights of the conference, with a particular focus on history, inequality and technology.

History

Among the history lessons, Prof. Simone Browne (University of Texas at Austin) presented on some of her research on the various practices of surveillance in the slave era, focusing in particular on the examples of fugitive slave notices, lantern laws and branding. Prof. Browne pointed out that surveillance is ongoing and indeed sustained by antiblackness, and that we need to understand conditions of Blackness in order to meaningfully understand practices of surveillance in the present day.

Prof. Browne’s presentation was followed by that of Prof. David Levering Lewis (New York University) who continued the discussion of the use and abuse of government surveillance of African Americans, with a particular focus on the First World War and immediate aftermath of the Second World War.

One of the highlights of the conference was the panel on “Martin Luther King & the FBI” which included Prof. David Garrow (University of Pittsburgh School of Law), James A. Baker (Federal Bureau of Investigation) and Prof. Alvaro Bedoya (Georgetown Law). Prof. Garrow pointed out that the FBI long had an organizational culture of surveillance and political control, relying in particular on human intelligence (HUMINT). Indeed, according to Prof. Garrow it is impossible to understand the history of what the FBI did to the civil rights movement without also understanding the history of HUMINT. James A. Baker, General Counsel of the FBI, on the other hand, emphasized that the FBI critically engaged with its history in order to make better decisions in the present day. He explained that the protections that the FBI had in place in order to prevent abuses of power today are much more substantial than they were in the past, and that the FBI is critically aware of the importance of protecting privacy and civil liberties while at the same time protecting the country from both foreign and domestic threats.

Finally, Prof. Sahar Aziz (Texas A&M University School of Law) described some more contemporary surveillance practices in the aftermath of 9/11, with a particular emphasis on continuity in the politics of counterterrorism. After all, the same counterterrorism infrastructure that was used to monitor African-Americans in the so-called war on drugs is now used to monitor Muslims in the so-called war on terror.

Inequality

One of the central lessons of the conference was that not all surveillance is equal. For instance, in the panel discussion on “Surveillance of Low-Income Communities” Prof. Khiara Bridges (Boston University) described her research on the experiences of African-American women from low-income backgrounds wanting to become mothers, who are constantly surveilled by various state bureaucracies as soon as they enroll in Medicaid. As Prof. Bridges points out, wealthier women are clearly not exposed to the same level of state surveillance. Prof. Bridges was joined by Hamid Khan (Stop LAPD Spying Coalition) who made the strong point that the American people need to stop trying to police their way out of social and political problems. After all, the pervasive criminalization of certain communities also leads to a pervasive dehumanization of the members of those communities.

Similarly, in her presentation on “Stop & Frisk: Surveillance on the Street” Prof. Kristin Henning (Georgetown Law) described how African-American youth are disproportionately targeted by the police on the street and in the school system, thus also drawing attention to how regular African-Americans are constantly monitored, watched, regulated and marginalized in everyday life.

One of the most fascinating presentations of the conference was on Prof. Osagie K. Obasogie’s (UC Hastings College of the Law) research on how the blind perceive race. Perhaps surprisingly, the blind also overwhelmingly describe race in visual terms, which is indicative of the pervasiveness and depth of the social construction of race in American society.

Technologies

Finally, some of the surveillance technologies that were discussed at the conference include stingrays (Freddy Martinez, Lucy Parsons Labs), predictive policing (Andrew Ferguson, University of the District of Columbia David A. Clarke School of Law; Kristi Lane Scott, U.S. Department of Justice; Sonja Starr, Michigan Law; Chinyeré Tutashinda, Center for Media Justice; and Arjun S. Sethi, Georgetown Law); police body cameras (Harlan Yu, Upturn); and encryption (Matthew Mitchell, Cryptoharlem). The conference participants emphasized that transparency was not enough; what was needed, rather, was a clear legal framework to ensure that this new technological infrastructure enhances, rather than detracts from, civil rights.

But data can also be used to shift power dynamics: this was the topic of Brandon Anderson (The Swat App) and Prof. Anthony Cook’s (Georgetown Law) tech/teach-in on The Swat App, which allows ordinary citizens to describe their experiences with the police without having to fear retaliation. The Swat App provides statistics and metrics of how the police are performing in different communities, thus effectively using crowdsourced data to enhance accountability in the American justice system.

To conclude, as Brandi Collins (ColorofChange.org) also pointed out in her presentation of “Surveillance of Black Lives Matter,” the story of surveillance is not new but neither is the story of resistance. In posing the question of “what happens to surveillance when we question the conditions of blackness” (Simone Browne), The Color of Surveillance conference at Georgetown Law’s Center on Privacy & Technology made an important contribution to sustaining a public narrative on the history and present-day practices of government surveillance in the United States, and what ordinary citizens can do to challenge and resist it.

For a full list of conference panels and speakers, see:

https://www.eventbrite.com/e/the-color-of-surveillance-government-monitoring-of-the-black-community-tickets-21461318381

Conference videos are available on CSPAN here: http://www.c-span.org/video/?407901-1/discussion-racial-bias-government

http://www.c-span.org/video/?407901-102/government-surveillance-race-part-3

http://www.c-span.org/video/?407901-103/government-surveillance-race-part-4

Ciudad Laboratorio – Bogotá

$
0
0

One of Data-Pop’s Research Assistants, Rodrigo Lara Molina, reports on the kick-off Ciudad Laboratorio-Bogotá event, held at Plaza de los Artesanos in Bogotà. The event was organized by the Bogotá Chamber of Commerce, the Economic Development Directorate of Bogotá, Tadeo University, Tell (a Colombian communications company), CESA University, and the Government of Cundinamarca. In total 400 individuals participated in the event over two sessions: a conference session in the morning and a workshop in the evening. A version of this post is available in Spanish here.


The main objective of this meeting was to set up “Ciudad Laboratorio,” an initiative that leads human-centered public innovation in a participatory and socially-inclusive way.

Presenters in the morning included Gabriella Gomez-Mont from “Laboratorio para la Ciudad” (Mexico) and Hector Vergara from “Laboratorio de Gobierno” (Chile) representing similar foreign initiatives. Colombian expositors were Andres Clavijo, our Data-Pop representative, and Rafael Villa from Policy-Lab.

Gabriella emphasized that a City Lab is transformative for both municipal governments and its citizens. She outlined that it could serve as a means for a city to look at itself, to detect its own problems, and to experiment with proposed solutions, adapted to the city itself, to solve a city’s issues. This was the case with “Mapatón CDMX”, a fabulous low-cost digital project that could map their franchised public transports, one of the biggest in the world, in a two-week period through citizen participation.

She also underlined the importance of the City Lab as a place of dialogue between authorities and different societal groups. As an example, she mentioned “Debate Digital CDMX: Tecnología Y Movilidad”, a debate organized by “Laboratorio para la Ciudad” to deal with the conflict between cab drivers and transport services offered by digital apps. As a result they generated a proposal that redesigned mobility and transportation by making cab fares more competitive, regulating new services, and generating an open database of mobility which would set the groundwork to create new transport solutions and to catalyze innovation in these areas.

Andrés presented that Big Data can enable inhabitants to solve urban problems in the frame of a human-centered point of view. He said that a City Lab has to be composed of a physical, virtual, and human dimension, and that it must allow experimentation, errors and innovation for the city and by the city.

He proposed four key ideas in building a City Laboratory: it should promote data access (Open Data); it should push for information quality standards and good metadata; it should support data literacy; and finally it should build-up local and global ecosystems of Big Data to confront city challenges.

Hector, the Chilean representative, declared that “Laboratorio de Gobierno” was created to reconfigure interactions between state and citizens through cooperation between all city agents, by the mechanism of co-creation. It aims to create a space where individuals can actively participate in public services. He also stated that “Laboratorio de Gobierno” is an initiative by the people and for the people.

He mentioned four different projects: the design of a new electricity bill to generate trust and educate about energy consumption; optimizing attendance time in Santiago public transport; redesigning a public health scheduling system; and improving communication and information in natural disasters (very frequent in Chile).

The last participant, Rafael Villa, highlighted the fact that a City Lab is everybody’s responsibility, and that this is what makes it valuable. He said that laboratories should aim to be efficient, transparent, and inclusive; and he reiterated that it’s a place where both governments and community members can cooperate to achieve a human-centered vision.

He claimed Laboratories are inspired by a new collaborative governance approach, one that creates public value and engenders local empowerment. In his experience laboratories have to deal with uncertainty because the solutions do not come from the Lab but from the people. He reiterated that Labs have to allow errors, especially cheap and fast errors.

Rafael animated the afternoon with a structured series of questions: Who are we? Why are we working together? How are we going to work and organize? What do we do now? Different assistants and organizers assumed institutional roles as they defined next steps of “Ciudad Laboratorio Bogotá”, all in a lively and democratic procedure.

Data-Pop will try to lead this initiative from the data perspective and will technically support all the projects related with data. For Data-Pop it is a duty to provide endorsement for a project of this kind, which is totally aligned with our fundamental principles of inclusion, innovation, public value and human-centered design.

Ciudad Laboratorio – Bogotá

$
0
0

Data-Pop participó, como organizador, en el evento de lanzamiento de “Ciudad Laboratorio-Bogotá” en la Plaza de los Artesanos en Bogotá. El evento fue organizado conjuntamente por la Cámara de Comercio de Bogotá, la Secretaría de Desarrollo Económico de Bogotá, la Universidad Tadeo, Tell (empresa colombiana de comunicaciones), Universidad CESA y la gobernación de Cundinamarca. Más de 400 personas participaron en el evento, que se llevó a cabo en dos sesiones: conferencias por la mañana y un taller para desarrollar la iniciativa por la tarde. La versión en inglés está disponible aquí.


El objetivo central de esta reunión era establecer la Ciudad Laboratorio, una iniciativa que pretende estar a la vanguardia en la innovación pública centrada en el hombre desarrollada de forma participativa e inclusiva.

Los exponentes de la mañana encargados de presentar experiencias internacionales similares fueron Gabriella Gómez-Mont de “Laboratorio para la Ciudad” (México) y Héctor Vergara de “Laboratorio de Gobierno” (Chile). Como exponentes nacionales fueron invitados Andrés Clavijo, nuestro representante de Data-pop, y Rafael Villa de Policy-Lab.

Gabriella destacó que un Laboratorio de Ciudad es transformador tanto para las estructuras de gobierno municipales como para sus ciudadanos. Ella subrayó que un laboratorio de este tipo podía servir como medio para que la ciudad se mire a sí misma, para que detecte sus problemas y se experimente con diversas soluciones, resolviendo sus problemas de manera autentica y adaptada a la realidad de la ciudad misma. A título de ejemplo, mostró el proyecto “Mapatón CDMX”, una iniciativa de bajo costo capaz de mapear uno de los más grandes sistemas concesionados de transporte público, el de Ciudad de México, en tan solo de dos semanas, a través de la participación ciudadana.

Ella también recalcó la importancia de que tiene un Laboratorio de Ciudad como plataforma de dialogo entre las autoridades y diversos grupos sociales. Como muestra de ello, citó el proyecto “Debate Digital CDMX: Tecnología Y Movilidad”, un debate organizado por “Laboratorio para la Ciudad” para intentar lidiar con el conflicto entre los sindicatos de taxistas y los servicios de transporte ofrecidos a través de aplicaciones de Smartphone. Los resultados de dicha instancia de diálogo, fueron propuestas para rediseñar la movilidad y el transporte a través de diferentes medidas: reducir las tarifas de taxi, regular los nuevos servicios y generando una base de datos abierta de movilidad, que servirá de base para crear nuevas soluciones de transporte y para catalizar la innovación en estas áreas.

Andrés enfatizó sobre las posibilidades que presenta el Big Data para empoderar a los habitantes y convertirlos en agentes de resolución de los problemas de la ciudad, desde una perspectiva centrada en el hombre. Para él, un Laboratorio de Ciudad debe estar compuesto por tres dimensiones: la física, la virtual y la humana. Además, explicó un laboratorio de esta índole debe permitir la experimentación, el error y la innovación para la ciudad y desde la ciudad.

Nuestro representante propuso cuatro ideas directrices para el Laboratorio de Ciudad: él debería fomentar el acceso a datos (Datos Abiertos); él debería promover y abogar por estándares de calidad de la información y de metadatos; él debería apoyar la alfabetización de datos; y finalmente, él debería fortalecer la creación de ecosistemas locales y globales de Big Data para afrontar los desafíos urbanos.

Héctor, el representante de la experiencia chilena, manifestó que el “Laboratorio de Gobierno” fue creado para reconfigurar las interacciones entre el estado y la ciudadanía mediante la cooperación entre todos los actores de la ciudad, a través del mecanismo de la co-creación. El laboratorio busca crear un espacio donde los individuos puedan participar activamente de los servicios públicos. Héctor también declaró que el “Laboratorio de Gobierno” es una iniciativa desde y para las personas.

El señor Vergara mencionó cuatro proyectos de su laboratorio: un nuevo diseño de la cuenta de electricidad que acreciente la confianza en el proveedor del servicio y que incentive el ahorro energético; la optimización del tiempo de espera del transporte público de la Santiago (capital de Chile); la reestructuración del sistema de programación de citas de la salud pública; y el mejoramiento del acceso a la comunicación y a la información en caso de desastres naturales, que son muy frecuentes ese país.

El último participante, Rafael Villa, resaltó el hecho de que un Laboratorio de Ciudad es responsabilidad de todos y que es esto lo que lo hace valioso. Él dijo que estos laboratorios deben aspirar a ser eficientes, transparentes e inclusivos. Rafael reiteró que es un espacio en donde tanto las autoridades como las diferentes comunidades pueden cooperar y así conseguir una verdadera perspectiva centrada en el hombre.

Este último expositor afirmó que los Laboratorios de Ciudad están inspirados en un nuevo enfoque colaborativo de gobierno, y que este enfoque genera valor público e impulsa en empoderamiento local. Desde su experiencia, los laboratorios deben saber afrontar la incertidumbre, pues las soluciones no provienen del laboratorio, sino de la ciudadanía. Adicionalmente, reiteró que los laboratorios deben permitir los errores, pero especialmente los errores de bajo costo y de corto plazo.

Rafael animó y estructuró el taller de la tarde mediante una serie de preguntas: ¿Quiénes somos?, ¿por qué trabajamos juntos?, ¿cómo trabajaremos y nos organizaremos?, ¿cuál es nuestro próximo paso? Diferentes asistentes al taller, como representantes de sus respectivas organizaciones asumieron roles específicos y al mismo tiempo definieron los pasos a seguir de “Ciudad Laboratorio Bogotá”, todo en una ambiente dinámico y democrático.

Data-Pop intentará liderar esta iniciativa desde la perspectiva de los datos y proveerá soporte técnico en los proyectos con un fuerte componente de datos. Para Data-Pop es un deber otorgar respaldo a un proyecto de este tipo, que está totalmente alineado con nuestros principios fundamentales de inclusión, innovación, valor público y diseño centrado en el hombre.

Promoting a “people-centered” data revolution at the World Humanitarian Summit—and beyond

$
0
0

Data-Pop Alliance

Data-Pop Alliance will co-host two side events during the first World Humanitarian Summit (WHS) held in Istanbul, Turkey this week: a remote side event with UNOCHA in New York on May 23rd, and an official WHS side event in Istanbul with UNDP, UNICEF and WFP, on May 24th.

Both events will deal with the applications and implications of data and Big Data across various core themes of WHS and our work: humanitarian effectiveness; reducing vulnerability and managing risk; transformation through innovation; and serving the needs of people in conflict.

Details are as follows:

  • Remote Side Event in New York on Humanitarian Data:
    Monday, May 23, 2016 from 6-8pm, ThoughtWorks NYC, 99 Madison Ave, 15th Floor, New York, NY 10016
    On Monday, May 23, we will be hosting a remote side-event together with UN OCHA’s Humanitarian Development Exchange (HDX) and at our New York headquarters. Come join us from ThoughtWorks NYC as we will host a discussion on Big Data and HDX, the launch of a map explorer, and the screening of a new film produced by HDX. Drinks and food will be provided. Place is limited; please register.
     
  • Official WHS Side Event in Istanbul on Real-Time Information Systems and New Data:
    Tuesday, May 24, 2016 from 9-10:30am, 
Rumeli Hall 7, Lütfi Kırdar Convention & Exhibition Center (LKCC), Istanbul, Turkey

    Data-Pop Alliance and its four core members the Harvard Humanitarian Initiative (HHI), MIT Media Lab, Overseas Development Institute, and the Flowminder Foundation, are partnering with UNDP, UNICEF and WFP to co-host an official WHS side event focused on the potential and limitations of real-time information and new and emerging data to improve decision-making, as well as areas for future investment with a focus on strengthening local capacities. The side event will include rapid lightning talks that canvass emerging (Big) data-driven applications in the humanitarian and development spaces, as well as an expert panel to discuss new partnerships in leveraging these opportunities both towards applied collaborative research and practical solutions.
     
    Panelists and speakers for the official side event include:
    1. Milica Begovic Radojevic, Innovation Team Lead Eastern Europe and Central Asia, UNDP
    2. Linus Bengtsson, Executive Director, Flowminder Foundation and Executive Committee, Data-Pop Alliance
    3. Stuart Campo, Innovation Deployment Specialist, UNICEF Global Innovation Centre
    4. Martin Bille Hermann, State Secretary for Development Policy, Ministry for Foreign Affairs, Denmark
    5. Nick Imboden, Head of Humanitarian Programme Cycle, UN OCHA
    6. Gizem Kececi, Head of Corporate Affairs, Vodafone Turkey
    7. Bernhard Kowatsch, Head of Innovation Accelerator, World Food Programme
    8. Allen Bailochan Tuladhar, Microsoft Regional Director, Microsoft Innovation Center Nepal
    9. Patrick Vinck, 
Assistant Professor, Harvard University
, Co-Director and Co-Founder, Data-Pop Alliance, who will introduce our Synthesis Report on “Big Data for Climate Change and Resilience: Realising the Benefits for Developing Countries” published as a background paper to WHS.

In addition to Patrick Vinck and Linus Bengtsson, David Sangokoya our Research Manager will be attending the Summit; come find them to learn more about our past and ongoing work!

Our related work

In addition to these events, below is a quick snapshot of our current and past projects related to the core themes of WHS:

Humanitarian effectiveness

  • Big Data and Conflict Prevention: Our Peacebuilding and Violence program focuses on the possibilities and impact of Big Data and peacebuilding, conflict prevention and public safety. This work builds on our ongoing work with HHI and the 2013 paper on Big Data and Conflict Prevention co-authored by Emmanuel Letouzé and Patrick Vinck (with Patrick Meier), which explored Big Data’s applications in early warning, real-time awareness and real-time feedback in relation to government and community-led conflict prevention and public safety initiatives.
     
  • Flowminder’s pioneering work with mobile network data: A core member of Data-Pop Alliance leading our applied research work, the Flowminder Foundation, a non-profit organization registered in Sweden founded by academic researchers, pioneered the application of using anonymised mobile network data for public health applications such as infectious disease in 2008 (malaria, cholera, dengue) and disaster response (Haiti 2010, Nepal 2015). Flowminder is closely integrated with WorldPop, a leading open data repository for geospatial demographic data used by major development agencies and governments worldwide.
     
  • Data and Migration Program: In collaboration with the Overseas Development Institute (ODI), the International Organization for Migration (IOM), Flowminder, the UN Research Institute for Social Development (UNRISD) and UNFPA, we are developing a long-term research, training and engagement program of work on Data and Migration. This joint program will focus on filling data and information gaps towards greater access to accurate data on global migration flows, and an evidence base toward better policies.

Reducing vulnerability and managing risk

  • DfID-funded Synthesis Report on Big Data for Climate Change and Resilience in Developing Countries: Data-Pop Alliance has been conducting ongoing research and trainings on Big Data, climate change vulnerability and risk, and environmental resilience. Our Climate Change and Resilience program is evaluating ways in which Big Data can be further developed to use not only in predicting and preventing these disasters, but in mitigating their deleterious effects, and in going beyond to increase communities’ resilience to its impacts. As part of this ongoing research, we published a Synthesis Report in September titled “Big Data for Climate Change and Resilience: Realising the Benefits for Developing Countries” with funding from the UK’s Department for International Development (DfID), ESCR and NERC. The paper evaluates the opportunities, challenges, and required steps for leveraging the new ecosystem of Big Data with regards to climate change and to build resilience, reduce vulnerability, and manage risks so that communities and countries as complex human ecosystems not only “bounce back” but also learn to adapt to maintain equilibrium in the face of natural hazards. This paper serves as a background paper for the World Humanitarian Summit.
     
  • COP21 Side Event: Additionally, in 2015 we co-hosted an event during COP21 with the Agence française de développement (AFD) in partnership with DfID, to discuss climate change and resilience in the age of data. The event drew more than 100 researchers, practitioners, and participants to explore the leveraging of new data sources and the analytical approaches to achieve these goals–chief of which is empowering at-risk communities. It also included a hands-on session where participants were able to work with cell-phone data and geospatial data.

Transformation through innovation

  • Resilience research and training: Our Research Affiliates Beth Tellman and Bessie Schwarz have been conducting ongoing research and trainings in working with Google Earth Engine to map flooding vulnerabilities in coastal communities through their Cloud to Street Last year, they held a workshop at the United Nations Headquarters in New York, “United Nations Sustainable Development Training: Modeling Flooding Vulnerability”, which launched as a Learning Session for the High-Level Political Forum on Sustainable Development. This training addressed the use of data in modeling, mapping, and monitoring vulnerability to climate change and includes case studies on Laos, Tanzania, Australia, South Africa, the West African coastline, and New York. These trainings and models were also presented during COP21 in Paris.
     
  • Data literacy program: Building on this work and our White Paper “Beyond Data Literacy: Reinventing Community Engagement and Empowerment in the Age of Data”, Data-Pop Alliance is developing a large scale professional and community capacity building and community engagement program on Big Data and development dedicated to official statisticians, public and elected officials, community organizers and representatives, journalists, librarians, teachers and more, with workshops planned in Senegal, Rwanda, Kenya, South Africa, Colombia, Brazil, and Thailand in 2016-18.
     
  • The Global Partnership for Sustainable Development Data: Data-Pop Alliance has been a key partner in the formation of the Global Partnership for Sustainable Development Data (GPSDD), which aims to support data-driven decision-making by initiating more open, new and usable data to help in humanitarian efforts of ending extreme poverty, combating climate change and ensuring a healthy life for all.

Serving the needs of people in conflict

  • Data-Pop Alliance was also a key strategic partner in PeaceHack Barcelona (#peacehackBCN) in September 2015, hosted by BuildUp in Barcelona, Spain. PeaceHack, an initiative by International Alert, convenes technologists, designers, developers, and peace practitioners to formulate ideas and solutions for identifying and stopping violent conflict in order build peace.
     
  • Call detail records, urban dynamics and public safety: Our work in our Peacebuilding and Violence program has deepened its focus on serving conflict-affected populations living in fragile cities in Latin America. Since 2015 we have been working on collaborative research pilots with Telefónica and Fondazione Bruno Kessler (FBK), with support from the World Bank and several of our research affiliates. Using CDR data, we analyzed the possibilities of applying data science techniques that use CDRs in relation to crime and urban safety. These pilots will be released in June.
     
  • Text-mining, gender discrimination and accountability: In 2014, we began a project in partnership with the International Center for Advocates Against Discrimination (ICAAD) and other data science and gender specialist organizations on the impact of using semantic analysis toward increasing the efficacy of the United Nations Universal Periodic Review (UPR), a mechanism designed to monitor human rights violations in UN member states and issue recommendations for improvement. In a pilot project in Fiji, our consortium used automated information extraction to monitor the implementation of those recommendations, and to promote accountability against human rights indicators, particularly those that address discrimination against women and girls.

Exploring the Applications and Implications of Social Media and Demographic Research

$
0
0

The Social Media and Demographic Research Workshop, co-organized by the IUSSP Scientific Panel on Big Data and Population Processes as a side-event at the 10th International AAAI Conference on Web and Social Media (ICWSM-16), took place in Cologne, Germany, on May 17, 2016. The goal of the workshop was to bring together population researchers and data scientists to discuss the applications and implications of social media for demographic research, and foster communication between these two communities.

The day was divided into three sessions: "Population Estimates", "Migrations", and "Demography and the Web: New Directions" [full program]. Each session brought together a range of presenters – demographers, sociologists, computer scientists – to share their work.

A number of presentations offered promising approaches for tracing migration patterns and measuring flows of migrants between countries, while taking advantage of the new potential that social media can offer (real-time, availability in data-poor environments, insight into opinions and beliefs, etc). They also proposed techniques for matching digital data with census and survey data, and determining the age, gender, and income breakdown of users of a certain website, set of apps, or social network – this in turn might be used to infer demographic characteristics of geographic areas.

Figure 1: Methodology overview. A mobility dataset can be built in the following steps: (1) Public user profiles of a photosharing service are crawled and photo metadata are extracted into a database (Data Collection). (2) Corresponding photos are labeled (with labels for ethnicity, gender, etc.) by crowd workers in an online labor marketplace (User Labeling). (3) The dataset is further enhanced with auxiliary data, e.g., with the information that a certain location is close to a restaurant (Adding Auxiliary Information). (4) The dataset can then be used to analyze attributes on various demographic levels or train and test classifiers for individual inferences.
Source: Riederer, C. J., Zimmeck, S., Phanord, C., Chaintreau, A., & Bellovin, S. M. (2015). “I don’t have a photograph, but you can have my footprints.”: Revealing the Demographics of Location Data (pp. 185–195). ACM Press. http://doi.org/10.1145/2817946.2817968

For example, Chris Riederer's work with demographic estimates based on Instagram profiles [see diagram], presented by Augustin Chaintreau, shows that the photos, text, and geotags from Instagram profiles can be used to determine age, race, and gender for census tracts in and around Manhattan. The authors suggest using this type of data to study segregation in daily interactions (i.e. across all locations) instead of just using home ZIP code as is usually the case.

Perhaps even more interesting than what these new data sources can do, however, is the discussion of where they might fail. Several of the presentations took a deeper look at how these models can create or reinforce bias. In the case of the Instagram study, for instance, although the model can determine gender with fairly high accuracy and confidence, the model is much more problematic for race: not only is it less accurate, but it also tends to overstate the confidence on its predictions. This bias can come from a variety of sources: people in a majority group may be more likely to post photos that identify them as part of the majority; human observers may accidentally (or intentionally) provide biased interpretations when coding training data for the models; machine learning algorithms may "learn" from bias in the data they are trained on, and reinforce that bias through feedback loops; etc. This demonstrates that seemingly "neutral" algorithms may in fact systematically under-represent specific groups, even when accounting for other forms of bias (such as over- or under-representation of certain groups in Instagram's user-base).

For Data-Pop Alliance, this workshop was an opportunity to present some of our work on sample bias correction [slides] using cell-phone data from Orange's D4D Challenge and Census data from Senegal's Agence Nationale de la Statistique et de la Démographie, which will be presented later this month at the World Bank's Annual Bank Conference on Development Economics 2016: Data and Development Economics (ABCDE).

The day ended with a very engaging roundtable discussion on important questions in the area of social media for demographic research, with particular focus on which real-world problems this emerging field can hope to address, and what IUSSP and other professional organizations can do to help advance these research goals. Participants mentioned nowcasting as an area of great potential, especially for deepening our understanding of migration. They also expressed a need for better materials and tools – such as a primer on "What is Social Media for Demographic Research?" – to help focus the discussion between computer scientists and demographers on how the field conceives of itself, what important research questions it hopes to explore, and which approaches are most appropriate for controlling bias, understanding/communicating uncertainty, and evaluating/comparing models.


Silke Roth on Crowdsourcing, Digital Inequalities, and Combining Cultural Knowledge with Big Data

$
0
0

Data–Pop Alliance has been conducting ongoing research on Big Data, climate change and environmental resilience. With funding from the UK’s Department for International Development (DfID), we published a synthesis report evaluating the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data and its potential applications and implications for climate change and disaster resilience. The report fed into the World Humanitarian Summit to be organized in Istanbul in May 2016.

This is the fourth podcast in a series of companion pieces that offer insights from the synthesis report.


This companion podcast to the synthesis report “Big Data for Climate Change and Disaster Resilience: Realising the Benefits for Developing Countries,” was funded by UK’s Department for International Development (DfID) was designed to probe experts to speak more about their work and their ideas about the potential and challenges of Big Data.

Titled “Silke Roth on Crowdsourcing, Digital Inequalities, and Combining Cultural Knowledge with Big Data” this podcast features Silke Roth. Silke reveals how she came to be involved in the field of Big Data for climate resilience, especially through her interest in digital inequality and the role it plays in crowdsourced disaster resilience. She notes case studies to highlight the importance of recognizing the limitations of crowdsourcing for disaster resilience, and how it might produce “invisible” populations.

Silke Roth (@SilkeRoth)

    Silke Roth is Associate Professor of Sociology at the University of Southampton (UK). She is interested in engagement for social justice and overcoming multiple forms of social inequality. Her areas of research include the participation in and the impact of voluntary organizations, social movements and non-governmental organizations. Together with Markus Luczak-Roesch, she is studying digital inequality.

For more on our series:
Listen to the full podcast
Read the summary of the DfID videos and podcasts
Listen to the full playlsit of podcasts

Serge Guillas on Tsunami Hazard Systems in India, Local Action, & Uncertainties

$
0
0

Data–Pop Alliance has been conducting ongoing research on Big Data, climate change and environmental resilience. With funding from the UK’s Department for International Development (DfID), we published a synthesis report evaluating the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data and its potential applications and implications for climate change and disaster resilience. The report fed into the World Humanitarian Summit to be organized in Istanbul in May 2016.

This is the seventh podcast in a series of companion pieces that offer insights from the synthesis report.


This companion podcast to the synthesis report “Big Data for Climate Change and Disaster Resilience: Realising the Benefits for Developing Countries,” was funded by UK’s Department for International Development (DfID) was designed to probe experts to speak more about their work and their ideas about the potential and challenges of Big Data.

Titled “Serge Guillas on Tsunami Hazard Systems in India, Local Action, & Uncertainties” this podcast features expert Serge Guillas. Serge identifies both academia and governments as the actors needed to advance the potential of Big Data for climate change resilience in India. He emphasizes the importance of demonstrating that quantified results of Big data for climate change studies are of interest to the governments and can be employed for decision-making in hazard planning, risk assessment, and early warning systems.

Serge Guillas

    Dr. Serge Guillas is an Associate Professor in the Department of Statistical Science, University College London. He works on environmental statistics and uncertainty quantification of complex computer models. He is currently the vice-Chair of the SIAM activity group on Uncertainty quantification. He has long been involved in an interdisciplinary effort to assess the long term evolution of stratospheric ozone. His most recent work include uncertainty quantification for tsunami models, with applications to insurance calculations and early warning systems.

For more on our series:
Listen to the full podcast
Read the summary of the DfID videos and podcasts
Listen to the full playlsit of podcasts

Frederike Otto & Mamun Rashid on CPDN, Climate Modeling, and Extreme Weather

$
0
0

Data–Pop Alliance has been conducting ongoing research on Big Data, climate change and environmental resilience. With funding from the UK’s Department for International Development (DfID), we published a synthesis report evaluating the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data and its potential applications and implications for climate change and disaster resilience. The report fed into the World Humanitarian Summit to be organized in Istanbul in May 2016.

This is the sixth podcast in a series of companion pieces that offer insights from the synthesis report.


This companion podcast to the synthesis report “Big Data for Climate Change and Disaster Resilience: Realising the Benefits for Developing Countries,” was funded by UK’s Department for International Development (DfID) was designed to probe experts to speak more about their work and their ideas about the potential and challenges of Big Data.

Titled “Friederike Otto & Mamun Rashid on CPDN, Climate Modeling, and Extreme Weather,” this podcast features two experts: Friederike Otto and Mamun Rashid. Friederike speaks about the importance of involving communitites in the process of tapping Big Data for climate change resilience–from the creation of the data, to analyzing, to representing, and understanding the data–thus increasing the impact the analysis can have on the ground. Mamun points to three necessary requirements for effective results of Big Data analysis: quantified data, a good analysis model, and the right computing infrastructure to efficiently analyze the data.

Friederike Otto (@FrediOtto)

    Friederike Otto is a senior researcher in the Environmental Change Instititute (ECI) at the University of Oxford and leads and coordinates the distributed computing climate modelling project climateprediction.net. Her main research interest is on extreme weather events, improving and developing methodologies to answer the question ‘whether and to what extent external climate drivers alter the likelihood of extreme weather.’

Mamun Rashid (@CPDN_BOINC)

    Mamun Rashid is a Scientific Research Computing Specialist working in climateprediction.net (CPDN) project at the University of Oxford. The project runs climate models on people’s home computers to help answer questions about how climate change is affecting our world. He is involved in infrastructure development and deployment for a large number of climate extreme event set analysis for World Weather Attribution (WWA) project.

For more on our series:
Listen to the full podcast
Read the summary of the DfID videos and podcasts
Listen to the full playlsit of podcasts

Patrick Vinck on Data Breadcrumbs Who Should Get Involved, and Kobo Toolbox

$
0
0

Data–Pop Alliance has been conducting ongoing research on Big Data, climate change and environmental resilience. With funding from the UK’s Department for International Development (DfID), we published a synthesis report evaluating the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data and its potential applications and implications for climate change and disaster resilience. The report fed into the World Humanitarian Summit to be organized in Istanbul in May 2016.

This is the first podcast in a series of companion pieces that offer insights from the synthesis report.


This companion podcast to the synthesis report “Big Data for Climate Change and Disaster Resilience: Realising the Benefits for Developing Countries,” was funded by UK’s Department for International Development (DfID) was designed to probe experts to speak more about their work and their ideas about the potential and challenges of Big Data.

Titled “Patrick Vinck on Data Breadcrumbs Who Should Get Involved, and Kobo Toolbox” this podcast features expert Patrick Vinck. Patrick delineates what Big Data is and is not. He identifies individual data producers and users as primary stakeholders needed to apply Big Data to climate resilience. He emphasizes the need of these individuals to not only know about the data they are generating, but also to know how to utilize that data.

Patrick Vinck (@developmentdata)

    Patrick Vinck, Ph.D., is the director of the Program for Vulnerable Populations at the Harvard Humanitarian Initiative. He works on peace, reconstruction and development projects throughout Africa and Asia including most recently in the Democratic Republic of the Congo and the Ivory Coast. Patrick co-founded KoBoToolbox, a digital data collection project to advance human rights, humanitarian, and social science data collection.

For more on our series:
Listen to the full podcast
Read the summary of the DfID videos and podcasts
Listen to the full playlsit of podcasts

Marion Dumas on “Democratic Data,” Crowdsourcing, and Opportunities and Pitfalls of Big Data

$
0
0

Data–Pop Alliance has been conducting ongoing research on Big Data, climate change and environmental resilience. With funding from the UK’s Department for International Development (DfID), we published a synthesis report evaluating the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data and its potential applications and implications for climate change and disaster resilience. The report fed into the World Humanitarian Summit to be organized in Istanbul in May 2016.

This is the fourth podcast in a series of companion pieces that offer insights from the synthesis report.


This companion podcast to the synthesis report “Big Data for Climate Change and Disaster Resilience: Realising the Benefits for Developing Countries,” funded by UK’s Department for International Development (DfID), was designed to probe experts to speak more about their work and their ideas about the potential and challenges of Big Data.

Titled “Marion Dumas on “Democratic Data,” Crowdsourcing, and Opportunities and Pitfalls of Big Data” this podcast features expert Marion Dumas. Marion address the importance of improving collective decision processes through Big Data to better deal with problems of sustainable development, like climate change. She views Big Data as inherent to strengthening democracies as we deal with more complex climate issues. This “Democratic Data” use makes communities more inclusive and allows their voices to be heard in decision-making surrounding natural hazards.

Marion Dumas (@MarionDumas1)

    Marion Dumas is a Omidyar Postdoctoral Fellow at the Santa Fe Institute. Marion seeks to understand how institutions work and how they shape social change. The questions she asks about institutions and about mechanisms of social change are motivated by the challenges of sustainability, which she sees as a problem of collective adaptation to a complex environment. Thus she seeks to characterize the adaptive capacity of societies as shaped by their institutions.

For more on our series:
Listen to the full podcast
Read the summary of the DfID videos and podcasts
Listen to the full playlsit of podcasts
Viewing all 137 articles
Browse latest View live




Latest Images