Using internet activity data to analyze human resources issues

This special issue of the International Journal of Manpower examines the potentials and challenges of Internet data or Big Data for research in the social sciences with a special focus on human resources issues. Internet data are increasingly representing a large part of everyday life. The information is timely, perhaps even daily following the factual process. It typically involves large numbers of observations and allows for flexible conceptual forms and experimental settings.

The introduction paper by Nikolaos Askitas and Klaus F. Zimmermann on The Internet as a Data Source for Advancement in Social Sciences reviews the issues and surveys the relevant literature. Internet data can be applied to a wide range of issues including forecasting (e.g. of unemployment, consumption goods, tourism, festival winners and the like), nowcasting (obtaining relevant information much earlier than through traditional data-collection techniques), detecting health issues and well-being (e.g. flu, malaise and ill-being during economic crises), documenting the matching process in various parts of the individual life (e.g., jobs, partnership, shopping, preferences), and measuring complex processes where traditional data have known deficits (e.g. international migration, collective bargaining agreements in developing countries).

The paper by Emilio Zagheni and Ingmar Weber on Demographic Research with Non-Representative Internet Data addresses the two most critical methodological issues in the use of internet data: non-representativeness and selection bias. It proposes a framework to collect web data and discusses possible estimation methods. The paper also surveys relevant demographic literature, in particular in the area of migration, where useful data about the mobility process are typically scarce in the traditional data sources.

Two papers study well-being from different data sources. Nikolaos Askitas and Klaus F. Zimmermann are examining Health and Well-Being in the Great Recession using Google activity data to trace and document the impact of the 2008 Financial and Economic Crisis on well-being. They are able to confirm previous knowledge from the economics of health, well-being and the business cycle. Martin Guzi and Pablo de Pedraza in their article A Web Survey Analysis of Subjective Well-being employ data from the voluntary web-survey WageIndicator project. They confirm that job characteristics affect job satisfaction and identify spillovers, since satisfaction in one domain affects other domains.

Margaret Maurer-Fazio and Lei Lei study the Chinese Internet job board labor market in their paper “As Rare as a Panda”: How Facial Attractiveness, Gender, and Occupation Affect Interview Callbacks at Chinese Firms. They examine in a resume audit (correspondence) study, how discrimination derived from gender and facial attractiveness varies across occupation, location, and firms’ ownership type and size. They find that women are generally preferred to men and unattractive job candidates have a disadvantage.

In their paper Comparing Collective Bargaining Agreements for Developing Countries, Janna Besamusca and Kea Tijdens employ for the first time the web-based WageIndicator Collective Bargaining Agreement Database for 11 developing countries. They find that few agreements specify wage levels, but almost all collective agreements have clauses on wages. Their study also documents working hours, paid-leave arrangements and work-family arrangements.

The final paper by Concha Artola, Fernando Pinto and Pablo de Pedraza entitled Can Internet Searches Forecast Tourism Inflows? represents the large literature on using internet data for forecasting purposes. Employing Google activity data, the authors demonstrate that traditional time-series forecasting models for tourism inflows into Spain can be improved using Google activity measures.

This special issue is of interest to researchers in the evolving field to keep them up to date with the developments of the area, to students who want to examine the potential application of such data to their own research, and to the wider public that wants to understand what reality will be faced with in a not so distant future.

