Recently we, that is Christina Koller, Sverre J. Herstad and I, received the news that our joint paper “Does the composition of regional knowledge bases influence extra-regional collaboration for innovation?” is published by Applied Economics Letters today.

The paper emphasizes that ther is a growing research interest in the relationship between the composition of regional knowledge bases, and the extra-regional collaborative ties  by actors during their development work. In order to investigate this relationship, we use patent data to characterize European NUTS 3 regions by their i) comparative Technological Specializations; and ii) Related Technological Variety. We find domestic, extra-regional collaboration to be negatively associated with regional Technological Specialization and Related Technological Variety. At the same time, we find Related Technological Variety to serve in support of international innovation collaboration.

Having the data ready it was technically a bit demanding to have a balanced panel and a fractional response variable. First that reminded us very much of the situation in Papke and Wooldridge (1996), where they introduce a regression model for fractional response in cross section data. This suggested regression model resembles the traditional logit or probit for binary responses. Papke and Wooldridge recently extended the model for balanced panels (Papke and Wooldridge, 2008; Wooldridge, 2011) and suggest estimating population-averaged panel-data models by using a Generalized Estimating Equation (GEE; Liang and Zenger, 1986). Therefore, we estimate a generalized linear model with our fractional dependent variable  and the independent variables. In our particular case we use the logit function as the link function  and the binomial distribution as the distribution. Additionally, the GEE estimation requires the specification of a working correlation matrix , which postulates that the correlations are not a function of the independent variables. We employ a so-called ‘exchangeable’ correlation matrix that is particularly suitable here, as our panel contains a rather small time dimension (Papke and Wooldridge, 2008). To control for fixed technology and country effects, we follow Papke and Wooldridge (2008).


Liang, K.-Y. and Zeger, S. L. (1986) Longitudinal data analysis using generalized linear models, Biometrika, 7, 13–22.

Papke, L. E. and Wooldridge, J. M. (1996) Econometric methods for fractional response variables with an application to 401 (K) plan participation rates, Journal of Applied Econometrics, 11, 619–32.

Papke, L. E. and Wooldridge, J. M. (2008) Panel data methods for fractional response variables with an application to test pass rates, Journal of Econometrics, 145, 121–33.